![]() |
VOOZH | about |
We’re so glad you’re here. You can expect all the best TNS content to arrive Monday through Friday to keep you on top of the news and at the top of your game.
Check your inbox for a confirmation email where you can adjust your preferences and even join additional groups.
Follow TNS on your favorite social media networks.
Become a TNS follower on LinkedIn.
Check out the latest featured and trending stories while you wait for your first TNS newsletter.
The impressive power of large language models (LLMs) has evolved substantially during the last couple of years. These versatile AI-powered tools are in fact deep learning artificial neural networks that are trained with massively large datasets, capable of leveraging billions of parameters (or machine learning variables) in order to perform various natural language processing (NLP) tasks.
These can run the gamut from generating, analyzing and classifying text, all the way to generating rather convincing images from a text prompt, to translating content into different languages, or chatbots that can hold human-like conversations. Well-known LLMs include proprietary models like OpenAI’s GPT-4, as well as a growing roster of open source contenders like Meta’s LLaMA.
But despite their considerable capabilities, LLMs can nevertheless present some significant disadvantages. Their sheer size often means that they require hefty computational resources and energy to run, which can preclude them from being used by smaller organizations that might not have the deep pockets to bankroll such operations. With larger models there is also the risk of algorithmic bias being introduced via datasets that are not sufficiently diverse, leading to faulty or inaccurate outputs — or the dreaded “hallucination” as it’s called in the industry.
These issues might be one of the many that are behind the recent rise of small language models, or SLMs.
Small language models are slimmed-down versions of their larger cousins, and for smaller enterprises with tighter budgets, SLMs are becoming a more attractive option, because they are generally easier to train, fine-tune and deploy, and also cheaper to run.
Small language models are essentially more streamlined versions of LLMs, in regards to the size of their neural networks, and simpler architectures.
Compared to LLMs, SLMs have fewer parameters and don’t need as much data and time to be trained — think minutes or a few hours of training time, versus many hours to even days to train a LLM. Because of their smaller size, SLMs are therefore generally more efficient and more straightforward to implement on-site, or on smaller devices.
Similar to their larger cousins, small language models utilize a type of deep learning neural network architecture known as the transformer model. Introduced by Google researchers back in 2017 via a paper titled Attention Is All You Need, transformers have revolutionized natural language processing (NLP) during the last few years, paving the way for the generative pre-trained transformers (GPTs) that underlie some of today’s most massive and powerful large language models.
Generally, these are the basic building blocks of the transformer model architecture:
Small language models are typically made from large language models using an approach called model compression, which results in smaller models that are more resource-efficient and performant, yet still relatively accurate.
Some techniques of model compression include:
Nevertheless, despite some of these potential limitations, some SLMs like Microsoft’s recently introduced 2.7 billion-parameter Phi-2, demonstrate state-of-the-art performance in mathematical reasoning, common sense, language understanding, and logical reasoning that is remarkably comparable to — and in some cases, exceed — that of much heftier LLMs. According to Microsoft, the efficiency of the transformer-based Phi-2 makes it an ideal choice for researchers who want to improve safety, interpretability and ethical development of AI models.
Other SLMs of note include:
Because of their smaller size, and reduced computational and operational cost, businesses and institutions can easily fine-tune and tailor small language models to a specific use.
For instance, SLMs could be used as chatbots to offer timely customer service, or utilized to summarize content or create calendar events for users. These smaller models could also be used to translate foreign languages in real-time, generate programming code, or to monitor or perform preventative maintenance on devices linked to the Internet of Things (IoT). Within automotive systems, SLMs can go a long way in offering real-time traffic updates for smarter road navigation, or improving voice commands or handsfree calling.
Ultimately, the emergence of small language models signals a potential shift from expensive and resource-heavy LLMs to more streamlined and efficient language models, arguably making it easier for more businesses and organizations to adopt and tailor generative AI technology to their specific needs. As language models evolve to become more versatile and powerful, it seems that going small may be the best way to go.