
Small LLMs are the Future of Agentic AI
Image by Editor | ChatGPT
Introduction
This article provides a summary of and commentary on the recent paper Small LLMs are the Future of Agentic AI. The study is a position paper that lays out several insightful postulates about the potential of small language models (SLMs) to drive innovation in agentic AI systems, compared to their larger counterparts, the LLMs, which are currently the predominant component fueling modern agentic AI solutions in organizations.
A couple of quick definitions before we jump into the paper:
- Agentic AI systems are autonomous systems capable of reasoning, planning, making decisions, and acting in complex and dynamic environments. Recently, this paradigm, which has been investigated for decades, has gained renewed attention due to its significant potential and impact when used alongside state-of-the-art language models and other cutting-edge AI-driven applications. You can find a list of 10 Agentic AI Key Terms Explained in this article.
- Language models are natural language processing (NLP) solutions trained on large datasets of text to perform a variety of language understanding and language generation tasks, including text generation and completion, question-answering, text classification, summarization, translation, and more.
Throughout this article, we will distinguish between small language models (SLMs) — those “small” enough to run efficiently on end-consumer hardware— and large language models (LLMs) — which are much larger and usually require cloud infrastructure. At times, we will simply use “language models” to refer to both from a more general perspective.
Authors’ Position
The article opens by highlighting the increasing relevance of agentic AI systems and their significant level of adoption by organizations today, usually in a symbiotic relationship with language models. State-of-the-art solutions, however, traditionally rely on LLMs due to their deep, general reasoning capabilities and their broad knowledge, gained from being trained on vast datasets.
This “status quo” and assumption that LLMs are the universal go-to approach for integration into agentic AI systems is precisely what the authors challenge through their position: they suggest shifting some attention to SLMs that, despite their smaller size compared to LLMs, could be a better approach for agentic AI in terms of efficiency, cost-effectiveness, and system adaptability.
Some key views underpinning the claim that SLMs, rather than LLMs, are “the future of agentic AI” are summarized below:
- SLMs are sufficiently powerful to undertake most current agentic tasks
- SLMs are better suited for modular agentic AI architectures
- SLMs’ deployment and maintenance are more feasible
The paper further elaborates on these views with the following arguments:
SLMs’ Aptitude for Agentic Tasks
Several arguments are provided to support this view. One is based on empirical evidence that SLM performance is rapidly improving, with models like Phi-2, Phi-3, SmoILM2, and more, reporting promising results. On another note, as AI agents are typically instructed to excel at a limited range of language model capabilities, properly fine-tuned SLMs should often be appropriate for most domain-specific applications, with the added benefits of efficiency and flexibility.
SLMs’ Suitability for Agentic AI Architectures
The small size and reduced pre-training and fine-tuning costs of SLMs make them easier to accommodate in typically modular agentic AI architectures and easier to adapt to ever-evolving user needs, behaviors, and requirements. Meanwhile, a well-fine-tuned SLM for selected domain-specific prompt sets can be sufficient for specialized systems and settings, although LLMs will generally have a broader understanding of language and the world as a whole. On another note, as AI agents frequently interact with code, conformance to certain formatting requirements is also a concern to ensure consistency. Consequently, SLMs trained with narrower formatting specifications would be preferable.
The heterogeneity inherent in agentic systems and interactions is another reason why SLMs are argued to be more suitable for agentic architectures, as these interactions serve as a pathway to gather data.
SLMs’ Economic Feasibility
SLM flexibility can be easily translated into a higher potential for democratization. The aforementioned reduced operational costs are a major reason for this. In more economic terms, the paper compares SLMs against LLMs concerning inference efficiency, fine-tuning agility, edge deployment, and parameter usage: aspects in which SLMs are considered superior.
Alternative Views, Barriers, and Discussion
The authors not only present their view, but they also outline and address counterarguments solidly founded on existing literature. These include statements like LLMs generally outperforming SLMs due to scalability laws (which may not always hold for narrow subtasks or task-specific fine-tuning), centralized LLM infrastructure being cheaper at scale (which can be countered by decreasing costs and modular SLM deployments that prevent bottlenecks), and industry inertia favoring LLMs over SLMs (which, while true, does not outweigh other SLM advantages like adaptability and economic efficiency, among others).
The main barrier to adopting SLMs as the universal go-to approach alongside agentic systems is the well-established dominance of LLMs from many perspectives, not just technical ones, accompanied by substantial investments made in LLM-centric pipelines. Clearly demonstrating the discussed advantages of SLMs is paramount to motivating and facilitating a transition from LLMs to SLMs in agentic solutions.
To finalize this analysis and summary of the paper, here are some of my own perspectives on what we have outlined and discussed. Specifically, while the claims made throughout the paper are brilliantly well-founded and convincing, in our rapidly changing world, paradigm shifts are often subject to barriers. Accordingly, I consider the following to be three major barriers to adopting SLMs as the main approach underlying agentic AI systems:
- The huge investments made in LLM infrastructure (already highlighted by the authors) make it difficult to change the status quo, at least in the short term, due to the strong economic inertia behind LLM-centric pipelines.
- We may have to rethink evaluation benchmarks to adapt them for SLM-based frameworks, as current benchmarks are designed to prioritize general performance aspects rather than narrow, specialized performance in agentic systems.
- Last, and perhaps simplest, there is still work to be done in terms of raising public awareness about the potential and advances made by SLMs. The “LLM” buzzword is deeply rooted in society, and the LLM-first mindset will take time and effort to evolve before decision-makers and practitioners jointly view SLMs as a possible replacement with its own advantages, especially regarding their integration into real-world agentic AI solutions.
On a final, personal note, if major cloud infrastructure providers were to embrace and more aggressively promote the authors’ view on the potential of SLMs to lead agentic AI development, perhaps a significant portion of this journey could be covered in the blink of an eye.