The year of agents

Last year was widely anticipated to be the year of AI, and it indeed turned out to be the year of reasoning AI. I believe this year will be the year of agents, but before we get there, let’s review what happened last year that led us to where we are today.

Data 💿

The amount of data used to train, at least open models, dramatically increased from 2 trillion tokens in Llama 2 all the way to 15 trillion in Llama 3. That, in turn, resulted in a narrowing of the performance gap between open and closed-source models, excluding the O series. We also seem to have hit a ceiling with respect to real-world data, which is unsurprising since the amount of high-quality training data has been estimated to be about 100 trillion tokens. Note that this excludes synthetic data as well as non-text data, both of which will continue to drive progress forward.

Cost 💰

The cost of AI decreased significantly during the year, dropping from $10 per 1 million tokens at the start to $2.50 by the end, and half that when using caching. Performance at increasingly smaller model sizes became a hotly contested area, with distillation from larger models becoming the norm. Models such as Phi 4 (27B) achieved near-GPT-4 performance, while the latest Llama 70B came close to matching 405B, their best model to date.

Speed 🏎️

AI is also running faster thanks to innovations from Groq and Cerebras, with speeds exceeding 1,000 tokens per second — at least an order of magnitude faster than the top generation speeds at the start of the year. As reasoning and agentic AI take off, generation speed will become increasingly important because both require more compute to tackle complex tasks.

AI Integration 🖱️

Many companies experimented with AI features last year, with varying levels of success. From summarizing unread messages and notifications on your phone and communication platforms to assisting with content creation and refinement, as well as helping locate relevant information and answers more quickly, most of these attempts were less transformational than they initially seemed. Identifying transformative AI use cases is proving more challenging than initially anticipated.

Reasoning 🤔

The biggest development last year, in my opinion, was reasoning AI with models such as the O series from OpenAI. The impact of these models has been somewhat understated — there was no “ChatGPT moment,” as the chat interface may not be the ideal medium for these models to demonstrate their full potential. One way to grasp the significance of this advancement is the fact that AI advanced from performing like an average programmer to ranking as the 175th best programmer in the world. Reasoning is, in many ways, a necessary but not sufficient step for creating more useful AI that can automate human tasks. Now that reasoning has improved, we can more reliably move to the next phase of AI: agents.

The Year Ahead

At a minimum, we can expect the same incremental improvements as last year, including cheaper and faster models, along with broader AI integration into everyday apps. On top of that, I believe this will be the year of agents. Initially, small tasks will be delegated to AI, such as checking you in for a flight or sending that document a colleague requested via email, simply by prompting a model. At the same time, companies will likely see significant innovation and disruption as they experiment further with agentic workflows, aiming to achieve more with fewer resources.

This should be an exciting year — not that any year is particularly boring in AI.

The year of agents was originally published in MantisNLP on Medium, where people are continuing the conversation by highlighting and responding to this story.

Do you have a Natural Language Processing problem you need help with?

Let's Talk