4 bold AI predictions for 2025

This article is part of VentureBeat’s special issue, “AI at Scale: From Vision to Feasibility.” Read more from this special issue here.

This article is part of VentureBeat’s special issue, “AI at Scale: From Vision to Feasibility.” Read more about the number here.

As we reach the year 2024, we can look back and recognize that artificial intelligence has made impressive and innovative advances. At the current pace, predicting what kind of surprises the year 2025 will hold for AI is virtually impossible. But several trends paint a compelling picture of what businesses can expect in the coming year and how they can prepare to make the most of it.

Plummeting costs of inference

Last year, the costs of frontier models steadily decreased. The price per million tokens of OpenAI’s highest-performing large language model (LLM) has dropped more than 200 times in the past two years.

A key factor driving down the price of inference is increasing competition. For many enterprise applications, most frontier models will be adequate, making it easy to switch from one to another, shifting competition toward price. Improvements in accelerator chips and specialized inference hardware are also making it possible for AI labs to provide their models at lower costs.

To take advantage of this trend, companies should start experimenting with more advanced LLMs and prototyping applications around them, even if costs are currently high. The continued reduction in model prices means that many of these applications will soon be scalable. At the same time, model capabilities continue to improve, meaning you can do a lot more with the same budget as last year.

The rise of great reasoning models

The launch of OpenAI o1 has unleashed a new wave of innovation in the LLM space. The trend of letting models “think” for longer and review their answers allows them to solve reasoning problems that were impossible with single inference calls. Although OpenAI has not published the details of o1, its impressive capabilities have sparked a new race in the AI space. There are currently many open source models that replicate o1’s reasoning capabilities and are extending the paradigm to new fields, such as answering open questions.

Advances in o1-like models, sometimes referred to as large reasoning models (LRMs), may have two important implications for the future. First, given the immense amount of tokens that LRMs must generate for their responses, we can expect that hardware companies will be more incentivized to create specialized AI accelerators with higher token yields.

Second, LRMs can help address one of the important bottlenecks of the next generation of language models: high-quality training data. There are already reports that OpenAI is using o1 for gee north delete training examples for its next generation of models. We can also expect LRMs to help generate a new generation of small specialized models that have been trained with synthetic data for very specific tasks.

To take advantage of these advances, companies should dedicate time and budget to experimenting with potential applications of frontier LRMs. They should always test the limits of frontier models and think about what types of applications would be possible if the next generation of models overcome those limitations. Combined with the continued reduction of inference costs, LRMs can unlock many new applications over the next year.

Alternatives to transformers are gaining momentum

The memory and compute bottleneck of Transformers, the primary deep learning architecture used in LLMs, has given rise to a field of alternative models with linear complexity. The most popular of these architectures, the state space model (SSM), has seen many advances over the past year. Other promising models include liquid neural networks (LNNs), which use new mathematical equations to do much more with far fewer artificial neurons and computing cycles.

Last year, AI researchers and labs released pure SSM models, as well as hybrid models that combine the strengths of transformers and linear models. Although these models have yet to perform at the level of cutting-edge transformer-based models, they are quickly catching up and are already orders of magnitude faster and more efficient. If progress in the field continues, many simpler LLM applications can be offloaded to these models and run on edge devices or local servers, where companies can use custom data without sending it to third parties.

Changes in scaling laws.

LLM scaling laws are constantly evolving. The release of GPT-3 in 2020 demonstrated that scaling model size would continue to deliver impressive results and allow models to perform tasks they were not explicitly trained to perform. In 2022, DeepMind launched the chinchilla paperwhich marked a new direction in data escalation laws. Chinchilla showed that by training a model on an immense data set that is several times larger than the number of its parameters, improvements can continue to be obtained. This development allowed smaller models to compete with frontier models with hundreds of billions of parameters.

Today, there are fears that both scaling laws are reaching their limits. Information indicate that frontier labs are experiencing diminishing returns when training larger models. At the same time, training data sets have already grown to tens of billions of tokens, and obtaining quality data is becoming increasingly difficult and expensive.

Meanwhile, LRMs promise a new vector: inference time scaling. When model and dataset size fail, we could break new ground by allowing models to run more inference cycles and correct their own errors.

As we approach the year 2025, the AI landscape continues to evolve in unexpected ways, with new architectures, reasoning capabilities, and economic models reshaping what is possible. For companies willing to experiment and adapt, these trends represent not just a technological advance, but a fundamental shift in how we can leverage AI to solve real-world problems.

Daily insights into business use cases with VB Daily

If you want to impress your boss, VB Daily has you covered. We give you the inside scoop on what companies are doing with generative AI, from regulatory changes to practical implementations, so you can share insights for maximum return on investment.

Read our Privacy Policy

Thanks for subscribing. Check out more VB newsletters here.

An error occurred.