Is Artificial Intelligence Affected by the Law of Diminishing Marginal Returns?

2024-06-19

Large language models (LLMs) have been at the forefront of artificial intelligence advancements, relying on vast amounts of data and computational power. However, research suggests that the commonly held belief in the AI industry that "more data and computational power equals more progress" may be misleading.


While companies have been actively collecting data to train LLMs, the law of diminishing returns suggests that pursuing model gains solely through scale may become economically unfeasible.





Matei Zaharia, Chief Technology Officer at Databricks, expressed a similar view in a recent interview, stating, "Whenever your training costs double, the quality only improves by about 1% or a similar level."


"Apart from the law of scale itself, another fact is that we've already put all the data on the internet into these models," he added.


Zaharia further explained, "Even if you repeat and add more data, maybe you can tweak it and create variants of it. But it's not clear if it will provide more information."


He acknowledged that while researchers will continue to work on making the process more efficient and scaling it to higher levels, "we may be hitting the limits of what the average consumer can do."


Different perspectives on scaling limits


On the other hand, Kevin Scott, Chief Technology Officer at Microsoft, believes that as computational scale increases, AI models can continue to become more powerful. He stated, "We are nowhere near the point of diminishing returns in terms of how much more powerful we can make AI models by increasing computational scale."


However, this viewpoint is not universally accepted.


Matei Zaharia, Gary Marcus, and Yann LeCun all express skepticism about the sustainability of this approach. They question whether existing infrastructure can scale fast enough to support increasingly large AI models.


This concern is not just theoretical but based on real-world evidence, such as the slowdown in revenue growth for data center physical infrastructure (DCPI) in the first quarter of 2024.


This slowdown is attributed to a design shift in the acceleration of computing infrastructure to support AI workloads that take longer to achieve.


Energy constraints and data center challenges


In contrast, Mark Zuckerberg, CEO of Meta, emphasizes that energy constraints are the most significant factor limiting AI growth, as data centers consume a significant amount of energy.


It is estimated that by 2030, data centers will consume 848 terawatt-hours (TWh) of electricity, nearly double the current 460 TWh. To put this number into perspective, in 2021, India, with a population of over 1 billion, consumed a total of 1443 TWh of electricity.


Zuckerberg also discusses the challenges of planning around exponential growth in AI, asking, "How long does it last when you're facing an exponential curve?"


He believes that the current exponential growth in AI is likely to continue, making it worthwhile for companies to invest billions or even hundreds of billions of dollars in building the necessary infrastructure.


However, he also acknowledges that no one in the industry can definitively say how long this growth rate will be sustained.


Potential solutions and future directions


Despite the challenges, there are opportunities for improvement to sustain this wave of generative AI. Matei Zaharia believes that there is still untapped potential for AI applications in specific domains.


He emphasizes that "most enterprise use cases are building something multi-step," which he refers to as "composite AI systems." Building these systems is a complex task, "and there's a lot of research to be done, like how to design it best."


Similarly, Yann LeCun, AI Chief at Meta, also talks about "goal-driven AI" architectures as the expansion of autoregressive LLMs yields diminishing returns.


He states, "As I've said repeatedly, for the next leap in capability, there will be a new architecture."


Following a similar line of thinking, Databricks is focused on helping people achieve the best quality in their domain-specific AI applications.


This will be achieved through building composite AI systems that include multiple components, such as invoking different models, retrieving relevant data, using external APIs and databases, and breaking down problems into smaller steps.


At the same time, Databricks is also focused on open-source models.


As the AI industry grapples with the law of diminishing returns, collaboration between data center operators, utility companies, and policymakers is crucial to ensure reliable and sustainable power supply while meeting the growing demands of AI.


The future of AI progress lies in finding innovative solutions and architectures that can overcome the limitations of current approaches.