Large language models have faced criticism in recent weeks for not being the right solution to advance artificial intelligence. A new report from The Economist, however, strongly contradicts that claim. In its deep look at foundation models, the story explains how making such AI models larger “by feeding them more data and increasing the number of parameters” only makes them better.
The GOOD COMPUTER
The Economist story begins by looking at the scope of a new supercomputer being built by Graphcore, a British chip designer. Called GOOD COMPUTER, the supercomputer is designed to carry out 10 raised to 19 calculations per second. In other words, the Good computer will be 100 million times faster than a computer capable of doing 100 billion calculations per second.
In fact, when it debuts, the Good computer will be ten times faster than Frontier, which topped the most recent list of “Top500” power supercomputers. Named after Jack Good, who worked with Alan Turing as a codebreaker during the second World War, the computer is designed to solve the compute performance challenge faced by AI models.
The most advanced AI models right now are 10,000 times larger than BERT, which was the largest model with 110 million parameters just four years ago. “Today’s most advanced AI programs are 10,000 times larger, with over a trillion parameters. The Good computer’s incredibly ambitious specifications are driven by the desire to run programs with something like 500trn parameters,” the Economist notes.
The way the AI world is advancing is further testament to the fact that adding parameters to models does not result in diminishing returns. “The new models far outperformed older machine-learning models on tasks such as suggesting the next words in an email or naming things which are present in an image, as well as on more recondite ones like crafting poetry,” says the report.
Flexibility is another promising feature
The deep dive by The Economist on foundation models also looks at how AI models have become flexible. The earliest generations of AI systems were good for only one purpose and were designed for a specific one. The newer models can be reassigned from one type of problem to another with relative ease.
This relative ease with which AI models can be fine tuned also gives them the name of “foundation models”.
“AI models used to be very speculative and artisanal, but now they have become predictable to develop,” Jack Clark, a co-founder of Anthropic, an AI startup explains to The Economist. “AI is moving into its industrial age.”
Oren Etzioni, who runs the Allen Institute for ai, a research outfit, says more than 80 per cent of AI research is now focussed on foundation models. The report says Kevin Scott, Chief Technology Officer of Microsoft, is also devoting 80 per cent of his time to development of foundational models.
Foundation models form the bedrock of AI experience delivered by the likes of Microsoft, Facebook parent Meta, Google parent Alphabet, and even Tesla. China, which is widely considered to be further ahead in AI use and deployment, has reportedly made foundation models a national priority.
The foundation models have become important mainly due to the major gains seen in high performance computing in the past decade. With the introduction of graphic processing units by companies like NVIDIA, it has become possible to run lots of calculations in parallel without spending a lot of money.
Fei-Fei Li, the co-director of Stanford University’s Institute for Human-Centred AI, also attributes self-supervised learning as another major factor driving AI progress. With models now being using self-supervised learning rather than with pre-labelled data sets, it has become possible for AI models to deliver inferences with better accuracy.
Concentration of power and national interest
The Economist report is not a glorification of foundation models but also looks at how the AI advancement has also meant concentration of power within a small set of people, organisations, and countries.
“Some worry that the technology’s heedless spread will further concentrate economic and political power, up-end swathes of the economy in ways which require some redress even if they offer net benefits and embed unexamined biases ever deeper into the automated workings of society,” the report explains.
The concentration of power is already evident from the roles played by both Google and Microsoft. They are both owners of models as well as owners of cloud computing platforms where these models are hosted. Another factor at play in the world of AI is centralisation.
From China’s treatment of Wu Dao as a national champion to France offering free computer power to BigScience, there is a form of national interest already at play. The bigger challenge among all this is the harm these general purpose technologies could cause since they are essentially feeding on the inherent bias in the system. It is worth reading the article in full here.