10/18/2023

Headline, October 19 2023/ ''' DIET A.I. DIRE '''


''' DIET A.I. DIRE '''



WHEN IT COMES to ''large language models'' [LLMS] such as GPT - which powers ChatGPT, a popular chatbot made by OpenAI, an American research lab, the clue is in the name.

Modern AI systems are powered by vast artificial neural networks, bits of software modelled, very loosely, on biological brains, GPT-3, an LLM released in 2020, was a behemoth. It had 175 billion ''parameters'', as the simulated connections between those neurons are called.

It was trained by having thousands of GPUS [ specialised chops that excel at AI work] crunch through hundreds of billions of words of text over the course of several weeks. All that is thought to have cost at least $4.6 million.

But the most consistent result from modern A.I. research is that, while big is good, bigger is better. Models have therefore been growing at a blistering pace. GPT-4 released in March, is thought to have around 1 trillion parameters - nearly six times as many as its predecessor.

Sam Altman, the firm's boss, put its development costs at more than $100 million. Similar trends exist across the industry. Epoch AI, a research firm, estimated in 2022 that the computing power necessary to train a cutting-edge model was doubling every six to 10 months.

This gigantism is becoming a problem. If Epoch AI'S ten monthly doubling-figure is right, then training costs could exceed a billion dollars by 2026 - assuming, that is, models do not run out of data first.

An analysis published in October 2022 forecast that the stock of high-quality text for training may well be exhausted around the same time. And even once the training is complete, actually using the resulting model can be expensive as well.

The bigger the model, the more it costs to run. Earlier this year Morgan Stanley, a bank guessed that, were half of Google's searches to be handled by a current GPT-style program, it could cost the firm an additional $6 billion a year. As the model gets bigger, that number will probably rise.

Many in the field therefore think the ''bigger is better'' approach is running out of road. If AI models are to carry on improving - never mind fulfilling the AI-related dreams currently sweeping the tech industry - their creators will need to work out how to get more performance out of fewer resources.

 As Mr. Altman put it in April, '' I think we are at the end of an era.''

' Quantitative Tightening ' : Instead, researchers are beginning to turn their attention to making their models more efficient, rather than simply bigger.

One approach is to make trade-offs, cutting the number of parameters but training models with more data.

In 2022, researchers at DeepMind, a division of Google, trained Chinchilla, an LLM with 70 billion parameters, on a corpus of 1.4 trillions words. The model outperforms GPT-3, which has 175 billion parameters trained on 300 billion words.

Feeding a smaller LLM more data means it takes longer to train. But the result is a smaller model that is faster and cheaper to use.

Another option is to make the maths fuzzier. Tracking fewer decimal places for each number in the model - rounding them off, in other words - can cut hardware requirements drastically.

In March researchers at the Institute of Science and Technology in Austria showed that rounding could squash the amount of memory consumed by a model similar to GPT-3, allowing the model to run on one high-end GPU instead of five, and with only ''negligible accuracy degradation''.

Some users fine-tune general-purpose LLMS to focus on a specific task such as generating legal documents or detecting fake news. That is not as cumbersome as training an LLM in the first place, but can still be costly and slow.

Fine-tuning LLaMA, an open-source model with 65 billion parameters that was built by Meta, Facebook's corporate parent, takes multiple GPUS anywhere from several hours to a few days.

Researchers at the University of Washington have invented a more efficient method that allowed them to create a new model, Guanaco, from LLaMA on single GPU in a day without sacrificing much, if any performance.

Part of the trick was to use a similar rounding technique to the Austrians. But they also used a technique called ''low-rank adaptation'', which involves freezing a model's existing parameters, then adding a new, smaller set of parameters in between.

The fine-tuning is done by altering only those new variables.

The Honour and Serving of the Latest Global Operational Research on AI, Developments and Future, continues. The World Students Society thanks The Economist.

With respectful dedication to Global Founder Framers of The World Students Society - the exclusive and eternal ownership of every student in the world : wssciw.blogspot.com and Twitter X !E-WOW! - The Ecosystem 2011 :

Good Night and God Bless

SAM Daily Times - the Voice of the Voiceless

0 comments:

Post a Comment

Grace A Comment!