HomeAI News
GPT-5 is training secretly! DeepMind Lianchuang broke the news that this model is 100 times larger than GPT-4

GPT-5 is training secretly! DeepMind Lianchuang broke the news that this model is 100 times larger than GPT-4

Hayo News
Hayo News
September 4th, 2023
View OriginalTranslated by Google
GPT-5 is still being trained in secret! The co-founder of DeepMind revealed in a recent interview that in the next three years, the Inflection model will be 1,000 times larger than the current GPT-4.

Recently, DeepMind co-founder Mustafa Suleyman, now CEO of Inflection AI, dropped a bombshell in an interview:

OpenAI is secretly training GPT-5.

I think we would all be better off just saying it. That's why we disclose the total amount of computing we have.

Within the next 18 months, Inflection AI will train models 100 times larger than current leading-edge models. Over the next 3 years, Inflection's model will be 1000 times larger than it is today.

In fact, Sam Altman has previously denied the idea of ​​training GPT-5.

In this regard, netizens said that OpenAI may have given it a new name, so they said that they did not train GPT-5.

This is just like the launch of Code Interpreter at that time. Many people feel that its ability is no longer the GPT-4 model, but should be GPT-4.5.

In addition, during the interview, Suleyman also revealed a lot of internal information about his work at DeepMind and Inflection AI, including Google’s acquisition of DeepMind at the time and its subsequent arrests, which to some extent explains why DeepMind “has improved” compared to OpenAI. It was early in the morning, but I rushed to a late market."

He also believes that open-sourcing models may increase the instability and harm AI brings to humans.

The biggest threat to AI security is not the big language model, but the autonomous agents that may appear in the future.

full interview

When asked whether AI might become an intelligent agent capable of autonomous evolution in the future, Suleyman said:

In the short term, it is unlikely that there will be an agent that can operate autonomously, set its own goals, recognize new information in the environment, new reward signals, and learn to use it as self-supervision, and over time An artificial agent that updates its own weight as time passes.

But this kind of autonomously evolving AI is something that no one should ignore, because if a certain AI technology really shows this ability, it may have very high potential risks.

At least as far as he knows, neither Inflection AI nor DeepMind is going in this direction.

Inflection AI is not an AGI company. What they want to do is to make a very useful personal assistant. This assistant provides users with highly customized AI services on the premise that it has full access to the user's personal information.

Will the Model Training Arms Race Exacerbate the Risks of AI?

His company, Inflection AI, is building one of the world's largest supercomputers, and he thinks that within the next 18 months they may be able to run a language model training run that's 10 or 100 times larger than the one that made GPT-4 training run.

When asked whether this arms race-style training model might increase the risk of AI, he replied:

100x training will still produce a chatbot that can be understood as a better GPT-4, and although this will make a more impressive model, it is not dangerous - because it lacks autonomy and cannot be modified Fundamental elements such as the physical world that make the model itself dangerous.

Just producing a very good and better GPT-4 is not dangerous; in order to make it dangerous, we need to add other capabilities, such as the aforementioned, allowing the model to iterate on its own, set its own goals, etc.

That's about five, ten, fifteen, twenty years from now.

Suleyman believes that Sam Altman may not be telling the truth when he recently said that they did not train GPT-5. (Come on. I don't know. I think it's better that we're all just straight about it.)

He wants all companies with large-scale computing power to be as transparent as possible, which is why they disclose the total amount of computing power they have.

They are training larger models than GPT-4. Currently, they have 6,000 H100s training models.

By December, 22,000 H100s were fully operational. From now on, 1,000 to 2,000 H100 units will be added every month.

He believes that Google DeepMind should do the same thing and disclose how many FLOPS training Gemini received.

How AI training costs will change

From the perspective of computing power cost, the scale of AI training in the future is unlikely to reach the $10 billion cost of training a certain model, unless someone really spends 3 years training a model, because the more computing power is stacked Training a larger model will take longer.

Although the higher the cost, it may bring stronger capabilities, but this is not a mathematical problem with no upper limit, and many practical limitations need to be considered.

However, because the cost of computing power continues to decrease with the iteration of chip computing power, the cost of training a certain model may be equivalent to US$10 billion in 2022.

However, because chip computing power will increase at an efficiency of 2-3 times, the cost of training a machine of this scale will be far less than it seems now.

For example, models such as Llama2 or Falcon in the open source community now only have 1.5 billion parameters or 2 billion parameters, but they have the parameter capabilities of GPT-3, which has 175 billion parameters.

open source view

As Suleyman, who has been working in closed source technology companies, he has a very unique perspective on the value and possible risks of the open source model.

First of all, he believes that within the next five years, the open source model will always lag behind the cutting-edge closed source model by 3-5 years.

Moreover, open source models will increase the social risks brought by AI.

If everyone has unlimited access to the latest models, a phenomenon will occur - "rapid diffusion of power."

For example, new media platforms allow everyone to function as a complete newspaper, with millions of fans and even influence around the world.

Unlimited access to cutting-edge models will amplify this power, as within the next 3 years humans will be able to train models that are 1,000 times larger than existing models.

Even Inflection AI will have 100 times more computing power than today's most cutting-edge models within the next 18 months.

The large open source model will put this power into everyone's hands, which is equivalent to giving everyone a potentially large-scale unstable and destructive tool.

And when the time comes, trying to figure out how to avoid the potentially destructive consequences of these tools, someone made a very clever metaphor - Tu tried to stop the rain by catching it with his hands.

He once explained to regulatory authorities that AI technology will lower the threshold for the development of many potentially dangerous compounds or weapons in the future.

AI can provide a lot of help in actually making these things - telling you where to get tools when you encounter technical challenges in the lab, etc. But it is true that removing these contents from pre-training, aligning the model, etc. can effectively reduce such risks.

In short, for people who use large model abilities to do bad things, you need to make it as difficult as possible for them to do these things.

However, if all models are open sourced as much as possible, more similar risks will be exposed in the face of increasingly powerful models in the future.

So although the open source model is indeed a good thing for many people, allowing everyone to obtain the model and make various attempts, bringing about technological innovation and improvement, we must also see the risks of open source. Because not everyone is well-intentioned and friendly.

Although what I say may be interpreted by many people as a conflict of interest between what I do and the open source community, so many people may be angry, I still want to express my point of view.

He also emphasized that he did not make these remarks to attack the open source community:

"Although what I say may be interpreted by many people as a conflict of interest between what I do and the open source community, so many people may be angry, but I still want to express my views and hope to gain people's support."

Catch a horse during Google and DeepMind

During his 10 years at DeepMind, he spent much of his time trying to incorporate more outside oversight into the process of building AI technology.

This is quite a painful process. While he thinks Google has good intentions, it still operates like a traditional bureaucracy.

When we established Google's ethics committee, it was planned to have nine independent members and was an important step in providing external oversight as we develop sensitive technology.

However, because a conservative was appointed and she has made some controversial remarks in the past, many netizens boycotted her on Twitter and other occasions, as well as several other members who supported her, demanding that they withdraw from the committee.

It's a complete tragedy and very upsetting. We have spent two years establishing this committee, which is the first step toward an external review of the very sensitive technology we are developing.

Unfortunately, within a week, three of the nine members resigned, and eventually she resigned, and then we lost half of the committee.

Then the company turns around and says, "Why are we limiting ourselves by hiring people? It's a waste of time."

In fact, when DeepMind was acquired, we made a condition of the acquisition that we have an ethics and safety committee.

We plan to build DeepMind into a global interest company after the ethics and safety committee: a company where all stakeholders have a voice in decision-making.

It is a company limited by guarantee. We then plan to develop a charter that sets out ethically secure goals for the development of AGI; this will allow us to devote a large portion of our revenue to scientific and social missions.

This is a very creative and experimental structure. But when Alphabet saw what happened with the ethics committee, they became timid. They said, "This is completely crazy. The same thing is going to happen to your Global Benefits Corporation. Why would you do that?"

Eventually, we merged DeepMind into Google, in a way that DeepMind had never been independent - and is now completely owned by Google.

Google’s next-generation large model Gemini

The Information exclusively reported that Google’s multi-modal artificial intelligence model Gemini is about to be launched and will directly benchmark OpenAI’s GPT-4.

In fact, at this year’s Google I/O conference, Pichai announced to the public that Google is developing the next generation model Gemini.

Rumor has it that the model will have at least 1 trillion parameters, and training will use tens of thousands of Google TPU AI chips.

Similar to OpenAI, Google uses the GPT-4 method to build models, which are composed of multiple artificial intelligence expert models with specific capabilities.

In short, Gemini is also a mixed expert model (MoE).

This may also mean that Google hopes to provide Gemini with different parameter sizes, because it is a good choice from a cost-effective perspective.

In addition to generating images and text, Gemini was trained on YouTube video transcription data and can also generate simple videos, similar to RunwayML Gen-2.

In addition, Gemini has also been significantly improved in terms of coding capabilities compared to Bard.

After the launch of Gemini, Google also plans to gradually integrate it into its own product line, including upgrading Bard, Google Office Family Bucket, Google Cloud, etc.

In fact, before Gemini, DeepMind also had a model codenamed "Goodall", which was based on the unannounced model Chipmunk and was comparable to ChatGPT.

However, after the birth of GPT-4, Google finally decided to abandon the development of this model.

It is said that at least more than 20 executives participated in the research and development of Gemini, led by Demis Hassabis, the founder of DeepMind, and Sergey Brin, the founder of Google, participated in the research and development.

There are also hundreds of employees at Google DeepMind, including former Google Brain director Jeff Dean and others.

Demis Hassabis said in a previous interview that Gemini will combine some of the advantages of the AlphaGo type system with the amazing language capabilities of large models.

It can be seen that Google is already preparing for the war and is waiting for Gemini to start its counterattack.




Reprinted from 新智元 润 桃子View Original


no dataCoffee time! Feel free to comment