100,000 tokens at a time! The epic upgrade of GPT4's strongest opponent, a hundred pages of data are summarized in one minute
Claude , known as ChatGPT's "strongest competitor", ushered in an epic update——
The memory of the model has taken off in place, and now it is no problem to read a novel with tens of thousands of words in one minute .
As soon as the news came out, the comment area immediately exploded, and netizens flocked to it, "woc again and again":
crazy crazy! Things are moving so fast, another day of worrying for humanity!
It turns out that this update has increased the number of tokens in the context window of the model to a full 100,000, which is equivalent to 75,000 words !
This means that the defect of the large model's poor "memory" is now reinforced, and we can directly throw materials with hundreds of pages and tens of thousands of words , such as financial reports, technical documents or even a book.
And it can help you analyze and summarize in one minute !
You must know that almost all AI chat machines on the market can only read a limited amount of text at a time, and understanding the context in long materials is very disastrous.
However, it is too slow for us humans to process large-scale texts by ourselves. For example, it takes about 5 hours to read the materials of 100,000 tokens, not to mention that it takes more time to understand and digest. Summarize.
Now, Claude does it straight away.
This wave is simply "five thunders" and "reverse unwinding" GPT-4, because the latter only achieved 32,000 tokens just now.
So, with 3 times more processing capacity than GPT-4 at one time, what is the improvement of Claude's effect?
Claude heavy update: remember 100,000 tokens at a time
According to Anthropic's official introduction, the upgraded Claude-100k version has greatly improved both dialogue and task processing capabilities.
On the one hand, it is the increase in the "amount of text that can be processed at one time", which directly broadens the types of positions that Claude can work on.
Previously, large models were used to process documents of dozens of pages at most.
Now, Claude has been able to speed-read company financial reports, technical development documents, identify risks in legal documents, read hundreds of pages of research papers, and even process data in entire code bases.
The most important thing is that it can not only read through the full text and summarize the main points, but also complete specific tasks, such as writing code and organizing tables.
For example, quickly understand hundreds of pages of development documents , and develop application demos based on the documents.
Take LangChain, a new technology that Claude has not seen before, as an example:
After handing it a 240-page LangChain API report, ask it to quickly give a demo of LangChain:
In almost no time, Claude quickly gave a demo of an application developed based on LangChain:
Another example, give it a long but must-listen 5-hour knowledge podcast :
It can not only convert the focus into text and extract it, but also quickly organize the tables and analyze the views:
On a slightly more difficult level, a 30-page research paper can handle it just fine, and it can even be specified exactly where it organizes a certain paragraph of a chapter:
In addition, it can also help the director deal with issues such as script location selection. For example, if you give the movie script of "Dune" and ask Claude the most suitable location for shooting, he will quickly give several corresponding reference addresses:
Finally, the official also gave a case of "The Great Gatsby", although there is no demo demonstration.
After they fed this to Claude, they replaced one of the characters, Mr. Carraway, and made his character "an engineer in Anthropic", and then gave it to Claude to read and let him find the difference.
In barely 22 seconds , Claude reads the book and discovers Mr. Carraway's "different identities".
On the other hand, the increase in "memory" has brought about an improvement in the control of topics and an improvement in chatting ability.
Previously, the large model often had the situation of "chatting and forgetting the topic". After the total number of words in the dialogue window exceeded a few thousand words, it began to speak nonsense.
Take a chestnut 🌰, if you set up a ChatGPT version of a cat girl with a lot of prompt words, then after it starts chatting, it may forget what it said to you in a few hours, and start to show some signs of "derailment" (manual dog head )
But now, Claude, who has a memory of 100,000+ tokens at a time, is unlikely to have such a situation. Instead, he can firmly remember the topics he talked with you and talk for several days in a row.
So, how can we use the latest version of Claude at present?
Both API and web version are live
Anthropic first announced the launch of the Claude-100k API version, and then quickly launched the web page.
△ Anthropic engineer
So whether you use the web page or the API, you can already directly experience this version of Claude with "super long memory".
Soon, some netizens can't wait to play it.
Matt Shumer, CEO of OthersideAI, tried the effect of Claude-100k summary technical report on the web page .
He first tested the effect of Claude-9k , and found that it still had "illusions" in the face of hundreds of pages of GPT-4 technical reports ; then he tested the new version of Claude-100k , and found that it gave a well-founded estimate:
The parameter magnitude of GPT-4 is about 500 billion !
Here's how it speculates:
I don’t know if this wave of OpenAI’s Altman will come out to refute the rumors (manual dog head).
Another brother from Assembly AI tested the API version of Claude-100k.
In a video demo, he sums up Lex Friedman's 5-hour podcast (about John Carmack) with a Claude-100k, and it looks pretty good too:
But whether it is the web version or the API, it is not a version that we can directly try without registration.
What we said before is that you can play without registration, application, or "show operation", and the experience is also very good. It is the slack side , which is very simple.
Unfortunately, it is still a Claude-9k "experimental version" at present.
So, to sum it up , this Claude-100k version:
- You can use the API to experience, but it is not free;
- The web version is also available, but you need to have a trial qualification, if you don’t have one, apply for it, and wait;
- The Slack side doesn't work yet, it's just a trial version.
Reverse open book GPT-4, netizen: the book is in the right direction
Yesterday, Google just announced several major updates at the I/O conference, including:
- Refactoring Google Search, Opening AI Dialogue
- Release large model PaLM 2
- Fully open Bard, play directly without queuing
...
This is seen as a series of counterattacks against Microsoft and OpenAI.
Now, Claude from Anthropic followed suit, released such a major update, and directly reversed the GPT-4——
Some netizens commented:
Claude rolled in the right direction .
Indeed, at present, most language models can only handle 2-8k tokens, and everyone is struggling to find ways to improve the memory of the model.
For example, the paper last month that can expand the upper limit of Transformer tokens to 1 million or even 2 million has attracted a lot of attention, but the test results of netizens seem to be unsatisfactory.
Now, Claude is directly "the first person in the industry" to announce that he has achieved 100k and put it out for everyone to use. It is hard not to be praised.
In addition, some netizens "stand higher", saying:
Competition between capitalisms is indeed wonderful.
The implication is that the volume is good and the volume is wonderful. How could we have witnessed so many and significant developments in just two days without the competition between giants and various vertical enterprises?
Hey, but then again, given that Anthropic was founded by a few ex-employees who were unhappy that OpenAI was getting too close to Microsoft, and Google invested $300 million in the company.
Bold guess:
This time, the "singing and harmony" between the two people, was it discussed in advance?
Reference link:
[1] https://techcrunch.com/2023/05/11/anthropics-latest-model-can-take-the-great-gatsby-as-input/
[2] https://twitter.com/AnthropicAI/status/1656700154190389248
[3] https://www.anthropic.com/index/100k-context-windows
[4] https://twitter.com/mattshumer_/status/1656781729485529089