Baichuan releases 53 billion large models, incorporating search capabilities: the first time testing experience has come
The magnitude of parameters has been improved, integrated into search engines, and the target service is B-side.
For three consecutive months, Baichuan Intelligent, founded by Wang Xiaochuan, released a large model yesterday.
On August 8, Baichuan Intelligent announced in Beijing that the new generation of large-scale model Baichuan-53B was officially launched. At the same time, Wang Xiaochuan and other company executives were interviewed by the media.
"People usually think that it takes at least half a year to release a large model, from data accumulation preparation, training to fine-tuning. It took us only two months to launch the first model, and the final quality has also been praised by the outside world," said the former CEO of Sogou, Wang Xiaochuan, founder of Baichuan Intelligent, said. "Compared with the previous ones, the parameter scale of Baichuan-53B has been greatly improved, and the writing ability has been greatly improved."
From the 7B model on June 15th, the 13B model on July 11th to the current 53 billion, the volume of the Baichuan large model has increased rapidly. This time, Baichuan Intelligent also announced the website of the large model and opened the application for internal testing.
At the event site, Wang Xiaochuan personally demonstrated some of the capabilities of the new generation of large models.
Post a circle of friends in the style of Gulong articles:
Complete the script for a short video ad:
Baichuan Intelligent said that in terms of creativity, style imitation and practicality of text creation, Baichuan-53B can do well enough, and can give a good response on most tasks.
After yesterday's release, the heart of the machine was also invited by the internal test to conduct a simple test, focusing on experiencing the text generation and search capabilities mentioned by Baichuan Intelligent.
Try the composition questions for the 2023 Beijing college entrance examination:
It can be seen that Baichuan 53B understands and can integrate some recent hot news:
But at the same time, the big models don't seem to think they have the ability to get real-time news.
On Baichuan-53B, Sogou emphasized that the large model and search have been integrated to a high degree, hoping to lay a foundation for future search models through this mechanism.
Baichuan believes that search enhancement is an effective means to solve model timeliness and hallucinations. Combining search technology with large language model capabilities enables innovative model optimization and improves the usability of AI answers.
According to reports, the search enhancement system of the Baichuan large model integrates multiple modules, including components such as command intent understanding, intelligent search, and result enhancement. The system accurately drives the search of query words through in-depth understanding of user instructions, and combines large language model technology to optimize the reliability of model result generation. Through this series of synergies, the large model enables more precise and intelligent answers to model results, and in this way reduces model hallucinations.
Compared with ChatGPT linking to Bing search in the form of a plug-in, Baichuan's large model integrates search more deeply, but Baichuan did not disclose the search engine it cooperates with.
In addition, in the dynamic response strategy, Baichuan also has its own uniqueness, which refines the command tasks into 16 independent categories. These categories cover various scenarios of user instructions, including precise question and answer, logical reasoning, brainstorming, etc., and each instruction category is personalized and optimized. To achieve this goal, the new model relies on the Prompt Augmentation technique, which guides the model to generate the desired output by constructing specific input prompts. This approach ensures that the model responds appropriately to different types of instructions.
In addition, Baichuan Intelligent discussed methods such as dynamic hyperparameter adjustment technology, intelligent search term generation, high-quality search result screening, and RLHF search result enhancement. In addition to large model pre-training, Baichuan emphasized the importance of alignment tuning (Alignment Tuning) to improve the quality of reply content.
"I feel a greater sense of accomplishment now than when I was a search engine," Wang Xiaochuan said. "Before the era of large models, Sogou has applied transformers very early, but we have not been able to effectively improve search into a practical question-answering model. But now, we can more easily achieve such capabilities."
It is worth mentioning that after the size of the model became larger, Baichuan did not continue the previous open source method. Baichuan-53B plans to open APIs and components next month, and strengthen business alignment and professional fields to promote implementation.
"The large models we provide can be directly used for running benchmark tests, which is rare in the industry. These products are not optimized for individual scenarios, and they are ready to become the basic model of to B," said Wang Xiaochuan .
On April 10, 2023, Wang Xiaochuan officially announced the establishment of Baichuan Intelligence, which is committed to creating a general intelligent technology that benchmarks OpenAI, building a basic large model and disruptive upper-level applications. While the technical team continues to expand, Baichuan has also launched self-developed large models one after another.
On June 15th, Baichuan Intelligent launched the Chinese-English language model Baichuan-7B with 7 billion parameters, and won the first place in many world authoritative Benchmark lists. On July 11, Baichuan Intelligence released a general-purpose large language model Baichuan-13B-Base with a parameter volume of 13 billion, a dialogue model Baichuan-13B-Chat and two quantized versions of INT4/INT8.
In terms of financing, Baichuan Intelligent completed the angel round of financing in May and received joint investment from more than ten institutions including Tencent, Xiaomi, Kingsoft, Muhua Capital, and Tsinghua University Asset Management Co., Ltd.
In terms of business model, Baichuan Intelligent hopes to be able to purchase and build "super applications" in the consumer field in the long run. In the to B field with relatively clear goals, although the speed of entering the market is not the fastest, the company has also demonstrated its strength through open source and other methods.
"From the perspective of to B, both open source and closed source large models have room for development. We believe that 80% of companies in the future need to build intelligence based on open source models," Wang Xiaochuan said. "Currently, more than 150 companies are applying to use our large-scale model."
Baichuan Intelligent plans to continue to release hundreds of billions and trillions of large-scale models in the third and fourth quarters of this year, and build products with the highest domestic level and benchmarking against the GPT series.