HomeAI Tools
Baichuan AI

Baichuan AI

46 liked
About Baichuan AI

Baichuan Intelligence was established on April 10, 2023 by former Sogou CEO Wang Xiaochuan. The company's mission is to help the public easily and inclusively obtain world knowledge and professional services, and is committed to building the best large-scale model base in China through the breakthrough of language AI. The company's core team is composed of top AI talents from well-known technology companies such as Sogou, Baidu, Huawei, Microsoft, Byte, and Tencent. Less than 100 days after its establishment, Baichuan Intelligent released two open-source Chinese large models, Baichuan-7B and Baichuan-13B, which are free for commercial use, and they are among the best in many authoritative evaluation lists, with downloads exceeding one million.

product description


The Baichuan-53B large model integrates intent understanding, information retrieval, and reinforcement learning technologies, combined with supervised fine-tuning and human intent alignment, and has outstanding performance in the fields of knowledge question answering and text creation.


The open-source and commercially available large-scale language model containing 13 billion parameters developed by Baichuan Intelligent after Baichuan-7B has achieved the best results of the same size on authoritative Chinese and English benchmarks. This release contains two versions of pre-training (Baichuan-13B-Base) and alignment (Baichuan-13B-Chat). Baichuan-13B has the following characteristics:

  • Larger size, more data: Baichuan-13B further expands the parameter amount to 13 billion on the basis of Baichuan-7B, and trains 1.4 trillion tokens on high-quality corpus, Exceeding LLaMA-13B by 40%, it is the model with the largest amount of training data under the current open source 13B size. Support Chinese and English bilingual, use ALiBi position code, context window length is 4096.

  • Open source pre-training and alignment models at the same time: The pre-training model is the "base" for developers, and ordinary users have stronger demands for alignment models with dialogue functions. Therefore, this open source we also released the alignment model (Baichuan-13B-Chat), which has a strong dialogue ability and can be used out of the box. It can be easily deployed with a few lines of code.

  • More efficient reasoning: In order to support the use of more users, we have open sourced the quantized versions of int8 and int4 at the same time. Compared with the non-quantified version, it greatly reduces the threshold of machine resources for deployment with almost no effect loss, and can be deployed On a consumer graphics card like the Nvidia 3090.

  • Open source, free and commercially available: Baichuan-13B is not only completely open to academic research, developers can also use it for free commercial use only after applying by email and obtaining an official commercial license.


An open source and commercially available large-scale pre-trained language model. Based on the Transformer structure, the 7 billion parameter model trained on about 1.2 trillion tokens supports Chinese and English bilingual, and the context window length is 4096. The best results of the same size are achieved on the standard Chinese and English benchmarks (C-Eval/MMLU).

Visit Official Website


Show more
You can also use your own data for model training, just organize the data into a specified format.
The quality of Baichuan's base is quite good, and it can leapfrog to challenge llama. The effect of the 7B model is not inferior to the various llama derivatives of 13B.
Community Posts
doom guyplan-icon
Baichuan-13B, an open-source large model of Baichuan, has a good performance in multiple capabilities, and some tasks have relatively large room for improvement.
master chief117
Baichuan-53B aligns the model with human values through alignment adjustments to generate "more satisfying" responses.
Baichuan Intelligence launched the Baichuan-53B large model today. The model is currently open for beta testing applications, which can be approved within half an hour of application, allowing users to directly experience this large model on the official website.
百川大模型-汇聚世界知识 创作...

This model integrates technologies of intent understanding, information retrieval, and reinforcement learning. Combined with supervised fine-tuning and alignment with human intent, it excels in areas like knowledge Q&A and text creation, supporting both Chinese and English.