ChatGLM 2 -6B is the second-generation version of the open source Chinese-English bilingual dialogue model ChatGLM-6B . On the basis of retaining many excellent features of the first generation model, such as smooth dialogue and low deployment threshold, ChatGLM 2 -6B introduces the following new features :
More powerful performance : Based on the development experience of the first generation model of ChatGLM, we have fully upgraded the base model of ChatGLM2-6B. ChatGLM2-6B uses the mixed objective function of GLM , and has undergone pre-training of 1.4T Chinese-English identifiers and human preference alignment training. The evaluation results show that compared with the original model, ChatGLM2-6B is better at MMLU (+23%), CEval (+33%), GSM8K (+571%), BBH (+60%) and other data sets have achieved substantial improvement in performance, and have strong competitiveness in open source models of the same size.
Welcome to chatglm.cn to experience a larger scale ChatGLM model.
ChatGLM2-6B open source model aims to promote the development of large model technology with the open source community. Developers and everyone are urged to abide by the open source agreement and not to use the open source model And code and derivatives based on open source projects are used for any purposes that may cause harm to the country and society, and for any services that have not undergone security assessment and filing. Currently, the project team has not developed any applications based on ChatGLM2-6B, including web, Android, Apple iOS and Windows App applications.
Although the model tries its best to ensure the compliance and accuracy of the data at all stages of training, the output content cannot be guaranteed due to the small size of the ChatGLM2-6B model and the model is affected by probabilistic randomness factors accuracy, and the model is easily misleading. This project does not assume the risks and responsibilities of data security and public opinion risks caused by open source models and codes, or any risks and responsibilities arising from misleading, misusing, spreading, and improper use of any models.
[2023/07/31] released the ChatGLM2-6B-32K model to improve the ability to understand long texts.
[2023/07/25] released the CodeGeeX2 model, based on ChatGLM2-6B adding code pre-training implementation, the coding ability has been comprehensively improved.
[2023/07/04] released P-Tuning v2 and full parameter fine-tuning script, see P-Tuning .
An open source project that accelerates ChatGLM2:
ChatGLM2- TPU : Using the TPU acceleration inference solution, running about 3 token/s in real time on the end-side computing chip BM1684X (16T@FP16, 16G memory) Project:
Visit Official Website