This project provides the Chinese dialogue model Linly-ChatFlow, the Chinese basic model Chinese-LLaMA (1-2), Chinese-Falcon and their training data to the community.
The model is based on full-tuning of the TencentPretrain pre-training framework.
The Chinese basic model is based on LLaMA and Falcon, using Chinese and Chinese-English parallel corpora for incremental pre-training, extending its language ability from English to Chinese. At the same time, the project summarizes the currently public multilingual instruction data, conducts large-scale instruction following training on the Chinese model, and realizes the Linly-ChatFlow dialogue model.
In addition, this project has open sourced the Linly-OpenLLaMA model trained from scratch, including 3B, 7B, and 13B scales. Models are exposed under the Apache 2.0 protocol.
Project Content
Chinese pre-training corpus \| Chinese command fine-tuning dataset \| model quantization deployment \| domain fine-tuning example
Visit Official Website