See BELLE/train for details, a training code implementation as simplified as possible, integrates Deepspeed-Chat, supports finetune, lora, and provides related docker
See BELLE/data/1.5M for details, refer to the Chinese dataset 1M + 0.5M generated by Stanford Alpaca ;
Continuously open data sets, see BELLE/data/10M for details
For details, see BELLE/eval , a 1k+ test set, and the corresponding scoring prompt. Contains multiple categories and uses GPT-4 or ChatGPT to score. At the same time, a scoring webpage is provided, which is convenient for a single case. Everyone is welcome to provide more test cases through PR.
For details, see BELLE/models - optimized models based on BLOOMZ-7B1-mt: BELLE-7B-0.2M , BELLE-7B-0.6M , BELLE-7B-1M , BELLE-7B-2M
- Models for tuning based on Meta LLaMA : BELLE-LLaMA-7B-0.6M-enc , BELLE-LLaMA-7B-2M-enc , BELLE-LLaMA-7B-2M-gptq-enc , BELLE-LLaMA-13B-2M -enc , BELLE-on-Open-Datasets and a pre-trained model BELLE-LLaMA-EXT-7B based on LLaMA that has expanded the Chinese vocabulary.
- 请参考[Meta LLaMA的License](https://github.com/facebookresearch/llama/blob/main/LICENSE),目前仅供学习交流。请严遵守LLaMA的使用限制。LLaMA模型不允许发布调优后的完整模型权重,但是可以发布原始的模型的diff。因此,我们使用文件间的XOR,保证拥有LLaMA原始模型授权的人才可以将本项目发布的模型转化成可以使用的格式。格式转化代码参考[BELLE/models](https://github.com/LianjiaTech/BELLE/tree/main/models)
See BELLE/gptq for details, refer to the implementation of gptq, and quantify the relevant models in this project
Provides the reasoning code Colab that can be run on colab
For details, see BELLE/chat , a cross-platform offline large language model chat app based on the BELLE model. Using the quantized offline model with Flutter, it can run on macOS (already supported), Windows, Android, iOS and other devices.
See BELLE/docs for details, which will regularly update the research report work related to this project
Everyone is welcome to contribute more prompts through issues!
Visit Official Website