📝 Main content of the project

🚀 Training code

See BELLE/train for details, a training code implementation as simplified as possible, integrates Deepspeed-Chat, supports finetune, lora, and provides related docker

📊 open data

🧐 Validation Collection & Validation Methods

For details, see BELLE/eval , a 1k+ test set, and the corresponding scoring prompt. Contains multiple categories and uses GPT-4 or ChatGPT to score. At the same time, a scoring webpage is provided, which is convenient for a single case. Everyone is welcome to provide more test cases through PR.

🤖 model

For details, see BELLE/models - optimized models based on BLOOMZ-7B1-mt: BELLE-7B-0.2MBELLE-7B-0.6MBELLE-7B-1MBELLE-7B-2M

- Models for tuning based on Meta LLaMA : BELLE-LLaMA-7B-0.6M-enc , BELLE-LLaMA-7B-2M-enc , BELLE-LLaMA-7B-2M-gptq-enc , BELLE-LLaMA-13B-2M -enc , BELLE-on-Open-Datasets and a pre-trained model BELLE-LLaMA-EXT-7B based on LLaMA that has expanded the Chinese vocabulary.

- 请参考[Meta LLaMA的License](https://github.com/facebookresearch/llama/blob/main/LICENSE),目前仅供学习交流。请严遵守LLaMA的使用限制。LLaMA模型不允许发布调优后的完整模型权重,但是可以发布原始的模型的diff。因此,我们使用文件间的XOR,保证拥有LLaMA原始模型授权的人才可以将本项目发布的模型转化成可以使用的格式。格式转化代码参考[BELLE/models](https://github.com/LianjiaTech/BELLE/tree/main/models)

⚖️ Model quantization gptq

See BELLE/gptq for details, refer to the implementation of gptq, and quantify the relevant models in this project

🌐 Colab

Open in Colab Provides the reasoning code Colab that can be run on colab

💬 ChatBELLE App

For details, see BELLE/chat , a cross-platform offline large language model chat app based on the BELLE model. Using the quantized offline model with Flutter, it can run on macOS (already supported), Windows, Android, iOS and other devices.

📑 research report

See BELLE/docs for details, which will regularly update the research report work related to this project

Everyone is welcome to contribute more prompts through issues!

Visit Official Website


Community Posts
no data
Nothing to display