Fudan version of ChatGPT is open source! 16 billion parameters, MOSS has added a number of new capabilities
In February this year, we reported that Fudan University launched the Chinese version of ChatGPT, which attracted widespread attention. At that time, Professor Qiu Xipeng said that he would open source Moss in April. Yesterday, the open source version of Moss finally met with us.
MOSS is an open source dialogue language model that supports Chinese-English bilingual and various plug-ins, but the number of parameters is much less than that of ChatGPT. After v0.0.2, the team continued to adjust it and launched MOSS v0.0.3. The current open source version is also officially this version.
Compared with the previous version, the function has also achieved a number of updates.
In the initial test, the basic function of MOSS is similar to that of ChatGPT, which can complete various natural language processing tasks according to the instructions entered by the user, including text generation, text summarization, translation, code generation, chatting and so on. After the open beta, the team continued to increase the pre-training of Chinese corpus:
"As of now, the base language model of MOSS 003 has been trained on 100B Chinese tokens, and the total number of training tokens has reached 700B, which also contains about 300B codes."
After the open beta, we also collected some user data. We found that the user intent in the real Chinese world is quite different from the user prompt distribution disclosed in the OpenAI InstructGPT paper (this is not only related to the difference in the country where the user comes from, but also related to the product. There are a lot of adversarial and test inputs in the data collected by early products), so we used this part of real data as a seed to regenerate about 1.1 million regular conversation data, covering finer-grained helpfulness data and a wider range of Harmlessness data.
Currently, the team has uploaded three models, moss-moon-003-base, moss-moon-003-sft, and moss-moon-003-sft-plugin, to HuggingFace. In the follow-up, three more models will be open source.
According to the project homepage, the moss-moon series models have 16 billion parameters, and can run on a single A100/A800 or two 3090 graphics cards at FP16 precision, and can run on a single 3090 graphics card at INT4/8 precision.
The plug-in enhances supernatural powers, and MOSS has new abilities
In MOSS v0.0.3, the team constructed about 300,000 plugin-enhanced dialogue data, including search engines, Vincent diagrams, calculators, equation solving, etc., bringing many new capabilities to MOSS. Regarding how to use the plug-in version of MOSS, the team will announce it on GitHub later.
So, let's take a quick look at what plug-in capabilities will lead us to explore:
Ability to invoke search engines:
Ability to call equation solvers:
Ability to generate images from text:
Regarding the plug-in capability, project author Sun Tianxiang added that the ability of MOSS 003 to support plug-ins is controlled through meta instructions, similar to the system prompt in gpt-3.5-turbo.
"Because it is model-controlled, it cannot guarantee a 100% control rate, and there are still some defects that cannot be called correctly when multiple plug-ins are selected, and plug-ins fight with each other. We are developing a new model as soon as possible to alleviate these problems."
how to install
Download the MOSS.git repository content to the local/remote server:
git clone https://github.com/OpenLMLab/MOSS.git cd MOSS
Create a conda environment:
conda create --name moss python=3.8 conda activate moss
Installation dependencies:
pip install -r requirements.txt
Note: The version of torch and transformers is not recommended to be lower than the recommended version.
According to the agreement, the open source version of MOSS can be used for commercial purposes:
At present, developers have innovated based on open source content, such as trying video question and answer through VideoChat.

You can experience it through this website .
VideoChat is a versatile video question answering tool that combines the power of action recognition, visual captioning and StableLM. The tool can generate dense, descriptive subtitles for any object and action in a video, offering a range of language styles to suit different user preferences. It supports users to engage in conversations of varying lengths, emotions, veracity of language.
Reference link:
MOSS project address: https://github.com/OpenLMLab/MOSS
video_chat_with_MOSS project address: https://github.com/OpenGVLab/Ask-Anything/tree/main/video_chat_with_MOSS