The Aquila language model is the first open-source language model with Chinese-English bilingual knowledge, support for commercial licensing agreements, and domestic data compliance requirements.

  • 🌟Support open source commercial license . The source code of the Aquila series models is based on the Apache 2.0 agreement , and the weight of the models is based on the "Zhiyuan Aquila Series Model License Agreement" . Users can use it for commercial purposes if they meet the license restrictions.
  • ✍️Have knowledge of Chinese and English . The Aquila series models are trained from 0 on the basis of high-quality Chinese and English corpus, and the Chinese corpus accounts for about 40%, ensuring that the model begins to accumulate native Chinese world knowledge during the pre-training stage, rather than translated knowledge.
  • 👮‍♀️ Meets domestic data compliance requirements . The Chinese corpus of the Aquila series models comes from the Chinese data sets accumulated by Zhiyuan for many years, including Chinese Internet data from more than 10,000 website sources (more than 99% of which are domestic website sources), as well as high-quality Chinese literature supported by domestic authoritative organizations data, Chinese book data, etc. We are still accumulating high-quality and diverse datasets, and adding them to the follow-up training of the Aquila basic model.
  • 🎯 Continuous iteration , open source and open. We will continue to improve training data, optimize training methods, and improve model performance. On the basis of a better basic model, we will cultivate a luxuriant "model tree" and continue to open source and open newer versions.

Enlightenment·Skyhawk Aquila model more details will be presented in the official technical report. Please pay attention to official channel updates. Including FlagAI GitHub warehouse , FlagAI Zhihu account , FlagAI official technical exchange group , Zhiyuan Research Institute WeChat public account, and Zhiyuan Community WeChat public account.

