PolyLM is a polyglot large language model, which is aimed to address the following blanks and limitations in current LLM research, offering a comprehensive and innovative solution to advance this field.
Covering 18 of the most commonly spoken languages. PolyLM is proficient in the major non-English languages spoken worldwide, such as Spanish, Russian, Arabic, Japanese, Korean, Thai, Indonesian, and Chinese etc. It is a perfect complement to the existing open-source models, including: (1) LLaMA, in which English is predominant among the whole dataset. (2) BLOOM, fails to address languages spoken by significant populations, such as Japanese, Korean and Thai. Better multingual instruction-following capability. We propose MULTIALPACA to complement ALPACA and CHINESEALPACA, making LLMs better follow multilingual instructions, particularly those coming from non-native English speakers. Strong performance. In comparison with popular multilingual LLMs of the similar model size, PolyLM demonstrates remarkable performance on various tasks, including QA, understanding, and generation.
Visit Official Website