Today, Stability AI released a new open-source language model, StableLM. The Alpha version of the model is available in 3 billion and 7 billion parameters, with 15 billion to 65 billion parameter models to follow. Developers can freely inspect, use, and adapt our StableLM base models for commercial or research purposes, subject to the terms of the CC BY-SA-4.0 license.
In 2022, Stability AI drove the public release of Stable Diffusion, a revolutionary image model that represents a transparent, open, and scalable alternative to proprietary AI. With the launch of the StableLM suite of models, Stability AI is continuing to make foundational AI technology accessible to all. Our StableLM models can generate text and code and will power a range of downstream applications. They demonstrate how small and efficient models can deliver high performance with appropriate training.
The release of StableLM builds on our experience in open-sourcing earlier language models with EleutherAI, a nonprofit research hub. These language models include GPT-J, GPT-NeoX, and the Pythia suite, which were trained on The Pile open-source dataset. Many recent open-source language models continue to build on these efforts, including Cerebras-GPT and Dolly-2.
StableLM is trained on a new experimental dataset built on The Pile, but three times larger with 1.5 trillion tokens of content. We will release details on the dataset in due course. The richness of this dataset gives StableLM surprisingly high performance in conversational and coding tasks, despite its small size of 3 to 7 billion parameters (by comparison, GPT-3 has 175 billion parameters).
We are also releasing a set of research models that are instruction fine-tuned. Initially, these fine-tuned models will use a combination of five recent open-source datasets for conversational agents: Alpaca, GPT4All, Dolly, ShareGPT, and HH. These fine-tuned models are intended for research use only and are released under a noncommercial CC BY-NC-SA 4.0 license, in-line with Stanford’s Alpaca license
Language models will form the backbone of our digital economy, and we want everyone to have a voice in their design. Models like StableLM demonstrate our commitment to AI technology that is transparent, accessible, and supportive:
Transparent. We open-source our models to promote transparency and foster trust. Researchers can “look under the hood” to verify performance, work on interpretability techniques, identify potential risks, and help develop safeguards. Organizations across the public and private sectors can adapt (“fine-tune”) these open-source models for their own applications without sharing their sensitive data or giving up control of their AI capabilities.
Accessible. We design for the edge so that everyday users can run our models on local devices. Using these models, developers can build independent applications compatible with widely-available hardware instead of relying on proprietary services from one or two companies. In this way, the economic benefits of AI are shared by a broad community of users and developers. Open, fine-grained access to our models allows the broad research and academic community to develop interpretability and safety techniques beyond what is possible with closed models.
Supportive. We build models to support our users, not replace them. We are focused on efficient, specialized, and practical AI performance – not a quest for god-like intelligence. We develop tools that help everyday people and everyday firms use AI to unlock creativity, boost their productivity, and open up new economic opportunities.
The models are now available in our GitHub repository. We will publish a full technical report in the near future, and look forward to ongoing collaboration with developers and researchers as we roll out the StableLM suite. In addition, we will be kicking off our crowd-sourced RLHF program, and working with community efforts such as Open Assistant to create an open-source dataset for AI assistants.
Visit Official Website