The Stable Diffusion team releases a large language model: is the era of the open source version of ChatGPT coming?
The open source model Stable Diffusion released by Stability AI is currently one of the most mainstream and important AI painting models.
Based on Stable Diffusion, the developer community has created many interesting plug-ins and models, such as the Control Net project that can control the shape of graphics, and more than 1,000 related development projects.
Now, this AI company that is keen on open source wants to do another big thing-release an open source large language model similar to ChatGPT.
Big language model for everyone
2023 can be said to be a year of blowout for large language models. In the past few months, a new large language model has been released almost every week. Large models, small models, text generation, multimodal, closed source, open source... Now is the spring of large language models, and all kinds of schools are blooming. This enthusiasm not only belongs to Internet companies such as Microsoft, Google, Baidu, and Ali, but also to all AI-related technology companies.
Compared with the existing large models, what is special about the StableLM large language model?
According to the official introduction, StableLM is currently an open source and transparent model that allows researchers and developers to freely inspect, use and modify the code. Just like Stable Diffusion, users can freely configure Stable LM to create a large language model tailored to their needs. The current StableLM Alpha version model has a parameter size of 3 billion and 7 billion. In the future, Stability AI will also provide model versions with a parameter size of 15 billion and 65 billion.
Although the model size of StableLM is much smaller than the 175 billion parameters of the GPT-3 model, Stability AI said that StableLM is based on an expanded data set that is 3 times larger than The Pile data set. It has a good performance in dialogue with natural language.
The Pile data set itself already includes a large number of books, Github knowledge base, web pages, chat records and other data, and also collects papers in medicine, physics, mathematics, computer science and philosophy, which can be used for general large language models and cross-domain text generation. Training provides a good baseline. Therefore, in actual use, the gap between StableLM and GPT-3 is not as obvious as the difference in paper parameter data.
For example, in terms of dialogue, if you ask "what would you say to a friend who is about to graduate from high school", StableLM will answer:
You should be proud of yourself and what you have achieved, and you should have expectations for the future.
StableLM also handles tasks like "writing an email" with ease.
In terms of creation, if such a title is given - "Write an epic rap battle song between deep learning neural network and symbolic artificial intelligence", StableLM can also write a rap lyrics in seconds.
In addition, Stability AI also showed some examples of "unconventional paths", such as writing programs in C language that can computer the meaning of life.
Stability AI also hosts StableLM on the HuggingFace community website, and users who want to try it can visit here for communication and debugging.
Through the short test we conducted, we found that StableLM's Chinese level is not very good, let alone compared with excellent students like ChatGPT. Therefore, it is recommended that you use English when communicating.
How far is it from ChatGPT?
So how does it compare to ChatGPT? At least for now, it's best not to compare them. In fact, the authenticity of its output is almost nonexistent. See below, for example, which claims that on January 6, 2021, Trump supporters took control of the Legislature, which is a dangerously confusing message about recent events.
And ChatGPT answered this way:
Closed source or open source?
Like many open source large language models including Stanford University's Alpaca, StableLM provides developers with the opportunity to freely customize large language models locally or on the server, eliminating the need to worry about data leakage to the model background.
However, after ChatGPT became popular, questions about AI model data privacy emerged in an endless stream. Not long ago, Samsung was exposed that several employees leaked confidential data to ChatGPT, so that Samsung Semiconductor decided to develop its own internal AI tools to avoid similar problems from happening again.
In addition to the advantages of high transparency, the open source model also facilitates developers to develop more creative applications. For example, users can customize StableLM to create a non-stop web writer, or a senior programmer or copywriter who is familiar with company projects. In addition, you can also tune into a horoscope expert on Weibo.
The open source model provides developers with a wider range of creative space, but it also provides more advanced tools for perpetrators. For those with ulterior motives, open source large-scale language models may become an artifact of telecom fraud. They can realistically simulate conversations and defraud people of their property.
Open source technology is always accompanied by controversy, which Stability AI has long expected. Stability AI has faced numerous copyright infringement lawsuits for its open-source "Stable Diffusion," as well as controversies over users using its tool to generate pornography.
Emad Mostaque, CEO of Stability AI, mentioned in a previous interview that large models need to receive more supervision, rather than being locked in a small black box by large companies. Therefore, the openness of large models is very important to this community. Stability AI insists on open source in order to bring technology to more people and guide people to think more.
StableLM is yet another validation of the promise of Stability AI, and it has the potential to start a new chapter in a future where everyone has their own language model.