Silicon Valley Sequoia Capital issued an analysis: How does the company implement AI applications?
The article summarizes after interviewing 33 companies in the investment portfolio, ranging from seed rounds to large companies that have already been listed. There are a total of 8 points of analysis, and there are a lot of dry goods. Based on the translation of ChatGPT, this article adds the inductive interpretation of Twitter blogger @Will 3.6-6.16 Silicon Valley, hoping to be helpful to AI entrepreneurs.
ChatGPT employs large language models (LLMs), triggering a wave of innovation. More companies than ever are bringing the ability to interact with natural language into their products. The adoption of language model APIs is creating new stacks. To better understand the applications you're building and the stacks you're using, we talked to 33 companies in the Sequoia Network, ranging from seed-stage startups to large public companies. We spoke to them two months ago and last week to capture the pace of change. Many founders and developers are self-exploring their AI strategies, and we want to share our research even as the field is rapidly evolving.
1. Almost all companies use language models in their products
· Almost every company is building language models into their products
· From code and data science co-pilots to chatbots for customers, developers, employees, and pure entertainment.
· Many companies are reimagining entire workflows with an AI-first lens.
· These are just a few examples, and they are just the beginning.
Almost every company in the Sequoia Network is incorporating language models into their products. We've seen all sorts of amazing autocompletion for everything from code (Sourcegraph, Warp, Github) to data science (Hex). We're seeing better chatbots, from customer support to employee support to consumer entertainment. Others are reimagining entire workflows with an AI-first lens: Visual Arts (Midjourney), Marketing (Hubspot, Attentive, Drift, Jasper, Copy, Writer), Sales (Gong), Contact Center (Cresta), Legal ( Ironclad, Harvey), accounting (Pilot), productivity (Notion), data engineering (dbt), search (Glean, Neeva), grocery shopping (Instacart), consumer payments (Klarna) and travel planning (Airbnb). These are just a few examples, and it's just the beginning.
2. These applications are mainly based on API, retrieval and orchestration, but the use of open source is also growing
-65% of companies have applications in production - 94% of companies are using the base model API -88% of companies believe that search technology -38% of companies are interested in a framework like LangChain -15% of companies built custom language models from scratch or using open source resources
Talking to every practitioner revealed that AI is moving too fast to have a high degree of confidence in the final tech stack,
But there is consensus that the LLM API will continue to serve as a key pillar,
Second is the retrieval mechanism and development frameworks like LangChain.
Training and tuning of open source and custom models also appears to be growing.
Other areas of the language modeling stack are also important but less mature.
The new stack for these applications is centered around language model APIs, retrieval and orchestration, but open source usage is also growing.
⭐️ 65% of companies are using these apps in production, up from 50% two months ago; the rest are still experimental. ⭐️ 94% of companies use the base model API. OpenAI's GPT is clearly the most popular model in our sample with 91% usage, but Anthropic has also increased interest to 15% over the past quarter. (some companies use more than one model). ⭐️ 88% of companies believe that retrieval mechanisms such as vector data will remain a key part of their stack. Retrieving context relevant to model inference can help improve result quality, reduce "hallucinations" (inaccuracies), and address data freshness issues. Some companies use specific vector databases (Pinecone, Weaviate, Chroma, Qdrant, Milvus, etc.), while others use pgvector or services provided by AWS. ⭐️ 38% of companies are very interested in LLM orchestration and application development frameworks similar to LangChain. Some companies use it for prototyping, while others use it in production. Adoption has increased in recent months. A handful of companies are working on complementary generative technologies, such as combining generated text and speech. We also think this is an exciting growth area. ⭐️ 15% of companies build custom language models from scratch or using open source technologies, often also using the LLM API. A few months ago, there was a significant increase in the number of custom model trainings. This requires a separate stack of computing, model libraries, hosting, training frameworks, experiment tracking, etc. Companies such as Hugging Face, Replicate, Foundry, Tecton, Weights & Biases, PyTorch, Scale, etc. provide related services.
Every practitioner said that AI is developing too fast to have a high degree of confidence in the final stack, but there is a consensus that the LLM API will continue to be the key pillar, followed by retrieval mechanisms and development frameworks like LangChain. Open source and custom model training and tuning also appear to be growing. Other areas of the stack are also important but less mature.
3. Companies want to tailor language models to their context
Currently, there are three main ways to customize a language model: Train a custom model from scratch, the hardest.
Fine-tune the base model with moderate difficulty.
Use a pre-trained model and retrieve relevant context with minimal difficulty.
Companies want to tailor language models to their unique context. Generic language models are powerful, but not differentiated or sufficient for many use cases. Companies want to enable natural language interaction on their data - developer documentation, product listings, HR or IT rules, etc. In some cases, companies also want to customize their models based on their user data: your personal notes, design layouts, data metrics, or code bases.
Today, there are three main ways to customize a language model (see Andrej's recent State of GPT talk at Microsoft Build for a more in-depth technical explanation):
⭐️ Train custom models from scratch. The most difficult. This is the traditional and hardest way to solve this problem. Often highly skilled ML scientists, large amounts of relevant data, training infrastructure, and computing power are required. This is one of the main reasons why a lot of NLP innovation has historically happened inside hyperscale technology companies. BloombergGPT is a great example of an attempt at custom models outside of hyperscale technology companies, using Hugging Face and other open source tools. As open source tools improve and more companies use LLMs to innovate, we expect to see more use of custom and pretrained models.
⭐️ Fine-tune the base model. Moderate difficulty. This is the operation of updating the weights of the pre-trained model by further using proprietary or domain-specific data. Open source innovation is also making it increasingly easy for professional teams to adopt this approach, but it still often requires a complex team. Some practitioners privately admit that fine-tuning is harder than it sounds and can have unintended consequences like model drift and other tricks that "break" the model without advance warning. While this approach is more likely to become more common, it is currently not available to most companies. However, this is also changing rapidly.
⭐️ Use pre-trained models and retrieve relevant context. The least difficult. People often think they want a model fine-tuned for them, when in reality they just want the model to reason about their information at the right time. There are many ways to provide the correct information: doing structured queries against SQL databases, searching product catalogs, calling some external API or using embedded searches. The benefit of embedded retrieval is that unstructured data can be easily searched using natural language. From a technical perspective, this is done by converting the data into embeddings, storing them in a vector database, and searching those embeddings for the most relevant context when a query occurs, which is then fed to the model. This approach helps you crack the model's finite context window, is less expensive, can address data freshness issues (e.g. ChatGPT doesn't know the world beyond September 2021) and can be developed by an individual with no formal machine learning training is completed.
Vector databases are useful because they make it easier to store, search, and update embeddings at high scale. So far we've observed that larger companies still follow their enterprise cloud agreements and use their cloud provider tools, while startups tend to use specially built vector databases. However, the field is very active. The context window keeps growing (just expanded to 16K by OpenAI, and Anthropic introduced a context window of 100K tokens). Base models and cloud databases may embed embedded retrieval directly into their services. We will closely monitor the development of this market.
4. Now API calls and training models are independent of each other, but in the future the two will gradually merge together
We may feel that we are faced with a choice between two technology stacks:
One is the technology stack that utilizes the LLM API
The other is the technology stack for training custom language models
More and more companies are interested in training and fine-tuning their own models. Over time, the LLM API technology stack and the custom model technology stack will become more and more integrated
Today, the LLM API stack may feel separate from the custom model training stack, but these are gradually merging together.
Sometimes, we feel that these are two different stacks: one for utilizing LLM APIs (more closed source, mainly for developers), and another for training custom language models (more open source , historically geared toward more complex machine learning teams). Some have wondered whether the ease with which LLMs are available through APIs means companies do less custom model training. So far, we've seen the opposite of that.
As interest in AI continues to grow and open-source development accelerates, many companies are becoming more interested in training and fine-tuning their own models. We think the LLM API and custom model stacks will gradually converge. For example, a company might train its own language model from open source, but use vector database retrieval to account for data freshness.
Smart startups building tools for custom model stacks are also working to expand their offerings to be more relevant to the LLM API revolution.
5. Tech stacks are becoming more accessible to developers
LangChain helps developers build LLM applications by abstracting common problems:
- Combining models into higher-level systems, - chain together multiple model calls, - connect the model to tools and data sources, - building agents capable of manipulating these tools, - and help avoid vendor dependencies by simplifying the process of language model switching.
This stack is becoming more developer friendly. The Language Models API puts powerful out-of-the-box models in the hands of ordinary developers, not just machine learning teams. Now that the population using language models has expanded significantly to all developers, we believe we will see more developer-focused tools. For example, LangChain helps developers build LLM applications by abstracting away common problems: composing models into higher-level systems, chaining together calls from multiple models, connecting models with tools and data sources, and building , and help avoid vendor lock-in by making it easier to switch language models. Some people use LangChain for prototyping, while some continue to use it in production.
6. Language models need to become more reliable (output quality, data privacy and security)
Many companies want better tools for handling data privacy, isolation, security, copyright, and monitoring model output.
Companies in regulated industries from fintech to healthcare are particularly concerned about this
Language models will be better trusted as policies are clarified and more security measures are put in place
Language models need to become more trustworthy (output quality, data privacy, security) to be fully accepted. Before fully unleashing LLM in their applications, many companies want tools that better handle data privacy, isolation, security, copyright, and monitoring model output. Companies from regulated industries ranging from fintech to healthcare are particularly concerned about this issue and report difficulty finding software solutions to the problem (an advantageous area for entrepreneurs). Ideally, there should be software in place to warn, if not prevent, of models producing errors/illusions, discriminatory content, dangerous content, or other issues. Some companies also worry about how the data shared with the models is used for training: for example, few people know that ChatGPT Consumer data is used for training by default, while ChatGPT Business and API data are not. As policies are clarified, more safeguards are put in place, and language models become more trusted, we may see another step change in adoption.
7. Language model applications will become more and more modal
Language model applications will become increasingly diverse. Companies have found interesting ways to combine multiple generative models to great effect: chatbots that combine text and speech generation unlock entirely new conversational experiences. Text and speech models can be combined to help you quickly cover a video recording error instead of re-recording the entire video. The models themselves are also increasingly diverse. We can imagine a future rich in consumer and enterprise AI applications that combine text, speech/audio, and image/video generation to create more engaging user experiences and accomplish more complex tasks.
8. It’s still early days
AI is just starting to permeate every corner of technology, with only 65% of respondents trying it today, and many of these are relatively simple applications.
The infrastructure layer (translator's understanding is the meaning of the middle layer) will develop rapidly in the next few years
It's still early days. Artificial intelligence is starting to permeate every corner of technology. Only 65 percent of respondents are in production environments, and many applications are relatively simple. As more companies roll out LLM applications, new hurdles will arise, creating more entrepreneurial opportunities. The infrastructure layer will continue to evolve rapidly over the next few years. If only half of the demos we've seen make it to production, we're in for an exciting ride. It's been exciting to see the founders from our earliest Arc investment to Zoom's founders all focus on the same thing: using artificial intelligence to delight users.
If you are starting a company that will be a key pillar of the language model stack or AI-oriented applications, Sequoia would love to meet you.
Thanks to all the founders and builders who contributed to this work, as well as Sequoia partners Charlie Curnin, Pat Grady, Sonya Huang, Andrew Reed, Bill Coughran, and friends at OpenAI for their comments and reviews.
Original article: https://www.sequoiacap.com/article/llm-stack-perspective/