HomeAI News
Large-scale models compete in AI search, and "Tiangong AI Search" is the first to start testing in China

Large-scale models compete in AI search, and "Tiangong AI Search" is the first to start testing in China

Hayo News
Hayo News
August 30th, 2023
View OriginalTranslated by Google
Overturning the foundation of the digital world, where is the first wave of big model applications?

Recently, new products in the field of science and technology all pay attention to a "big model blessing", and the technological competition has entered a fierce stage. Whether it is Google, Microsoft or Meta, it seems that they have returned to their youthful appearance in an instant.

With the rapid development of technology, more and more people began to discuss the application of large models. In terms of landing, the first thing to bear is the search that Google has always had a dominant position.

Not long after ChatGPT was released, the first wave of large-scale model application products released by Microsoft, which had a first-mover advantage, was a search engine. In the early morning of February 8, Beijing time, Microsoft issued a major announcement, announcing against the clock that it would introduce large-scale model technology into its own search engine.

This time, Google Search, which has been in a dominant position for decades, felt the "shock" brought by Microsoft's new Bing, and let us see that the AI ​​search engine has become a strategic place for the application of large-scale model technology.

Recently, the domestic company Kunlun Wanwei joined the "AI search engine" battle, announcing the launch of the first search engine in China that incorporates a large language model - Tiangong AI Search, and opened the application for internal testing and launched the app .

In this article, let's take a look at how Tiangong AI search challenges traditional search? How does it work in practice.

Why start with search?

Why did Kunlun Wanwei, which released the "Tiangong" large-scale language model early, put the first product of large-scale model technology on the C side in the search engine scene?

The ultimate reason lies in the importance of search and the innovative potential brought by large model technology.

Due to the rapid iteration of technology, many technology companies have proposed a "basic model" on which developers can build commercial applications according to their own needs. However, based on the large-scale industrial transformation they have achieved, the effect has not yet appeared.

But in the consumer field, generative AI seems to have a more obvious application prospect. Since February this year, the actions of pioneers such as Microsoft, OpenAI, Google, and Baidu have all introduced the ability of large models into their own search engines, which has been welcomed by people.

The era of large models has come, what changes will happen to our lives? After seeing the shocking effect of ChatGPT, we have all imagined it either seriously or exaggeratedly, and there is a consensus that it may be ubiquitous on the products of technology companies, and the more people need to interact with computers Work, the more intense the subversion will be.

In the process of interacting with computers, search engines are basic and "insensitive" applications. For a long time, there has been almost no obvious change in the form of search, and people are more and more inclined to choose the top services.

After the advent of the large model, traditional search may be subverted, and this pattern may be broken: by combining AI technology after qualitative change, the search action that used to start with keywords has become an instruction to "let artificial intelligence work." , we no longer need to think about the way of retrieval, or tediously filter potentially useful content or entries in the search results, AI will solve the problem in one stop.

Through the chain of thought (CoT) capability unique to large models, the new generation of search systems can fully understand the questions people ask and the content they find, analyze your intentions, interact with you continuously and effectively, and generate meaningful content.

To put it simply, AI now has a bit of "logic". It can truly serve as our personal assistant, become a traffic portal because it meets a large number of complex needs, and can also be used as a preliminary productivity tool to solve work problems.

Based on the search capabilities of large models, we can expect that in the near future, the demand for information will be greatly satisfied, allowing AI to integrate data can greatly improve the efficiency of knowledge acquisition, and AI generation can be completed at an unimaginable speed before Task.

On the other hand, an AI that can fully understand human intentions can also connect to various services, making itinerary planning and meeting minutes no longer take time, and it will become smarter and smarter as it continues to be used.

If such a large-scale model application exists, isn’t it the “super APP” that we have been thinking about and can help us deal with the world?

Full AI search experience, and more convenient

Now that a product has been launched, what is its specific effect?

This APP is called "Tiangong AI Assistant". New users can download it to experience it. If they are old users, they only need to update the APP to experience it. Its user interface design is simple: just click in the search box, and you can ask yourself any question you want to know. In addition, using the "AI dialogue" function, you can also experience the ability to chat and interact with Tiangong AI assistant, create text and other conventional large models.

We know that traditional search engines are mainly keyword-oriented. After inputting text, a large number of results matching keywords are obtained, arranged in order of relevance (regardless of advertisements). But sometimes this method may not be able to give you the answer you really want. After all, even papers have title parties, and if you search for a long paragraph, search engines rarely consider the logic of the input content.

Tiangong AI Search focuses on natural language search , which is to ask questions in vernacular, without using words to make sentences, or using the "operators" mentioned in the information retrieval course, and you can ask whatever you want. Tiangong AI search can not only easily analyze and get your real intention, but also capture the context of the question, making the search results more accurate and relevant.

It has also greatly changed the logic of search engine output results. Simply ask a question and you can see that the interface of Tiangong AI search is divided into three parts from top to bottom to present, namely reference, answer and follow-up .

This is exactly the difference between Tiangong AI search and traditional search: it will first display reference information sources, and these information sources are the most valuable for answering questions; Redundant and irrelevant information, more efficiently and accurately generate concise answers.

As a reference for citing information sources in search results (answers), it is the first highlight of Tiangong AI Search . These listed references ensure that answers are traceable and trustworthy, giving you a direct link to the original information through the corresponding index. There are also abundant sources of reference information, including not only news websites, knowledge question-and-answer platforms, but also official websites and videos of institutions.

At the bottom, there is the "Question" function of Tiangong AI Search, which reflects the large model of the search engine. It allows you to conduct 20+ rounds of in-depth interaction around a question.

The characteristic of the search engine is that it can output instant and accurate information according to your needs, while the strength of the big model is that it breaks the barriers between man and machine, can effectively communicate with you, fully understand the context, and give an accurate response.

Next, I would like to know about the masterpiece published by Google in 2017 that influenced the development direction of Natural Language Processing (NLP). Tiangong AI search gave the title, architectural principles and impact of this paper, which is like a summary of the paper.

We continue to ask, Transformer, which has a high exposure rate in the NLP field, has already expanded to the field of computer vision. What is so great about Google's work in the field of Vision Transformer Vision Transformer? Tiangong AI search let us know the advantages of ViT compared with traditional convolutional and cyclic neural networks, better modeling ability and stronger interpretability, and its positive impact on the field of computer vision.

The authors who wrote the Transformer paper at the beginning are now regarded as masters. How are they doing? Then continue to ask.

It can be seen that the infinite questioning of Tiangong AI search gives you and me the ability to "break the casserole and ask the end", and you can find the answer to the ins and outs of a matter.

In addition to turning you into a "know-it-all" by asking questions, Tiangong AI Search has powerful information integration, refinement and connection capabilities under the blessing of large models, so that it is more comfortable and meaningful when dealing with open-ended questions.

This time I would like to ask a hot topic that is currently inconclusive in the field of large models, open source or closed source? Let's see what answer Tiangong AI Search will give us. Its answer first pointed out that it cannot be generalized, then listed the advantages of open source and closed source in detail, and finally suggested that enterprises and research institutions choose open source or closed source according to their own conditions, which can be said to be very comprehensive.

The search engine of the large model can not only answer questions, but also grasp many details. Tiangong AI search is stronger than traditional search in terms of knowledge and creative search.

For example, given the following programming topic, it will explain it first, and then output the code to realize it. Of course, the source links of the solutions are also listed.

In addition, you can also ask questions based on this result to understand the principle of this code step by step.

Another creative question, for example, I want to use the generative tools Stable Diffusion and Runway to create a sci-fi blockbuster, but I don't know how to do it. Tiangong AI Search has given very detailed preparation steps, which can be seen to be much faster than summarizing by yourself.

Then follow up, I want to write a story about the nuclear pollution of the ocean that leads to the extinction of human beings, but I don't know how to write a script. Also handed over to Tiangong AI search, the answer is still clear and logical.

Real-time performance is an important requirement for search engines, and Tiangong AI Search is particularly good at this aspect. It uses the entire network as a database to ensure real-time output.

For example, I want to know about Code Llama, the large code model released by Meta last Friday, and the large model based on it that surpasses GPT-4. From Tiangong AI search, we know that the large model that surpasses GPT-4 is WizardCoder 34B and its one-time generation pass rate.

Finally, the very friendly point of Tiangong AI search is that the search results of each round will not be lost , and will be saved in "My History", so that you can look back at the search content at any time. And, all clients are unified.

Tiangong large model and AI enhancement technology

What technologies are used behind the seemingly useful Tiangong AI search? Its most important support is the 100 billion-level large language model "Tiangong" previously launched by Kunlun Wanwei.

As China's first dual-hundred-billion-level large language model that benchmarks against ChatGPT, "Tiangong" is deployed on a domestic leading GPU cluster, integrating a 100-billion-level pre-trained base model and a 100-billion RLHF model. Therefore, the model has powerful natural language processing and intelligent interaction capabilities. With the blessing of rich knowledge reserves, it can meet diverse generative AI needs such as knowledge question and answer, copywriting, logical reasoning, mathematical calculation, and code programming.

Kunlun Wanwei said that the new generation of search engines is becoming smarter by using the capabilities of large models. On the other hand, based on the searched real-time content, the probability of hallucinations and other phenomena in the content generation of the large model is also reduced. Behind Tiangong AI search, Kunlun Wanwei has focused on improvements from multiple perspectives, revolutionizing the experience of traditional search engines.

Specifically, the improvement is mainly reflected in five aspects:

Intent recognition and understanding : In traditional search engines, users often need to try search sentences multiple times. Tiangong AI Search will use a large model to rewrite the user's query before retrieval, which can not only dig deep into the user's true intention, but also accurately capture the context of the query, bringing more accurate and relevant search results.

Intelligent summarization : On open-ended questions, through the "Dense Passage Retrieval" (DPR) technology, a dual-encoder model is used to encode and calculate similarities between questions and potentially relevant documents (such as wiki pages or forum articles) to ensure accurate retrieval Highly relevant documents and key paragraphs.

Vector semantic retrieval : Kunlun Wanwei has built a large-scale real-time vector retrieval system for search engines, which plays a role in multiple links of search, including precise content positioning, enhanced content diversity, and improved contextual coherence. By recalling the search results previously queried by users, the coherence between search results and user interaction is improved, creating a more natural and smooth search dialogue experience.

Intelligent questioning technology , which provides support for the infinite questioning of Tiangong AI search. Kunlun Wanwei said that the core of the technology is to fully understand the user's query and ask questions when more information is needed. The realization of questioning is not only inseparable from the steps of "intent recognition, information completeness detection, question generation, user feedback reception, dynamic adjustment and learning, context awareness", but also requires continuous training with a large amount of data such as dialogue, user query logs, and questioning feedback. Of course, it also needs continuous iteration and optimization. By accurately grasping the needs of users multiple times, the answer is always on the right track.

In addition, Tiangong AI search also implements cross-language information retrieval (Cross-Language Information Retrieval, CLIR). Even if you use Chinese when asking questions, the information that AI is looking for when generating content is not only in Chinese, but when presenting the results, all the information is in Chinese. Translated and integrated. This method not only greatly expands the knowledge boundary of search, but also ensures that users can access the latest and most comprehensive global information and research results.

More importantly, Tiangong AI Search will automatically filter paid webpages and invalid information, and there are no advertisements, and the top ones are all valid reference links.

With the blessing of these capabilities, AI search can not only understand your long and difficult sentences, but also collect information from the global network to sort out logical and clear answers. If you get your feedback, it can continue to improve. A universal AI that can solve all problems has begun to take shape.

Perhaps, this is how super apps started.

Reprinted from 机器之心View Original