Microsoft officially released a tutorial in person, grasping the advanced gameplay of "Prompt Project"
With the big model, the next step is to design the prompt.
Over the past few months, large models such as ChatGPT and GPT-4 have been released one after another. These models exhibit a strong emergent ability, but the results generated by the models are random, good and bad, part of the reason is closely related to the design of Prompt.
Many people compare Prompt to the spell of a large model, which has a great influence on guiding the model to generate content. How to choose Prompt has become a concern of every AI researcher. Recently, Microsoft officially released a tutorial, which introduces some advanced gameplay in Prompt design and engineering, covering system messages, few-sample learning, non-chat scenarios, etc.
Each part of the content has a technical introduction and example display. Let's take a look at the specific content below.
About Prompt, you should know these
System messages are included at the beginning of the Prompt to provide the model with context, instructions, or other information relevant to the use case. Users can describe what the model should and should not answer through system messages, and define the format of the model's reply.
The following figure shows an example of a system message and a model-generated reply:
Usually, system messages can also look like this:
- The Assistant in the above table is a large language model trained by OpenAI.
- Assistant is an intelligent chatbot designed to help users answer questions. Require the model to answer questions only using the given context, and if you are unsure of the answer, you can say "I don't know".
- Assistant is an intelligent chatbot that, for example, helps users answer tax-related questions.
- Another example is that you are an Assistant designed to extract entities from text. The user will paste a string of text, and you will respond in the form of a JSON object with the entities you extracted from the text.
Here is an example of the output format:
The above is about the introduction of system messages, but an important detail is that even with well-designed system messages, the model may still generate error replies that contradict the system message instructions.
A common way to adapt a language model to a new task is to use few-shot learning. Few-shot learning provides a set of training samples as part of a prompt to provide additional contextual information to the model.
A sequence of messages (written in the new Prompt format) between the user and the Assistant can serve as an example for few-shot learning. These examples can be used to guide models to respond in a certain way, simulate specific behaviors, and provide seed answers to common questions.
The basic structure of a prompt.
Although the main application scenario of the current large model is the dialog generation scenario, it can also be used in non-dialog scenarios. For example, for a sentiment analysis scenario, you might use the following prompt:
use explicit instructions
In general, the order in which information appears in a prompt is important. Since GPT-like models are built in a specific way, the building process defines what the model does with the input. Research has shown that telling the model what you want it to do at the beginning of a prompt, before sharing additional contextual information or examples, can help the model produce higher-quality output.
Repeat the command one last time
Models are susceptible to recent biases, in which case the end prompt information may have a greater impact on the output than the beginning prompt information. Therefore, repeating instructions at the end of the prompt is worth a try.
operations on the output
This is the case when several words or phrases are included at the end of the Prompt to obtain a model response that conforms to the desired form. For example, using a prompt such as "Here's a bulleted list of key points: 🧥🧳🧥-" can help ensure that the output is formatted as a bulleted list.
Add syntax to prompts, such as punctuation, headings, etc. Doing so makes the output easier to parse.
In the example below, a separator (in this case ---) is added between different sources of information or steps. This operation allows the use of --- as a stop condition for generation. Also, some headings or special variables are capitalized for distinction.
break down tasks
Large language models (LLMs) often perform better if the task is broken down into smaller steps.
Note that syntax is used here to distinguish the various parts and to initialize the output. Breaking down the task from one step into two steps is not immediately obvious in this simple example, but when trying to do this for large blocks of text containing many factual statements, breaking the task down makes a significant difference.
chain of thought tips
This is a variation of the split task technique. In this approach, instead of splitting the task into smaller steps, the model is instructed to respond step-by-step and present all involved steps. Doing so reduces inaccurate results and makes it easier to evaluate model responses.
provide real context
Under this method, this paper proposes to provide real data to the model. In general, the closer the raw data is to the final answer, the less work the model has to do, which means the less chance the model has to make mistakes. In the example below, the system message provides the latest articles, and then asks the model to give some early customers, and the model gives the answer accurately.
In addition, Microsoft also introduced other tips about Prompt in this guide. You can go to the original text to get more information.