HomeTutorials
Chain of Thought Prompting
7

Chain of Thought Prompting

diodio
diodio
March 22nd, 2023

Chain of Thought (CoT) prompting is a novel method for generating prompts that encourages the large language model (LLM) to provide an explanation of its reasoning. The image below illustrates a few-shot standard prompt on the left and a chain of thought prompt on the right.

The key concept behind CoT is that by presenting the LLM with few-shot exemplars that demonstrate a clear explanation of the reasoning process, the LLM will also produce its own reasoning process when generating responses to the prompt. This explicit reasoning often results in more accurate outcomes.

Example

Here are a few demos. The first shows GPT-3 (davinci-003) failing to solve a simple word problem. The second shows GPT-3 (davinci-003) succesfully solving the same problem, by using CoT prompting.

Incorrect

Prompt:

Which is a faster way to get to work?
Option 1: Take a 1000 minute bus, then a half hour train, and finally a 10 minute bike ride.
Option 2: Take an 800 minute bus, then an hour train, and finally a 30 minute bike ride.

Output:

Option 1 is a faster way to get to work.

Correct

Prompt:

Which is a faster way to get to work?
Option 1: Take a 1000 minute bus, then a half hour train, and finally a 10 minute bike ride.
Option 2: Take an 800 minute bus, then an hour train, and finally a 30 minute bike ride.

Output:

Option 1 is a faster way to get to work.

Correct

Prompt:

Which is a faster way to get home?
Option 1: Take an 10 minutes bus, then an 40 minute bus, and finally a 10 minute train.
Option 2: Take a 90 minutes train, then a 45 minute bike ride, and finally a 10 minute bus.
Option 1 will take 10+40+10 = 60 minutes.
Option 2 will take 90+45+10=145 minutes.
Since Option 1 takes 60 minutes and Option 2 takes 145 minutes, Option 1 is faster.

Which is a faster way to get to work?
Option 1: Take a 1000 minute bus, then a half hour train, and finally a 10 minute bike ride.
Option 2: Take an 800 minute bus, then an hour train, and finally a 30 minute bike ride.

Output:

Option 1 will take 1000+30+10 = 1040 minutes.
Option 2 will take 800+60+30 = 890 minutes.
Since Option 2 takes 890 minutes and Option 1 takes 1040 minutes, Option 2 is faster.

Results

CoT has been shown to be effective in improving results on tasks like arithmetic, commonsense, and symbolic reasoning tasks1. In particular, prompted PaLM 540B2 achieves 57% solve rate accuracy on GSM8K3 (SOTA at the time).

Limitations

Importantly, according to somebody al., "CoT only yields performance gains when used with models of ∼100B parameters". Smaller models wrote illogical chains of thought, which led to worse accuracy than standard prompting. Models usually get performance boosts from CoT prompting in a manner proportional to the size of the model.

Notes

No language models were hurt finetuned in the process of writing this chapter 😊.

Comments

no dataCoffee time! Feel free to comment