Chain-Of-Thought

Chain-of-Thought Prompting (Manual CoT)

→ 사람이 이 11이라는 답변을 도출하기 위해 어떠한 생각의 고리들을 도출해낼까?를 text로 표현하고, 이 과정을 Question과 Answer사이에 Reasoning process라는 항목이라는 이름하에 삽입.

→ 이런 Reasoning process를 Demonstration에 Model Chain-of-Thought Prompting을 하면 LLM은 추론과정 (Reasoning process)을 생성한 후 Answer을 생성.

•

초기 CoT paper에서 manual reasoning prompt를 작성할 때 annotator에서 구체적인 instruction을 주지 않고, final answer을 도출할 수 있도록 step-by-step reasoning process를 적어달라고 단순 요청

→ 각 annotator의 linguistic style을 반영하기 위함

•

GPT3를 따라 few-shot prompting을 기본으로 함

Self-Consistency

•

CoT Prompt를 짠 후에 →  Ask the LM to generate a diverse set of possible solutions.

•

Random seed 고정하지 않고 forward 여러번 진행 = diverse set 생성, diverse reasoning path ⇒ Majority vote.

•

Model의 Output을 Ensemble

Zero-shot CoT

→ Few-shot CoT에서 manually task-specific CoT prompt를 생성해왔으나 simple prompt가 더 효과가 있다라고 주장

→ Task에 상관 없이 Answer에 “Let’s. think step by step”을 추가하여 plausible reasoning process를 생성

→ 최초 논문에서 제안된 pipeline은 크게 두가지로 구분되며 (1) Reasoning Extraction, (2) Answer Extraction 순서로 진행됨

(1) Reasoning Extraction

•

Q: [X], A: [T]

•

T: trigger sentence

(2) Answer Extraction

•

[X’] [Z] [A]

•

[Z]: generation output at first step

•

[A]: trigger sentence to extract the answer

→ T중에 ‘Let’s think step by step’이 가장 좋음.

→ Misleading한 값을 넣더라도 일부 case에서 더 좋음, but instructive한 case에서 큰 성능 향상 폭을 보임. 이는 LLM이 적절한 prompt를 통해 성능향상을 이루어낼 수 있다는 것을 의미함.

이후 위처럼 적절한 instruction을 찾아서 LLM의 reasoning capabilities를 prompting 연구가 활발히 진행되었음.