Search

Rethinking the role of the demonstrations

Category
PaperReview
Venue
NACCL 2022
Backbone
GPT3-J
OPT
Text
- 어떠한 요인 때문에 In-context learning이 작동하는가?
PPT
Comprehensive Review of the Three Paper
Rethinking the Role of Demonstrations: What Makes In-Context Learning Work? [EMNLP 2022, cited by 512] – University of Washington
Larger language models do in-context learning differently [ICLR 2024 under review cited by 78] – Google Research
What in-context learning “learns” in-context: disentangling task recognition and task learning [ACL 2023, cited by 15] - Princeton University
How LLM Use Demonstration in In-Context Learning.pdf
1173.8KB
#### Label Space(positive, negative) 유지된다면, Demonstration내 random-label match는 성능 하락에 크게 영향을 미치지 않는다.
#### Demonstration을 out-of-domain에서 가져온다면, ICL성능이 급격하게 하락함 (gold-label, random-label에서도)
meta-training으로 모델에게 input-dist, label dist에 대해서 이해시키는 것을 사전에 학습시키지 않는한
Channeling은 Y를 기반으로 X를 예측하고 마지막에도 모든 y에 대해서 높은 likelihood를 같는 x를 찾는게 목표임.
#### Label-Space를 Random Word로 변경할 경우, Direct 계열의 ICL은 성능이 하락함.
#### X,Y Pair로 주는 것은 중요함
→This suggests that the model has learned the (implicit notion of) input-label correspondence from the language modeling objective alone, e.g., associating a positive review with the word ‘positive’. This is in line with Reynolds and McDonell (2021) who claim that the demonstrations are for task location and the intrinsic ability to perform the task is obtained at pretraining time.9 On one hand, this suggests that the language modeling objective has led to great zero-shot capacity, even if it is not always evident from the naive zero-shot accuracy. On the other hand, this suggests that in-context learning may not work on a task whose input-label correspondence is not already captured in the LM.