๋ณธ๋ฌธ ๋ฐ”๋กœ๊ฐ€๊ธฐ

1๏ธโƒฃ AI•DS/๐Ÿ“— NLP21

ํ…์ŠคํŠธ ๋ถ„์„ โ‘  ๐Ÿ“Œ ํŒŒ์ด์ฌ ๋จธ์‹ ๋Ÿฌ๋‹ ์™„๋ฒฝ๊ฐ€์ด๋“œ ๊ณต๋ถ€ ๋‚ด์šฉ ์ •๋ฆฌ ๐Ÿ“Œ ์‹ค์Šต ์ฝ”๋“œ https://colab.research.google.com/drive/1UzQNyu-rafb1SQEDcQCeCyYO54ECgULT?usp=sharing 08. ํ…์ŠคํŠธ ๋ถ„์„.ipynb Colaboratory notebook colab.research.google.com 1๏ธโƒฃ ํ…์ŠคํŠธ ๋ถ„์„์˜ ์ดํ•ด ๐Ÿ‘€ ๊ฐœ์š” ๐Ÿ’ก NLP ์™€ ํ…์ŠคํŠธ ๋งˆ์ด๋‹ โœ” NLP ์ธ๊ฐ„์˜ ์–ธ์–ด๋ฅผ ์ดํ•ดํ•˜๊ณ  ํ•ด์„ํ•˜๋Š”๋ฐ ์ค‘์ ์„ ๋‘๊ณ  ๋ฐœ์ „ ํ…์ŠคํŠธ ๋งˆ์ด๋‹์„ ํ–ฅ์ƒํ•˜๊ฒŒ ํ•˜๋Š” ๊ธฐ๋ฐ˜ ๊ธฐ์ˆ  ๊ธฐ๊ณ„๋ฒˆ์—ญ, ์งˆ์˜์‘๋‹ต ์‹œ์Šคํ…œ ๋“ฑ โœ” ํ…์ŠคํŠธ ๋งˆ์ด๋‹ ๋น„์ •ํ˜• ํ…์ŠคํŠธ์—์„œ ์˜๋ฏธ์žˆ๋Š” ์ •๋ณด๋ฅผ ์ถ”์ถœํ•˜๋Š” ๊ฒƒ์— ์ค‘์  1. ํ…์ŠคํŠธ ๋ถ„๋ฅ˜ : ๋ฌธ์„œ๊ฐ€ ํŠน์ • ๋ถ„๋ฅ˜ ๋˜๋Š” ์นดํ…Œ๊ณ ๋ฆฌ์— ์†ํ•˜๋Š” ๊ฒƒ์„ ์˜ˆ์ธกํ•˜๋Š” ๊ธฐ๋ฒ• ex. ์‹ ๋ฌธ ๊ธฐ์‚ฌ ์นดํ…Œ๊ณ ๋ฆฌ ๋ถ„.. 2022. 5. 14.
[cs224n] 10๊ฐ• ๋‚ด์šฉ ์ •๋ฆฌ ๐Ÿ’ก ์ฃผ์ œ : Question Answering ๐Ÿ“Œ ํ•ต์‹ฌ Task : QA ์งˆ๋ฌธ ์‘๋‹ต, reading comprehension, open-domain QA SQuAD dataset BiDAF , BERT 1๏ธโƒฃ Introduction 1. Motivation : QA โœ” QA ์™€ IR system ์˜ ์ฐจ์ด โ—ฝ IR = information retrieval ์ •๋ณด๊ฒ€์ƒ‰ ๐Ÿ’จ QA : Query (specifit) → Answer : ๋ฌธ์„œ์—์„œ ์ •๋‹ต ์ฐพ๊ธฐ ex. ์šฐ๋ฆฌ๋‚˜๋ผ ์ˆ˜๋„๋Š” ์–ด๋””์•ผ? - ์„œ์šธ ๐Ÿ’จ IR : Query (general) → Document list : ์ •๋‹ต์„ ํฌํ•จํ•˜๊ณ  ์žˆ๋Š” ๋ฌธ์„œ ์ฐพ๊ธฐ ex. ๊น€์น˜๋ณถ์Œ๋ฐฅ์€ ์–ด๋–ป๊ฒŒ ๋งŒ๋“ค์–ด? - ์œ ํŠœ๋ธŒ ์˜์ƒ ๋ฆฌ์ŠคํŠธ, ๋ธ”๋กœ๊ทธ ๋ฆฌ์ŠคํŠธ ๐Ÿ‘‰ ์ตœ๊ทผ์—๋Š” ์Šค๋งˆํŠธํฐ, ์ธ๊ณต์ง€๋Šฅ ์Šคํ”ผ์ปค ๊ธฐ.. 2022. 5. 13.
[cs224n] 9๊ฐ• ๋‚ด์šฉ ์ •๋ฆฌ ๐Ÿ“‘ 9์žฅ. NLP ์—ฐ๊ตฌ ์ „๋ฐ˜, CS224N ํ”„๋กœ์ ํŠธ 1๏ธโƒฃ Starting Research โœจ SQuAD ์Šคํƒ ํฌ๋“œ ๋Œ€ํ•™์˜ NLP ๊ทธ๋ฃน์—์„œ ํฌ๋ผ์šฐ๋“œ ์†Œ์‹ฑ์„ ํ†ตํ•ด ๋งŒ๋“  ์œ„ํ‚คํ”ผ๋””์•„ ์•„ํ‹ฐํด์— ๋Œ€ํ•œ 107,785๊ฐœ์˜ ์งˆ๋ฌธ-๋Œ€๋‹ต ๋ฐ์ดํ„ฐ์…‹์ด๋‹ค. ํ•œ๊ตญ์—๋Š” KorQuAD ๊ฐ€ ์žˆ๋‹ค. ์ง€๋ฌธ(Context) - ์งˆ๋ฌธ(Question) - ๋‹ต๋ณ€ (Answer) ์œผ๋กœ ์ด๋ฃจ์–ด์ง„ ๋ฐ์ดํ„ฐ์…‹ ํ˜•ํƒœ์ด๋‹ค. ์งˆ๋ฌธ์˜ ๋‹ต๋ณ€ ์—ฌ๋ถ€์— ๋”ฐ๋ผ 70๋งŒ๊ฑด์€ ์ •๋‹ต์ด ์žˆ๋Š” ๋ฐ์ดํ„ฐ์…‹, 30๋งŒ๊ฑด์€ ์ •๋‹ต์ด ์—†๋Š” ๋ฐ์ดํ„ฐ์…‹์œผ๋กœ ๊ตฌ์„ฑ๋˜์–ด ์žˆ๋‹ค. โœจ ์—ฐ๊ตฌ์˜ ์‹œ์ž‘์€ 1. ๋…ผ๋ฌธ์„ ์—ด์‹ฌํžˆ ์ฝ๋Š”๋‹ค. 2. NLP ๋…ผ๋ฌธ์— ๋Œ€ํ•œ ACL Anthology ์ฐธ๊ณ  3. ์ฃผ์š” ML ์ปจํผ๋Ÿฐ์Šค๋“ค์˜ ๋…ผ๋ฌธ ์ฐธ๊ณ  : NeurlPS, ICML, ICLR 4. ๊ธฐ์กด ํ”„๋กœ์ ํŠธ ์ฐธ์กฐ โœจ NLP ์—ฐ๊ตฌ์—์„œ ๊ฐ€.. 2022. 5. 9.
[cs224n] 8๊ฐ• ๋‚ด์šฉ ์ •๋ฆฌ ๐Ÿ’ก ์ฃผ์ œ : Seq2Seq , Attention, ๊ธฐ๊ณ„๋ฒˆ์—ญ ๐Ÿ“Œ ํ•ต์‹ฌ Task : machine translation ๊ธฐ๊ณ„๋ฒˆ์—ญ Seq2Seq Attention ๊ธฐ๊ณ„๋ฒˆ์—ญ์€ ๋Œ€ํ‘œ์ ์ธ Seq2Seq ํ˜•ํƒœ์˜ ํ™œ์šฉ ์˜ˆ์ œ ์ค‘ ํ•˜๋‚˜์ด๊ณ , attention ์ด๋ผ๋Š” ๋ฐฉ๋ฒ•๋ก ์„ ํ†ตํ•ด ์„ฑ๋Šฅ์ด ๊ฐœ์„ ๋˜์—ˆ๋‹ค. 1๏ธโƒฃ Machine Translation 1. ๊ธฐ๊ณ„๋ฒˆ์—ญ โœ” ์ •์˜ ์ž…๋ ฅ์œผ๋กœ ๋“ค์–ด์˜จ Source language ๋ฅผ target language ํ˜•ํƒœ๋กœ ๋ฒˆ์—ญํ•˜๋Š” Task โœ” ์—ญ์‚ฌ โžฐ 1950's : The early history of MT ๋Ÿฌ์‹œ์•„์–ด๋ฅผ ์˜์–ด๋กœ ๋ฒˆ์—ญํ•˜๋Š” ๋“ฑ์˜ ๊ตฐ์‚ฌ ๋ชฉ์ ์œผ๋กœ ๊ฐœ๋ฐœ๋˜๊ธฐ ์‹œ์ž‘ํ•˜์˜€๋‹ค. Rule-based ์˜ ๋ฒˆ์—ญ ์‹œ์Šคํ…œ์œผ๋กœ ๊ฐ™์€ ๋œป์˜ ๋‹จ์–ด๋ฅผ ๋Œ€์ฒดํ•˜๋Š” ๋‹จ์ˆœํ•œ ๋ฐฉ์‹์„ ์‚ฌ์šฉํ–ˆ๋‹ค. โžฐ 1990s - 2010s :.. 2022. 5. 9.
[cs224n] 7๊ฐ• ๋‚ด์šฉ ์ •๋ฆฌ Vanishing Gradients and Fancy RNNs ๐Ÿ’ก ์ฃผ์ œ : Vanishing Gradients and Fancy RNNs ๐Ÿ“Œ ํ•ต์‹ฌ Task : ๋ฌธ์žฅ์ด ์ฃผ์–ด์งˆ ๋•Œ ์ง€๊ธˆ๊นŒ์ง€ ๋‚˜์˜จ ๋‹จ์–ด๋“ค ์ดํ›„์— ๋‚˜์˜ฌ ๋‹จ์–ด๋ฅผ ์˜ˆ์ธก Sequential data : ์ˆœ์„œ๊ฐ€ ์˜๋ฏธ ์žˆ์œผ๋ฉฐ ์ˆœ์„œ๊ฐ€ ๋‹ฌ๋ผ์งˆ ๊ฒฝ์šฐ ์˜๋ฏธ๊ฐ€ ์†์ƒ๋˜๋Š” ๋ฐ์ดํ„ฐ๋กœ ์ˆœํ™˜์‹ ๊ฒฝ๋ง์„ ์‚ฌ์šฉํ•˜๋Š” ์ด์œ ๋Š” ์ž…๋ ฅ์„ ์ˆœ์ฐจ๋ฐ์ดํ„ฐ๋กœ ๋ฐ›๊ฑฐ๋‚˜, ์ถœ๋ ฅ์„ ์ˆœ์ฐจ ๋ฐ์ดํ„ฐ๋กœ ๋‚ด๊ธฐ ์œ„ํ•ด์„œ๋‹ค. RNN : ๋‹ค์Œ์— ์˜ฌ ๋‹จ์–ด๋ฅผ ์˜ˆ์ธกํ•˜๋Š” ๊ณผ์ œ๋ฅผ ํšจ๊ณผ์ ์œผ๋กœ ์ˆ˜ํ–‰ํ•˜๊ธฐ ์œ„ํ•ด ๋„์ž…ํ•œ NN ์˜ ์ผ์ข… ๐Ÿ‘‰ ๋ฌธ์ œ์  : ๊ธฐ์šธ๊ธฐ์†Œ์‹ค/ํญ์ฆ, ์žฅ๊ธฐ์˜์กด์„ฑ LSTM : RNN ์˜ ์žฅ๊ธฐ์˜์กด์„ฑ์˜ ๋ฌธ์ œ์ ์„ ๋ณด์™„ํ•ด ๋“ฑ์žฅํ•œ ๋ชจ๋ธ ๐Ÿ‘‰ cell state , 3 ๊ฐœ์˜ gate ๊ฐœ๋…์„ ๋„์ž… 1๏ธโƒฃ Language model, RNN.. 2022. 4. 21.
[cs224n] 6๊ฐ• ๋‚ด์šฉ ์ •๋ฆฌ ๐Ÿ’ก ์ฃผ์ œ : Language models and RNN (Recurrent Neural Network) ๐Ÿ“Œ ํ•ต์‹ฌ Task : ๋ฌธ์žฅ์ด ์ฃผ์–ด์งˆ ๋•Œ ์ง€๊ธˆ๊นŒ์ง€ ๋‚˜์˜จ ๋‹จ์–ด๋“ค ์ดํ›„์— ๋‚˜์˜ฌ ๋‹จ์–ด๋ฅผ ์˜ˆ์ธก RNN : ๋‹ค์Œ์— ์˜ฌ ๋‹จ์–ด๋ฅผ ์˜ˆ์ธกํ•˜๋Š” ๊ณผ์ œ๋ฅผ ํšจ๊ณผ์ ์œผ๋กœ ์ˆ˜ํ–‰ํ•˜๊ธฐ ์œ„ํ•ด ๋„์ž…ํ•œ NN ์˜ ์ผ์ข… ๐Ÿ“Œ ๋ชฉ์ฐจ / ๋‚ด์šฉ 1. Language model (1) Language model ์ด๋ž€ โœ” ์ •์˜ ๋‹จ์–ด์˜ ์‹œํ€€์Šค(๋ฌธ์žฅ) ์— ๋Œ€ํ•ด ์–ผ๋งˆ๋‚˜ ์ž์—ฐ์Šค๋Ÿฌ์šด ๋ฌธ์žฅ์ธ์ง€๋ฅผ 'ํ™•๋ฅ ' ์„ ์ด์šฉํ•ด ์˜ˆ์ธกํ•˜๋Š” ๋ชจ๋ธ Language modeling = ์ฃผ์–ด์ง„ ๋‹จ์–ด์˜ ์‹œํ€€์Šค์— ๋Œ€ํ•ด ๋‹ค์Œ์— ๋‚˜ํƒ€๋‚  ๋‹จ์–ด๊ฐ€ ์–ด๋–ค ๊ฒƒ์ธ์ง€๋ฅผ ์˜ˆ์ธกํ•˜๋Š” ์ž‘์—… ํŠน์ • ๋ฌธ์žฅ์— ํ™•๋ฅ ์„ ํ• ๋‹นํ•œ๋‹ค. ๋ฌธ์žฅ์˜ ๋‹จ์–ด w(1), w(2) , ... w(t) ๊ฐ€ ์ฃผ์–ด์กŒ์„ ๋•Œ ๋‹ค์Œ์— ์˜ฌ ๋‹จ์–ด w(t+1).. 2022. 3. 24.
[cs224n] 5๊ฐ• ๋‚ด์šฉ ์ •๋ฆฌ ๐Ÿ’ก ์ฃผ์ œ : Dependency Parsing ๐Ÿ“Œ ํ•ต์‹ฌ Task : ๋ฌธ์žฅ์˜ ๋ฌธ๋ฒ•์ ์ธ ๊ตฌ์„ฑ, ๊ตฌ๋ฌธ์„ ๋ถ„์„ Dependency Parsing : ๋‹จ์–ด ๊ฐ„ ๊ด€๊ณ„๋ฅผ ํŒŒ์•…ํ•˜์—ฌ ๋‹จ์–ด์˜ ์ˆ˜์‹ (๋ฌธ๋ฒ•) ๊ตฌ์กฐ๋ฅผ ๋„์ถœํ•ด๋‚ด๊ธฐ ๐Ÿ“Œ ๋ชฉ์ฐจ 1. Dependency Parsing ์ด๋ž€ (1) Parsing โœ” ์ •์˜ ๊ฐ ๋ฌธ์žฅ์˜ ๋ฌธ๋ฒ•์ ์ธ ๊ตฌ์„ฑ์ด๋‚˜ ๊ตฌ๋ฌธ์„ ๋ถ„์„ํ•˜๋Š” ๊ณผ์ • ์ฃผ์–ด์ง„ ๋ฌธ์žฅ์„ ์ด๋ฃจ๋Š” ๋‹จ์–ด ํ˜น์€ ๊ตฌ์„ฑ ์š”์†Œ์˜ ๊ด€๊ณ„๋ฅผ ๊ฒฐ์ •ํ•˜๋Š” ๋ฐฉ๋ฒ•์œผ๋กœ, parsing์˜ ๋ชฉ์ ์— ๋”ฐ๋ผ Consitituency parsing๊ณผ Dependency parsing์œผ๋กœ ๊ตฌ๋ถ„ โœ” ๋น„๊ต ํ† ํฌ๋‚˜์ด์ง• : ๋ฌธ์žฅ์ด ๋“ค์–ด์˜ค๋ฉด ์˜๋ฏธ๋ฅผ ๊ฐ€์ง„ ๋‹จ์œ„๋กœ ์ชผ๊ฐœ์ฃผ๋Š” ๊ฒƒ pos-tagging : ํ† ํฐ๋“ค์— ํ’ˆ์‚ฌ tag ๋ฅผ ๋ถ™์—ฌ์ฃผ๋Š” ๊ณผ์ • Paring : ๋ฌธ์žฅ ๋ถ„์„ ๊ฒฐ๊ณผ๊ฐ€ Tree ํ˜•ํƒœ๋กœ ๋‚˜.. 2022. 3. 22.
[cs224n] 4๊ฐ• ๋‚ด์šฉ ์ •๋ฆฌ ๐Ÿ’ก ์ฃผ์ œ : Backpropagation and Computation Graphs ๐Ÿ“Œ ๋ชฉ์ฐจ ์ •๋ฆฌ 1. Matrix gradient for NN (1) NN ์˜ ๊ณผ์ • feedforward : X * W = output vector = predict ๊ฐ’ backpropagation : output vector ๋ฅผ weight matrix ์— ๋Œ€ํ•ด ๋ฏธ๋ถ„ (2) ๊ฐ€์ค‘์น˜ ํ–‰๋ ฌ (parameter) ์˜ ๋ฏธ๋ถ„ Chain Rule : ํ•จ์ˆ˜์˜ ์—ฐ์‡„๋ฒ•์น™์„ ๊ธฐ๋ฐ˜์œผ๋กœ ์ด๋ฃจ์–ด์ง€๋Š” ๊ณ„์‚ฐ ๊ทœ์น™ (ํ•ฉ์„ฑํ•จ์ˆ˜์˜ ๋ฏธ๋ถ„) NN ์€ chain rule ์„ ์ด์šฉํ•ด ์ตœ์ข… scalar ๊ฐ’์„ weight ๋กœ ๋ฏธ๋ถ„ํ•ด๊ฐ€๋ฉฐ ๊ฐ€์ค‘์น˜๋ฅผ ์—…๋ฐ์ดํŠธ ํ•˜๋Š” ๋ฐฉ์‹์œผ๋กœ ํ•™์Šต์„ ์ง„ํ–‰ํ•œ๋‹ค. dz/dw ๋ฅผ ๊ณ„์‚ฐํ•˜๋Š” ๊ณผ์ • (3) Gradient Tips ๋ณ€์ˆ˜๋ฅผ ์ž˜ ์ •์˜ํ•˜๊ณ  .. 2022. 3. 18.
NLP deep learning ๐Ÿ‘€ ์œ„ํ‚ค๋…์Šค : https://wikidocs.net/35476 ์˜ ๋”ฅ๋Ÿฌ๋‹ ๊ฐœ์š” ํŒŒํŠธ ๊ณต๋ถ€ํ•œ ๊ฒƒ ์ •๋ฆฌ (์ด๋ฏธ์ง€ ์ถœ์ฒ˜๋Š” ๋ชจ๋‘ ์œ„ํ‚ค๋…์Šค ํ™ˆํŽ˜์ด์ง€) ๐Ÿ“Œ ์†Œํ”„ํŠธ๋งฅ์Šค ํšŒ๊ท€ ๋กœ์ง€์Šคํ‹ฑํšŒ๊ท€ : ์ด์ง„ ๋ถ„๋ฅ˜ ๋ฌธ์ œ VS ์†Œํ”„ํŠธ๋งฅ์Šค ํšŒ๊ท€ : ๋‹ค์ค‘ ํด๋ž˜์Šค ๋ถ„๋ฅ˜ ๋ฌธ์ œ ๋Œ€ํ‘œ์ ์ธ ๋‹ค์ค‘ ํด๋ž˜์Šค ๋ถ„๋ฅ˜ ์˜ˆ์ œ : iris ๋ถ“๊ฝƒ ํ’ˆ์ข… ๋ถ„๋ฅ˜ (k=3) Softmax function ํด๋ž˜์Šค์˜ ๊ฐœ์ˆ˜๊ฐ€ k ๊ฐœ์ผ ๋•Œ, k ์ฐจ์›์˜ ๋ฒกํ„ฐ๋ฅผ ์ž…๋ ฅ๋ฐ›์•„ ๊ฐ ํด๋ž˜์Šค์— ๋Œ€ํ•œ ํ™•๋ฅ ์„ ์ถ”์ •ํ•œ๋‹ค. zi : k์ฐจ์›์˜ ๋ฒกํ„ฐ์—์„œ i ๋ฒˆ์งธ ์›์†Œ pi : i ๋ฒˆ์งธ ํด๋ž˜์Šค๊ฐ€ ์ •๋‹ต์ผ ํ™•๋ฅ  k ์ฐจ์›์˜ ๋ฒกํ„ฐ๋ฅผ ์ž…๋ ฅ → ๋ฒกํ„ฐ ์›์†Œ ๊ฐ’์„ 0๊ณผ 1 ์‚ฌ์ด์˜ ๊ฐ’์œผ๋กœ ๋ณ€๊ฒฝ → ๋‹ค์‹œ k ์ฐจ์›์˜ ๋ฒกํ„ฐ๋ฅผ ๋ฐ˜ํ™˜ ๐Ÿ‘€ ์ƒ˜ํ”Œ ๋ฐ์ดํ„ฐ ๋ฒกํ„ฐ(4์ฐจ์›) ์„ ์†Œํ”„ํŠธ๋งฅ์Šค ํ•จ์ˆ˜์˜ ์ž…๋ ฅ๋ฒกํ„ฐ 3์ฐจ์›์œผ๋กœ ์ถ•์†Œํ•˜๋Š” ๋ฐฉ๋ฒ•? ๐Ÿ‘‰.. 2022. 3. 15.
[cs224n] 3๊ฐ• ๋‚ด์šฉ ์ •๋ฆฌ ๐Ÿ’ก ์ฃผ์ œ : Word Window Classification, NN and Matrix Calculus ๐Ÿ“Œ ํ•ต์‹ฌ Task : ๋ถ„๋ฅ˜ - ๊ฐœ์ฒด๋ช… ๋ถ„๋ฅ˜ (Named Entity Recognition) ๐Ÿ“Œ ๋ชฉ์ฐจ ์ •๋ฆฌ 1. Classification Review / introduction NLP ์—์„œ์˜ ๋ถ„๋ฅ˜ ๋ฌธ์ œ ๐Ÿ‘‰ input data : ๋‹จ์–ด, ๋ฌธ์žฅ, ๋ฌธ์„œ ๋“ฑ ๐Ÿ‘‰ Class : ๊ฐ์ •๋ถ„๋ฅ˜, ๊ฐœ์ฒด๋ช… ๋ถ„๋ฅ˜ (Named entity) , ๊ฐ™์€ ์˜๋ฏธ/ํ’ˆ์‚ฌ์˜ ๋‹จ์–ด๋ผ๋ฆฌ ๋ถ„๋ฅ˜ ๋“ฑ ๐Ÿ‘‰ ๊ฒฐ์ •๊ฒฝ๊ณ„ (decision boundary) ๋ฅผ ๊ฒฐ์ •ํ•  Weight ๋ฅผ ํ•™์Šต ์ง€๋„ํ•™์Šต ๐Ÿ‘‰ Train set → Loss function → Validation / Test set ์†์‹คํ•จ์ˆ˜ ๐Ÿ‘‰ ์˜ˆ์ธกํ•œ ๋ฐ์ดํ„ฐ(y hat) ์˜ ํ™•๋ฅ ๋ถ„ํฌ์™€ ์‹ค์ œ ๋ฐ์ดํ„ฐ(.. 2022. 3. 14.
728x90