๋ณธ๋ฌธ ๋ฐ”๋กœ๊ฐ€๊ธฐ
1๏ธโƒฃ AI•DS/๐Ÿ“’ Deep learning

[๋”ฅ๋Ÿฌ๋‹ ํŒŒ์ดํ† ์น˜ ๊ต๊ณผ์„œ] 7์žฅ ์‹œ๊ณ„์—ด I

by isdawell 2022. 11. 10.
728x90

 

1๏ธโƒฃ  ์‹œ๊ณ„์—ด ๋ฌธ์ œ 


 

๐Ÿ”น  ์‹œ๊ณ„์—ด ๋ถ„์„์ด๋ž€ 

 

โ†ช ์‹œ๊ฐ„์— ๋”ฐ๋ผ ๋ณ€ํ•˜๋Š” ๋ฐ์ดํ„ฐ๋ฅผ ์‚ฌ์šฉํ•ด ์ถ”์ด๋ฅผ ๋ถ„์„ํ•˜๋Š” ๊ฒƒ์œผ๋กœ ์ฃผ๊ฐ€/ํ™˜์œจ๋ณ€๋™, ๊ธฐ์˜จ/์Šต๋„๋ณ€ํ™” ๋“ฑ์ด ๋Œ€ํ‘œ์ ์ธ ์‹œ๊ณ„์—ด ๋ถ„์„์— ํ•ด๋‹นํ•œ๋‹ค. 

 

โ†ช ์ถ”์„ธํŒŒ์•…, ํ–ฅํ›„์ „๋ง ์˜ˆ์ธก์— ์‹œ๊ณ„์—ด ๋ถ„์„์„ ์‚ฌ์šฉํ•œ๋‹ค. 

 

 

 

๐Ÿ”น  ์‹œ๊ณ„์—ด ํ˜•ํƒœ 

 

โ†ช ๋ฐ์ดํ„ฐ ๋ณ€๋™ ์œ ํ˜•์— ๋”ฐ๋ผ ๊ตฌ๋ถ„ํ•  ์ˆ˜ ์žˆ๋‹ค. 

 

๋ถˆ๊ทœ์น™๋ณ€๋™ ์˜ˆ์ธก ๋ถˆ๊ฐ€๋Šฅํ•˜๊ณ  ์šฐ์—ฐ์ ์œผ๋กœ ๋ฐœ์ƒํ•˜๋Š” ๋ณ€๋™. ์ „์Ÿ, ํ™์ˆ˜, ์ง€์ง„, ํŒŒ์—… ๋“ฑ
์ถ”์„ธ๋ณ€๋™ GDP, ์ธ๊ตฌ์ฆ๊ฐ€์œจ ๋“ฑ ์žฅ๊ธฐ์ ์ธ ๋ณ€ํ™” ์ถ”์„ธ๋ฅผ ์˜๋ฏธํ•œ๋‹ค. ์žฅ๊ธฐ๊ฐ„์— ๊ฑธ์ณ ์ง€์†์ ์œผ๋กœ ์ฆ๊ฐ€, ๊ฐ์†Œํ•˜๊ฑฐ๋‚˜ ์ผ์ • ์ƒํƒœ๋ฅผ ์œ ์ง€ํ•˜๋ ค๋Š” ์„ฑ๊ฒฉ์„ ๋ˆ๋‹ค. 
์ˆœํ™˜๋ณ€๋™ 2~3๋…„ ์ •๋„์˜ ์ผ์ •ํ•œ ๊ธฐ๊ฐ„์„ ์ฃผ๊ธฐ๋กœ ์ˆœํ™˜์ ์œผ๋กœ ๋‚˜ํƒ€๋‚˜๋Š” ๋ณ€๋™
๊ณ„์ ˆ๋ณ€๋™ ๊ณ„์ ˆ์ ์ธ ์˜ํ–ฅ๊ณผ ์‚ฌํšŒ์  ๊ด€์Šต์— ๋”ฐ๋ผ 1๋…„ ์ฃผ๊ธฐ๋กœ ๋ฐœ์ƒํ•˜๋Š” ๊ฒƒ์„ ์˜๋ฏธ 

 

 

๐Ÿ”น  ์‹œ๊ณ„์—ด ๋ฐ์ดํ„ฐ 

 

โ†ช ๊ทœ์น™์  ์‹œ๊ณ„์—ด vs ๋ถˆ๊ทœ์น™์  ์‹œ๊ณ„์—ด 

 

• ๊ทœ์น™์  ์‹œ๊ณ„์—ด : ํŠธ๋ Œ๋“œ์™€ ๋ถ„์‚ฐ์ด ๋ถˆ๋ณ€ํ•˜๋Š” ๋ฐ์ดํ„ฐ 

• ๋ถˆ๊ทœ์น™์  ์‹œ๊ณ„์—ด : ํŠธ๋ Œ๋“œ ํ˜น์€ ๋ถ„์‚ฐ์ด ๋ณ€ํ™”ํ•˜๋Š” ์‹œ๊ณ„์—ด ๋ฐ์ดํ„ฐ 

 

โ–ธ ์‹œ๊ณ„์—ด์„ ์ž˜ ๋ถ„์„ํ•œ๋‹ค๋Š” ๊ฒƒ์€ ๋ถˆ๊ทœ์น™์„ฑ์„ ๊ฐ–๋Š” ์‹œ๊ณ„์—ด ๋ฐ์ดํ„ฐ์— ํŠน์ •ํ•œ ๊ธฐ๋ฒ•์ด๋‚˜ ๋ชจ๋ธ์„ ์ ์šฉํ•˜์—ฌ ๊ทœ์น™์ ์ธ ํŒจํ„ด์„ ์ฐพ๊ฑฐ๋‚˜ ์˜ˆ์ธกํ•˜๋Š” ๊ฒƒ์„ ์˜๋ฏธํ•œ๋‹ค. 

 

โ†ช AR,MA,ARMA,ARIMA,๋”ฅ๋Ÿฌ๋‹ ๊ธฐ๋ฒ• ๋“ฑ์ด ์‚ฌ์šฉ๋œ๋‹ค. 

 

 

 

 

 

 

2๏ธโƒฃ  AR, MA, ARMA, ARIMA 


 

โ†ช ์‹œ๊ณ„์—ด ๋ถ„์„์€ ์ผ๋ฐ˜์ ์ธ ๋จธ์‹ ๋Ÿฌ๋‹์—์„œ "์‹œ๊ฐ„" ์„ ๋…๋ฆฝ๋ณ€์ˆ˜๋กœ ์‚ฌ์šฉํ•œ๋‹ค๋Š” ํŠน์ง•์ด ์žˆ๋‹ค. 

 

๐Ÿ”น  AR ๋ชจ๋ธ 

 

 

•  Auto Regressive ์ž๊ธฐํšŒ๊ท€ ๋ชจ๋ธ

• ์ด์ „ ๊ด€์ธก๊ฐ’์ด ์ดํ›„ ๊ด€์ธก๊ฐ’์— ์˜ํ–ฅ์„ ์ค€๋‹ค๋Š” ์•„์ด๋””์–ด์—์„œ ์‹œ์ž‘ํ•œ ๊ฒƒ์œผ๋กœ ์ด์ „ ๋ฐ์ดํ„ฐ์˜ '์ƒํƒœ' ์—์„œ ํ˜„์žฌ ๋ฐ์ดํ„ฐ์˜ ์ƒํƒœ๋ฅผ ์ถ”๋ก ํ•œ๋‹ค. 

• โ‘  ๋ฒˆ ๋ถ€๋ถ„ : ์‹œ๊ณ„์—ด ๋ฐ์ดํ„ฐ์˜ ํ˜„์žฌ์‹œ์  

• โ‘ก ๋ฒˆ ๋ถ€๋ถ„ : ๊ณผ๊ฑฐ๊ฐ€ ํ˜„์žฌ์— ๋ฏธ์น˜๋Š” ์˜ํ–ฅ์„ ๋‚˜ํƒ€๋‚ด๋Š” ๋ชจ์ˆ˜์— ์‹œ๊ณ„์—ด ๋ฐ์ดํ„ฐ์˜ ๊ณผ๊ฑฐ ์‹œ์ ์„ ๊ณฑํ•œ ๊ฒƒ

• โ‘ข ๋ฒˆ ๋ถ€๋ถ„ : ๋ฐฑ์ƒ‰์žก์Œ, ์‹œ๊ณ„์—ด ๋ถ„์„์—์„œ์˜ ์˜ค์ฐจํ•ญ 

 

 

 

๐Ÿ”น  MA ๋ชจ๋ธ 

 

 

• Moving average ์ด๋™ํ‰๊ท  ๋ชจ๋ธ 

• ํŠธ๋ Œ๋“œ (ํ‰๊ท  or ์‹œ๊ณ„์—ด ๊ทธ๋ž˜ํ”„์—์„œ y ๊ฐ’) ๊ฐ€ ๋ณ€ํ™”ํ•˜๋Š” ์ƒํ™ฉ์— ์ ํ•ฉํ•œ ํšŒ๊ท€๋ชจ๋ธ 

• window ๋ผ๋Š” ๊ฐœ๋…์„ ์‚ฌ์šฉํ•˜์—ฌ ๊ทธ ํฌ๊ธฐ๋งŒํผ ์ด๋™ํ•œ๋‹ค ํ•˜์—ฌ ์ด๋™ํ‰๊ท  ๋ชจ๋ธ์ด๋ผ ๋ถ€๋ฅธ๋‹ค. 

• โ‘ก ๋ฒˆ ๋ถ€๋ถ„ : ๋งค๊ฐœ๋ณ€์ˆ˜ θ ์— ๊ณผ๊ฑฐ ์‹œ์ ์˜ ์˜ค์ฐจ๋ฅผ ๊ณฑํ•œ ๊ฒƒ์œผ๋กœ, ์ด์ „ ๋ฐ์ดํ„ฐ์˜ ์˜ค์ฐจ์—์„œ ํ˜„์žฌ ๋ฐ์ดํ„ฐ์˜ ์ƒํƒœ๋ฅผ ์ถ”๋ก ํ•œ๋‹ค. 

 

 

 

๐Ÿ”น  ARMA ๋ชจ๋ธ 

 

• ์ž๊ธฐํšŒ๊ท€ ์ด๋™ํ‰๊ท  ๋ชจ๋ธ, ์ฃผ๋กœ ์—ฐ๊ตฌ๊ธฐ๊ด€์—์„œ ์‚ฌ์šฉํ•œ๋‹ค. 

• AR, MA ๋‘ ๊ฐ€์ง€ ๊ด€์ ์—์„œ ๊ณผ๊ฑฐ์˜ ๋ฐ์ดํ„ฐ๋ฅผ ์‚ฌ์šฉํ•œ๋‹ค. 

 

 

 

 

 

๐Ÿ”น  ARIMA ๋ชจ๋ธ 

 

• ์ž๊ธฐ ํšŒ๊ท€ ๋ˆ„์ (integrated) ์ด๋™ ํ‰๊ท  ๋ชจ๋ธ : AR๊ณผ MA ๋ฅผ ๋ชจ๋‘ ๊ณ ๋ คํ•˜๋Š” ๋ชจํ˜•์ธ๋ฐ, ARMA ์™€ ๋‹ฌ๋ฆฌ ๊ณผ๊ฑฐ ๋ฐ์ดํ„ฐ์˜ ์„ ํ˜•์  ๊ด€๊ณ„ ๋ฟ ์•„๋‹ˆ๋ผ ์ถ”์„ธ๊นŒ์ง€ ๊ณ ๋ คํ•œ ๋ชจ๋ธ์ด๋‹ค. 

 

from statsmodels.tsa.arima_model import ARIMA

 

• ARIMA(data, order = (p,d,q)) 

 

โ–ธp : ์ž๊ธฐํšŒ๊ท€์ฐจ์ˆ˜

โ–ธd : ์ฐจ๋ถ„์ฐจ์ˆ˜

โ–ธq : ์ด๋™ํ‰๊ท ์ฐจ์ˆ˜ 

โ–ธmodel.fit() : ํ›ˆ๋ จ 

โ–ธmodel.forecast() : ์˜ˆ์ธก 

 

 

 

 

3๏ธโƒฃ  RNN 


 

๐Ÿ”น  Recurrent Neural Network 

 

 

• ์ด์ „ ์€๋‹‰์ธต์ด ํ˜„์žฌ ์€๋‹‰์ธต์˜ ์ž…๋ ฅ์ด ๋˜๋ฉด์„œ ๋ฐ˜๋ณต๋˜๋Š” ์ˆœํ™˜ ๊ตฌ์กฐ๋ฅผ ๊ฐ–๋Š”๋‹ค. 

• ์ด์ „์˜ ์ •๋ณด๋ฅผ ๊ธฐ์–ตํ•˜๊ณ  ์žˆ๊ธฐ ๋•Œ๋ฌธ์— ์ตœ์ข…์ ์œผ๋กœ ๋‚จ๊ฒจ์ง„ ๊ธฐ์–ต์€ ๋ชจ๋“  ์ž…๋ ฅ ์ „์ฒด๋ฅผ ์š”์•ฝํ•œ ์ •๋ณด๊ฐ€ ๋œ๋‹ค. 

• ์Œ์„ฑ์ธ์‹, ๋‹จ์–ด์˜ ์˜๋ฏธํŒ๋‹จ ๋ฐ ๋Œ€ํ™” ๋“ฑ์˜ ์ž์—ฐ์–ด์ฒ˜๋ฆฌ์— ํ™œ์šฉ๋˜๊ฑฐ๋‚˜ ์†๊ธ€์”จ, ์„ผ์„œ๋ฐ์ดํ„ฐ ๋“ฑ์˜ ์‹œ๊ณ„์—ด ๋ฐ์ดํ„ฐ ์ฒ˜๋ฆฌ์— ํ™œ์šฉ๋œ๋‹ค. 

 

๐Ÿ”น  ๋‹ค์–‘ํ•œ ์œ ํ˜•์˜ RNN 

 

 

 

 

์ผ๋Œ€๋‹ค : ์ž…๋ ฅ์ด ํ•˜๋‚˜๊ณ  ์ถœ๋ ฅ์ด ๋‹ค์ˆ˜์ธ ๊ตฌ์กฐ์ด๋‹ค. ์ด๋ฏธ์ง€๋ฅผ ์ž…๋ ฅํ•ด, ์ด๋ฏธ์ง€์— ๋Œ€ํ•œ ์„ค๋ช…์„ ๋ฌธ์žฅ์œผ๋กœ ์ถœ๋ ฅํ•˜๋Š” ์ด๋ฏธ์ง€ ์บก์…˜์ด ๋Œ€ํ‘œ์ ์ธ ์‚ฌ๋ก€์ด๋‹ค. 

 

• ๋‹ค๋Œ€์ผ : ์ž…๋ ฅ์ด ๋‹ค์ˆ˜๊ณ  ์ถœ๋ ฅ์ด ํ•˜๋‚˜์ธ ๊ตฌ์กฐ๋กœ, ๋ฌธ์žฅ์„ ์ž…๋ ฅํ•ด ๊ธ/๋ถ€์ •์„ ์ถœ๋ ฅํ•˜๋Š” ๊ฐ์„ฑ๋ถ„์„๊ธฐ์—์„œ ์‚ฌ์šฉ๋˜๋Š” ๊ตฌ์กฐ์ด๋‹ค. 

 

# ๋‹ค๋Œ€์ผ ๋ชจ๋ธ 

self.em = nn.Embedding(len(TEXT.vocab.stoi), embedding_dim) # ์ž„๋ฒ ๋”ฉ ์ฒ˜๋ฆฌ 
self.rnn = nn.RNNCell(input_dim, hidden_size) # RNN ์ ์šฉ 
self.fc1 = nn.Linear(hidden_size, 256) # ์™„์ „์—ฐ๊ฒฐ์ธต
self.fc2 = nn.Linear(256,3) # ์ถœ๋ ฅ์ธต

 

 

• ๋‹ค๋Œ€๋‹ค : ์ž…๋ ฅ๊ณผ ์ถœ๋ ฅ์ด ๋‹ค์ˆ˜์ธ ๊ตฌ์กฐ, ์ž๋™๋ฒˆ์—ญ๊ธฐ๊ฐ€ ๋Œ€ํ‘œ์ ์ธ ์‚ฌ๋ก€์ด๋‹ค. ํŒŒ์ดํ† ์น˜์—์„œ๋Š” ์•„๋ž˜์˜ ํ•œ์ค„ ์ฝ”๋“œ๋กœ ๊ฐ„๋‹จํ•˜๊ฒŒ ๊ตฌํ˜„์ด ๊ฐ€๋Šฅํ•˜๋‚˜, ํŒŒ์ดํ† ์น˜์—์„œ๋Š” seq2seq ๊ตฌ์กฐ๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๋ฐฉ์‹์œผ๋กœ ๊ตฌํ˜„๋œ๋‹ค. 

 

keras.layers.SimpleRNN(100, return_sequences = True, name='RNN') 

โ†ช ํ…์„œํ”Œ๋กœ์šฐ์—์„œ๋Š” return_sequences =True ์˜ต์…˜์œผ๋กœ ์‹œํ€€์Šค๋ฅผ ๋ฆฌํ„ดํ•  ์ˆ˜ ์žˆ๋„๋ก ํ•œ๋‹ค

 

โ–ธ ํŒŒ์ดํ† ์น˜๋กœ ๊ตฌํ˜„ 

 

# ๋‹ค๋Œ€๋‹ค ๋ชจ๋ธ 

Seq2Seq(
    (encoder) : Encoder(
        (embedding) : Embedding(7855,256) 
        (rnn) : LSTM(256, 512, num_layers=2, dropout=0.5) 
        (dropout) : Dropout(p=0.5, inplace=False) 
    )
    (decoder) : Decoder(
        (embedding) : Embedding(5893,256) 
        (rnn) : LSTM(256, 512, num_layers=2, dropout=0.5) 
        (fc_out) : Linear(in_features=512, out_features=5893, bias = True)
        (dropout) : Dropout(p=0.5, inplace=False) 
    )
)

 

 

 

 

๋™๊ธฐํ™” ๋‹ค๋Œ€๋‹ค : ์ž…๋ ฅ๊ณผ ์ถœ๋ ฅ์ด ๋‹ค์ˆ˜์ธ ๊ตฌ์กฐ๋กœ, ํ”„๋ ˆ์ž„ ์ˆ˜์ค€์˜ ๋น„๋””์˜ค ๋ถ„๋ฅ˜๊ฐ€ ๋Œ€ํ‘œ์ ์ธ ์‚ฌ๋ก€์ด๋‹ค. 

 

 

 

๐Ÿ”น  RNN ๊ณ„์ธต๊ณผ ์…€ 

 

 

• Cell ์…€ : ์˜ค์ง ํ•˜๋‚˜์˜ ๋‹จ๊ณ„ time step ๋งŒ ์ฒ˜๋ฆฌํ•˜๋Š” ๋‹จ์œ„๋ฅผ ์˜๋ฏธํ•œ๋‹ค. ์‹ค์ œ ๊ณ„์‚ฐ์—์„œ ์‚ฌ์šฉ๋˜๋Š” RNN ๊ณ„์ธต์˜ ๊ตฌ์„ฑ์š”์†Œ๋กœ ๋‹จ์ผ ์ž…๋ ฅ๊ณผ ๊ณผ๊ฑฐ ์ƒํƒœ๋ฅผ ๊ฐ€์ ธ์™€ ์ถœ๋ ฅ๊ณผ ์ƒˆ๋กœ์šด ์ƒํƒœ๋ฅผ ์ƒ์„ฑํ•œ๋‹ค. 

 

โ†ช ์…€ ์œ ํ˜• 

 

nn.RNNCell  Simple RNN ๊ณ„์ธต์— ๋Œ€์‘๋˜๋Š” RNN ์…€ 
nn.GRUCell GRU ๊ณ„์ธต์— ๋Œ€์‘๋˜๋Š” GRU ์…€
nn.LSTMCell LSTM ๊ณ„์ธต์— ๋Œ€์‘๋˜๋Š” LSTM ์…€ 

 

• Layer ๊ณ„์ธต : ์…€์„ ๋ž˜ํ•‘ํ•ด ๋™์ผํ•œ ์…€์„ ์—ฌ๋Ÿฌ ๋‹จ๊ณ„์— ์ ์šฉํ•œ๋‹ค. 

 

 

๐Ÿ‘€ ํŒŒ์ดํ† ์น˜์—์„œ๋Š” ๋ ˆ์ด์–ด์™€ ์…€์„ ๋ถ„๋ฆฌํ•ด์„œ ๊ทœํ˜„ํ•˜๋Š” ๊ฒƒ์ด ๊ฐ€๋Šฅํ•˜๋‹ค. 

 

 

 

 

4๏ธโƒฃ  RNN ๊ตฌ์กฐ 


 

๐Ÿ”น ๊ตฌ์กฐ 

 

 

• x(t-1) ์—์„œ h(t-1) ์„ ์–ป๊ณ , ๋‹ค์Œ ๋‹จ๊ณ„์—์„œ h(t-1) ๊ณผ x(t) ๋ฅผ ์‚ฌ์šฉํ•ด ๊ณผ๊ฑฐ ์ •๋ณด์™€ ํ˜„์žฌ ์ •๋ณด๋ฅผ ๋ชจ๋‘ ๋ฐ˜์˜ํ•œ๋‹ค. 

 

• ๊ฐ€์ค‘์น˜ 

โ†ช Wxh : ์ž…๋ ฅ์ธต์—์„œ ์€๋‹‰์ธต์œผ๋กœ ์ „๋‹ฌ๋˜๋Š” ๊ฐ€์ค‘์น˜

โ†ช Whh : t ์‹œ์ ์˜ ์€๋‹‰์ธต์—์„œ t+1 ์‹œ์ ์˜ ์€๋‹‰์ธต์œผ๋กœ ์ „๋‹ฌ๋˜๋Š” ๊ฐ€์ค‘์น˜ 

โ†ช Why : ์€๋‹‰์ธต์—์„œ ์ถœ๋ ฅ์ธต์œผ๋กœ ์ „๋‹ฌ๋˜๋Š” ๊ฐ€์ค‘์น˜ 

โ–ธ ๋ชจ๋“  ์‹œ์ ์— ๊ฐ€์ค‘์น˜ ๊ฐ’์€ ๋™์ผํ•˜๊ฒŒ ์ ์šฉ๋œ๋‹ค. 

 

 

๐Ÿ”น t ์‹œ์ ์˜ RNN ๊ณ„์‚ฐ 

 

โ‘  ์€๋‹‰์ธต ๊ณ„์‚ฐ 

 

โ†ช ํ˜„์žฌ ์ž…๋ ฅ๊ฐ’๊ณผ ์ด์ „ ์‹œ์ ์˜ ์€๋‹‰์ธต์„ ๊ฐ€์ค‘์น˜ ๊ณ„์‚ฐํ•œ ํ›„, ํ•˜์ดํผ๋ณผ๋ฆญ ํƒ„์  ํŠธ ํ™œ์„ฑํ™” ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•ด ํ˜„์žฌ ์‹œ์ ์˜ ์€๋‹‰์ธต์„ ๊ณ„์‚ฐํ•œ๋‹ค. 

 

โ‘ก ์ถœ๋ ฅ์ธต 

 

โ†ช  ์ถœ๋ ฅ์ธต ๊ฐ€์ค‘์น˜์™€ ํ˜„์žฌ ์€๋‹‰์ธต์„ ๊ณฑํ•˜์—ฌ ์†Œํ”„ํŠธ๋งฅ์Šค ํ•จ์ˆ˜๋ฅผ ์ ์šฉํ•œ๋‹ค. 

 

 

โ‘ข ์ˆœ๋ฐฉํ–ฅ ํ•™์Šต ๋ฐ ์˜ค์ฐจ E

 

 

โ†ช ์‹ฌ์ธต ์‹ ๊ฒฝ๋ง์—์„œ ์ผ๋ฐ˜์ ์ธ feedforward ์ „๋ฐฉํ–ฅ ํ•™์Šต๊ณผ ๋‹ฌ๋ฆฌ  ๊ฐ ๋‹จ๊ณ„ t ๋งˆ๋‹ค ์˜ค์ฐจ๋ฅผ ์ธก์ •ํ•œ๋‹ค. 

 

 

โ‘ฃ ์—ญ์ „ํŒŒ

 

 

โ†ช RNN ์—์„œ๋Š” BPTT (backpropagation through time) ๋ฅผ ์ด์šฉํ•˜์—ฌ ๋ชจ๋“  ๋‹จ๊ณ„๋งˆ๋‹ค ์ฒ˜์Œ๋ถ€ํ„ฐ ๋๊นŒ์ง€ ์—ญ์ „ํŒŒํ•œ๋‹ค. 

โ†ช ๊ฐ ๋‹จ๊ณ„๋งˆ๋‹ค ์˜ค์ฐจ๋ฅผ ์ธก์ •ํ•˜๊ณ  ์ด์ „ ๋‹จ๊ณ„๋กœ ์ „๋‹ฌ๋˜๋Š” ๊ฒƒ์„ ์˜๋ฏธํ•˜๋Š”๋ฐ, โ‘ข ๊ณผ์ •์—์„œ ๊ตฌํ•œ ์˜ค์ฐจ๋ฅผ ์ด์šฉํ•ด ๊ฐ€์ค‘์น˜์™€ bias ๋ฅผ ์—…๋ฐ์ดํŠธ ํ•œ๋‹ค. 

โ†ช ๊ธฐ์šธ๊ธฐ ์†Œ๋ฉธ๋ฌธ์ œ : ์˜ค์ฐจ๊ฐ€ ๋ฉ€๋ฆฌ ์ „ํŒŒ๋ ๋•Œ ๊ณ„์‚ฐ๋Ÿ‰์ด ๋งŽ์•„์ง€๊ณ  ์ „ํŒŒ๋˜๋Š” ์–‘์ด ์ ์ฐจ ์ ์–ด์ง€๋Š” ๋ฌธ์ œ์  

 

 

 

๐Ÿ”น RNN ์…€ ๊ตฌํ˜„ : IMDB ์˜ํ™”๋ฆฌ๋ทฐ ๊ธ๋ถ€์ • ์˜ˆ์ œ 

 

โ‘  ๋ฐ์ดํ„ฐ ์ค€๋น„ 

 

(1) ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ๊ฐ€์ ธ์˜ค๊ธฐ 

 

pip install torchtext==0.10.1 # ๋Ÿฐํƒ€์ž„ ๋‹ค์‹œ์‹œ์ž‘ 

import torch 
import torchtext 
import numpy as np 
import torch.nn as nn 
import torch.nn.functional as F 
import time

 

 

โ–ธ torchtext : NLP ๋ถ„์•ผ์—์„œ ์‚ฌ์šฉํ•˜๋Š” ๋ฐ์ดํ„ฐ๋กœ๋”๋กœ, ํŒŒ์ผ ๊ฐ€์ ธ์˜ค๊ธฐ, ํ† ํฐํ™”, ๋‹จ์–ด ์ง‘ํ•ฉ ์ƒ์„ฑ, ์ธ์ฝ”๋”ฉ, ๋‹จ์–ด๋ฒกํ„ฐ ์ƒ์„ฑ ๋“ฑ์˜ ์ž‘์—…์„ ์ง€์›ํ•œ๋‹ค. 

 

โ–ธ ์šฉ์–ด์ •๋ฆฌ

 

โœ” ํ† ํฐํ™” : ํ…์ŠคํŠธ๋ฅผ ๋ฌธ์žฅ์ด๋‚˜ ๋‹จ์–ด๋กœ ๋ถ„๋ฆฌํ•˜๋Š” ๊ฒƒ 

โœ” ๋‹จ์–ด์ง‘ํ•ฉ vocabulary : ์ค‘๋ณต์„ ์ œ๊ฑฐํ•œ ํ…์ŠคํŠธ์˜ ์ด ๋‹จ์–ด์˜ ์ง‘ํ•ฉ 

โœ” ์ธ์ฝ”๋”ฉ : ์‚ฌ๋žŒ์˜ ์–ธ์–ด์ธ ๋ฌธ์ž๋ฅผ ์ปดํ“จํ„ฐ์˜ ์–ธ์–ด์ธ ์ˆซ์ž๋กœ ๋ฐ”๊พธ๋Š” ์ž‘์—… 

โœ” ๋‹จ์–ด๋ฒกํ„ฐ : ๋‹จ์–ด์˜ ์˜๋ฏธ๋ฅผ ๋‚˜ํƒ€๋‚ด๋Š” ์ˆซ์ž ๋ฒกํ„ฐ 

 

 

(2) ๋ฐ์ดํ„ฐ ์ „์ฒ˜๋ฆฌ 

 

from torchtext.legacy.data import Field
start = time.time() 
TEXT = torchtext.legacy.data.Field(lower = True, fix_length = 200, batch_first=False) 
LABEL = torchtext.legacy.data.Field(sequential = False)

 

โ–ธ torchtext.legacy.data.Field

 

โœ” lower = True : ๋Œ€๋ฌธ์ž๋ฅผ ๋ชจ๋‘ ์†Œ๋ฌธ์ž๋กœ ๋ณ€๊ฒฝ 

 

โœ” fix_length = 200 : ๊ณ ์ •๋œ ๊ธธ์ด์˜ ๋ฐ์ดํ„ฐ๋ฅผ ์–ป์„ ์ˆ˜ ์žˆ๋‹ค. ์—ฌ๊ธฐ์ฒ˜๋Ÿผ 200์œผ๋กœ ๊ณ ์ •ํ•œ๋‹ค๋ฉด ๋ฐ์ดํ„ฐ์˜ ๊ธธ์ด๋ฅผ 200์œผ๋กœ ๋งž์ถ”๋Š” ๊ฒƒ์ด๊ณ , ๋งŒ์•ฝ 200๋ณด๋‹ค ์งง์€ ๊ธธ์ด๋ผ๋ฉด ํŒจ๋”ฉ ์ž‘์—…์„ ํ†ตํ•ด ์ด์— ๋งž์ถ”์–ด ์ค€๋‹ค. 

 

โœ” batch_first = True : ์‹ ๊ฒฝ๋ง์— ์ž…๋ ฅ๋˜๋Š” ํ…์„œ์˜ ์ฒซ๋ฒˆ์งธ ์ฐจ์›์˜ ๊ฐ’์ด ๋ฐฐ์น˜ํฌ๊ธฐ๊ฐ€ ๋  ์ˆ˜ ์žˆ๋„๋ก ํ•œ๋‹ค. ์›๋ž˜ ๋ชจ๋ธ์˜ ๋„คํŠธ์›Œํฌ๋กœ ์ž…๋ ฅ๋˜๋Š” ๋ฐ์ดํ„ฐ๋Š” (seq_len, batch_size, hidden_size) ํ˜•ํƒœ์ธ๋ฐ, ์ด ์˜ต์…˜์„ True ๋กœ ์„ค์ •ํ•˜๊ฒŒ ๋˜๋ฉด (batch_size, seq_len, hidden_size) ํ˜•ํƒœ๋กœ ๋ณ€๊ฒฝ๋œ๋‹ค.  

 

โ• ํŒŒ์ดํ† ์น˜๋Š” ๊ฐ ๊ณ„์ธต๋ณ„ ๋ฐ์ดํ„ฐ ํ˜•ํƒœ๋ฅผ ๋งž์ถ”๋Š” ๊ฒƒ์—์„œ ์‹œ์ž‘ํ•˜์—ฌ ๋๋‚ ์ •๋„๋กœ ๋งค์šฐ ์ค‘์š”ํ•˜๊ธฐ ๋•Œ๋ฌธ์— ์ž…๋ ฅ์ธต, ์€๋‹‰์ธต ๋ฐ์ดํ„ฐ๋“ค์— ๋Œ€ํ•ด ๊ฐ ์ˆซ์ž๊ฐ€ ์˜๋ฏธํ•˜๋Š” ๊ฒƒ์„ ์ดํ•ดํ•ด์•ผ ํ•œ๋‹ค. 

 

โœ” sequential = False : ๋ฐ์ดํ„ฐ์— ์ˆœ์„œ๊ฐ€ ์žˆ๋Š”์ง€ ๋‚˜ํƒ€๋‚ด๋Š” ๊ฒƒ์œผ๋กœ ๊ธฐ๋ณธ๊ฐ’์€ True ์ด๋‹ค. ์˜ˆ์ œ์˜ ๋ ˆ์ด๋ธ”์€ ๊ธ๋ถ€์ • ๊ฐ’๋งŒ ๊ฐ€์ง€๋ฏ€๋กœ False ๋กœ ์„ค์ •ํ•œ๋‹ค. 

 

 

(3) ๋ฐ์ดํ„ฐ์…‹ ์ค€๋น„ 

 

from torchtext.legacy import datasets 
train_data, test_data = datasets.IMDB.splits(TEXT, LABEL)

 

โ–ธ splits : ์ „์ฒด ๋ฐ์ดํ„ฐ์…‹์„ TEXT ์™€ LABEL ๋กœ ๋ถ„ํ• ํ•˜์—ฌ, TEXT ๋Š” ํ›ˆ๋ จ์šฉ๋„๋กœ LABEL ์€ ํ…Œ์ŠคํŠธ ์šฉ๋„๋กœ ์‚ฌ์šฉํ•œ๋‹ค. 

 

โ–ธ ํ›ˆ๋ จ ๋ฐ์ดํ„ฐ๋Š” text ์™€ label ์„ ๊ฐ€์ง€๋Š” ์‚ฌ์ „ํ˜•์‹์œผ๋กœ ๊ตฌ์„ฑ๋˜์–ด ์žˆ๋‹ค. { 'text' : ['A','B', ..] , 'label' : 'pos' }

 

 

 

 

โ‘ก ๋ฐ์ดํ„ฐ ์ „์ฒ˜๋ฆฌ 

 

(1) ํ…์ŠคํŠธ ์ „์ฒ˜๋ฆฌ 

 

# ๋ฐ์ดํ„ฐ ์ „์ฒ˜๋ฆฌ 
import string 

for example in train_data.examples : # ๋ฐ์ดํ„ฐ์…‹ ๋‚ด์šฉ ํ™•์ธํ•˜๊ธฐ : examples 
  text = [x.lower() for x in vars(example)['text']] # ์†Œ๋ฌธ์ž๋กœ ๋ณ€๊ฒฝ 
  text = [x.replace("<br","") for x in text] # "<br" ์„ "" ๊ณต๋ฐฑ์œผ๋กœ ๋ณ€๊ฒฝ 
  text = [''.join(c for c in s if c not in string.punctuation) for s in text] # ๊ตฌ๋‘์  ์ œ๊ฑฐ 
  text = [s for s in text if s] # ๊ณต๋ฐฑ์ œ๊ฑฐ 
  vars(example)['text'] = text

 

โ–ธ ๋ถˆํ•„์š”ํ•œ ๋ฌธ์ž ์ œ๊ฑฐ, ๊ณต๋ฐฑ์ฒ˜๋ฆฌ ๋“ฑ์ด ํฌํ•จ๋œ๋‹ค. 

 

 

(2) ํ›ˆ๋ จ๊ณผ ๊ฒ€์ฆ ๋ฐ์ดํ„ฐ์…‹ ๋ถ„๋ฆฌ 

 

import random 
train_data , valid_data = train_data.split(random_state = random.seed(0), split_ratio = 0.8)

 

โ–ธ random_state : ๋ฐ์ดํ„ฐ ๋ถ„ํ•  ์‹œ ๋ฐ์ดํ„ฐ๊ฐ€ ์ž„์˜๋กœ ์„ž์ธ ์ƒํƒœ์—์„œ ๋ถ„ํ• ๋œ๋‹ค. seed ๊ฐ’์„ ์‚ฌ์šฉํ•˜๋ฉด ๋™์ผํ•œ ์ฝ”๋“œ๋ฅผ ์—ฌ๋Ÿฌ๋ฒˆ ์ˆ˜ํ–‰ํ•ด๋„ ๋™์ผํ•œ ๊ฐ’์˜ ๋ฐ์ดํ„ฐ๋ฅผ ๋ฐ˜ํ™˜ํ•œ๋‹ค. 

 

 

(3) ๋‹จ์–ด์ง‘ํ•ฉ ๋งŒ๋“ค๊ธฐ : build.vocab() 

 

#๋‹จ์–ด์ง‘ํ•ฉ ๋งŒ๋“ค๊ธฐ 
TEXT.build_vocab(train_data, max_size = 10000, min_freq = 10, vectors=None) 
LABEL.build_vocab(train_data)

 

โ–ธ ๋‹จ์–ด์ง‘ํ•ฉ : IMDB ๋ฐ์ดํ„ฐ์…‹์— ํฌํ•จ๋œ ๋‹จ์–ด๋“ค์„ ์ด์šฉํ•ด ํ•˜๋‚˜์˜ ๋”•์…”๋„ˆ๋ฆฌ์™€ ๊ฐ™์€ ์ง‘ํ•ฉ์„ ๋งŒ๋“œ๋Š” ๊ฒƒ์œผ๋กœ ๋‹จ์–ด๋“ค์˜ ์ค‘๋ณต์€ ์ œ๊ฑฐ๋œ ์ƒํƒœ์—์„œ ์ง„ํ–‰๋œ๋‹ค. 

 

โ–ธ max_size : ๋‹จ์–ด ์ง‘ํ•ฉ์˜ ํฌ๊ธฐ๋กœ ๋‹จ์–ด ์ง‘ํ•ฉ์— ํฌํ•จ๋˜๋Š” ์–ดํœ˜ ์ˆ˜ 

 

โ–ธ min_freq : ํŠน์ • ๋‹จ์–ด์˜ ์ตœ์†Œ ๋“ฑ์žฅ ํšŸ์ˆ˜๋กœ, ํ›ˆ๋ จ ๋ฐ์ดํ„ฐ์…‹์—์„œ ํŠน์ • ๋‹จ์–ด๊ฐ€ ์ตœ์†Œ 10๋ฒˆ ๋“ฑ์žฅํ•œ ๊ฒƒ๋งŒ ๋‹จ์–ด์ง‘ํ•ฉ์— ํฌํ•จํ•˜๊ฒ ๋‹ค๋Š” ์˜๋ฏธ์ด๋‹ค. 

 

โ–ธ vectors : ์ž„๋ฒ ๋”ฉ ๋ฒกํ„ฐ๋ฅผ ์ง€์ •ํ•  ์ˆ˜ ์žˆ๋‹ค. Word2Vec, Glove ๋“ฑ์ด ์žˆ์œผ๋ฉฐ ํŒŒ์ดํ† ์น˜์—์„œ๋„ nn.embedding() ์„ ํ†ตํ•ด ๋žœ๋คํ•œ ์ˆซ์ž๊ฐ’์œผ๋กœ ๋ณ€ํ™˜ํ•˜์—ฌ ๊ฐ€์ค‘์น˜๋ฅผ ํ•™์Šตํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์ œ๊ณตํ•œ๋‹ค.

 

 

print('TEXT ๋‹จ์–ด์‚ฌ์ „์˜ ํ† ํฐ ๊ฐœ์ˆ˜(์ค‘๋ณต์—†์Œ):', len(TEXT.vocab)) 
# 10002

print('LABEL ๋‹จ์–ด์‚ฌ์ „์˜ ํ† ํฐ ๊ฐœ์ˆ˜(์ค‘๋ณต์—†์Œ):', len(LABEL.vocab)) 
# 3 : ์›๋ž˜ ๊ธ์ •๊ณผ ๋ถ€์ • 2๊ฐœ๊ฐ€ ์ถœ๋ ฅ๋˜์–ด์•ผ ํ•˜๋Š”๋ฐ ํ•œ๋ฒˆ ํ™•์ธํ•ด๋ณผ ํ•„์š”๊ฐ€ ์žˆ์Œ

 

 

(4) ๋‹จ์–ด ์ง‘ํ•ฉ ํ™•์ธ 

 

# ํ…Œ์ŠคํŠธ ๋ฐ์ดํ„ฐ์…‹์˜ ๋‹จ์–ด์ง‘ํ•ฉ ํ™•์ธ 

print(LABEL.vocab.stoi)


defaultdict(<bound method Vocab._default_unk_index of <torchtext.legacy.vocab.Vocab object at 0x7f134796de10>>, {'<unk>': 0, 'pos': 1, 'neg': 2})

 

→ <unk> : ์‚ฌ์ „์— ์—†๋Š” ๋‹จ์–ด๋ฅผ ์˜๋ฏธ 

 

 

(5) ๋ฐ์ดํ„ฐ์…‹ ๋ฉ”๋ชจ๋ฆฌ๋กœ ๊ฐ€์ ธ์˜ค๊ธฐ 

 

# ๋ฐ์ดํ„ฐ์…‹ ๋ฉ”๋ชจ๋ฆฌ๋กœ ๊ฐ€์ ธ์˜ค๊ธฐ 

BATCH_SIZE = 64 
device = torch.device('cuda:0' if torch.cude.is_available() else 'cpu') 

embedding_dim = 100 # ๊ฐ ๋‹จ์–ด๋ฅผ 100์ฐจ์›์œผ๋กœ ์กฐ์ •(์ž„๋ฒ ๋”ฉ ๊ณ„์ธต์„ ํ†ต๊ณผํ•œ ํ›„ ๊ฐ ๋ฒกํ„ฐ์˜ ํฌ๊ธฐ) 
hidden_size = 300 

train_iterator, valid_iterator, test_iterator = torchtext.legacy.data.BucketIterator.splits(
    (train_data, valid_data, test_data),
    batch_size = BATCH_SIZE, 
    device = device
)

 

โ–ธ hidden size : ์€๋‹‰์ธต์˜ ์œ ๋‹›(๋‰ด๋Ÿฐ) ๊ฐœ์ˆ˜๋ฅผ ์ •ํ•œ๋‹ค. ์ผ๋ฐ˜์ ์œผ๋กœ ๋น„์„ ํ˜• ๋ฌธ์ œ๋ฅผ ์ข€ ๋” ํ•™์Šตํ•  ์ˆ˜ ์žˆ๋„๋ก, ์€๋‹‰์ธต์˜ ์œ ๋‹› ๊ฐœ์ˆ˜๋ฅผ ๋Š˜๋ฆฌ๊ธฐ๋ณด๋‹จ, ๊ณ„์ธต ์ž์ฒด์˜ ๊ฐœ์ˆ˜๋ฅผ ๋Š˜๋ฆฌ๋Š” ๊ฒƒ์ด ์„ฑ๋Šฅ์— ๋” ์ข‹๋‹ค. ์ตœ์ ์˜ ์€๋‹‰์ธต ๊ฐœ์ˆ˜์™€ ์œ ๋‹› ๊ฐœ์ˆ˜๋ฅผ ์ฐพ๋Š” ๊ฒƒ์€ ๋งค์šฐ ์–ด๋ ค์šด ์ผ์ด๊ธฐ ๋•Œ๋ฌธ์— ๊ณผ์ ํ•ฉ์ด ๋ฐœ์ƒํ•˜์ง€ ์•Š๋„๋ก, ์‹ค์ œ ํ•„์š”ํ•œ ๊ฐœ์ˆ˜๋ณด๋‹ค ๋” ๋งŽ์€ ์ธต๊ณผ ์œ ๋‹›์„ ๊ตฌ์„ฑํ•ด ๊ฐœ์ˆ˜๋ฅผ ์กฐ์ •ํ•ด ๋‚˜๊ฐ€๋Š” ๋ฐฉ์‹์„ ์‚ฌ์šฉํ•œ๋‹ค. 

 

โ–ธBucketIterator : dataloader ์™€ ๋น„์Šทํ•˜๊ฒŒ ๋ฐฐ์น˜ ํฌ๊ธฐ ๋‹จ์œ„๋กœ ๊ฐ’์„ ์ฐจ๋ก€๋กœ ๊บผ๋‚ด์–ด ๋ฉ”๋ชจ๋ฆฌ๋กœ ๊ฐ€์ ธ์˜ค๊ณ  ์‹ถ์„ ๋•Œ ์‚ฌ์šฉํ•œ๋‹ค. ๋น„์Šทํ•œ ๊ธธ์ด์˜ ๋ฐ์ดํ„ฐ๋ฅผ ํ•œ ๋ฐฐ์น˜์— ํ• ๋‹นํ•˜์—ฌ ํŒจ๋”ฉ์„ ์ตœ์†Œํ™” ์‹œ์ผœ์ค€๋‹ค. 

 

 

 

โ‘ข ์›Œ๋“œ ์ž„๋ฒ ๋”ฉ ๋ฐ RNN ์…€์ •์˜ 

 

โœ” ์•ž์„œ ๋‹จ์–ด์ง‘ํ•ฉ ์ƒ์„ฑ ๊ณผ์ •์—์„œ vectors=none ์œผ๋กœ ์„ค์ •ํ•˜์˜€์œผ๋ฏ€๋กœ ์ž„๋ฒ ๋”ฉ์ด ์ง„ํ–‰๋˜์ง€ ์•Š์•˜๊ธฐ ๋•Œ๋ฌธ์—, nn.Embedding() ์„ ์ด์šฉํ•ด ์ž„๋ฒ ๋”ฉ ์ฒ˜๋ฆฌ๋ฅผ ์‹œ์ผœ์ค€๋‹ค. 

 

class RNNCell_Encoder(nn.Module) : 

  def __init__(self, input_dim, hidden_size) : 
    super(RNNCell_Encoder, self).__init__() 
    self.rnn = nn.RNNCell(input_dim, hidden_size) # RNN ์…€ ๊ตฌํ˜„ 

  def forward(self, inputs) : # inputs ๋Š” ์ž…๋ ฅ ์‹œํ€€์Šค๋กœ (์‹œํ€€์Šค ๊ธธ์ด, ๋ฐฐ์น˜, ์ž„๋ฒ ๋”ฉ) ํ˜•ํƒœ๋ฅผ ๊ฐ€์ง
    bz = inputs.shape[1] # ๋ฐฐ์น˜๋ฅผ ๊ฐ€์ ธ์˜จ๋‹ค. 
    ht = torch.zeros((bz, hidden_size)).to(device) # ๋ฐฐ์น˜์™€ ์€๋‹‰์ธต ๋‰ด๋Ÿฐ์˜ ํฌ๊ธฐ๋ฅผ 0์œผ๋กœ ์ดˆ๊ธฐํ™” 
    for word in inputs : 
      ht = self.rnn(word, ht) # ์žฌ๊ท€์ ์œผ๋กœ ๋ฐœ์ƒํ•˜๋Š” ์ƒํƒœ ๊ฐ’ ์ฒ˜๋ฆฌ 
    return ht

 

โ–ธ nn.RNNCell 

 

  • input_dim : ํ›ˆ๋ จ ๋ฐ์ดํ„ฐ์…‹์˜ feature ๊ฐœ์ˆ˜๋กœ (batch, input_size) ํ˜•ํƒœ๋ฅผ ๊ฐ–๋Š”๋‹ค. (๋ฐฐ์น˜, ์ž…๋ ฅ ๋ฐ์ดํ„ฐ ์นผ๋Ÿผ๊ฐœ์ˆ˜)
  • hidden_size : ์€๋‹‰์ธต์˜ ๋‰ด๋Ÿฐ ๊ฐœ์ˆ˜๋กœ (batch, hidden_size) ํ˜•ํƒœ๋ฅผ ๊ฐ–๋Š”๋‹ค. 

 

โ–ธ ht = self.rnn(word, ht) 

 

  • ht : ํ˜„์žฌ์ƒํƒœ
  • word : ํ˜„์žฌ์˜ ์ž…๋ ฅ๋ฒกํ„ฐ๋กœ Xi ๋ฅผ ์˜๋ฏธ, (batch, input_size) ์˜ ํ˜•ํƒœ๋ฅผ ๊ฐ–๋Š”๋‹ค. 
  • ht : ์ด์ „์ƒํƒœ๋ฅผ ์˜๋ฏธํ•˜๋ฉฐ (batch, hidden_size) ํ˜•ํƒœ๋ฅผ ๊ฐ–๋Š”๋‹ค. 

 

class Net(nn.Module) : 

  def __init__(self) : 
    super(Net, self).__init__() 
    self.em = nn.Embedding(len(TEXT.vocab.stoi), embedding_dim) # ์ž„๋ฒ ๋”ฉ ์ฒ˜๋ฆฌ 
    self.rnn = RNNCell_Encoder(embedding_dim, hidden_size) 
    self.fc1 = nn.Linear(hidden_size, 256) 
    self.fc2 = nn.Linear(256,3) 
  
  def forward(self,x) : 
    x = self.em(x) 
    x = self.rnn(x) 
    x = F.relu(self.fc1(x)) 
    x = self.fc2(x) 
    return x

 

โ–ธ nn.Embedding : ์ž„๋ฒ ๋”ฉ ์ฒ˜๋ฆฌ๋ฅผ ์œ„ํ•œ ๊ตฌ๋ฌธ์œผ๋กœ ์ž„๋ฒ ๋”ฉ์„ ํ•  ๋‹จ์–ด ์ˆ˜ (๋‹จ์–ด์ง‘ํ•ฉํฌ๊ธฐ) ์™€  ์ž„๋ฒ ๋”ฉํ•  ๋ฒกํ„ฐ์˜ ์ฐจ์›์„ ์ง€์ •ํ•ด์ค€๋‹ค. 

 

 

 

 

 

 

โ‘ฃ ์˜ตํ‹ฐ๋งˆ์ด์ €์™€ ์†์‹คํ•จ์ˆ˜ ์ •์˜ 

 

model = Net() 
model.to(device) 

loss_fn = nn.CrossEntropyLoss()  
optimizer = torch.optim.Adam(model.parameters(), lr=0.0001)

 

โ–ธ nn.CrossEntropyLoss : ๋‹ค์ค‘๋ถ„๋ฅ˜์— ์‚ฌ์šฉ๋˜๋Š” ์†์‹คํ•จ์ˆ˜ 

 

 

โ‘ค ๋ชจ๋ธ ํ•™์Šต 

 

โ–ธ ๋ชจ๋ธ ํ•™์Šต์„ ์œ„ํ•œ ํ•จ์ˆ˜๋ฅผ ์ •์˜ 

 

# ๋ชจ๋ธ ํ•™์Šต 

def training(epoch, model, trainloader, validloader) : 

  correct = 0 
  total = 0 
  running_loss = 0 

  model.train() 

  for b in trainloader : 
    x,y = b.text, b.label # text ์™€ label ์„ ๊บผ๋‚ด์˜จ๋‹ค. 
    x,y = x.to(device) , y.to(device) # ๋ฐ์ดํ„ฐ๊ฐ€ CPU ๋ฅผ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋„๋ก ์žฅ์น˜ ์ง€์ • 
    y_pred = model(x) 
    loss = loss_fn(y_pred, y) # CrossEntropyLoss ์†์‹คํ•จ์ˆ˜ ์ด์šฉํ•ด ์˜ค์ฐจ ๊ณ„์‚ฐ 
    optimizer.zero_grad() # ๋ณ€ํ™”๋„ (gradients) ์ดˆ๊ธฐํ™” 
    loss.backward() # ์—ญ์ „ํŒŒ 
    optimizer.step() # ์—…๋ฐ์ดํŠธ 

    with torch.no_grad() : # autograd๋ฅผ ๋”์œผ๋กœ์จ ๋ฉ”๋ชจ๋ฆฌ ์‚ฌ์šฉ๋Ÿ‰์„ ์ค„์ด๊ณ  ์—ฐ์‚ฐ ์†๋„๋ฅผ ๋†’์ž„ 
      y_pred = torch.argmax(y_pred, dim =1) 
      correct += (y_pred == y).sum().item() 
      total += y.size(0) 
      running_loss += loss.item() 
  
  epoch_loss = running_loss / len(trainloader.dataset) 
  # ๋ˆ„์ ๋œ ์˜ค์ฐจ๋ฅผ ์ „์ฒด๋ฐ์ดํ„ฐ์…‹์œผ๋กœ ๋‚˜๋ˆ„์–ด ์—ํฌํฌ ๋‹จ๊ณ„๋งˆ๋‹ค ์˜ค์ฐจ๋ฅผ ๊ตฌํ•œ๋‹ค. 
  epoch_acc = correct/total 

  valid_correct = 0 
  valid_total = 0 
  valid_running_loss = 0 

  model.eval() # evaluation ๊ณผ์ •์—์„œ ์‚ฌ์šฉํ•˜์ง€ ์•Š์•„์•ผ ํ•˜๋Š” layer๋“ค์„ ์•Œ์•„์„œ off ์‹œํ‚ค๋„๋ก ํ•˜๋Š” ํ•จ์ˆ˜
  with torch.no_grad() : # evaluation ํ˜น์€ validation ์—์„œ๋Š” no_grad ๋ฅผ ์“ด๋‹ค. 
    for b in validloader : 
      x,y = b.text, b.label 
      x,y = x.to(device) , y.to(device) 
      y_pred = model(x) 
      loss = loss_fn(y_pred,y) 
      y_pred = torch.argmax(y_pred, dim =1) 
      valid_correct += (y_pred == y).sum().item() 
      valid_total += y.size(0) 
      valid_running_loss += loss.item() 

  epoch_valid_loss = valid_running_loss / len(validloader.dataset) 
  epoch_valid_acc = valid_correct / valid_total 

  print('epoch :', epoch, 
        'loss :', round(epoch_loss,3), 
        'accuracy : ', round(epoch_acc,3), 
        'valid_loss :', round(epoch_valid_loss,3), 
        'valid_accuracy :', round(epoch_valid_acc,3)
        )
  
  return epoch_loss, epoch_acc, epoch_valid_loss, epoch_valid_acc

 

 

โ–ธ ๋ชจ๋ธ ํ•™์Šต 

 

# ๋ชจ๋ธ ํ•™์Šต ์ง„ํ–‰ 

epochs = 5 
train_loss = [] 
train_acc = [] 
valid_loss = [] 
valid_acc = [] 

for epoch in range(epochs) : 
  epoch_loss, epoch_acc, epoch_valid_loss, epoch_valid_acc = training(epoch,
                                                                      model,
                                                                      train_iterator,
                                                                      valid_iterator)
  train_loss.append(epoch_loss) # ํ›ˆ๋ จ ๋ฐ์ดํ„ฐ์…‹์„ ๋ชจ๋ธ์— ์ ์šฉํ–ˆ์„ ๋•Œ์˜ ์˜ค์ฐจ 
  train_acc.append(epoch_acc) # ํ›ˆ๋ จ ๋ฐ์ดํ„ฐ์…‹์„ ๋ชจ๋ธ์— ์ ์šฉํ–ˆ์„ ๋•Œ ์ •ํ™•๋„ 
  valid_loss.append(epoch_valid_loss) # ๊ฒ€์ฆ ๋ฐ์ดํ„ฐ์…‹์„ ๋ชจ๋ธ์— ์ ์šฉํ–ˆ์„ ๋•Œ ์˜ค์ฐจ 
  valid_acc.append(epoch_valid_acc) # ๊ฒ€์ฆ ๋ฐ์ดํ„ฐ์…‹์„ ๋ชจ๋ธ์— ์ ์šฉํ–ˆ์„ ๋•Œ ์ •ํ™•๋„ 

end = time.time() 
#print(end-start)

 

โ†ช ์—ํฌํฌ๊ฐ€ 5๋ผ ์ •ํ™•๋„๋Š” ๋‚ฎ์ง€๋งŒ ํ•™์Šต๊ณผ ๊ฒ€์ฆ ๋ฐ์ดํ„ฐ์…‹์— ๋Œ€ํ•œ ์˜ค์ฐจ๊ฐ€ ์œ ์‚ฌํ•˜๋ฏ€๋กœ ๊ณผ์ ํ•ฉ์€ ๋ฐœ์ƒํ•˜์ง€ ์•Š์Œ์„ ํ™•์ธํ•ด๋ณผ ์ˆ˜ ์žˆ๋‹ค. 

 

โ‘ฅ ๋ชจ๋ธ ์˜ˆ์ธก 

 

โ–ธ ํ…Œ์ŠคํŠธ์…‹์— ๋Œ€ํ•œ ๋ชจ๋ธ ์˜ˆ์ธกํ•จ์ˆ˜ ์ •์˜ 

 

def evaluate(epoch, model, testloader) : 
  test_correct = 0 
  test_total = 0 
  test_running_loss = 0 

  model.eval() 
  with torch.no_grad() : 
    for b in testloader : 
      x,y = b.text, b.label 
      x,y = x.to(device) , y.to(device) 
      y_pred = model(x) 
      loss = loss_fn(y_pred, y) 
      y_pred = torch.argmax(y_pred, dim=1) 
      test_correct += (y_pred == y).sum().item() 
      test_total += y.size(0) 
      test_running_loss += loss.item() 
  
  epoch_test_loss = test_running_loss/len(testloader.dataset) 
  epoch_test_acc = test_correct/test_total 

  print('epoch : ', epoch, 
        'test_loss : ', round(epoch_test_loss,3), 
        'test_accuracy :', round(epoch_test_acc,3)) 
  
  return epoch_test_loss, epoch_test_acc

 

 

โ–ธ ํ…Œ์ŠคํŠธ์…‹์— ๋Œ€ํ•œ ๋ชจ๋ธ ์˜ˆ์ธก ๊ฒฐ๊ณผ ํ™•์ธ 

 

epochs = 5  
test_loss = [] 
test_acc = [] 

for epoch in range(epochs) : 
  epoch_test_loss, epoch_test_acc = evaluate(epoch, model, test_iterator) 
  test_loss.append(epoch_test_loss) 
  test_acc.append(epoch_test_acc) 

end = time.time()

 

๋” ๋†’์€ ์ •ํ™•๋„๋ฅผ ์›ํ•œ๋‹ค๋ฉด ์—ํฌํฌ ํšŸ์ˆ˜๋ฅผ ๋Š˜๋ฆฌ๋ฉด ๋œ๋‹ค. 

 

 

 

 

๐Ÿ”น RNN ๊ณ„์ธต ๊ตฌํ˜„ 

โ•  RNN ์…€ ๋„คํŠธ์›Œํฌ์™€ ํฌ๊ฒŒ ๋‹ค๋ฅด์ง„ ์•Š๋‹ค. ๋ฏธ์„ธํ•œ ์ฐจ์ด ์œ„์ฃผ๋กœ ์‚ดํŽด๋ณด๊ธฐ!

 

โ‘  ๋ฐ์ดํ„ฐ ๋กœ๋“œ ๋ฐ ์ „์ฒ˜๋ฆฌ (RNN ์…€ ๊ณผ์ •๊ณผ ๊ฐ™์œผ๋ฏ€๋กœ ์ƒ๋žต) 

 

โ‘ก ๋ชจ๋ธ ๋„คํŠธ์›Œํฌ ์ •์˜ 

 

โ–ธ ๋ณ€์ˆ˜๊ฐ’ ์ง€์ • 

 

vocab_size = len(TEXT.vocab) # ์˜ํ™” ๋ฆฌ๋ทฐ์— ๋Œ€ํ•œ ํ…์ŠคํŠธ ๊ธธ์ด 
n_classes = 2  # ๊ธ์ • ๋ถ€์ •

 

โ–ธ RNN layer ๋„คํŠธ์›Œํฌ 

 

class BasicRNN(nn.Module) : 
  def __init__(self, n_layers, hidden_dim, n_vocab, embed_dim, n_classes, dropout_p=0.2) :
    super(BasicRNN,self).__init__() 
    self.n_layers = n_layers # RNN ๊ณ„์ธต์— ๋Œ€ํ•œ ๊ฐœ์ˆ˜ 
    self.embed = nn.Embedding(n_vocab, embed_dim) # ์›Œ๋“œ ์ž„๋ฒ ๋”ฉ ์ ์šฉ 
    self.hidden_dim = hidden_dim 
    self.dropout = nn.Dropout(dropout_p) # ๋“œ๋กญ์•„์›ƒ ์ ์šฉ 

    self.rnn = nn.RNN(embed_dim, self.hidden_dim, num_layers = self.n_layers, batch_first = True) 
    self.out = nn.Linear(self.hidden_dim, n_classes) 
  
  def forward(self,x) : 
    x = self.embed(x) # ๋ฌธ์ž๋ฅผ ์ˆซ์ž/๋ฒกํ„ฐ๋กœ ๋ณ€ํ™˜ 
    h_0 = self._init_state(batch_size = x.size(0)) # ์ตœ์ดˆ ์€๋‹‰์ƒํƒœ์˜ ๊ฐ’์„ 0์œผ๋กœ ์ดˆ๊ธฐํ™”
    x,_ = self.rnn(x, h_0) # RNN ๊ณ„์ธต
    h_t = x[:,-1,:] # ๋ชจ๋“  ๋„คํŠธ์›Œํฌ๋ฅผ ๊ฑฐ์ณ ๊ฐ€์žฅ ๋งˆ์ง€๋ง‰์— ๋‚˜์˜จ ๋‹จ์–ด์˜ ์ž„๋ฒ ๋”ฉ๊ฐ’ (๋งˆ์ง€๋ง‰ ์€๋‹‰์ƒํƒœ์˜ ๊ฐ’) 

    self.dropout(h_t) 
    logit = torch.sigmoid(self.out(h_t)) 
    return logit 
  
  def _init_state(self, batch_size =1) : 
    weight = next(self.parameters()).data # ๋ชจ๋ธ ํŒŒ๋ผ๋ฏธํ„ฐ ๊ฐ’์„ ๊ฐ€์ ธ์™€ weight ์— ์ €์žฅ 
    return weight.new(self.n_layers, batch_size, self.hidden_dim).zero_() 
    # ํฌ๊ธฐ๊ฐ€ (๊ณ„์ธต์˜ ๊ฐœ์ˆ˜, ๋ฐฐ์น˜ํฌ๊ธฐ, ์€๋‹‰์ธต์˜ ๋‰ด๋Ÿฐ๊ฐœ์ˆ˜) ์ธ ์€๋‹‰์ƒํƒœ์˜ ํ…์„œ๋ฅผ ์ƒ์„ฑํ•ด 0์œผ๋กœ ์ดˆ๊ธฐํ™”ํ•œ ํ›„ ๋ฐ˜ํ™˜

 

โ†ช nn.RNN 

 

  • embed_dim : ํ›ˆ๋ จ ๋ฐ์ดํ„ฐ์…‹์˜ ํŠน์„ฑ(์นผ๋Ÿผ) ๊ฐœ์ˆ˜ 
  • hidden_dim : ์€๋‹‰ ๊ณ„์ธต์˜ ๋‰ด๋Ÿฐ ๊ฐœ์ˆ˜ 
  • num_layers : RNN ๊ณ„์ธต์˜ ๊ฐœ์ˆ˜ 

 

 

โ‘ข ์†์‹คํ•จ์ˆ˜์™€ ์˜ตํ‹ฐ๋งˆ์ด์ € ์„ค์ • 

 

model = BasicRNN(n_layers = 1, hidden_dim = 256, n_vocab = vocab_size, embed_dim = 128, n_classes = n_classes, dropout_p = 0.5)
model.to(device)

loss_fn = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.0001)

 

 

โ‘ฃ ๋ชจ๋ธ ํ›ˆ๋ จ ๋ฐ ํ‰๊ฐ€ 

 

 

def train(model, optimizer, train_iter):
    model.train()
    for b, batch in enumerate(train_iter):
        x, y = batch.text.to(device), batch.label.to(device)
        y.data.sub_(1) 
        # ๋ ˆ์ด๋ธ”์ด ๊ธ์ •(2), ๋ถ€์ •(1) ๋กœ ๋˜์–ด์žˆ๊ธฐ ๋•Œ๋ฌธ์— ๊ฐ๊ฐ 1๊ณผ 0์œผ๋กœ ๊ฐ’์„ ๋ฐ”๊ฟ”์ฃผ๊ธฐ ์œ„ํ•จ


        optimizer.zero_grad()

        logit = model(x)
        loss = F.cross_entropy(logit, y)
        loss.backward()
        optimizer.step()

        if b % 50 == 0:
            print("Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}".format(e,
                                                                           b * len(x),
                                                                           len(train_iter.dataset),
                                                                           100. * b / len(train_iter),
                                                                           loss.item()))
                                                                           
 

def evaluate(model, val_iter):
    model.eval()
    corrects, total, total_loss = 0, 0, 0

    for batch in val_iter:
        x, y = batch.text.to(device), batch.label.to(device)
        y.data.sub_(1) 
        logit = model(x)
        loss = F.cross_entropy(logit, y, reduction = "sum")
        total += y.size(0)
        total_loss += loss.item()
        corrects += (logit.max(1)[1].view(y.size()).data == y.data).sum()
        
    avg_loss = total_loss / len(val_iter.dataset)
    avg_accuracy = corrects / total
    return avg_loss, avg_accuracy

 

 

 

BATCH_SIZE = 100
LR = 0.001
EPOCHS = 5
for e in range(1, EPOCHS + 1):
    train(model, optimizer, train_iterator)
    val_loss, val_accuracy = evaluate(model, valid_iterator)
    print("[EPOCH: %d], Validation Loss: %5.2f | Validation Accuracy: %5.2f" % (e, val_loss, val_accuracy))

 

 

test_loss, test_acc = evaluate(model,test_iterator)
print("Test Loss: %5.2f | Test Accuracy: %5.2f" % (test_loss, test_acc))

 

 

์ •ํ™•๋„๊ฐ€ ๊ทธ๋‹ฅ ๋†’์ง€ ์•Š๋‹ค. ์—ํฌํฌ๋ฅผ ์ฆ๊ฐ€์‹œ์ผœ๋ณด๊ฑฐ๋‚˜, ๋‹ค๋ฅธ ๋ชจ๋ธ๋กœ ๋ณ€๊ฒฝํ•ด๋ณธ๋‹ค. ์—ฌ๋Ÿฌ ์œ ํ˜•์˜ ๋ชจ๋ธ์„ ์ ์šฉํ•œ ํ›„ ๊ฐ€์žฅ ๊ฒฐ๊ณผ๊ฐ€ ์ข‹์€ ๋ชจ๋ธ์„ ์„ ํƒํ•œ๋‹ค. ๋˜ํ•œ ํ•˜์ดํผํŒŒ๋ผ๋ฏธํ„ฐ (๋ฐฐ์น˜ํฌ๊ธฐ, ํ•™์Šต๋ฅ  ๋“ฑ) ๋ฅผ ํŠœ๋‹ํ•ด๊ฐ€๋Š” ๊ณผ์ •์ด ํ•„์š”ํ•˜๋‹ค. 

 

 

 

728x90

๋Œ“๊ธ€