๋ณธ๋ฌธ ๋ฐ”๋กœ๊ฐ€๊ธฐ
1๏ธโƒฃ AI•DS/๐Ÿ“’ Deep learning

[๋”ฅ๋Ÿฌ๋‹ ํŒŒ์ดํ† ์น˜ ๊ต๊ณผ์„œ] 2์žฅ ์‹ค์Šต ํ™˜๊ฒฝ ์„ค์ •๊ณผ ํŒŒ์ดํ† ์น˜ ๊ธฐ์ดˆ

by isdawell 2022. 9. 23.
728x90

 

 

 

โœ…  ํŒŒ์ดํ† ์น˜ ๊ธฐ์ดˆ 

 

https://colab.research.google.com/drive/1ki4W3rwTExhmZp5E-Ic81ab2NMe8iRHB?usp=sharing 

 

[๋”ฅ๋Ÿฌ๋‹ ํŒŒ์ดํ† ์น˜ ๊ต๊ณผ์„œ] chapter 02 ํŒŒ์ดํ† ์น˜ ๊ธฐ์ดˆ.ipynb

Colaboratory notebook

colab.research.google.com

 

 

 

1๏ธโƒฃ  ํŒŒ์ดํ† ์น˜ ๊ฐœ์š” 


 

๐Ÿ”น  ํŠน์ง• ๋ฐ ์žฅ์  

 

โˆ˜  ์—ฐ์‚ฐ์„ ์œ„ํ•œ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ → ๋”ฅ๋Ÿฌ๋‹ ํ”„๋ ˆ์ž„์›Œํฌ๋กœ ๋ฐœ์ „ 

โˆ˜  GPU ์—์„œ ํ…์„œ ์กฐ์ž‘ ๋ฐ ๋™์  ์‹ ๊ฒฝ๋ง ๊ตฌ์ถ•์ด ๊ฐ€๋Šฅํ•œ ํ”„๋ ˆ์ž„์›Œํฌ 

 

โœ” GPU : ์—ฐ์‚ฐ์„ ๋น ๋ฅด๊ฒŒ ํ•˜๋Š” ์—ญํ• , ๋‚ด๋ถ€์ ์œผ๋กœ CUDA, cuDNN ๊ฐ™์€ API ๋ฅผ ํ†ตํ•ด ์—ฐ์‚ฐ ๊ฐ€๋Šฅ

 

โœ” ํ…์„œ : ํŒŒ์ดํ† ์น˜์˜ ๋ฐ์ดํ„ฐ ํ˜•ํƒœ๋กœ, ๋‹ค์ฐจ์› ํ–‰๋ ฌ ๊ตฌ์กฐ๋ฅผ ๊ฐ€์ง„๋‹ค. .cuda() ๋ฅผ ์‚ฌ์šฉํ•ด GPU ์—ฐ์‚ฐ์„ ์ˆ˜ํ–‰ํ•  ์ˆ˜ ์žˆ๊ฒŒ ํ•  ์ˆ˜ ์žˆ๋‹ค. 

 

  • axis 0 : ๋ฒกํ„ฐ, 1์ฐจ์› ์ถ• 
  • axis 1 : ํ–‰๋ ฌ, 2์ฐจ์› ์ถ• 
  • axis 2 : ํ…์„œ, 3์ฐจ์› ์ถ• 

 

โœ” ๋™์  ์‹ ๊ฒฝ๋ง : ํ›ˆ๋ จํ•  ๋•Œ๋งˆ๋‹ค ๋„คํŠธ์›Œํฌ ๋ณ€๊ฒฝ์ด ๊ฐ€๋Šฅํ•œ (์€๋‹‰์ธต ์ถ”๊ฐ€ ๋ฐ ์ œ๊ฑฐ) ์‹ ๊ฒฝ๋ง 

 

 

import torch 

torch.tensor([1,-1], [1,-1])

 

 

๐Ÿ”น ์•„ํ‚คํ…์ฒ˜ 

 

โˆ˜  ํŒŒ์ดํ† ์น˜ API - ํŒŒ์ดํ† ์น˜ ์—”์ง„ - ์—ฐ์‚ฐ์ฒ˜๋ฆฌ 

โˆ˜  ํŒŒ์ดํ† ์น˜ API : ์‚ฌ์šฉ์ž๊ฐ€ ์‚ฌ์šฉํ•˜๋Š” ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋กœ ์•„ํ‚คํ…์ณ ๊ฐ€์žฅ ์œ—๊ณ„์ธต์— ์œ„์น˜ํ•ด์žˆ๋‹ค. 

โˆ˜  ํŒŒ์ดํ† ์น˜ ์—”์ง„ : ๋‹ค์ฐจ์› ํ…์„œ ๋ฐ ์ž๋™ ๋ฏธ๋ถ„์„ ์ฒ˜๋ฆฌํ•œ๋‹ค 

โˆ˜  ์—ฐ์‚ฐ์ฒ˜๋ฆฌ : ํ…์„œ์— ๋Œ€ํ•œ ์—ฐ์‚ฐ์„ ์ฒ˜๋ฆฌํ•œ๋‹ค. 

 

 

โˆ˜  API 

 

  • torch : GPU ๋ฅผ ์ง€์›ํ•˜๋Š” ํ…์„œ ํŒจํ‚ค์ง€, ๋น ๋ฅธ ์†๋„๋กœ ๋งŽ์€ ์–‘์˜ ๊ณ„์‚ฐ์ด ๊ฐ€๋Šฅํ•˜๋‹ค. 
  • torch.autograd : ์ž๋™ ๋ฏธ๋ถ„ ํŒจํ‚ค์ง€. ํ…์„œํ”Œ๋กœ, Caffe ์™€ ๊ฐ€์žฅ ์ฐจ๋ณ„๋˜๋Š” ์ง€์ ์œผ๋กœ, ์ผ๋ฐ˜์ ์œผ๋กœ ์€๋‹‰์ธต ๋…ธ๋“œ ์ˆ˜ ๋ณ€๊ฒฝ๊ณผ ๊ฐ™์ด ์‹ ๊ฒฝ๋ง์— ์‚ฌ์†Œํ•œ ๋ณ€๊ฒฝ์ด ์žˆ๋‹ค๋ฉด ์‹ ๊ฒฝ๋ง ๊ตฌ์ถ•์„ ์ฒ˜์Œ๋ถ€ํ„ฐ ๋‹ค์‹œ ์‹œ์ž‘ํ•ด์•ผ ํ•˜๋Š”๋ฐ, ํŒŒ์ดํ† ์น˜๋Š” ์ž๋™ ๋ฏธ๋ถ„ ๊ธฐ์ˆ ์„ ํƒํ•˜์—ฌ ์‹ค์‹œ๊ฐ„์œผ๋กœ ๋„คํŠธ์›Œํฌ ์ˆ˜์ •์ด ๋ฐ˜์˜๋œ ๊ณ„์‚ฐ์ด ๊ฐ€๋Šฅํ•ด ์‚ฌ์šฉ์ž๋Š” ๋‹ค์–‘ํ•œ ์‹ ๊ฒฝ๋ง์„ ์ ์šฉํ•ด๋ณผ ์ˆ˜ ์žˆ๋‹ค. 
  • torch.nn : CNN, RNN, ์ •๊ทœํ™” ๋“ฑ์ด ํฌํ•จ๋˜์–ด ์žˆ์–ด ์†์‰ฝ๊ฒŒ ์‹ ๊ฒฝ๋ง์„ ๊ตฌ์ถ•ํ•  ์ˆ˜ ์žˆ๋‹ค. 
  • torch.multiprocessing : ํŒŒ์ดํ† ์น˜์—์„œ ์‚ฌ์šฉํ•˜๋Š” ํ”„๋กœ์„ธ์Šค ์ „๋ฐ˜์— ๊ฑธ์ณ ํ…์„œ์˜ ๋ฉ”๋ชจ๋ฆฌ ๊ณต์œ ๊ฐ€ ๊ฐ€๋Šฅํ•˜๋‹ค. ์„œ๋กœ ๋‹ค๋ฅธ ํ”„๋กœ์„ธ์Šค์—์„œ ๋™์ผํ•œ ๋ฐ์ดํ„ฐ (ํ…์„œ) ์— ๋Œ€ํ•œ ์ ‘๊ทผ ๋ฐ ์‚ฌ์šฉ์ด ๊ฐ€๋Šฅํ•˜๋‹ค. 

 

 

โˆ˜ ํ…์„œ๋ฅผ ๋ฉ”๋ชจ๋ฆฌ์— ์ €์žฅํ•˜๊ธฐ 

 

  • ์Šคํ† ๋ฆฌ์ง€ : ํ…์„œ๋Š” ์–ด๋–ค ์ฐจ์›์„ ๊ฐ€์ง€๋“  ๋ฉ”๋ชจ๋ฆฌ์— ์ €์žฅํ•  ๋•Œ๋Š” 1์ฐจ์› ๋ฐฐ์—ด ํ˜•ํƒœ๊ฐ€ ๋œ๋‹ค. ๋ณ€ํ™˜๋œ 1์ฐจ์› ๋ฐฐ์—ด์„ ์Šคํ† ๋ฆฌ์ง€๋ผ๊ณ  ๋ถ€๋ฅธ๋‹ค. 
  • ์˜คํ”„์…‹ offset : ํ…์„œ์—์„œ ์ฒซ๋ฒˆ์งธ ์š”์†Œ๊ฐ€ ์Šคํ† ๋ฆฌ์ง€์— ์ €์žฅ๋œ ์ธ๋ฑ์Šค์ด๋‹ค. 
  • ์ŠคํŠธ๋ผ์ด๋“œ stride :  ์ฐจ์›์— ๋”ฐ๋ผ ๋‹ค์Œ ์š”์†Œ๋ฅผ ์–ป๊ธฐ ์œ„ํ•ด ๊ฑด๋„ˆ๋›ฐ๊ธฐ๊ฐ€ ํ•„์š”ํ•œ ์Šคํ† ๋ฆฌ์ง€์˜ ์š”์†Œ (์ธ๋ฑ์Šค์˜) ๊ฐœ์ˆ˜ (๋ฉ”๋ชจ๋ฆฌ์—์„œ์˜ ํ…์„œ ๋ ˆ์ด์•„์›ƒ) . ์š”์†Œ๊ฐ€ ์—ฐ์†์ ์œผ๋กœ ์ €์žฅ๋˜๊ธฐ ๋•Œ๋ฌธ์— ํ–‰ ์ค‘์‹ฌ์œผ๋กœ ์ŠคํŠธ๋ผ์ด๋“œ๋Š” ํ•ญ์ƒ 1์ด๋‹ค. 

 

 ์ถœ์ฒ˜ :  https://hiddenbeginner.github.io/deeplearning/2020/01/21/pytorch_tensor.html

 

stride = (์—ด๋ฐฉํ–ฅ์œผ๋กœ ๋‹ค์Œ์— ์œ„์น˜ํ•œ ์›์†Œ์— ์ ‘๊ทผํ•  ๋•Œ ๊ฑด๋„ˆ๊ฐ€์•ผ ํ•  ์ธ๋ฑ์Šค ์ˆ˜ , ํ–‰๋ฐฉํ–ฅ์œผ๋กœ ๋‹ค์Œ์— ์œ„์น˜ํ•œ ์›์†Œ์— ์ ‘๊ทผํ•  ๋•Œ ๊ฑด๋„ˆ๊ฐ€์•ผ ํ•  ์ธ๋ฑ์Šค ์ˆ˜)

 

 

 

 

 

 

 

 

 

 

2๏ธโƒฃ  ํŒŒ์ดํ† ์น˜ ๊ธฐ์ดˆ ๋ฌธ๋ฒ• 


 

๐Ÿ”น ํ…์„œ ๋‹ค๋ฃจ๊ธฐ 

 

A. ํ…์„œ ์ƒ์„ฑ ๋ฐ ๋ณ€ํ™˜

 

• torch.tensor( ) 

  • device = 'cude:0'    :  GPU ์— ํ…์„œ ์ƒ์„ฑํ•˜๋Š” ์˜ต์…˜ 
  • dtype = torch.float64 : dtype ์„ ์‚ฌ์šฉํ•˜์—ฌ ํ…์„œ ์ƒ์„ฑ 

 

# ํ…์„œ ์ƒ์„ฑ ๋ฐ ๋ณ€ํ™˜ 

import torch 

print(torch.tensor([[1,2],[3,4]])) # 2์ฐจ์› ํ˜•ํƒœ์˜ ํ…์„œ ์ƒ์„ฑ 
print(torch.tensor([[1,2],[3,4]], device = 'cuda:0')) # GPU ์— ํ…์„œ ์ƒ์„ฑ 
print(torch.tensor([[1,2],[3,4]], dtype = torch.float64))  # dtype ์„ ์ด์šฉํ•ด ํ…์„œ ์ƒ์„ฑ

 

 

 

• .numpy() 

  • ํ…์„œ๋ฅผ ndarray ๋กœ ๋ณ€๊ฒฝ 
  • .to('cpu').numpy() : GPU ์ƒ์˜ ํ…์„œ๋ฅผ CPU ์˜ ํ…์„œ๋กœ ๋ณ€ํ™˜ํ•œ ํ›„ ndarray ๋กœ ๋ณ€ํ™˜ 

 

# ํ…์„œ๋ฅผ ndarray ๋กœ ๋ณ€ํ™˜ 

temp = torch.tensor([[1,2],[3,4]]) 
print(temp.numpy()) 

temp = torch.tensor([[1,2],[3,4]] , device = 'cuda:0') 
print(temp.to('cpu').numpy())

 

 

 

 

B. ํ…์„œ ์ธ๋ฑ์Šค ์กฐ์ž‘ 

 

• ๋ฐฐ์—ด์ฒ˜๋Ÿผ ์ธ๋ฑ์Šค๋ฅผ ๋ฐ”๋กœ ์ง€์ •ํ•˜๊ฑฐ๋‚˜ ์Šฌ๋ผ์ด์Šค ๋“ฑ์„ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋‹ค. 

• torch.FloatTensor : 32๋น„ํŠธ์˜ ๋ถ€๋™ ์†Œ์ˆ˜์  

• torch.DoubleTensor : 64๋น„ํŠธ์˜ ๋ถ€๋™ ์†Œ์ˆ˜์  

• torch.LongTensor : 64๋น„ํŠธ์˜ ๋ถ€ํ˜ธ๊ฐ€ ์žˆ๋Š” ์ •์ˆ˜ 

 

• ์ธ๋ฑ์Šค ์กฐ์ž‘ : [ ] ์ค‘๊ด„ํ˜ธ๋ฅผ ํ†ตํ•ด ์›ํ•˜๋Š” ์›์†Œ๋ฅผ ๊ฐ€์ ธ์˜จ๋‹ค. 

 

# ์ธ๋ฑ์‹ฑ & ์Šฌ๋ผ์ด์‹ฑ

temp = torch.FloatTensor([1,2,3,4,5,6,7]) 
print(temp[0],temp[1],temp[-1]) # ์ธ๋ฑ์Šค๋กœ ์ ‘๊ทผ
print('------------------------------------------') 
print(temp[2:5], temp[4:-1]) # ์Šฌ๋ผ์ด์Šค๋กœ ์ ‘๊ทผ

 

 

 

C. ํ…์„œ ์—ฐ์‚ฐ ๋ฐ ์ฐจ์› ์กฐ์ž‘

 

• ํ…์„œ ๊ฐ„์˜ ํƒ€์ž…์ด ๋‹ค๋ฅด๋ฉด ์—ฐ์‚ฐ์ด ๋ถˆ๊ฐ€๋Šฅํ•˜๋‹ค (FloatTensor ์™€ DoubleTensor ๊ฐ„์— ์‚ฌ์น™ ์—ฐ์‚ฐ์„ ์ˆ˜ํ–‰ํ•˜๋ฉด ์˜ค๋ฅ˜๊ฐ€ ๋ฐœ์ƒ) 

 

• view( ) : ํ…์„œ์˜ ์ฐจ์›์„ ๋ณ€๊ฒฝํ•˜๋Š” ๋ฐฉ๋ฒ• (reshape ๊ธฐ๋Šฅ๊ณผ ์œ ์‚ฌ) 

• stack, cat : ํ…์„œ๋ฅผ ๊ฒฐํ•ฉ 

• t, transpose : ์ฐจ์›์„ ๊ตํ™˜ 

 

 

# ์—ฐ์‚ฐ 
temp = torch.tensor([[1,2],[3,4]]) 

print(temp.shape) 
print()
print(temp.view(4,1))
print() 
print(temp.view(-1)) # 2x2 ํ–‰๋ ฌ์„ 1์ฐจ์› ๋ฒกํ„ฐ๋กœ ๋ณ€ํ˜• 
print() 
print(temp.view(1,-1)) # 1x4 ํ–‰๋ ฌ๋กœ ๋ณ€ํ™˜

 

 

 

 

๐Ÿ”น  ๋ฐ์ดํ„ฐ ์ค€๋น„ 

 

โˆ˜ ๋ฐ์ดํ„ฐ ํ˜ธ์ถœ ๋ฐฉ๋ฒ• : pandas ๋ฅผ ์ด์šฉํ•˜๋Š” ๋ฐฉ๋ฒ• or ํŒŒ์ดํ† ์น˜์—์„œ ์ œ๊ณตํ•˜๋Š” ๋ฐ์ดํ„ฐ๋ฅผ ์ด์šฉํ•˜๋Š” ๋ฐฉ๋ฒ•

 

โˆ˜ ์ด๋ฏธ์ง€ ๋ฐ์ดํ„ฐ : ๋ถ„์‚ฐ๋œ ํŒŒ์ผ์—์„œ ๋ฐ์ดํ„ฐ๋ฅผ ์ฝ์€ ํ›„ ์ „์ฒ˜๋ฆฌ๋ฅผ ํ•˜๊ณ  ๋ฐฐ์น˜ ๋‹จ์œ„๋กœ ๋ถ„ํ• ํ•˜์—ฌ ์ฒ˜๋ฆฌํ•œ๋‹ค. 

โˆ˜ ํ…์ŠคํŠธ ๋ฐ์ดํ„ฐ : ์ž„๋ฒ ๋”ฉ ๊ณผ์ •์„ ๊ฑฐ์ณ ์„œ๋กœ ๋‹ค๋ฅธ ๊ธธ์ด์˜ ์‹œํ€€์Šค๋ฅผ ๋ฐฐ์น˜ ๋‹จ์œ„๋กœ ๋ถ„ํ• ํ•˜์—ฌ ์ฒ˜๋ฆฌํ•œ๋‹ค. 

 

 

โˆ˜ Custom dataset ์ปค์Šคํ…€ ๋ฐ์ดํ„ฐ์…‹ : ๋ฐ์ดํ„ฐ๋ฅผ ํ•œ๋ฒˆ์— ๋‹ค ๋ถ€๋ฅด์ง€ ์•Š๊ณ  ์กฐ๊ธˆ์”ฉ ๋‚˜๋ˆ„์–ด ๋ถˆ๋Ÿฌ์„œ ์‚ฌ์šฉํ•˜๋Š” ๋ฐฉ์‹ 

 

  • torch.utils.data.Dataset → ์ด๋ฅผ ์ƒ์†๋ฐ›์•„ ๋ฉ”์†Œ๋“œ๋“ค์„ ์˜ค๋ฒ„๋ผ์ด๋“œ ํ•˜์—ฌ ์ปค์Šคํ…€ ๋ฐ์ดํ„ฐ์…‹์„ ๋งŒ๋“ค ์ˆ˜ ์žˆ๋‹ค. 
  • torch.utils.data.DataLoader → ํ•™์Šต์— ์‚ฌ์šฉ๋  ๋ฐ์ดํ„ฐ ์ „์ฒด๋ฅผ ๋ณด๊ด€ํ–ˆ๋‹ค๊ฐ€ ๋ชจ๋ธ ํ•™์Šต์„ ํ•  ๋•Œ ๋ฐฐ์น˜ํฌ๊ธฐ ๋งŒํผ ๋ฐ์ดํ„ฐ๋ฅผ ๊บผ๋‚ด ์‚ฌ์šฉํ•œ๋‹ค. ๋ฐ์ดํ„ฐ๋ฅผ ๋ฏธ๋ฆฌ ์ž˜๋ผ๋†“๋Š” ๊ฒƒ์€ ์•„๋‹ˆ๊ณ , ๋‚ด๋ถ€์ ์œผ๋กœ ๋ฐ˜๋ณต์ž์— ํฌํ•จ๋œ ์ธ๋ฑ์Šค๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋ฐฐ์น˜ ํฌ๊ธฐ ๋งŒํผ ๋ฐ์ดํ„ฐ๋ฅผ ๋ฐ˜ํ™˜ํ•œ๋‹ค. 
  • ๋ฏธ๋‹ˆ ๋ฐฐ์น˜ ํ•™์Šต, ๋ฐ์ดํ„ฐ ์…”ํ”Œ(shuffle), ๋ณ‘๋ ฌ ์ฒ˜๋ฆฌ๊นŒ์ง€ ๊ฐ„๋‹จํžˆ ์ˆ˜ํ–‰๊ฐ€๋Šฅ 

 

https://wikidocs.net/57165

 

 

import pandas as pd 
import torch 
from torch.utils.data import Dataset 
from torch.utils.data import DataLoader 

class CustomDataset(Dataset) : 

    # csv_file ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ํ†ตํ•ด ๋ฐ์ดํ„ฐ์…‹์„ ๋ถˆ๋Ÿฌ์˜จ๋‹ค. 
    
    def __init__(self, csv_file) : 
        self.label = pd.read_csv(csv_file) 
    
    # ์ „์ฒด ๋ฐ์ดํ„ฐ์…‹์˜ ํฌ๊ธฐ๋ฅผ ๋ฐ˜ํ™˜ 
    
    def __len__(self) : 
    	return len(self.label) 
    
    # ์ „์ฒด x ์™€ y ๋ฐ์ดํ„ฐ ์ค‘์— ํ•ด๋‹น idx ๋ฒˆ์งธ์˜ ๋ฐ์ดํ„ฐ๋ฅผ ๊ฐ€์ ธ์˜จ๋‹ค. 
    
    def __getitem__(self,idx) : 
    	sample = torch.tensor(self.label.iloc[idx, 0:3]).int() 
        label = torch.tensor(self.label.iloc[idx,3]).int() 
        return sample, label 
        
   
  tensor_dataset = CustomDataset('../covtype.csv') 
  dataset = DataLoader(tensor_dataset, batch_size = 4, shuffle = True) 
  # ๋ฐ์ดํ„ฐ์…‹์„ torch.utils.data.DataLoader ํŒŒ๋ผ๋ฏธํ„ฐ๋กœ ์ „๋‹ฌํ•œ๋‹ค.

 

 

 

 

 

 

 

โˆ˜ MNIST  ๋ฐ์ดํ„ฐ ๋ถˆ๋Ÿฌ์˜ค๊ธฐ 

 

  • torchvision : ํŒŒ์ดํ† ์น˜์—์„œ ์ œ๊ณตํ•˜๋Š” ๋ฐ์ดํ„ฐ์…‹๋“ค์ด ๋ชจ์—ฌ์žˆ๋Š” ํŒจํ‚ค์ง€๋กœ ImageNet, MNIST ๋ฅผ ํฌํ•จํ•œ ์œ ๋ช…ํ•œ ๋ฐ์ดํ„ฐ์…‹๋“ค์„ ์ œ๊ณตํ•˜๊ณ  ์žˆ๋‹ค. 

 

import torchvision.transforms as transforms 

mnist_transform = transforms.Compose([
    transforms.ToTensor(), 
    transforms.Normalize((0.5,),(1.0,)) # ํ‰๊ท ์ด 0.5, ํ‘œ์ค€ํŽธ์ฐจ๊ฐ€ 1์ด ๋˜๋„๋ก ๋ฐ์ดํ„ฐ ๋ถ„ํฌ๋ฅผ ์กฐ์ • 
])


from torchvision.datasets import MNIST  
import requests 
 
download_root = '/content/sample_data/mnist_dataset'

train_dataset = MNIST(download_root, transform = mnist_transform , train=True, download=True) 
valid_dataset = MNIST(download_root, transform = mnist_transform , train=False, download=True) 
test_dataset = MNIST(download_root, transform = mnist_transform , train=False, download=True)

 

 

๐Ÿ”น  ๋ชจ๋ธ์ •์˜ 

 

ํŒŒ์ดํ† ์น˜ ๊ตฌํ˜„์ฒด๋“ค์€ ๊ธฐ๋ณธ์ ์œผ๋กœ Class ๋ผ๋Š” ๊ฐœ๋…์„ ์• ์šฉํ•œ๋‹ค. 

 

 

โˆ˜ ๋ชจ๋ธ๊ณผ ๋ชจ๋“ˆ์˜ ์ฐจ์ด 

 

  • ๊ณ„์ธต Layer : ๋ชจ๋“ˆ์„ ๊ตฌ์„ฑํ•˜๋Š” ํ•œ ๊ฐœ์˜ ๊ณ„์ธต์œผ๋กœ convolutional layer, linear layer ๋“ฑ์ด ์žˆ๋‹ค. 
  • ๋ชจ๋“ˆ module : ํ•œ ๊ฐœ ์ด์ƒ์˜ ๊ณ„์ธต์ด ๋ชจ์—ฌ์„œ ๊ตฌ์„ฑ๋œ ๊ฒƒ์œผ๋กœ ๋ชจ๋“ˆ์ด ๋ชจ์—ฌ์„œ ์ƒˆ๋กœ์šด ๋ชจ๋“ˆ์„ ๋งŒ๋“ค ์ˆ˜๋„ ์žˆ๋‹ค. 
  • ๋ชจ๋ธ model : ์ตœ์ข…์ ์œผ๋กœ ์›ํ•˜๋Š” ๋„คํŠธ์›Œํฌ๋กœ ํ•œ ๊ฐœ์˜ ๋ชจ๋“ˆ์ด ๋ชจ๋ธ์ด ๋  ์ˆ˜๋„ ์žˆ๋‹ค. 

 

 

 

A. ๋‹จ์ˆœ ์‹ ๊ฒฝ๋ง์„ ์ •์˜ํ•˜๋Š” ๋ฐฉ๋ฒ• 

 

  • nn.Module ์„ ์ƒ์†๋ฐ›์ง€ ์•Š๋Š” ๋งค์šฐ ๋‹จ์ˆœํ•œ ๋ชจ๋ธ์„ ๋งŒ๋“ค ๋•Œ ์‚ฌ์šฉํ•œ๋‹ค. 
  • ๊ตฌํ˜„์ด ์‰ฝ๊ณ  ๋‹จ์ˆœํ•˜๋‹ค. 

 

# ๋”˜์ˆœ ์‹ ๊ฒฝ๋ง ์ •์˜ 
import torch.nn as nn
model = nn.Linear(in_features = 1, out_features = 1, bias = True)

 

 

B. nn.Module() ์„ ์ƒ์†ํ•˜์—ฌ ์ •์˜ํ•˜๋Š” ๋ฐฉ๋ฒ• 

 

  • __init__() : ๋ชจ๋ธ์—์„œ ์‚ฌ์šฉ๋  ๋ชจ๋“ˆ (nn.Linear , nn.Conv2d) , ํ™œ์„ฑํ™” ํ•จ์ˆ˜ ๋“ฑ์„ ์ •์˜ 
  • forward() : ๋ชจ๋ธ์—์„œ ์‹คํ–‰๋˜์–ด์•ผ ํ•˜๋Š” ์—ฐ์‚ฐ์„ ์ •์˜  

 

import torch 
import torch.nn as nn

class MLP(nn.Module) :  # ๋ชจ๋“ˆ ์ƒ์† 

  def __init__(self, inputs) : 
    super(MLP, self).__init__() 
    self.layer = Linear(inputs, 1)   # ๊ณ„์ธต ์ •์˜ 
    self.activation = Sigmoid()     # ํ™œ์„ฑํ™” ํ•จ์ˆ˜ ์ •์˜ 
       
  def forward(self, X) : 
    X = self.layer(X) 
    X = self.activation(X) 
    return X

 

 

• ์„ ํ˜•ํšŒ๊ท€ ๋ชจ๋ธ ์ƒ์„ฑ ์˜ˆ์‹œ 

 

class LinearRegressionModel(nn.Module) : 
  
  def __init__(self) : 
    super().__init__() 
    self.linear = nn.Linear(1,1) 

  def forward(self, x) : 
    return self.linear(x)

 

 

C. nn.Sequential ์‹ ๊ฒฝ๋ง์„ ์ •์˜ํ•˜๋Š” ๋ฐฉ๋ฒ• 

 

  • nn.Sequential : ์ด๋ฅผ ์‚ฌ์šฉํ•˜๋ฉด __init__() ์—์„œ ์‚ฌ์šฉํ•  ๋„คํŠธ์›Œํฌ ๋ชจ๋ธ๋“ค์„ ์ •์˜ํ•ด ์ค„ ๋ฟ ์•„๋‹ˆ๋ผ forward() ํ•จ์ˆ˜์—์„œ๋Š” ๋ชจ๋ธ์—์„œ ์‹คํ–‰๋˜์–ด์•ผ ํ•  ๊ณ„์‚ฐ์„ ์ข€ ๋” ๊ฐ€๋…์„ฑ ๋›ฐ์–ด๋‚˜๊ฒŒ ์ฝ”๋“œ๋กœ ์ž‘์„ฑํ•  ์ˆ˜ ์žˆ๋‹ค. 
  • layer ๋ฅผ ์—ฌ๋Ÿฌ๊ฐœ ์ •์˜ํ•  ์ˆ˜ ์žˆ๊ณ , sequential ๊ฐ์ฒด๋Š” ๊ทธ ์•ˆ์— ํฌํ•จ๋œ ๊ฐ ๋ชจ๋“ˆ์„ ์ˆœ์ฐจ์ ์œผ๋กœ ์‹คํ–‰ํ•ด ์ค€๋‹ค. 
  • nn.Sequential ์€ ๋ชจ๋ธ์˜ ๊ณ„์ธต์ด ๋ณต์žกํ• ์ˆ˜๋ก ํšจ๊ณผ๊ฐ€ ๋›ฐ์–ด๋‚˜๋‹ค. 

 

import torch.nn as nn 

class MLP(nn.Module) : 

  def __init__(self) : 

    super(MLP, self).__init__() 

    self.layer1 = nn.Sequential(
        nn.Conv2d(in_channels = 3, out_channels = 64, kernel_size = 5), 
        nn.ReLU(inplace = True), 
        nn.MaxPool2d(2) 
    )

    self.layer2 = nn.Sequential( 
        nn.Conv2d(in_channels = 64, out_channels = 30, kernel_size = 5), 
        nn.ReLU(inplace=True), 
        nn.MaxPool2d(2)
    )

    self.layer3 = nn.Sequential(
        nn.Linear(in_features = 30*5*5, out_features = 10, bias = True), 
        nn.ReLU(inplace=True) 
    )
  
  def forward(self,x) : 
    x = self.layer1(x) 
    x = self.layer2(x) 
    x = x.view(x.shape[0],-1) 
    x = self.layer3(x) 
    return x 
  

model = MLP() # ๋ชจ๋ธ์— ๋Œ€ํ•œ ๊ฐ์ฒด ์ƒ์„ฑ 

print("printing children \n ---------------------") 
print(list(model.children())) 
print("\n\nprinting Modules\n-------------------------") 
print(list(model.modules()))

 

 

 

โœ” model.modules() : ๋ชจ๋ธ์˜ ๋„คํŠธ์›Œํฌ์— ๋Œ€ํ•œ ๋ชจ๋“  ๋…ธ๋“œ๋“ค์„ ๋ฐ˜ํ™˜

 

 

 

โœ” model.children() : ๊ฐ™์€ ์ˆ˜์ค€ level ์˜ ํ•˜์œ„ ๋…ธ๋“œ๋ฅผ ๋ฐ˜ํ™˜ 

 

 

 

 

 

D. ํ•จ์ˆ˜๋กœ ์‹ ๊ฒฝ๋ง์„ ์ •์˜ํ•˜๋Š” ๋ฐฉ๋ฒ• 

 

โˆ˜ Sequential ์ด์šฉ๊ณผ ๋™์ผํ•˜์ง€๋งŒ, ํ•จ์ˆ˜๋กœ ์„ ์–ธํ•  ๊ฒฝ์šฐ ๋ณ€์ˆ˜์— ์ €์žฅํ•ด๋†“์€ ๊ณ„์ธต๋“ค์„ ์žฌ์‚ฌ์šฉ ํ•  ์ˆ˜ ์žˆ๋Š” ์žฅ์ ์ด ์žˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ๋ชจ๋ธ์ด ๋ณต์žกํ•ด์ง€๋Š” ๋‹จ์ ๋„ ์žˆ๋‹ค. ๋ณต์žกํ•œ ๋ชจ๋ธ์€ ํ•จ์ˆ˜๋ฅผ ์ด์šฉํ•˜๋Š” ๊ฒƒ๋ณด๋‹ค nn.Module ์„ ์ƒ์†๋ฐ›์•„ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์ด ํŽธ๋ฆฌํ•˜๋‹ค. 

 

def MLP(in_features = 1, hidden_features = 20, out_features = 1) : 
  
  hidden = nn.Linear(in_features = in_features, out_features = hidden_features, bias = True) 
  activation = nn.ReLU() 
  output = nn.Linear(in_features = hidden_features, out_features = out_features, bias = True) 
  net = nn.Sequential(hidden, activation, output) 
    
  return net

 

 

 

 

๐Ÿ”น  ๋ชจ๋ธ ํŒŒ๋ผ๋ฏธํ„ฐ ์ •์˜ 

 

๐Ÿ’จ ์†์‹คํ•จ์ˆ˜ 

 

โˆ˜  ํ•™์Šตํ•˜๋Š” ๋™์•ˆ ์ถœ๋ ฅ๊ณผ ์‹ค์ œ ๊ฐ’ ์‚ฌ์ด์˜ ์˜ค์ฐจ๋ฅผ ์ธก์ •ํ•˜์—ฌ ๋ชจ๋ธ์˜ ์ •ํ™•์„ฑ์„ ์ธก์ •ํ•œ๋‹ค. 

โˆ˜  BCELoss : ์ด์ง„ ๋ถ„๋ฅ˜๋ฅผ ์œ„ํ•ด ์‚ฌ์šฉ 

โˆ˜  CrossEntropyLoss : ๋‹ค์ค‘ ํด๋ž˜์Šค ๋ถ„๋ฅ˜๋ฅผ ์œ„ํ•ด ์‚ฌ์šฉ 

โˆ˜  MSELoss : ํšŒ๊ท€ ๋ชจ๋ธ์—์„œ ์‚ฌ์šฉ 

 

 

 

๐Ÿ‘‰ ์†์‹ค ํ•จ์ˆ˜์˜ ๊ฐ’์„ ์ตœ์†Œํ™” ํ•˜๋Š” ๊ฐ€์ค‘์น˜์™€ ๋ฐ”์ด์–ด์Šค๋ฅผ ์ฐพ๋Š” ๊ฒƒ์ด ํ•™์Šต์˜ ๋ชฉํ‘œ ๐Ÿ‘ˆ

 

 

 

๐Ÿ’จ ์˜ตํ‹ฐ๋งˆ์ด์ € 

 

โˆ˜  ์†์‹คํ•จ์ˆ˜๋ฅผ ๋ฐ”ํƒ•์œผ๋กœ ๋ชจ๋ธ์˜ ์—…๋ฐ์ดํŠธ ๋ฐฉ๋ฒ•์„ ๊ฒฐ์ •ํ•œ๋‹ค. 

โˆ˜  step() ๋ฉ”์„œ๋“œ๋ฅผ ํ†ตํ•ด ์ „๋‹ฌ๋ฐ›์€ ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ์—…๋ฐ์ดํŠธ ํ•œ๋‹ค. 

โˆ˜  ํŒŒ๋ผ๋ฏธํ„ฐ๋ณ„๋กœ ๋‹ค๋ฅธ ๊ธฐ์ค€ (ex. ํ•™์Šต๋ฅ ) ์„ ์ ์šฉ์‹œํ‚ฌ ์ˆ˜ ์žˆ๋‹ค. 

โˆ˜  torch.optim.Optimizer(params, defaults) : ์˜ตํ‹ฐ๋งˆ์ด์ €์˜ ๊ธฐ๋ณธ ํด๋ž˜์Šค 

โˆ˜  zero_grad() : ํŒŒ๋ผ๋ฏธํ„ฐ์˜ gradient ๋ฅผ 0 ์œผ๋กœ ๋งŒ๋“œ๋Š” ๋ฉ”์„œ๋“œ 

โˆ˜  torch.optim.lr_scheduler : ์—ํฌํฌ์— ๋”ฐ๋ผ ํ•™์Šต๋ฅ ์„ ์กฐ์ •ํ•  ์ˆ˜ ์žˆ๋‹ค. 

โˆ˜  ์ข…๋ฅ˜ 

 

  • optim.Adadelta , optim.Adagrad, optim.Adam, optim.SparseAdam, optim.Adamax 
  • optim.ASGD, optim.LBFGS 
  • optim.RMSProp, optimRprop, optim.SGD 

 

 

 

 

๐Ÿ’จ ํ•™์Šต ์Šค์ผ€์ฅด๋Ÿฌ learning rate scheduler

 

โˆ˜  ์ง€์ •ํ•œ ํšŸ์ˆ˜์˜ ์—ํฌํฌ๋ฅผ ์ง€๋‚  ๋•Œ๋งˆ๋‹ค ํ•™์Šต๋ฅ ์„ ๊ฐ์†Œ (decay) ์‹œ์ผœ์ค€๋‹ค. 

โˆ˜  ์ด๋ฅผ ์‚ฌ์šฉํ•˜๋ฉด ํ•™์Šต ์ดˆ๊ธฐ์—๋Š” ๋น ๋ฅธ ํ•™์Šต์„ ์ง„ํ–‰ํ•˜๋‹ค๊ฐ€ global minimum ๊ทผ์ฒ˜์— ์˜ค๋ฉด ํ•™์Šต๋ฅ ์„ ์ค„์—ฌ ์ตœ์ ์ ์„ ์ฐพ์•„๊ฐˆ ์ˆ˜ ์žˆ๋„๋ก ํ•ด์ค€๋‹ค. 

 

 

  • optim.lr_scheduler.LambdaLR : ๋žŒ๋‹ค ํ•จ์ˆ˜๋ฅผ ์ด์šฉํ•ด ํ•จ์ˆ˜์˜ ๊ฒฐ๊ณผ๋ฅผ ํ•™์Šต๋ฅ ๋กœ ์„ค์ • 
  • optim.lr_scheduler.StepLR : ํŠน์ • ๋‹จ๊ณ„๋งˆ๋‹ค ํ•™์Šต๋ฅ ์„ ๊ฐ๋งˆ ๋น„์œจ๋งŒํผ ๊ฐ์†Œ์‹œํ‚จ๋‹ค. 
  • optim.lr_scheduler.MultiStepLR : stepLR ๊ณผ ๋น„์Šทํ•˜์ง€๋งŒ ํŠน์ • ๋‹จ๊ณ„๊ฐ€ ์•„๋‹Œ ์ง€์ •๋œ ์—ํฌํฌ์—๋งŒ ๊ฐ๋งˆ ๋น„์œจ๋กœ ๊ฐ์†Œ์‹œํ‚จ๋‹ค. 
  • optim.lr_scheduler.ExponentialLR : ์—ํฌํฌ๋งˆ๋‹ค ์ด์ „ ํ•™์Šต๋ฅ ์— ๊ฐ๋งˆ๋งŒํผ ๊ณฑํ•œ๋‹ค. 
  • optim.lr_scheduler.CosineAnnealingLR : ํ•™์Šต๋ฅ ์„ ์ฝ”์‚ฌ์ธ ํ•จ์ˆ˜์˜ ํ˜•ํƒœ์ฒ˜๋Ÿผ ๋ณ€ํ™”์‹œํ‚จ๋‹ค. ํ•™์Šต๋ฅ ์ด ์ปค์ง€๊ธฐ๋„ ์ž‘์•„์ง€๊ธฐ๋„ ํ•œ๋‹ค. 
  • optim.lr_scheduler.ReduceLROnPlateau  : ํ•™์Šต์ด ์ž˜๋˜๊ณ  ์žˆ๋Š”์ง€ ์•„๋‹Œ์ง€์— ๋”ฐ๋ผ ๋™์ ์œผ๋กœ ํ•™์Šต๋ฅ ์„ ๋ณ€ํ™”์‹œํ‚ฌ ์ˆ˜ ์žˆ๋‹ค. 

 

๐Ÿ’จ ์ง€ํ‘œ metrics 

 

โˆ˜  ํ›ˆ๋ จ๊ณผ ํ…Œ์ŠคํŠธ ๋‹จ๊ณ„๋ฅผ ๋ชจ๋‹ˆํ„ฐ๋ง ํ•œ๋‹ค. 

 

 

๐Ÿ’จ ์ตœ์†Œ์  Minimum

 

โˆ˜  Global minimum : ์˜ค์ฐจ๊ฐ€ ๊ฐ€์žฅ ์ž‘์„ ๋•Œ์˜ ๊ฐ’์„ ์˜๋ฏธํ•˜๋ฉฐ, ์šฐ๋ฆฌ๊ฐ€ ์ตœ์ข…์ ์œผ๋กœ ์ฐพ๊ณ ์ž ํ•˜๋Š”๊ฒƒ (์ตœ์ ์ )

โˆ˜  local minimum : ์ „์—ญ ์ตœ์†Œ์ ์„ ์ฐพ์•„๊ฐ€๋Š” ๊ณผ์ •์—์„œ ๋งŒ๋‚˜๋Š” ๊ตฌ๋ฉ ๊ฐ™์€ ๊ณณ์œผ๋กœ ์˜ตํ‹ฐ๋งˆ์ด์ €๊ฐ€ ์ง€์—ญ ์ตœ์†Œ์ ์—์„œ ํ•™์Šต์„ ๋ฉˆ์ถ”๋ฉด ์ตœ์†Ÿ๊ฐ’์„ ๊ฐ–๋Š” ์˜ค์ฐจ๋ฅผ ์ฐพ์„ ์ˆ˜ ์—†๋Š” ๋ฌธ์ œ๊ฐ€ ๋ฐœ์ƒํ•œ๋‹ค. 

 

 

# ๋ชจ๋ธ ํŒŒ๋ผ๋ฏธํ„ฐ ์ •์˜ ์˜ˆ์‹œ ์ฝ”๋“œ 

from torch.optim import optimizer 

criterion = torch.nn.MSELoss() 
optimizer = torch.optim.SGD(model.parameters(), lr = 0.01, momentum = 0.9) 
scheduler = torch.optim.lr_scheduler.LamdaLR(optimizer = optimizer, 
                                             lr_lambda = lambda epoch : 0.95**epoch)

for epoch in range(1, 100+1) : # ์—ํฌํฌ ์ˆ˜๋งŒํผ ๋ฐ์ดํ„ฐ๋ฅผ ๋ฐ˜๋ณตํ•˜์—ฌ ์ฒ˜๋ฆฌ 

  for x, y in dataloader : # ๋ฐฐ์น˜ ํฌ๊ธฐ๋งŒํผ ๋ฐ์ดํ„ฐ๋ฅผ ๊ฐ€์ ธ์™€์„œ ํ•™์Šต ์ง„ํ–‰
    optimizer.zero_grad() 
  

loss_fn(model(x), y).backward() 
optimizer.step() 
scheduler.step()

 

 

 

 

๐Ÿ”น  ๋ชจ๋ธํ›ˆ๋ จ 

 

๐Ÿ’จ ํ•™์Šต

 

โˆ˜  ๋ชจ๋ธ์„ ํ•™์Šต์‹œํ‚จ๋‹ค๋Š” ๊ฒƒ์€ y = wx + b ๋ผ๋Š” ํ•จ์ˆ˜์—์„œ w ์™€ b ์˜ ์ ์ ˆํ•œ ๊ฐ’์„ ์ฐพ๋Š”๋‹ค๋Š” ๊ฒƒ์˜ ์˜๋ฏธ์ด๋‹ค. 

โˆ˜  w ์™€ b ์— ์ž„์˜์˜ ๊ฐ’์„ ์ ์šฉํ•˜์—ฌ ์‹œ์ž‘ํ•ด ์˜ค์ฐจ๊ฐ€ ์ค„์–ด๋“ค์–ด ์ „์—ญ ์ตœ์†Œ์ ์— ์ด๋ฅผ๋•Œ๊นŒ์ง€ ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ๊ณ„์† ์ˆ˜์ •ํ•ด ๋‚˜์•„๊ฐ„๋‹ค. 

 

๋”ฅ๋Ÿฌ๋‹ ํ•™์Šต์ ˆ์ฐจ ํŒŒ์ดํ† ์น˜ ํ•™์Šต ์ ˆ์ฐจ
๋ชจ๋ธ, ์†์‹คํ•จ์ˆ˜, ์˜ตํ‹ฐ๋งˆ์ด์ € ์ •์˜ • ๋ชจ๋ธ, ์†์‹คํ•จ์ˆ˜, ์˜ตํ‹ฐ๋งˆ์ด์ € ์ •์˜

•  optimizer.zero_grad() : ์ „๋ฐฉํ–ฅ ํ•™์Šต, ๊ธฐ์šธ๊ธฐ ์ดˆ๊ธฐํ™”
์ „๋ฐฉํ–ฅ ํ•™์Šต (์ž…๋ ฅ → ์ถœ๋ ฅ ๊ณ„์‚ฐ) output = model(input) : ์ถœ๋ ฅ ๊ณ„์‚ฐ 
์†์‹ค ํ•จ์ˆ˜๋กœ ์ถœ๋ ฅ๊ณผ ์ •๋‹ต์˜ ์ฐจ์ด (์˜ค์ฐจ) ๊ณ„์‚ฐ loss = loss_fn(output, target) : ์˜ค์ฐจ ๊ณ„์‚ฐ 
์—ญ์ „ํŒŒ ํ•™์Šต (๊ธฐ์šธ๊ธฐ ๊ณ„์‚ฐ) loss.backward() : ์—ญ์ „ํŒŒ ํ•™์Šต 
๊ธฐ์šธ๊ธฐ ์—…๋ฐ์ดํŠธ  optimizer.step() : ๊ธฐ์šธ๊ธฐ ์—…๋ฐ์ดํŠธ 

 

 

โˆ˜  loss.backward( ) : ํŒŒ์ดํ† ์น˜์—์„œ ๊ธฐ์šธ๊ธฐ ๊ฐ’์„ ๊ณ„์‚ฐํ•˜๊ธฐ ์œ„ํ•ด ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์ธ๋ฐ, ์ด๋Š” ์ƒˆ๋กœ์šด ๊ธฐ์šธ๊ธฐ๊ฐ’์ด ์ด์ „ ๊ธฐ์šธ๊ธฐ ๊ฐ’์— ๋ˆ„์ ํ•˜์—ฌ ๊ณ„์‚ฐ๋˜๋ฏ€๋กœ, RNN ๊ฐ™์€ ๋ชจ๋ธ์„ ๊ตฌํ˜„ํ•  ๋•Œ์—๋Š” ํ•„์š”ํ•˜์ง€๋งŒ, ๊ทธ๋ ‡์ง€ ์•Š์€ ๋”ฅ๋Ÿฌ๋‹ ์•„ํ‚คํ…์ณ์—์„œ๋Š” ๋ถˆํ•„์š”ํ•˜๋‹ค. ๋”ฐ๋ผ์„œ ๊ธฐ์šธ๊ธฐ ๊ฐ’์˜ ๋ˆ„์  ๊ณ„์‚ฐ์ด ํ•„์š”ํ•˜์ง€ ์•Š์„ ๋•Œ์—๋Š” ์ž…๋ ฅ ๊ฐ’์„ ๋ชจ๋ธ์— ์ ์šฉํ•˜๊ธฐ ์ „์— optimizer.zero_grad() ๋ฅผ ํ˜ธ์ถœํ•˜์—ฌ ๋ฏธ๋ถ„๊ฐ’ (๊ธฐ์šธ๊ธฐ๋ฅผ ๊ตฌํ•˜๋Š” ๊ณผ์ •์—์„œ ๋ฏธ๋ถ„์„ ์‚ฌ์šฉ) ์ด ๋ˆ„์ ๋˜์ง€ ์•Š๊ฒŒ ์ดˆ๊ธฐํ™” ํ•ด์ฃผ์–ด์•ผ ํ•œ๋‹ค. 

 

 

โˆ˜ ๋ชจ๋ธ ํ›ˆ๋ จ ์ฝ”๋“œ 

 

for epoch in range(100) : 
  yhat = model(x_train)  
  loss = criterion(yhat, y_train)  # criterion = torch.nn.MSELoss() 
  optimizer.zero_grad() # ์˜ค์ฐจ๊ฐ€ ์ค‘์ฒฉ์ ์œผ๋กœ ์Œ“์ด์ง€ ์•Š๋„๋ก ์ดˆ๊ธฐํ™”
  loss.backward() # ์—ญ์ „ํŒŒ ํ•™์Šต 
  optimizer.step() # ๊ธฐ์šธ๊ธฐ ์—…๋ฐ์ดํŠธ

 

 

 

 

๐Ÿ”น  ๋ชจ๋ธํ‰๊ฐ€

 

๐Ÿ’จ evaluation

 

โˆ˜  ์ฃผ์–ด์ง„ ํ…Œ์ŠคํŠธ ๋ฐ์ดํ„ฐ์…‹์„ ์‚ฌ์šฉํ•ด ๋ชจ๋ธ์„ ํ‰๊ฐ€ 

โˆ˜  torchmetrics ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ์‚ฌ์šฉ 

 

 

๐Ÿ’จ ํ•จ์ˆ˜๋ฅผ ์ด์šฉํ•˜์—ฌ ๋ชจ๋ธ์„ ํ‰๊ฐ€ํ•˜๋Š” ์ฝ”๋“œ 

 

โˆ˜  torchmetrics.functional.accuracy( ) 

 

# ํŒจํ‚ค์ง€ ์„ค์น˜ : pip install torchmetrics

# ๋ชจ๋ธ์„ ํ‰๊ฐ€ํ•˜๋Š” ์ฝ”๋“œ : ํ•จ์ˆ˜๋ฅผ ์ด์šฉํ•œ ์ฝ”๋“œ 

import torch 
import torchmetrics  

preds = torch.randn(10,5).softmax(dim = -1)  # ์˜ˆ์ธก๊ฐ’ : softmax() 
target = torch.randint(5, (10,)) # ์‹ค์ œ๊ฐ’ 

acc = torchmetrics.functional.accuracy(preds, target) # โญ๋ชจ๋ธ ํ‰๊ฐ€

 

 

 

๐Ÿ’จ ๋ชจ๋“ˆ์„ ์ด์šฉํ•˜์—ฌ ๋ชจ๋ธ์„ ํ‰๊ฐ€ํ•˜๋Š” ์ฝ”๋“œ

 

โˆ˜  torchmetrics.Accuracy() 

 

# ๋ชจ๋ธ์„ ํ‰๊ฐ€ํ•˜๋Š” ์ฝ”๋“œ : ๋ชจ๋“ˆ์„ ์ด์šฉํ•œ ์ฝ”๋“œ 

metric = torchmetrics.Accuracy() # ๋ชจ๋ธ ํ‰๊ฐ€ ์ดˆ๊ธฐํ™” 

n_batches = 10 

for i in range(n_batches) : 
  preds = torch.randn(10,5).softmax(dim = -1) 
  target = torch.randint(5, (10,)) 

  acc = metric(preds, target) # โญ
  print(f'{i} ๋ฒˆ์งธ ๋ฐฐ์น˜์—์„œ์˜ ์ •ํ™•๋„ : {acc}') # ํ˜„์žฌ ๋ฐฐ์น˜์—์„œ์˜ ๋ชจ๋ธ ํ‰๊ฐ€ 

############################

acc = metric.compute() 
print(f'์ „์ฒด ๋ฐ์ดํ„ฐ์— ๋Œ€ํ•œ ์ •ํ™•๋„ : {acc}') # ๋ชจ๋“  ๋ฐฐ์น˜์—์„œ์˜ ๋ชจ๋ธ ํ‰๊ฐ€

 

โˆ˜  ์‚ฌ์ดํ‚ท๋Ÿฐ์—์„œ ์ œ๊ณตํ•˜๋Š” metrics ๋ชจ๋“ˆ์„ ์‚ฌ์šฉํ•  ์ˆ˜๋„ ์žˆ๋‹ค → confusion_matrix , accuracy_score, classification_report 

 

 

๐Ÿ”น  ํ›ˆ๋ จ๊ณผ์ • ๋ชจ๋‹ˆํ„ฐ๋ง 

 

๐Ÿ’จ ๋ชจ๋‹ˆํ„ฐ๋ง 

 

โˆ˜  ํ•™์Šต์ด ์ง„ํ–‰๋˜๋Š” ๊ณผ์ •์—์„œ ๊ฐ ํŒŒ๋ผ๋ฏธํ„ฐ ๊ฐ’๋“ค์ด ์–ด๋–ป๊ฒŒ ๋ณ€ํ™”ํ•˜๋Š”์ง€ ์‚ดํŽด๋ณด๋Š” ๊ฒƒ 

โˆ˜  ํ…์„œ๋ณด๋“œ : ํ•™์Šต์— ์‚ฌ์šฉ๋˜๋Š” ๊ฐ์ข… ํŒŒ๋ผ๋ฏธํ„ฐ๊ฐ’์ด ์–ด๋–ป๊ฒŒ ๋ณ€ํ™”ํ•˜๋Š”์ง€ ์†์‰ฝ๊ฒŒ ์‹œ๊ฐํ™”ํ•˜์—ฌ ๋ณผ ์ˆ˜ ์žˆ์Œ + ์„ฑ๋Šฅ์„ ์ถ”์ ํ•˜๊ฑฐ๋‚˜ ํ‰๊ฐ€ํ•˜๋Š” ์šฉ๋„๋กœ ์‚ฌ์šฉ 

 

1. ํ…์„œ๋ณด๋“œ ์„ค์ • Set up 
2. ํ…์„œ๋ณด๋“œ์— ๊ธฐ๋ก write 
3. ํ…์„œ๋ณด๋“œ๋ฅผ ์‚ฌ์šฉํ•ด ๋ชจ๋ธ ๊ตฌ์กฐ๋ฅผ ์‚ดํŽด๋ณธ๋‹ค. 

 

pip install tensorboard

 

import torch 
from torch.utils.tensorboard import SummaryWriter #โญ

writer = SummaryWriter('../chap02/tensorboard') # ๐Ÿ“Œ ๋ชจ๋‹ˆํ„ฐ๋ง์— ํ•„์š”ํ•  ๊ฐ’๋“ค์ด ์ €์žฅ๋  ์œ„์น˜ 

for epoch in range(num_epochs) : 

  model.train() # ํ•™์Šต ๋ชจ๋“œ๋กœ ์ „ํ™˜ (dropout=True) 
  batch_loss = 0.0 

  for i, (x,y) in enumerate(dataloader) : 
    x, y = x.to(device).float(), y.to(device).float() 
    outputs = model(x) 
    loss = criterion(outputs, y) 
    writer.add_scalar('Loss', loss, epoch) # ๐Ÿ“Œ ์Šค์นผ๋ผ ๊ฐ’ (์˜ค์ฐจ) ์„ ๊ธฐ๋ก 
    optimizer.zero_grad() 
    loss.backward() 
    optimizer.step() 



writer.close() # SummaryWriter ๊ฐ€ ๋”์ด์ƒ ํ•„์š”ํ•˜์ง€ ์•Š์œผ๋ฉด close() ํ˜ธ์ถœ

 

 

tensorboard -- logdir = ../chap02/tensorboard --port=6006 

# ์œ„์˜ ๋ช…๋ น์„ ์ž…๋ ฅํ•˜๋ฉด ํ…์„œ๋ณด๋“œ๋ฅผ ์‹คํ–‰ํ•  ์ˆ˜ ์žˆ๊ฒŒ ๋˜๊ณ  ์›น ๋ธŒ๋ผ์šฐ์ €์—

http://localhost:6006 ์„ ์ž…๋ ฅํ•˜๋ฉด ์•„๋ž˜์™€ ๊ฐ™์€ ์›นํŽ˜์ด์ง€๊ฐ€ ๋œฌ๋‹ค.

 

 

 

 

๐Ÿ’จ model.train() 

 

โˆ˜  ํ›ˆ๋ จ ๋ฐ์ดํ„ฐ์…‹์— ์‚ฌ์šฉ, ๋ชจ๋ธ ํ›ˆ๋ จ์ด ์ง„ํ–‰๋  ๊ฒƒ์ž„์„ ์•Œ๋ฆผ

โˆ˜  dropout ์ด ํ™œ์„ฑํ™”๋จ 

โˆ˜  model.train ๊ณผ model.eval ์„ ์„ ์–ธํ•ด์•ผ ๋ชจ๋ธ์˜ ์ •ํ™•๋„๋ฅผ ๋†’์ผ ์ˆ˜ ์žˆ๋‹ค. 

 

 

๐Ÿ’จ model.eval() 

 

โˆ˜  ๋ชจ๋ธ์„ ํ‰๊ฐ€ํ•  ๋•Œ ๋ชจ๋“  ๋…ธ๋“œ๋ฅผ ์‚ฌ์šฉํ•˜๊ฒ ๋‹ค๋Š” ์˜๋ฏธ

โˆ˜  ๊ฒ€์ฆ๊ณผ ํ…Œ์ŠคํŠธ์…‹์— ์‚ฌ์šฉํ•œ๋‹ค. 

 

# model.eval() 

model.eval() # ๊ฒ€์ฆ ๋ชจ๋“œ๋กœ ์ „ํ™˜ (dropout = False) โญ

with torch.no_grad() : 
  valid_loss = 0 

  for x,y in valid_dataloader : 
    outputs = model(x) 
    loss = F.cross_entropy(outputs, y.long(),squeeze()) 
    valid_loss += float(loss) 
    y_hat += [outputs] 


valid_loss = valid_loss/len(valid_loss)

 

๊ฒ€์ฆ ๊ณผ์ •์—๋Š” ์—ญ์ „ํŒŒ๊ฐ€ ํ•„์š”ํ•˜์ง€ ์•Š๊ธฐ ๋•Œ๋ฌธ์— with torch.no_grad() ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๊ธฐ์šธ๊ธฐ ๊ฐ’์„ ์ €์žฅํ•˜์ง€ ์•Š๋„๋ก ํ•œ๋‹ค. (๋ฉ”๋ชจ๋ฆฌ, ์—ฐ์‚ฐ ์‹œ๊ฐ„ ์ค„์ด๊ธฐ) 

 

 

โœ”  python with ์ ˆ 

 

โˆ˜  ์ž์›์„ ํš๋“ํ•˜๊ณ , ์‚ฌ์šฉํ•˜๊ณ , ๋ฐ˜๋‚ฉํ•  ๋•Œ ์ฃผ๋กœ ์‚ฌ์šฉํ•œ๋‹ค. 

โˆ˜  ํŒŒ์ด์ฌ์—์„œ open() ํ•จ์ˆ˜๋ฅผ ํ†ตํ•ด ํŒŒ์ผ์„ ์—ด๋ฉด ๊ผญ close() ๋ฅผ ํ•ด์ฃผ์–ด์•ผ ํ•˜๋Š”๋ฐ, with ๊ตฌ๋ฌธ์„ ์“ฐ๋ฉด close() ๋ฅผ ์ž๋™์œผ๋กœ ํ˜ธ์ถœํ•ด์ฃผ๊ธฐ ๋•Œ๋ฌธ์— close ๋ฌธ์„ ์“ฐ์ง€ ์•Š์•„๋„ ๋˜๋Š” ์žฅ์ ์ด ์žˆ๋‹ค. 

 

 

 

 

 

๐Ÿ”น  ์ฝ”๋“œ ๋ง›๋ณด๊ธฐ 

 

๐Ÿ’จ ๋ถ„๋ฅ˜ ๋ฌธ์ œ 

 

โˆ˜  car_evaluation  dataset : output ์€ ์ฐจ์˜ ์ƒํƒœ์— ๊ด€ํ•œ ๋ฒ”์ฃผํ˜• ๊ฐ’์œผ๋กœ 4๊ฐœ์˜ ๋ฒ”์ฃผ๋ฅผ ๊ฐ–๋Š”๋‹ค. 

 

(1) ๋ฐ์ดํ„ฐ ๋ถˆ๋Ÿฌ์˜ค๊ธฐ 

 

import torch 
import torch.nn as nn 
import numpy as np 
import pandas as pd 
import matplotlib.pyplot as plt 
import seaborn as sns 

%matplotlib inline 

# ๋ฐ์ดํ„ฐ ํ˜ธ์ถœ 

dataset = pd.read_csv('/content/car_evaluation.csv') 
dataset.head()

 

 

→ ๋‹จ์–ด๋ฅผ ๋ฒกํ„ฐ๋กœ ๋ฐ”๊พธ์–ด์ฃผ๋Š” ์ž„๋ฒ ๋”ฉ ์ฒ˜๋ฆฌ๊ฐ€ ํ•„์š”ํ•˜๋‹ค. 

 

 

 

(2) ๋ฐ์ดํ„ฐ ๋ถ„ํฌ ์‹œ๊ฐํ™” 

 

# target ๋ถ„ํฌ ์‹œ๊ฐํ™” 
fig_size = plt.rcParams["figure.figsize"] 
fig_size[0] = 8 
fig_size[1] = 6 

plt.rcParams["figure.figsize"] = fig_size 
dataset.output.value_counts().plot(kind = 'pie', autopct = '%0.05f%%', 
                                   colors = ['lightblue', 'lightgreen', 'orange','pink'], 
                                   explode = (0.05,0.05,0.05,0.05))

 

 

 

(3) ๋ฐ์ดํ„ฐ ์ „์ฒ˜๋ฆฌ 

 

 

๋ฒ”์ฃผํ˜• ๋ณ€์ˆ˜์— ๋Œ€ํ•ด ๋ฒ”์ฃผ ํ˜•๋ณ€ํ™˜ astype('category') → ๋„˜ํŒŒ์ด ๋ฐฐ์—ด (numpy array) → Tensor 

 

# ์ „์ฒ˜๋ฆฌ : ๋ฒ”์ฃผํ˜• ํƒ€์ž…์œผ๋กœ ๋ฐ”๊พธ๊ธฐ 

categorical_columns = ['price', 'maint', 'doors', 'persons', 'lug_capacity', 'safety']

for category in categorical_columns : 
  dataset[category] = dataset[category].astype('category')  # ๐Ÿ“Œ ๋ฒ”์ฃผํ˜•์œผ๋กœ ํ˜•๋ณ€ํ™˜ 
  

# ๐Ÿ“Œ ๋ฒ”์ฃผํ˜• ๋ฐ์ดํ„ฐ (๋‹จ์–ด) ๋ฅผ ์ˆซ์ž (๋„˜ํŒŒ์ด ๋ฐฐ์—ด)๋กœ ๋ณ€ํ™˜ : cat.codes

price = dataset['price'].cat.codes.values 
maint = dataset['maint'].cat.codes.values
doors = dataset['doors'].cat.codes.values
persons = dataset['persons'].cat.codes.values
lug_capacity = dataset['lug_capacity'].cat.codes.values
safety = dataset['safety'].cat.codes.values

print(price)
print(price.shape)

 

categorical_data = np.stack([price, maint, doors, persons, lug_capacity, safety], axis=1) 
# (1728,1) --> axis=1 --> (1728,6)

categorical_data[:10] # 10๊ฐœ์˜ ํ–‰๋งŒ ์ถœ๋ ฅํ•ด๋ณธ๋‹ค.

 

 

 

 

โœ” np.concatenate ์™€ np.stack ์˜ ์ฐจ์ด 

 

: stack ์€ ์ง€์ •ํ•œ axis ๋ฅผ ์™„์ „ํžˆ ์ƒˆ๋กœ์šด axis ๋กœ ์ƒ๊ฐํ•˜์—ฌ ์—ฐ๊ฒฐํ•˜๋ฏ€๋กœ ๋ฐ˜๋“œ์‹œ ๋‘ ๋„˜ํŒŒ์ด ๋ฐฐ์—ด์˜ ์ฐจ์›์ด ๋™์ผํ•ด์•ผ ์˜ค๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ•˜์ง€ ์•Š๋Š”๋‹ค. 

 

โ€ป stack ์ฐธ๊ณ  : https://everyday-image-processing.tistory.com/87 

 

 

→ axis ์ถ•์„ ์ง€์ •ํ•˜๋Š” ์œ„์น˜๋Œ€๋กœ ์ฐจ์›์ด 1๋กœ ์ฆ๊ฐ€ํ•œ ๊ฒƒ์„ ํ™•์ธํ•ด๋ณผ ์ˆ˜ ์žˆ๋‹ค. 

 

 

โˆ˜ ๋ฐฐ์—ด์„ ํ…์„œ๋กœ ๋ณ€ํ™˜ : torch.tensor( ) 

 

# ๋ฐฐ์—ด์„ ํ…์„œ๋กœ ๋ณ€ํ™˜ 
categorical_data = torch.tensor(categorical_data, dtype = torch.int64) 
categorical_data[:10]

 

โˆ˜ target ๋ณ€์ˆ˜์— ๋Œ€ํ•ด์„œ๋„ ํ…์„œ๋กœ ๋ณ€ํ™˜ : get_dummies ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋„˜ํŒŒ์ด ๋ฐฐ์—ด๋กœ ๋ณ€ํ™˜ํ•œ ํ›„, 1์ฐจ์› ํ…์„œ๋กœ ๋ณ€ํ™˜ํ•œ๋‹ค. 

 

# target ์นผ๋Ÿผ๋„ ํ…์„œ๋กœ ๋ณ€ํ™˜ 

outputs = pd.get_dummies(dataset.output)
outputs = outputs.values 
outputs = torch.tensor(outputs).flatten() # 1์ฐจ์› ํ…์„œ๋กœ ๋ณ€ํ™˜ 

print(categorical_data.shape) 
print(outputs.shape)

 

โœ” ravel(), reshape(), flatten() ์€ ํ…์„œ์˜ ์ฐจ์›์„ ๋ฐ”๊ฟ€ ๋•Œ ์‚ฌ์šฉํ•œ๋‹ค. 

 

 

(4) ์›Œ๋“œ ์ž„๋ฒ ๋”ฉ 

 

โˆ˜ ์œ ์‚ฌํ•œ ๋‹จ์–ด๋ผ๋ฆฌ ์œ ์‚ฌํ•˜๊ฒŒ ์ธ์ฝ”๋”ฉ๋˜๋„๋ก ํ‘œํ˜„ํ•˜๋Š” ๋ฐฉ๋ฒ• 

โˆ˜ ๋†’์€ ์ฐจ์›์˜ ์ž„๋ฒ ๋”ฉ์ผ์ˆ˜๋ก ๋‹จ์–ด ๊ฐ„์˜ ์„ธ๋ถ€ ๊ด€๊ณ„๋ฅผ ์ž˜ ํŒŒ์•…ํ•  ์ˆ˜ ์žˆ๊ฒŒ ๋œ๋‹ค. 

โˆ˜ ๋‹จ์ผ ์ˆซ์ž๋กœ ๋ณ€ํ™˜๋œ ๋„˜ํŒŒ์ด ๋ฐฐ์—ด์„ N ์ฐจ์›์œผ๋กœ ๋ณ€๊ฒฝํ•œ๋‹ค. 

โˆ˜ ์ž„๋ฒ ๋”ฉ ํฌ๊ธฐ (๋ฒกํ„ฐ ์ฐจ์›) ์— ๋Œ€ํ•œ ์ •ํ™•ํ•œ ๊ทœ์น™์€ ์—†์œผ๋‚˜, ์นผ๋Ÿผ์˜ ๊ณ ์œ ๊ฐ’ ์ˆ˜ (๋ฒ”์ฃผ ๊ฐœ์ˆ˜) ๋ฅผ 2๋กœ ๋‚˜๋ˆ„๋Š” ๊ฒƒ์„ ๋งŽ์ด ์‚ฌ์šฉํ•œ๋‹ค. 

 

# ์›Œ๋“œ ์ž„๋ฒ ๋”ฉ ์ฐจ์› ์ •์˜ 

## ๋ฒ”์ฃผ ๊ฐœ์ˆ˜ (๊ณ ์œ ๊ฐ’ ๊ฐœ์ˆ˜)
categorical_column_sizes = [len(dataset[column].cat.categories) for column in categorical_columns]

## ์ž„๋ฒ ๋”ฉ ํฌ๊ธฐ = (๊ณ ์œ ๊ฐ’ ์ˆ˜, ์ฐจ์›์˜ ํฌ๊ธฐ)
categorical_embedding_sizes = [(col_size, min(50, (col_size+1)//2)) for col_size in categorical_column_sizes]
print(categorical_embedding_sizes)

 

 

 

(5) ๋ฐ์ดํ„ฐ์…‹ ๋ถ„๋ฆฌ 

 

# ๋ฐ์ดํ„ฐ์…‹ ๋ถ„๋ฆฌ 

total_records = 1728 
test_records = int(total_records*0.2) # ์ „์ฒด ๋ฐ์ดํ„ฐ ์ค‘ 20%๋ฅผ ํ…Œ์ŠคํŠธ ์šฉ๋„๋กœ ์‚ฌ์šฉ 

categorical_train_data = categorical_data[:total_records - test_records]
categorical_test_data = categorical_data[total_records - test_records : total_records] 
train_outputs = outputs[:total_records - test_records] 
test_outputs = outputs[total_records - test_records : total_records]

 

 

(6) ๋ชจ๋ธ ๋„คํŠธ์›Œํฌ ์ƒ์„ฑ 

 

# ๋ชจ๋ธ ๋„คํŠธ์›Œํฌ ์ƒ์„ฑ 

class Model(nn.Module) : # 1๏ธโƒฃ 

  def __init__(self, embedding_size, output_size, layers, p = 0.4) : # 2๏ธโƒฃ
    super().__init__() # 3๏ธโƒฃ
    self.all_embeddings = nn.ModuleList(
        [nn.Embedding(ni,nf) for ni,nf in embedding_size])
    
    self.embedding_dropout = nn.Dropout(p) 

    all_layers = [] 
    num_categorical_cols = sum((nf for ni, nf in embedding_size))
    input_size = num_categorical_cols # ์ž…๋ ฅ์ธต์˜ ํฌ๊ธฐ๋ฅผ ์ฐพ๊ธฐ ์œ„ํ•ด ๋ฒ”์ฃผํ˜• ์นผ๋Ÿผ์˜ ๊ฐœ์ˆ˜๋ฅผ ์ €์žฅ 

    for i in layers : # 4๏ธโƒฃ
      all_layers.append(nn.Linear(input_size, i)) 
      all_layers.append(nn.ReLU(inplace=True)) 
      all_layers.append(nn.BatchNorm1d(i)) 
      all_layers.append(nn.Dropout(p)) 
      input_size = i 

    all_layers.append(nn.Linear(layers[-1], output_size)) 
    self.layers = nn.Sequential(*all_layers) 
    # ์‹ ๊ฒฝ๋ง์˜ ๋ชจ๋“  ๊ณ„์ธต์ด ์ˆœ์ฐจ์ ์œผ๋กœ ์‹คํ–‰๋˜๋„๋ก ๋ชจ๋“  ๊ณ„์ธต์— ๋Œ€ํ•œ ๋ชฉ๋ก์„ Sequential ํด๋ž˜์Šค๋กœ ์ „๋‹ฌ 
  
  def forward(self, x_categorical): # 5๏ธโƒฃ
    embeddings = []
    for i,e in enumerate(self.all_embeddings):
        embeddings.append(e(x_categorical[:,i]))
    x = torch.cat(embeddings, 1)
    x = self.embedding_dropout(x)
    x = self.layers(x)
    return x

 

 

โ‘  class ํ˜•ํƒœ๋กœ ๊ตฌํ˜„๋˜๋Š” ๋ชจ๋ธ์€ nn.Module ์„ ์ƒ์†๋ฐ›๋Š”๋‹ค. 

 

โ‘ก __init__ : ๋ชจ๋ธ์—์„œ ์‚ฌ์šฉ๋  ํŒŒ๋ผ๋ฏธํ„ฐ์™€ ์‹ ๊ฒฝ๋ง์„ ์ดˆ๊ธฐํ™”ํ•˜๊ธฐ ์œ„ํ•œ ์šฉ๋„๋กœ ์‚ฌ์šฉ๋˜๋ฉฐ, ๊ฐ์ฒด๊ฐ€ ์ƒ์„ฑ๋  ๋•Œ ์ž๋™์œผ๋กœ ํ˜ธ์ถœ๋œ๋‹ค. 

 

  • self : ์ฒซ๋ฒˆ์งธ ํŒŒ๋ผ๋ฏธํ„ฐ๋กœ, ์ƒ์„ฑ๋˜๋Š” ๊ฐ์ฒด ์ž๊ธฐ ์ž์‹ ์„ ์˜๋ฏธํ•œ๋‹ค. 
  • embedding_size : ๋ฒ”์ฃผํ˜• ์นผ๋Ÿผ์˜ ์ž„๋ฒ ๋”ฉ ํฌ๊ธฐ 
  • output_size : ์ถœ๋ ฅ์ธต์˜ ํฌ๊ธฐ 
  • layers : ๋ชจ๋“  ๊ณ„์ธต์— ๋Œ€ํ•œ ๋ชฉ๋ก 
  • p : ๋“œ๋กญ์•„์›ƒ ๋น„์œจ (๊ธฐ๋ณธ๊ฐ’์€ 0.5)  

 

โ‘ข super().__init__() : ๋ถ€๋ชจ ํด๋ž˜์Šค์— ์ ‘๊ทผํ•  ๋•Œ ์‚ฌ์šฉํ•˜๋ฉฐ super ์•ˆ์—๋Š” self ๋ฅผ ์‚ฌ์šฉํ•˜์ง€ ์•Š๋Š”๋‹ค. 

 

โ‘ฃ ๋ชจ๋ธ์˜ ๋„คํŠธ์›Œํฌ๋ฅผ ๊ตฌ์ถ•ํ•˜๊ธฐ ์œ„ํ•ด for ๋ฌธ์„ ์‚ฌ์šฉํ•˜์—ฌ ๊ฐ ๊ณ„์ธต์„ all_layers ๋ชฉ๋ก์— ์ถ”๊ฐ€ํ•œ๋‹ค. 

 

โ‘ค forward() :  ํ•™์Šต ๋ฐ์ดํ„ฐ๋ฅผ ์ž…๋ ฅ๋ฐ›์•„ ์—ฐ์‚ฐ์„ ์ง„ํ–‰ํ•œ๋‹ค. 

 

 

(7) ๊ฐ์ฒด ์ƒ์„ฑ 

 

โˆ˜ ๊ฐ์ฒด๋ฅผ ์ƒ์„ฑํ•˜๋ฉด์„œ ํŒŒ๋ผ๋ฏธํ„ฐ ๊ฐ’์„ ์ž…๋ ฅํ•œ๋‹ค. 

 

# ๋ชจ๋ธ ๊ฐ์ฒด ์ƒ์„ฑ 

model = Model(categorical_embedding_sizes, 4, [200,100,50], p =0.4)  
# (๋ฒ”์ฃผํ˜• ์นผ๋Ÿผ์˜ ์ž„๋ฒ ๋”ฉ ํฌ๊ธฐ, ์ถœ๋ ฅํฌ๊ธฐ, ์€๋‹‰์ธต์˜ ๋‰ด๋Ÿฐ, ๋“œ๋กญ์•„์›ƒ) 
# [200,100,50] : ๋‹ค๋ฅธ ํฌ๊ธฐ๋กœ ์ง€์ •ํ•ด์„œ ํ…Œ์ŠคํŠธ ํ•ด๋„ ๊ดœ์ฐฎ์Œ

 

 

 

(8) ์†์‹คํ•จ์ˆ˜, ์˜ตํ‹ฐ๋งˆ์ด์ € ์ •์˜, ์ž์› ํ• ๋‹น 

 

โˆ˜ ๋ถ„๋ฅ˜ ๋ฌธ์ œ์ด๊ธฐ ๋•Œ๋ฌธ์— ์†์‹คํ•จ์ˆ˜๋Š” ํฌ๋กœ์Šค ์—”ํŠธ๋กœํ”ผ๋ฅผ ์‚ฌ์šฉํ•˜๊ณ , ์˜ตํ‹ฐ๋งˆ์ด์ €๋Š” ์•„๋‹ด์„ ์‚ฌ์šฉํ•œ๋‹ค. 

 

loss_function = nn.CrossEntropyLoss() 
optimizer = torch.optim.Adam(model.parameters(), lr = 0.001)

 

โˆ˜ ํŒŒ์ดํ† ์น˜๋Š” GPU ์— ์ตœ์ ํ™”๋œ ๋”ฅ๋Ÿฌ๋‹ ํ”„๋ ˆ์ž„์›Œํฌ์ด๋ฏ€๋กœ GPU ์‚ฌ์šฉ์ด ๊ฐ€๋Šฅํ•˜๋‹ค๋ฉด ์ž์›์„ ํ• ๋‹นํ•ด์ค€๋‹ค. 

 

# GPU / CPU ์‚ฌ์šฉ ์„ค์ • 

if torch.cuda.is_available():
    device = torch.device('cuda')
else:
    device = torch.device('cpu')

 

 

 

(9) ๋ชจ๋ธ ํ•™์Šต 

 

# ๋ชจ๋ธ ํ•™์Šต 

epochs = 500
aggregated_losses = []
train_outputs = train_outputs.to(device=device, dtype=torch.int64)


for i in range(epochs):# ๊ฐ ๋ฐ˜๋ณต๋งˆ๋‹ค ์†์‹คํ•จ์ˆ˜๊ฐ€ ์˜ค์ฐจ๋ฅผ ๊ณ„์‚ฐํ•œ๋‹ค. 
    i += 1
    y_pred = model(categorical_train_data).to(device)
    single_loss = loss_function(y_pred, train_outputs)
    aggregated_losses.append(single_loss)  # ๋ฐ˜๋ณตํ• ๋•Œ๋งˆ๋‹ค ์˜ค์ฐจ๋ฅผ ์ถ”๊ฐ€ 

    if i%25 == 1:
        print(f'epoch: {i:3} loss: {single_loss.item():10.8f}')

    optimizer.zero_grad() 
    single_loss.backward() # ๊ฐ€์ค‘์น˜ ์—…๋ฐ์ดํŠธ 
    optimizer.step() # ๊ธฐ์šธ๊ธฐ ์—…๋ฐ์ดํŠธ

print(f'epoch: {i:3} loss: {single_loss.item():10.10f}') # 25 ์—ํฌํฌ๋งˆ๋‹ค ์˜ค์ฐจ ์ถœ๋ ฅ

 

 

 

 

 

 

(10) ํ…Œ์ŠคํŠธ ๋ฐ์ดํ„ฐ์…‹์œผ๋กœ ์˜ˆ์ธก

 

# ์˜ˆ์ธก 

test_outputs = test_outputs.to(device = device, dtype = torch.int64) 
with torch.no_grad() : 
  y_val = model(categorical_test_data) 
  loss = loss_function(y_val, test_outputs) 

print(f'Loss : {loss:.8f}')

ํ›ˆ๋ จ ๋ฐ์ดํ„ฐ์—์„œ ๋„์ถœ๋œ loss ๊ฐ’๊ณผ ๋น„์Šทํ•œ ๊ฐ’์ด๋ฏ€๋กœ ๊ณผ์ ํ•ฉ์€ ๋ฐœ์ƒํ•˜์ง€ ์•Š์•˜๋‹ค๊ณ  ํ•ด์„ ๊ฐ€๋Šฅ

 

print(y_val[:5]) # 5๊ฐœ์˜ ๊ฐ’์„ ์ถซ๋ ฅ 

#  ๋ชจ๋ธ ๋„คํŠธ์›Œํฌ์—์„œ output size = 4 ๋กœ ์ง€์ •ํ–ˆ๊ธฐ ๋•Œ๋ฌธ์—, ๊ฐ ํ–‰์€ ํ•˜๋‚˜์˜ output ์— ๋Œ€ํ•œ ๋„ค ๊ฐœ์˜ ๋‰ด๋Ÿฐ ๊ฐ’์„ ๋ณด์—ฌ์คŒ

 

 

→ ๊ธฐ์กด target ๊ฐ’์ด ๋ฒ”์ฃผ๊ฐ€ 4๊ฐœ๋กœ ์ด๋ฃจ์–ด์ง„ ๊ฐ’์ด์—ˆ๊ธฐ ๋•Œ๋ฌธ์— ๋ชจ๋ธ ๋„คํŠธ์›Œํฌ์—์„œ๋„ output size ๋ฅผ 4๋กœ ์ง€์ •ํ–ˆ๊ณ , ๊ฐ ๋ชฉ๋ก์—์„œ ๊ฐ€์žฅ ํฐ ๊ฐ’์— ํ•ด๋‹นํ•˜๋Š” ์ธ๋ฑ์Šค ์œ„์น˜๊ฐ€ ํ•ด๋‹น ๋ฒ”์ฃผ๋กœ ์˜ˆ์ธก๋˜๋Š” ๊ฐ’์ด๋‹ค. 

 

5๊ฐœ์˜ ์˜ˆ์ธก๊ฒฝ์šฐ์— ๋Œ€ํ•ด ๋ชจ๋‘ ์ธ๋ฑ์Šค 0์ด ์ถœ๋ ฅ๋จ

 

 

 

โ€ป argmax 

 

  • axis=1 : ๊ฐ ๊ฐ€๋กœ์ถ• ์›์†Œ๋“ค๋ผ๋ฆฌ ๋น„๊ตํ•ด์„œ ์ตœ๋Œ€๊ฐ’์˜ ์œ„์น˜(์ธ๋ฑ์Šค) ๋ฅผ ๋ฐ˜ํ™˜
  • axis=0 : ๊ฐ ์„ธ๋กœ์ถ• ์›์†Œ๋“ค๋ผ๋ฆฌ ๋น„๊ตํ•ด์„œ ์ตœ๋Œ€๊ฐ’์˜ ์œ„์น˜(์ธ๋ฑ์Šค) ๋ฅผ ๋ฐ˜ํ™˜ 
  • axis ๋ฅผ ์ง€์ •ํ•˜์ง€ ์•Š์œผ๋ฉด ๋ชจ๋“  ์›์†Œ๋ฅผ ์ˆœ์„œ๋Œ€๋กœ 1์ฐจ์› array ๋กœ ๊ฐ„์ฃผํ–ˆ์„ ๋•Œ ์ตœ๋Œ€๊ฐ’์˜ ์ธ๋ฑ์Šค๋ฅผ ๋ฐ˜ํ™˜ํ•œ๋‹ค. 

 

(11) ๋ชจ๋ธ์„ฑ๋Šฅ ํ‰๊ฐ€ 

 

import warnings
warnings.filterwarnings('ignore') 
from sklearn.metrics import classification_report, confusion_matrix, accuracy_score

test_outputs=test_outputs.cpu().numpy()
print(confusion_matrix(test_outputs,y_val))
print(classification_report(test_outputs,y_val))
print(accuracy_score(test_outputs, y_val))

 

๋ชจ๋“  ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ๋ฌด์ž‘์œ„๋กœ ์„ ํƒํ–ˆ๋‹ค๋Š” ๊ฒƒ์„ ๊ฐ์•ˆํ•˜๋ฉด ๋‚˜์˜์ง€ ์•Š์€ ์„ฑ๋Šฅ (์ •ํ™•๋„ 75%)

 

→ ํŒŒ๋ผ๋ฏธํ„ฐ (ํ›ˆ๋ จ/ํ…Œ์ŠคํŠธ ๋ฐ์ดํ„ฐ์…‹ ๋ถ„ํ• , ์€๋‹‰์ธต ๊ฐœ์ˆ˜ ๋ฐ ํฌ๊ธฐ ๋“ฑ) ์„ ๋ณ€๊ฒฝํ•˜๋ฉด์„œ ๋” ๋‚˜์€ ์„ฑ๋Šฅ ์ฐพ์•„๋ณด๊ธฐ 

 

 

 

 

 

 

 

728x90

๋Œ“๊ธ€