๋ณธ๋ฌธ ๋ฐ”๋กœ๊ฐ€๊ธฐ
1๏ธโƒฃ AI•DS/๐ŸฅŽ Casual inference

์ธ๊ณผ์ถ”๋ก ์˜ ๋ฐ์ดํ„ฐ ๊ณผํ•™ - ๊ฐ€์ƒ์˜ ํ†ต์ œ์ง‘๋‹จ

by isdawell 2023. 4. 25.
728x90

์ฐธ๊ณ ์˜์ƒ : Bootcamp 3-4.  ๊ฐ€์ƒ์˜ ํ†ต์ œ์ง‘๋‹จ 

 

 

 

1. synthetic control vs DID 


 

โ—ฏ  Synthetic control 

 

 

•  synthetic control ์€ DID ์˜ ํ™•์žฅ๋ฒ„์ „์ด๋‚˜ ์ข€ ๋” ์œ ์—ฐํ•œ ๋ฐฉ๋ฒ•์ด๋‹ค. 

•  ๋งค์นญ์ด ์„ฑ๋ฆฝ๋˜์ง€ ์•Š๊ณ , parallel trend ๊ฐ€์ •์ด ์„ฑ๋ฆฝํ•˜์ง€ ์•Š๋”๋ผ๋„ ์ ์šฉํ•  ์ˆ˜ ์žˆ๋‹ค. 

•  ์ตœ๊ทผ ๊ฐ€์žฅ ์ฃผ๋ชฉ๋ฐ›๋Š” ๋ฐฉ๋ฒ•๋ก ์ด๊ธฐ๋„ ํ•˜๋‹ค. 

 

 

•  control group ์„ ์กฐํ•ฉํ•จ์œผ๋กœ์„œ ๊ฐ€์ƒ์˜ ๋น„๊ต๊ฐ€๋Šฅํ•œ ํ†ต์ œ์ง‘๋‹จ์„ ๊ตฌ์„ฑํ•  ์ˆ˜ ์žˆ๋Š” ๋ฐฉ๋ฒ•์ด๋‹ค. 

 

 

 

 

 

 

 

2. Example


 

โ—ฏ  ์บ˜๋ฆฌํฌ๋‹ˆ์•„์˜ ๋‹ด๋ฐฐ ๊ทœ์ œ์˜ ๋‹ด๋ฐฐ ํŒ๋งค๋Ÿ‰์— ๋ฏธ์นœ ํšจ๊ณผ 

 

 

•  ์บ˜๋ฆฌํฌ๋‹ˆ์•„์—์„œ๋งŒ 1988๋…„์— ๋„์ž…๋จ, ๊ทœ์ œ๊ฐ€ ๋„์ž…๋˜์ง€ ์•Š์€ 49๊ฐœ์˜ ๋‹ค๋ฅธ ์ฃผ์™€ ๋น„๊ตํ•˜๊ณ ์ž ํ•˜์ง€๋งŒ, ์œ„์˜ ๊ทธ๋ฆผ๊ณผ ๊ฐ™์ด parallel trend ๋ฅผ ๋”ฐ๋ฅด์ง€ ์•Š์Œ โ‡จ synthetic control ์ด ํ•„์š” 

 

 

•  synthetic ์บ˜๋ฆฌํฌ๋‹ˆ์•„๋ฅผ ๋งŒ๋“ค๊ธฐ ์œ„ํ•ด, ๊ฐ ์ฃผ์— ๋Œ€ํ•ด weight linear combination ์„ ํ†ตํ•ด ์บ˜๋ฆฌํฌ๋‹ˆ์•„๋ฅผ ๋ชจ๋ฐฉํ•  ์ˆ˜ ์žˆ๋Š” ๊ฐ€์ƒ์˜ ์ฃผ๋ฅผ ๋งŒ๋“ ๋‹ค. 

•  Donor pool : synthetic control ์„ ๋งŒ๋“œ๋Š”๋ฐ ํˆฌ์ž…๋˜๋Š” control unit 

 

 

โ—ฏ  ์„œ๋…๊ณผ ๋™๋…์˜ ํ†ต์ผ์ด ๊ฒฝ์ œ์— ๋ฏธ์นœ ์ธ๊ณผ์  ํšจ๊ณผ 

 

•  ํ†ต์ผ์ด ๋˜์ง€ ์•Š์•˜๋”๋ผ๋ฉด ์žˆ์—ˆ์„ counterfactual ์ž์ฒด๋Š” ๊ด€์ฐฐํ•  ์ˆ˜ ์—†์Œ 

•  ๋…์ผ๊ณผ ์œ ์‚ฌํ•œ ์ƒํ™ฉ์˜ ๊ตญ๊ฐ€๋ฅผ ์ฐพ๊ธฐ๋„ ์–ด๋ ค์›€ 

 

 

•  Synthetic control ์„ ์ ์šฉํ•˜์—ฌ, ์ฃผ๋ณ€ ๊ตญ๊ฐ€๋“ค์„ ์ ์ ˆํžˆ weight ์„ ์ฃผ์–ด์„œ ์กฐํ•ฉํ–ˆ๋”๋‹ˆ, ์„œ๋…์˜ ๊ฒฝ์ œ๋ฅผ ๋ชจ๋ฐฉํ•˜๋Š” ๊ฐ€์ƒ์˜ ๊ตญ๊ฐ€๋ฅผ ๋งŒ๋“ค ์ˆ˜ ์žˆ์—ˆ๋‹ค. 

•  Donor pool ์—์„œ weight ์„ ์–ด๋–ป๊ฒŒ ์„ค์ •ํ•˜๋Š๋ƒ๊ฐ€ ๊ด€๊ฑด 

 

 

 

 

 

 

3. How to construct the synthetic control 


 

โ—ฏ  Original Method

 

 

•  treatment ๋ฅผ ๋ฐ›๊ธฐ ์ด์ „์˜ outcome ๊ณผ predictor ์— ๋Œ€ํ•ด์„œ ์ฒ˜์น˜๊ทธ๋ฃน์—์„œ์˜ ๊ฐ’๊ณผ ํ†ต์ œ์ง‘๋‹จ์—์„œ์˜ ์กฐํ•ฉ์—์„œ์˜ ์ฐจ์ด๊ฐ€ ์ตœ์†Œํ™” ๋˜๋„๋ก rate ์„ ๊ตฌํ•œ๋‹ค. 

 

 

•  control unit ์—์„œ์˜ ์กฐํ•ฉ์œผ๋กœ treated unit ์— ๊ฐ€๊นŒ์›Œ์งˆ ์ˆ˜ ์žˆ๋Š” weight ์„ ์ฐพ๊ธฐ โ‡จ optimization ๋ฌธ์ œ 

•  t=1 ์ธ ์‹œ์ ์˜ outcome, t=2 ์ธ ์‹œ์ ์˜ outcome, t=2 ์ธ ์‹œ์ ์—์„œ์˜ predictor X โ‡จ ์„ธ ๋ณ€์ˆ˜๋“ค ๊ฐ„์˜ weight ์„ ๊ณ ๋ ค : w-weights 

•  control unit ๋ณ„๋กœ ์–ด๋–ป๊ฒŒ weight ์„ ์ฃผ์–ด์„œ synthetic control ์„ ๋งŒ๋“ค์ง€ : v-weights 

 

โ‡จ  W1โˆ™(Y1' - (V1โˆ™Y1,1 + V2โˆ™Y2,1 + V3โˆ™Y3,1)) + W2โˆ™(Y2' - .....) + W3โˆ™(Y3' - .....) ๋ฅผ minimize ํ•˜๋Š” weight ์ฐพ๊ธฐ 

โ‡จ  W, V์˜ ์ด์ฐจํ•ญ ๋ฌธ์ œ : Quadratic programming 

 

 

•  synthetic control ์€ ๊ฐ๊ฐ์˜ treated unit ์— ๋Œ€ํ•ด ๋”ฐ๋กœ ๊ตฌํ•  ์ˆ˜ ๋ฐ–์— ์—†๋‹ค. 

•  optimization ๋ฌธ์ œ์ด๊ธฐ ๋•Œ๋ฌธ์— control variable ์ด ๋งŽ์œผ๋ฉด ์‹œ๊ฐ„์ด ์˜ค๋ž˜๊ฑธ๋ฆฌ๊ณ  ์ตœ์ ๊ฐ’์ด ๋„์ถœ๋˜์ง€ ์•Š์„ ์ˆ˜ ์žˆ๋‹ค. ๋ชจ๋“  control variable ์ด ๋ชจ๋‘ donor pool ์ด์ง€ ์•Š์•„๋„ ๋œ๋‹ค. donor pool ์„ ์ ์ ˆํžˆ ์ค„์ด๋Š” ๊ฒƒ๋„ ์ค‘์š”ํ•˜๋‹ค. 

 

 

โ—ฏ  Various methods 

 

 

•  negative weight ์„ ์“ฐ๊ฑฐ๋‚˜, ์—˜๋ผ์Šคํ‹ฑ๋„ท/๋ผ์˜ ๋“ฑ์„ ์‚ฌ์šฉํ•˜๊ฑฐ๋‚˜ ๋‹ค์–‘ํ•œ ๋ฐฉ๋ฒ•๋“ค์ด ์ƒ๊ฒจ๋‚˜๊ณ  ์žˆ๋‹ค. 

 

 

โ—ฏ  Synthetic difference in difference 

 

 

•  ๊ธฐ์กด์˜ synthetic control ์€ ์‹œ๊ฐ„์— ๋”ฐ๋ฅธ ๋ณ€ํ™”๋ฅผ ํฌ๊ฒŒ ๊ณ ๋ คํ•˜์ง€ ์•Š๊ณ , control unit ๊ณผ ๋ณ€์ˆ˜์— ๋Œ€ํ•œ weight ๋งŒ์„ ๊ณ ๋ คํ•œ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜, synthetic DID ๋Š” ๊ธฐ์กด์˜ DID ์ฒ˜๋Ÿผ unit fixed effect ์™€ time fixed effect ๋ฅผ ๋ชจ๋‘ ๊ณ ๋ คํ•˜๊ณ , ๋™์‹œ์— ๊ธฐ์กด์˜ synthetic control ์ฒ˜๋Ÿผ ๋™์ž‘ํ•˜๋ฉด์„œ, time weight ๋„ ๊ณ ๋ คํ•œ๋‹ค. ์ด๋ฅผํ†ตํ•ด fixed effect ์™€ weight ์„ ๋ชจ๋‘ ๊ณ ๋ คํ•  ์ˆ˜ ์žˆ๊ฒŒ ๋œ๋‹ค. 

 

 

 

โ—ฏ  Bayesian synthetic control 

 

 

 

 

•  synthetic control ์€ prediction problem ์œผ๋กœ ๊ท€๊ฒฐ๋  ์ˆ˜ ์žˆ๋‹ค. out of sample prediction ๋ฌธ์ œ์ด๊ธฐ ๋•Œ๋ฌธ์— ๋จธ์‹ ๋Ÿฌ๋‹ ๋ฐฉ๋ฒ•๋ก ์ด ๋งŽ์ด ์ ์šฉ๋˜๋Š” ๋ถ„์•ผ์ด๋‹ค. 

 

 

 

 

 

 

 

4. Inference for synthetic control 


 

โ—ฏ  Placebo test

 

•  ํ†ต๊ณ„์ ์œผ๋กœ synthetic theory ์— ๊ธฐ๋ฐ˜์— standard error ๋‚˜ p-value ๋ฅผ ๊ตฌํ•  ์ˆ˜๊ฐ€ ์—†๋‹ค. 

•  synthetic control ์„ ํ•  ๋•Œ, ์–ด๋– ํ•œ ์‹์œผ๋กœ inference ๋ฅผ ํ•˜๋ƒ โ‡จ Placebo test : synthetic control ๊ณผ actual ๊ฐ’์˜ ์ฐจ์ด๊ฐ€ treatment ์ „ํ›„๋กœ ์–ผ๋งˆ๋‚˜ ์ฐจ์ด๊ฐ€ ๋‚˜๋Š”์ง€ ํ™•์ธํ•œ๋‹ค. 

 

 

•  treatment ์ดํ›„์— ์‹ค์ œ๊ฐ’๊ณผ synthetic control ๊ฐ„์˜ ์ฐจ์ด๊ฐ€ 100๋ฐฐ ์ด์ƒ์œผ๋กœ ํผ 

•  ์˜…์€ ํšŒ์ƒ‰ ์„ ๋“ค์€ ์บ˜๋ฆฌํฌ๋‹ˆ์•„๋ฅผ ์ œ์™ธํ•œ ๋‚˜๋จธ์ง€ ์ฃผ๋“ค์— ๋Œ€ํ•œ synthetic control ๊ณผ ์‹ค์ œ ๊ฐ’์˜ ์ฐจ์ด 

•  ๋‹ค๋ฅธ ๋Œ€๋ถ€๋ถ„์˜ ์ฃผ๋“ค์—์„œ๋Š” treatment ์ด์ „๊ณผ ์ดํ›„์— ์ฐจ์ด๊ฐ€ ๋ณ„๋กœ ์—†๋‹ค (์‹ค์ œ treatment ๊ฐ€ ์—†์—ˆ๊ธฐ ๋•Œ๋ฌธ) โ‡จ ์บ˜๋ฆฌํฌ๋‹ˆ์•„์— ๋Œ€ํ•œ ํšจ๊ณผ๊ฐ€ ์œ ์˜ํ•˜๋‹ค๊ณ  ๊ฒฐ๋ก ๋‚ด๋ฆด ์ˆ˜ ์žˆ๋‹ค. 

 

 

 

 

5. Sensitivity tests for synthetic control 


 

 

•  synthetic control ์„ ์–ด๋–ค ๋ณ€์ˆ˜๋กœ ๊ตฌ์„ฑํ•˜๋Š”์ง€์— ๋”ฐ๋ผ ๋‹ฌ๋ผ์ง : variable (predictors) ์— ๋Œ€ํ•œ sensitivity test ๊ฐ€ ํ•„์š” 

•  donor pool ๋‚ด์—์„œ control unit ์— ๋Œ€ํ•œ weight ์— ๋Œ€ํ•ด์„œ๋„ sensitivity test ๊ฐ€ ํ•„์š” 

•  ์—ฌ๋Ÿฌ๊ฐ€์ง€ ๊ฐ€๋Šฅ์„ฑ์— ๋Œ€ํ•ด robust ํ•œ ๊ฒฝ์šฐ์˜ ์ˆ˜๋ฅผ ์ฐพ๊ธฐ 

 

 

•  prediction ๋ฌธ์ œ์ด๊ธฐ ๋•Œ๋ฌธ์— ๋จธ์‹ ๋Ÿฌ๋‹์—์„œ ์ ์šฉํ•˜๋Š” Train , test ๋ฐ์ดํ„ฐ split ๋„ synthetic control ์— ์ ์šฉํ•ด๋ณผ ์ˆ˜ ์žˆ๋‹ค. 

 

 

 

6. Requirements for synthetic control 


•  ๋” ๊ณต๋ถ€ํ•ด๋ณด๊ณ  ์‹ถ๋‹ค๋ฉด review paper ์‚ดํŽด๋ณด๊ธฐ! 

 

 

 

 

๐Ÿ‘€ Control group ์ด ์—†๋‹ค๋ฉด → Time series forecasting model can be used to predict the counterfactual, ๊ทธ๋Ÿฌ๋‚˜ ์ด๋Ÿฌํ•œ ๊ฒฝ์šฐ ์™ธ๋ถ€ ์ถฉ๊ฒฉ ์ดํ›„์— ๋Œ€ํ•œ ์˜ˆ์ธก์€ ๋‹ค์†Œ ์–ด๋ ค์šธ ์ˆ˜ ์žˆ์Œ

 

 

728x90

๋Œ“๊ธ€