1οΈβ£ 6κ° λ³΅μ΅
πΉ Main Topic : Graph Neural Networks
β λ³΅μ΅ : Node embedding
• κ·Έλνμμ μ μ¬ν λ Έλλ€μ΄ ν¨μ f λ₯Ό κ±°μ³ d μ°¨μμΌλ‘ μλ² λ© λμμ λ, μλ² λ© κ³΅κ° λ΄μμ κ°κΉμ΄ μμΉνλλ‘ λ§λλ κ²
βͺ Encoder : κ° λ Έλλ₯Ό μ μ°¨μ 벑ν°λ‘ 맀ν
βͺ Similarity function : μλ κ·Έλν λ΄μμμ λ Έλ κ° μ μ¬λμ μλ² λ© κ³΅κ°μμ λ Έλ 벑ν°μ λ΄μ κ°μ΄ μ μ¬νλλ‘ λ§λλ ν¨μ
• Shallow Encoding (embedding lookup) : μλ² λ© νλ ¬μμ λ Έλμ μλ² λ© λ²‘ν°λ₯Ό κ° μΉΌλΌμ λ΄μ, λ¨μν 벑ν°λ₯Ό μ½μ΄μ€λ λ°©μ → π€¨ λ Έλ κ°μ νλΌλ―Έν°λ₯Ό 곡μ νμ§ μκΈ° λλ¬Έμ λ Έλμ κ°μκ° μ¦κ°ν μλ‘ νλ ¬μ ν¬κΈ°κ° κ³μ λμ΄λκ² λλ©°, νλ ¨ κ³Όμ μμ λ³΄μ§ λͺ»ν λ Έλλ μλ² λ©μ μμ±ν μ μλ€. λν λ Έλμ feature μ 보λ ν¬ν¨λμ§ μλλ€.
β‘ GNN
• λ¨μν look up νλ μλ² λ© λ°©μμ νκ³λ₯Ό 극볡νκ³ μ λ€μ€ λ μ΄μ΄λ‘ ꡬμ±λ encoder λ₯Ό νμ©
• Task : Node classification, Link prediction, Community detection, network similarity
π κ·Έλ¬λ λ¬Έμ κ° μλ€
βͺ λ€νΈμν¬λ μμμ ν¬κΈ°λ₯Ό κ°μ§κ³ μμΌλ©° 볡μ‘ν topological ꡬ쑰λ₯Ό κ°μ§λ€.
βͺ νΉμ ν κΈ°μ€μ μ΄λ μ ν΄μ§ μμκ° μλ€.
βͺ λμ μ΄λ©° multimodal feature λ₯Ό κ°μ§λ€.
πΉ Deep learning for Graphs
β Notation
• V : λ Έλμ§ν©
• A : μΈμ νλ ¬ (μ°κ²° μ¬λΆλ₯Ό λνλ΄λ λ°©μ : binary)
• X : node feature νλ ¬
• N(v) : v μ μ΄μλ Έλ μ§ν©
β‘ Convolutional Neworks
β» μ 리 μ°Έκ³ : https://manywisdom-career.tistory.com/71
• Convolutional μ°μ° : Sliding window λ₯Ό ν΅ν΄ μ»μ μ 보λ₯Ό λͺ¨λ λν΄ output μ λμΆ
• μ΄μλ Έλμ μ 보λ₯Ό λ³ννκ³ κ²°ν©νμ¬ νΉμ λ Έλλ₯Ό μλ² λ©νλ€.
• Layer-k embedding : k hop λ§νΌμ μ΄μλ Έλμ μ 보λ₯Ό κ°μ Έμ μλ² λ© νλ€λ μλ―Έ
• Neighborhood aggregation : μ΄μλ Έλλ‘λΆν° μ 보λ₯Ό μ§κ³νλ λ°©μμ λ€νΈμν¬λ§λ€ λ€λ₯΄λ€. μ΄λ μ§κ³νλ ν¨μλ permutation invariant (μ λ ₯ λ°μ΄ν°μ μμμ μν₯μ λ°μ§ μλ) ν¨μμ¬μΌ νλ©°, κΈ°λ³Έμ μΌλ‘ λ§μ΄ μ¬μ©νλ λ°©μμ μ 보λ₯Ό average (νκ· ) νλ κΈ°λ²μ λ§μ΄ μ¬μ©νλ€.
β’ μν 곡μ
π Wk, Bk : νμ΅ν κ°μ€μΉ νλΌλ―Έν°λ‘ Wk λ μ΄μλ Έλλ€λ‘λΆν° μ§κ³ν μ 보μ λν΄ λΆμ¬νλ κ°μ€μΉμ΄κ³ Bk λ νμ¬ κ³μ°μ€μΈ λ Έλμ μ΄μ λ μ΄μ΄μμμ (μκΈ° μμ μ) μλ² λ© μ 보μ λν κ°μ€μΉμ΄λ€. μ΄ λ κ°μ ν΅ν΄ μ΄μμ 보μ μ§μ€ν μ§, μκΈ° μμ μ κ°μ λ³νμ μ§μ€ν μ§ κ²°μ νλ€. μ΄ νλΌλ―Έν°λ νΉμ λ Έλλ₯Ό μλ² λ©ν λ λͺ¨λ λ Έλμ λν΄ κ³΅μ λλ κ°μ΄κΈ° λλ¬Έμ, μλ‘μ΄ λ Έλλ κ·Έλνμ λν΄μλ μΌλ°νμν¬ μ μλ€.
β£ GNN νλ ¨λ°©μ
• Goal : Node embedding Zv
• input : Graph
βͺ Unsupervised setting : κ·Έλν ꡬ쑰λ₯Ό supervisionμΌλ‘ μ¬μ©
βͺ Supervised setting [Node classification] : node label y μ λν΄, μ€μ λΌλ²¨κ³Ό λ Έλ μλ² λ© κ²°κ³Όκ° κΈ°λ°μ μμΈ‘ λΌλ²¨κ° μ¬μ΄μ loss function μ μ μνμ¬ νλ ¨μ μ§ν
2οΈβ£ μ½λ리뷰
https://colab.research.google.com/drive/1DsdBei9OSz4yRZ-KIGEU6iaHflTTRGY-?usp=sharing
πΉ Dataset
• Cora dataset
- λ€λ₯Έ λ Όλ¬Έμ μΈμ©νλ μ°κ²°κ΅¬μ‘°λ₯Ό ννν κ² : Citation Network
- 2708 κ°μ κ³Όν λΆμΌ λ Όλ¬Έ μΆκ°μ λν λ°μ΄ν°λ‘, κ° λ Όλ¬Έμ 7κ° class λΆλ₯ μ€ νλμ μνλ€.
- 5429 κ°μ λ§ν¬ (μ£μ§)λ‘ κ΅¬μ±λμ΄ μλ€.
- κ° λ Έλλ λ¨μ΄μ¬μ μ κΈ°λ°μΌλ‘ 0 (ν΄λΉ λ¨μ΄κ° μ‘΄μ¬νμ§ μμ) νΉμ 1 (ν΄λΉ λ¨μ΄κ° μ‘΄μ¬ν¨) binary κ°μ κ°μ§ λ¨μ΄λ²‘ν°λ‘ μ΄λ£¨μ΄μ Έ μλ€. λ¨μ΄μ¬μ μ 1433κ°μ λ¨μ΄λ‘ ꡬμ±λμ΄ μλ€ π node_features = 1433
- Main Task : node classification (CrossEntropyLoss - λ€μ€λΆλ₯)
β Data Normalization
→ GCN κ³Όμ μμ λ Έλ μ°¨μλ‘ μ κ·ν νλ κ³Όμ
dataset = Planetoid("/tmp/Cora", name="Cora")
print(f'μ κ·ν μμ΄ νλ ¬μ κ° νμ κ° ν©μ° κ²°κ³Ό : {dataset[0].x.sum(dim=-1)}')
dataset = Planetoid("/tmp/Cora", name="Cora", transform = T.NormalizeFeatures()) #πΎ
print(f'μ κ·νλ₯Ό μ μ©ν΄ νλ ¬μ κ° νμ κ° ν©μ° κ²°κ³Ό : {dataset[0].x.sum(dim=-1)}') # dim = axis
β‘ GCN model architecture
class GCN(torch.nn.Module) :
def __init__(self, num_node_features : int, num_classes : int, hidden_dim : int = 16, dropout_rate : float = 0.5) :
super().__init__()
self.dropout1 = torch.nn.Dropout(dropout_rate)
self.conv1 = GCNConv(num_node_features, hidden_dim)
# (conv1): GCNConv(1433, 16)
self.relu = torch.nn.ReLU(inplace = True)
self.dropout2 = torch.nn.Dropout(dropout_rate)
self.conv2 = GCNConv(hidden_dim, num_classes)
# (conv2): GCNConv(16, 7)
def forward(self, x : Tensor, edge_index : Tensor) -> torch.Tensor :
x = self.dropout1(x)
x = self.conv1(x, edge_index)
x = self.relu(x)
x = self.dropout2(x)
x = self.conv2(x, edge_index)
return x
β’ Training and Evaluation
def train_step(model : torch.nn.Module, data : Data, optimizer : torch.optim.Optimizer,loss_fn : LossFn) :
Tuple[float, float]
model.train()
optimizer.zero_grad()
mask = data.train_mask
logits = model(data.x, data.edge_index)[mask]
preds = logits.argmax(dim=1)
y = data.y[mask]
loss = loss_fn(logits, y)
acc = (preds == y).sum().item() / y.numel() #β numel : torch tensor ν¬κΈ°λ₯Ό λ°ν
loss.backward()
optimizer.step()
return loss.item(), acc
@torch.no_grad()
def eval_step(model : torch.nn.Module, data : Data, loss_fn : LossFn, stage : Stage) :
model.eval()
mask = getattr(data, f'{stage}_mask')
logits = model(data.x, data.edge_index)[mask]
preds = logits.argmax(dim=1)
y = data.y[mask]
loss = loss_fn(logits, y)
acc = (preds == y).sum().item() / y.numel() #β
return loss.item(),acc
• optimizer.zero_grad() : νλΌλ―Έν° μ΄κΈ°ν
• preds = logits.argmax(dim=1) : μμΈ‘κ° μΈλ±μ€ λ°ν π 7κ°μ class μΈλ±μ€ μ€ νλλ₯Ό λ°ν
β£ Train function define and Training
SEED = 42
MAX_EPOCHS = 200
LEARNING_RATE = 0.01
WEIGHT_DECAY = 5e-4
EARLY_STOPPING = 10
def train(model : torch.nn.Module, data : Data, optimizer : torch.optim.Optimizer,
loss_fn : LossFn = torch.nn.CrossEntropyLoss(), max_epochs : int = 200,
early_stopping : int = 10, print_interval : int = 20, verbose : bool = True) :
history = {'loss':[],'val_loss' : [], 'acc' : [], 'val_acc' : []}
for epoch in range(max_epochs) :
loss, acc = train_step(model, data, optimizer, loss_fn)
val_loss, val_acc = eval_step(model, data, loss_fn, 'val')
history['loss'].append(loss)
history['acc'].append(acc)
history["val_loss"].append(val_loss)
history['val_acc'].append(val_acc)
if epoch > early_stopping and val_loss > np.mean(history['val_loss'][-(early_stopping +1) : -1]) :
if verbose :
print('\n ealry stopping ...')
break
if verbose and epoch % print_interval == 0 :
print(f'\nEpoch : {epoch} \n----------- ')
print(f'Train loss : {loss:.4f} | Train acc : {acc:.4f}')
print(f'Val loss : {val_loss : .4f} | Val acc : {val_acc : .4f}')
test_loss, test_acc = eval_step(model, data, loss_fn, "test")
if verbose:
print(f"\nEpoch: {epoch}\n----------")
print(f"Train loss: {loss:.4f} | Train acc: {acc:.4f}")
print(f" Val loss: {val_loss:.4f} | Val acc: {val_acc:.4f}")
print(f" Test loss: {test_loss:.4f} | Test acc: {test_acc:.4f}")
return history
• loss function μ μ λ° νμ΄νΌνλΌλ―Έν° μ μ (max_epoch, early stopping)
• accuracy, loss μΆλ ₯ ν¨μ μ μ
torch.manual_seed(SEED)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = GCN(dataset.num_node_features, dataset.num_classes).to(device)
data = dataset[0].to(device)
optimizer = torch.optim.Adam(model.parameters(), lr = LEARNING_RATE, weight_decay = WEIGHT_DECAY)
history = train(model, data, optimizer, max_epochs = MAX_EPOCHS, early_stopping = EARLY_STOPPING)
plt.figure(figsize = (12,4))
plot_history(history, 'GCN')
'1οΈβ£ AIβ’DS > π GNN' μΉ΄ν κ³ λ¦¬μ λ€λ₯Έ κΈ
[cs224w] Frequent Subgraph Mining with GNNs (0) | 2023.01.27 |
---|---|
[cs224w] Theory of Graph Neural Networks (0) | 2023.01.06 |
[CS224W] Message Passing and Node classification (0) | 2022.11.17 |
[CS224W] PageRank (0) | 2022.11.02 |
[CS224W] 1κ° Machine Learning With Graphs (0) | 2022.10.11 |
λκΈ