Summary Questions of the lecture

27  Download (0)

전체 글

(1)

Summary Questions of the lecture

 Explain the spectral filtering of a graph signal

 Decompose the graph signal 𝒇 with the Fourier Transform, and multiply the

filter 𝑔 to filter the transformed signal in the spectral domain. Then, reconstructො the filtered signal in the spatial/temporal domain with the inverse Fourier

Transform.

(2)

Summary Questions of the lecture

 How to design the spectral filter for GCN?

 The spectral filters are learned in a data-driven way. For end-to-end learning, such filters are parametrized.

(3)

Summary Questions of the lecture

 What is the channel in a graph signal?

 Letting 𝑁 and 𝑑1 denote the number of nodes and the feature dimension in each node respectively, the graph signal is represented by a 𝑁 ×

𝑑1 matrix. Then, each column of the matrix refers to ‘channel’.

(4)

Summary Questions of the lecture

 How to deal with multi-channel signals in GCN?

 To create the 𝑛'th output channel, we transform each input channel with a different filter and add up the resulting signals. Thus, we need to learn 𝑑1 × 𝑑2 many filters .

(5)

Geometric deep learning via graph convolution … (continue)

GCN (𝒢)

𝒉𝒊(𝒍+𝟏) = ෍

𝒗𝒋∈𝑵(𝒗𝒊)

𝒇 𝒙𝒊, 𝒘𝒊𝒋, 𝒉𝒋(𝒍), 𝒙𝒋

𝑥3, ℎ3𝑙

𝑣1

𝑣3 𝑣4

𝑥1, ℎ1𝑙

𝑥4, ℎ4𝑙

𝑥5, ℎ5𝑙 𝑥 , ℎ𝑙

(6)

Outline of Lecture (3)

 Graph Convolution Networks (GCN)

 What are issues on GCN

 Graph Filtering in GCN

 Graph Pooling in GCN

 Original GNN (Scarselli et al. 2005)

 Spectral GCN

 Spectral Filtering

 Graph Spectral Filtering in GCN

 Spectral Graph CNN (Bruna et al. ICLR 2014)

 ChebNet (Defferard et al. NIPS 2016)

 Simplified ChebNet (Kipf & Welling, ICLR 2017)

 Spatial GCN

 GraphSage (Hamilton et al. NIPS 2017)

 GCN (Kipf & Welling. ICLR 2017)

GAT (Veličković et al. ICLR 2018)

 MPNN (Glimer et al. ICML 2017)

 Link Analysis

 PageRank

 Diffusion

 Propagation using graph diffusion

 Predict Then Propagate [ICLR’19]

 Graph Diffusion-Embedding Networks [CVPR’19]

 Making a new graph

Diffusion Improves Graph Learning [NIPS’19]

 Graph Learning-Convolutional Nets. [CVPR’19]

(7)

GCN: Types of Operations for Graph Filtering

Spatial filtering Spectral filtering

Original GNN (Scarselli et al.

2005)

Simp_ChebNet (Kipf & Welling.

ICLR 2017) (VeličkovićGAT et al.

ICLR 2018)

GraphSage (Hamilton et al.

NIPS 2017)

(Glimer et al. MPNN ICML 2017)

Spectral Graph CNN (Bruna et al.

ICLR 2014)

ChebNet (Defferard et al.

NIPS 2016)

(8)

GCN: Graph Filtering

How to design the spectral filter for GCN?

Data-driven! Learn 𝑔(Λ)ො from data!

How to deal with multi-channel signals?

Each input channel contributes to each output channel 𝑭𝑂 : , 𝑛 = ෍

𝑚=1 𝑑1

𝑔𝑛𝑚 𝑳 𝑭𝐼 : , 𝑚 𝑛 = 1, … 𝑑2 𝑭𝐼 ∈ ℝ𝑁×𝑑1 → 𝑭𝑂 ∈ ℝ𝑁×𝑑2.

Filter for input channel Learn 𝑑2 × 𝑑1 filter

(9)

GCN: Graph Filtering

𝑔𝑛𝑚(Λ): non-parametric → not learnable; 𝑛 = 1, … 𝑑2, 𝑚 = 1, … 𝑑1

𝑔𝑛𝑚 Λ =

𝑔𝑛𝑚 𝜆1 0 0

0 ⋱ 0

0 0 𝑔ො𝑛𝑚 𝜆𝑁

⟹ ො𝑔𝑛𝑚 𝑳 = 𝑼 ො𝑔𝑛𝑚 Λ 𝑼𝑻

(10)

GCN: Graph Filtering

𝑔𝑛𝑚 Λ : learnable parameterization;

𝑛 = 1, … 𝑑2, 𝑚 = 1, … 𝑑1

𝑔𝑛𝑚 Λ =

𝜃1(𝑛𝑚) 0 0

0 ⋱ 0

0 0 𝜃𝑁(𝑛𝑚)

⟹ ො𝑔𝑛𝑚 𝑳 = 𝑼 ො𝑔𝑛𝑚 Λ 𝑼𝑻

Spectral GCN: Spectral Networks and Deep Locally Connected Networks on Graphs (Bruna et al. ICLR 2014)

⟹ 𝑑1× 𝑑2 × 𝑁 parameters

(11)

GCN: Graph Filtering

𝑔𝑛𝑚 Λ : polynomial parameterization;

𝑛 = 1, … 𝑑2, 𝑚 = 1, … 𝑑1

𝑔𝑛𝑚 Λ =

𝑘=0 𝐾

𝜃𝑘(𝑛𝑚) 𝜆1𝑘 0 0

0 ⋱ 0

0 0 ෍

𝑘=0 𝐾

𝜃𝑘(𝑛𝑚) 𝜆𝑁𝑘

⟹ ො𝑔𝑛𝑚 𝑳 = 𝑼 ො𝑔𝑛𝑚 Λ 𝑼𝑻

(12)

GCN: Graph Filtering

𝑔𝑛𝑚 Λ : polynomial parameterization; 𝑛 = 1, … 𝑑2, 𝑚 = 1, … 𝑑1

𝑔𝑛𝑚 Λ =

𝑘=0 𝐾−1

𝜃𝑘(𝑛𝑚) 𝜆1𝑘 0 0

0 ⋱ 0

0 0 ෍

𝑘=0 𝐾−1

𝜃𝑘(𝑛𝑚) 𝜆𝑘𝑁

⟹ ො𝑔𝑛𝑚 𝑳 = 𝑼 ො𝑔𝑛𝑚 Λ 𝑼𝑻

⟹ 𝑑1× 𝑑2 × 𝐾 parameters

𝑭𝑂 : , 𝑛 = 𝑼 ො𝑔𝑛𝑚 Λ 𝑼𝑻𝑭𝐼 : , 𝑚 = σ𝑚=1𝑑1 σ𝑘=0𝐾−1 𝜃𝑘(𝑛𝑚) 𝑳𝑘 𝑭𝐼 : , 𝑚

= ෍

𝑘=0 𝐾−1

𝜃𝑘(𝑛𝑚)Λ𝑘

(13)

GCN: Graph Filtering

𝑔𝑛𝑚 Λ : polynomial parameterization; 𝑛 = 1, … 𝑑2, 𝑚 = 1, … 𝑑1 𝑭𝑂 : , 𝑛 = 𝑼 ො𝑔𝑛𝑚 Λ 𝑼𝑻𝑭𝐼 : , 𝑚 = σ𝑚=1𝑑1 σ𝑘=0𝐾−1 𝜃𝑘(𝑛𝑚) 𝑳𝑘 𝑭𝐼 : , 𝑚

If the node 𝑣𝑗 is more than 𝐾 − ℎ𝑜𝑝𝑠 away from node 𝑣𝑖, then, σ𝑘=0𝐾−1𝜃𝑘(𝑛𝑚)𝐿𝑘

𝑖𝑗 = 0 𝒉 = 𝑳𝒇, ℎ𝒊 = σ𝒗

𝒋∈𝓝(𝒗𝒊)(𝑓𝒊 − 𝑓𝑗) 𝒈 = 𝑳𝒉 = 𝑳𝑳𝒇 = 𝑳𝟐𝒇

𝑔𝒌 = σ𝒗 ∈𝓝 𝓝(𝒗 )(𝑓𝑘 − 𝑓𝑗)

(14)

GCN: Graph Filtering

𝑔𝑛𝑚 Λ : polynomial parameterization; 𝑛 = 1, … 𝑑2, 𝑚 = 1, … 𝑑1

𝑭𝑂 : , 𝑛 = 𝑼 ො𝑔𝑛𝑚 Λ 𝑼𝑻𝑭𝐼 : , 𝑚 = σ𝑚=1𝑑1 σ𝑘=0𝐾−1 𝜃𝑘(𝑛𝑚) 𝑳𝑘 𝑭𝐼 : , 𝑚

If the node 𝑣𝑗 is more than 𝐾 − ℎ𝑜𝑝𝑠 away from node 𝑣𝑖, then,

𝑘=0 𝐾−1

𝜃𝑘(𝑛𝑚)𝐿𝑘

𝑖𝑗

= 0

⟹ The filter is localized within 𝐾 − ℎ𝑜𝑝𝑠 neighbors in the spatial domain

(15)

GCN: Chebyshev Polynomials

High-order polynomial has non-orthogonal basis 1, 𝑥, 𝑥2, … 𝑔 𝑥 = 𝜃0 + 𝜃1𝑥 + 𝜃2𝑥2 + ⋯

Unstable under perturbation of coefficients (convoluted)

(16)

GCN: Chebyshev Polynomials

High-order polynomial has non-orthogonal basis 1, 𝑥, 𝑥2, … 𝑔 𝑥 = 𝜃0 + 𝜃1𝑥 + 𝜃2𝑥2 + ⋯

Chebyshev polynomials for recursive formulation for fast filtering:

Recursive definition:

 𝑇0 𝑥 = 1; 𝑇1 𝑥 = 𝑥

 𝑇𝑘 𝑥 = 2𝑥𝑇𝑘−1 𝑥 − 𝑇𝑘−2 𝑥

Chebyshev polynomials 𝑇𝑘 form an orthogonal basis for the Hilbert space 𝐿2( −1,1 , 𝑑𝑥

1−𝑥2).

(17)

GCN: Chebyshev Polynomials

Chebyshev polynomials:

Recursive definition:

 𝑇0 𝑥 = 1; 𝑇1 𝑥 = 𝑥

 𝑇𝑘 𝑥 = 2𝑥𝑇𝑘−1 𝑥 − 𝑇𝑘−2 𝑥

Chebyshev polynomials 𝑇𝑘 form an orthogonal basis for the Hilbert space 𝐿2( −1,1 , 𝑑𝑥

1−𝑥2).

Chebyshev filter has orthogonal basis 𝑇0 𝑥 , 𝑇1 𝑥 , 𝑇2 𝑥 , …

(18)

GCN: ChebNet

Parameterize 𝑔ො𝑛𝑚 𝜦 with Chebyshev polynomials

𝑔𝑛𝑚 𝜦 = ෍

𝑘=0 𝐾

𝜃𝑘(𝑚𝑛)𝑇𝑘 𝜦 , 𝑤𝑖𝑡ℎ ෩෩ 𝜦 = 2𝜦

𝜆𝑚𝑎𝑥 − 𝐈 ∈ [−𝐈, 𝐈]

𝑑2 × 𝑑1 × 𝐾 parameters

ChebNet: Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering (Defferard et al. NIPS 2016)

(19)

GCN: ChebNet

Parameterize 𝑔ො𝑛𝑚 𝜦 with Chebyshev polynomials

𝑔𝑛𝑚 𝜦 = ෍

𝑘=0 𝐾

𝜃𝑘(𝑚𝑛)𝑇𝑘 𝜦 , 𝑤𝑖𝑡ℎ ෩෩ 𝜦 = 2𝜦

𝜆𝑚𝑎𝑥 − 𝐈 ∈ [−𝐈, 𝐈]

Single channel:

𝒇𝑶 = 𝑼 ො𝑔 𝜦 𝑼𝑇𝒇𝑰 = σ𝑘=0𝐾 𝜃𝑘𝑇𝑘(෨𝑳) 𝒇𝑰, with ෨𝑳 = 2𝑳

𝜆𝑚𝑎𝑥 − 𝐈

(20)

GCN: ChebNet

Single channel:

𝒇𝑶 = 𝑼 ො𝑔 𝜦 𝑼𝑇𝒇𝑰 = σ𝑘=0𝐾 𝜃𝑘𝑇𝑘(෨𝑳) 𝒇𝑰, with ෨𝑳 = 2𝑳

𝜆𝑚𝑎𝑥 − 𝐈 Multiple channel: 𝑑2 × 𝑑1 × 𝐾 parameters

𝑭𝑂 : , 𝑛 = σ𝑚=1𝑑1 σ𝑘=0𝐾 𝜃𝑘(𝑚𝑛)𝑇𝑘(෨𝑳)𝑭𝐼 : , 𝑚

(21)

GCN: Simplified ChebNet

Chebyshev polynomials 𝑔ො𝑛𝑚 𝜦 with 𝐾 = 1 and 𝜆𝑚𝑎𝑥 = 2(∗) for 𝑳𝑠𝑦𝑚

𝑔𝑛𝑚 𝜦 = 𝜃0(𝑚𝑛) + 𝜃1(𝑚𝑛)(𝜦 − 𝐈) Further constrain 𝜃 = 𝜃0 = −𝜃1

𝑔𝑛𝑚 𝜦 = 𝜃(𝑚𝑛)(2𝐈 − 𝜦) Single channel: 𝑛 = 1, 𝑚 = 1, for 𝑳 = 𝑳𝑠𝑦𝑚

𝒇𝑂 = 𝑼 ො𝑔 𝜦 𝑼𝑇𝒇𝐼 = 𝜃 2𝐈 − 𝑳 𝒇𝐼 = 𝜃 𝐈 + 𝑫−𝟏/𝟐𝑨𝑫−𝟏/𝟐 𝒇𝐼

(22)

GCN: Simplified ChebNet

Single channel: 𝑛 = 1, 𝑚 = 1

𝒇𝑂 = 𝑼 ො𝑔 𝜦 𝑼𝑇𝒇𝐼 = 𝜃 2𝐈 − 𝑳 𝒇𝐼 = 𝜃 𝐈 + 𝑫−𝟏/𝟐𝑨𝑫−𝟏/𝟐 𝒇𝐼

Since 𝜆 𝑰 + 𝑫−𝟏/𝟐𝑨𝑫−𝟏/𝟐 ∈ [0,2], repeated operations lead to instabilties.

Renormalization trick:

𝐈 + 𝑫−𝟏/𝟐𝑨𝑫−𝟏/𝟐 → ෩𝑫−𝟏/𝟐𝑨෩෩𝑫−𝟏/𝟐 with 𝑨 = 𝑨 + 𝐈, and ෩෩ 𝑫𝑖𝑖 = σ𝑗 𝑨෩𝑖𝑗 . This yields

𝒇𝑂 = 𝜃(෩𝑫−𝟏/𝟐𝑨෩෩𝑫−𝟏/𝟐)𝒇𝐼

(23)

GCN: Simplified ChebNet

Single channel: 𝑚 = 1, 𝑛 = 1

𝒇𝑂 = 𝜃 ෩𝑫−𝟏/𝟐𝑨෩෩𝑫−𝟏/𝟐 𝒇𝐼, with 𝑨 = 𝑨 + 𝐈, and ෩෩ 𝑫𝑖𝑖 = σ𝑗 𝑨෩𝑖𝑗 Multiple channel: 𝑚 = 1, … , 𝑑1; 𝑛 = 1, … , 𝑑2

𝑭𝑂 : , 𝑛 = ෍

𝑚=1 𝑑1

𝜃(𝑚𝑛) 𝑫෩−𝟏/𝟐𝑨෩෩𝑫−𝟏/𝟐 𝑭𝐼 : , 𝑚

(24)

GCN: Simplified ChebNet

Multiple channel: 𝑚 = 1, … , 𝑑1; 𝑛 = 1, … , 𝑑2 𝑭𝑂 : , 𝑛 = ෍

𝑚=1 𝑑1

𝜃(𝑚𝑛) 𝑫෩−𝟏/𝟐𝑨෩෩𝑫−𝟏/𝟐 𝑭𝐼 : , 𝑚 𝑭𝑂 : , 𝑛 = 𝑫෩−𝟏/𝟐𝑨෩෩𝑫−𝟏/𝟐

𝑚=1 𝑑1

𝜃(𝑚𝑛)𝑭𝐼 : , 𝑚

𝑭𝑂 : , 𝑛 = 𝑫෩−𝟏/𝟐𝑨෩෩𝑫−𝟏/𝟐 𝑭𝐼 : , 1 … 𝑭𝐼 : , 𝑑1

𝜃(1𝑛)

⋮ 𝜃(𝑑1𝑛) 𝑭𝑂 : , 𝟏 … 𝑭𝑂 : , 𝑑2 = 𝑫෩−𝟏/𝟐𝑨෩෩𝑫−𝟏/𝟐 𝑭𝐼 : , 1 … 𝑭𝐼 : , 𝑑1

𝜃(11)

⋮ 𝜃(𝑑11)

𝜃(1𝑑2)

⋮ 𝜃(𝑑1𝑑2) 𝑭𝑂 = 𝑫෩−𝟏/𝟐𝑨෩෩𝑫−𝟏/𝟐 𝑭𝐼𝚯, with 𝚯 ∈ ℝ𝑑1×𝑑2 and 𝚯 𝑚, 𝑛 = 𝜃(𝑚𝑛)

(25)

GCN: Simplified ChebNet

Multiple channel: 𝑚 = 1, … , 𝑑1; 𝑛 = 1, … , 𝑑2 𝑭𝑂 : , 𝑛 = ෍

𝑚=1 𝑑1

𝜃(𝑚𝑛) 𝑫෩−𝟏/𝟐𝑨෩෩𝑫−𝟏/𝟐 𝑭𝐼 : , 𝑚 Matrix form:

𝑭𝑂 = 𝑫෩−𝟏/𝟐𝑨෩෩𝑫−𝟏/𝟐 𝑭𝐼𝚯, with 𝚯 ∈ ℝ𝑑1×𝑑2 and 𝚯 𝑚, 𝑛 = 𝜃(𝑚𝑛) Why Spectral filtering?

(26)

GCN: Simplified ChebNet

Matrix form:

𝑭𝑂 = 𝑫෩−𝟏/𝟐𝑨෩෩𝑫−𝟏/𝟐 𝑭𝐼𝚯, with 𝚯 ∈ ℝ𝑑1×𝑑2 and 𝚯 𝑚, 𝑛 = 𝜃(𝑚𝑛) 𝒇 GFT ෠𝒇 = 𝑼𝑻𝒇

Coefficients Decompose

𝑔(Λ)

Filter 𝑔(Λ)𝑼ො 𝑻𝒇

Filtered coefficients

IGFT

Reconstruct 𝑼 ො𝑔(Λ)𝑼𝑻𝒇

ො 𝑔(𝑳)

(27)

Summary Questions of the lecture

 Explain the key idea of Spectral GCN: Spectral Networks and Deep Locally Connected Networks on Graphs (Bruna et al. ICLR 2014)

 Explain the key idea of ChebNet: Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering (Defferard et al. NIPS 2016)

 Explain the key idea of Simplified ChebNet: Semi-Supervised

Classification with Graph Convolutional Networks (Kipf & Welling, ICLR 2017)

수치

Updating...

참조

관련 주제 :