Summary Questions of the lecture
Explain the spectral filtering of a graph signal
Decompose the graph signal 𝒇 with the Fourier Transform, and multiply the
filter 𝑔 to filter the transformed signal in the spectral domain. Then, reconstructො the filtered signal in the spatial/temporal domain with the inverse Fourier
Transform.
Summary Questions of the lecture
How to design the spectral filter for GCN?
The spectral filters are learned in a data-driven way. For end-to-end learning, such filters are parametrized.
Summary Questions of the lecture
What is the channel in a graph signal?
Letting 𝑁 and 𝑑1 denote the number of nodes and the feature dimension in each node respectively, the graph signal is represented by a 𝑁 ×
𝑑1 matrix. Then, each column of the matrix refers to ‘channel’.
Summary Questions of the lecture
How to deal with multi-channel signals in GCN?
To create the 𝑛'th output channel, we transform each input channel with a different filter and add up the resulting signals. Thus, we need to learn 𝑑1 × 𝑑2 many filters .
Geometric deep learning via graph convolution … (continue)
GCN (𝒢)
𝒉𝒊(𝒍+𝟏) =
𝒗𝒋∈𝑵(𝒗𝒊)
𝒇 𝒙𝒊, 𝒘𝒊𝒋, 𝒉𝒋(𝒍), 𝒙𝒋
𝑥3, ℎ3𝑙
𝑣1
𝑣3 𝑣4
𝑥1, ℎ1𝑙
𝑥4, ℎ4𝑙
𝑥5, ℎ5𝑙 𝑥 , ℎ𝑙
Outline of Lecture (3)
Graph Convolution Networks (GCN)
What are issues on GCN
Graph Filtering in GCN
Graph Pooling in GCN
Original GNN (Scarselli et al. 2005)
Spectral GCN
Spectral Filtering
Graph Spectral Filtering in GCN
Spectral Graph CNN (Bruna et al. ICLR 2014)
ChebNet (Defferard et al. NIPS 2016)
Simplified ChebNet (Kipf & Welling, ICLR 2017)
Spatial GCN
GraphSage (Hamilton et al. NIPS 2017)
GCN (Kipf & Welling. ICLR 2017)
GAT (Veličković et al. ICLR 2018)
MPNN (Glimer et al. ICML 2017)
Link Analysis
PageRank
Diffusion
Propagation using graph diffusion
Predict Then Propagate [ICLR’19]
Graph Diffusion-Embedding Networks [CVPR’19]
Making a new graph
Diffusion Improves Graph Learning [NIPS’19]
Graph Learning-Convolutional Nets. [CVPR’19]
GCN: Types of Operations for Graph Filtering
Spatial filtering Spectral filtering
Original GNN (Scarselli et al.
2005)
Simp_ChebNet (Kipf & Welling.
ICLR 2017) (VeličkovićGAT et al.
ICLR 2018)
GraphSage (Hamilton et al.
NIPS 2017)
(Glimer et al. MPNN ICML 2017)
Spectral Graph CNN (Bruna et al.
ICLR 2014)
ChebNet (Defferard et al.
NIPS 2016)
…
GCN: Graph Filtering
How to design the spectral filter for GCN?
Data-driven! Learn 𝑔(Λ)ො from data!
How to deal with multi-channel signals?
Each input channel contributes to each output channel 𝑭𝑂 : , 𝑛 =
𝑚=1 𝑑1
ො
𝑔𝑛𝑚 𝑳 𝑭𝐼 : , 𝑚 𝑛 = 1, … 𝑑2 𝑭𝐼 ∈ ℝ𝑁×𝑑1 → 𝑭𝑂 ∈ ℝ𝑁×𝑑2.
Filter for input channel Learn 𝑑2 × 𝑑1 filter
GCN: Graph Filtering
ො
𝑔𝑛𝑚(Λ): non-parametric → not learnable; 𝑛 = 1, … 𝑑2, 𝑚 = 1, … 𝑑1
ො
𝑔𝑛𝑚 Λ =
ො
𝑔𝑛𝑚 𝜆1 0 0
0 ⋱ 0
0 0 𝑔ො𝑛𝑚 𝜆𝑁
⟹ ො𝑔𝑛𝑚 𝑳 = 𝑼 ො𝑔𝑛𝑚 Λ 𝑼𝑻
GCN: Graph Filtering
ො
𝑔𝑛𝑚 Λ : learnable parameterization;
𝑛 = 1, … 𝑑2, 𝑚 = 1, … 𝑑1
ො
𝑔𝑛𝑚 Λ =
𝜃1(𝑛𝑚) 0 0
0 ⋱ 0
0 0 𝜃𝑁(𝑛𝑚)
⟹ ො𝑔𝑛𝑚 𝑳 = 𝑼 ො𝑔𝑛𝑚 Λ 𝑼𝑻
Spectral GCN: Spectral Networks and Deep Locally Connected Networks on Graphs (Bruna et al. ICLR 2014)
⟹ 𝑑1× 𝑑2 × 𝑁 parameters
GCN: Graph Filtering
ො
𝑔𝑛𝑚 Λ : polynomial parameterization;
𝑛 = 1, … 𝑑2, 𝑚 = 1, … 𝑑1
ො
𝑔𝑛𝑚 Λ =
𝑘=0 𝐾
𝜃𝑘(𝑛𝑚) 𝜆1𝑘 0 0
0 ⋱ 0
0 0
𝑘=0 𝐾
𝜃𝑘(𝑛𝑚) 𝜆𝑁𝑘
⟹ ො𝑔𝑛𝑚 𝑳 = 𝑼 ො𝑔𝑛𝑚 Λ 𝑼𝑻
GCN: Graph Filtering
ො
𝑔𝑛𝑚 Λ : polynomial parameterization; 𝑛 = 1, … 𝑑2, 𝑚 = 1, … 𝑑1
ො
𝑔𝑛𝑚 Λ =
𝑘=0 𝐾−1
𝜃𝑘(𝑛𝑚) 𝜆1𝑘 0 0
0 ⋱ 0
0 0
𝑘=0 𝐾−1
𝜃𝑘(𝑛𝑚) 𝜆𝑘𝑁
⟹ ො𝑔𝑛𝑚 𝑳 = 𝑼 ො𝑔𝑛𝑚 Λ 𝑼𝑻
⟹ 𝑑1× 𝑑2 × 𝐾 parameters
𝑭𝑂 : , 𝑛 = 𝑼 ො𝑔𝑛𝑚 Λ 𝑼𝑻𝑭𝐼 : , 𝑚 = σ𝑚=1𝑑1 σ𝑘=0𝐾−1 𝜃𝑘(𝑛𝑚) 𝑳𝑘 𝑭𝐼 : , 𝑚
=
𝑘=0 𝐾−1
𝜃𝑘(𝑛𝑚)Λ𝑘
GCN: Graph Filtering
ො
𝑔𝑛𝑚 Λ : polynomial parameterization; 𝑛 = 1, … 𝑑2, 𝑚 = 1, … 𝑑1 𝑭𝑂 : , 𝑛 = 𝑼 ො𝑔𝑛𝑚 Λ 𝑼𝑻𝑭𝐼 : , 𝑚 = σ𝑚=1𝑑1 σ𝑘=0𝐾−1 𝜃𝑘(𝑛𝑚) 𝑳𝑘 𝑭𝐼 : , 𝑚
If the node 𝑣𝑗 is more than 𝐾 − ℎ𝑜𝑝𝑠 away from node 𝑣𝑖, then, σ𝑘=0𝐾−1𝜃𝑘(𝑛𝑚)𝐿𝑘
𝑖𝑗 = 0 𝒉 = 𝑳𝒇, ℎ𝒊 = σ𝒗
𝒋∈𝓝(𝒗𝒊)(𝑓𝒊 − 𝑓𝑗) 𝒈 = 𝑳𝒉 = 𝑳𝑳𝒇 = 𝑳𝟐𝒇
𝑔𝒌 = σ𝒗 ∈𝓝 𝓝(𝒗 )(𝑓𝑘 − 𝑓𝑗)
GCN: Graph Filtering
ො
𝑔𝑛𝑚 Λ : polynomial parameterization; 𝑛 = 1, … 𝑑2, 𝑚 = 1, … 𝑑1
𝑭𝑂 : , 𝑛 = 𝑼 ො𝑔𝑛𝑚 Λ 𝑼𝑻𝑭𝐼 : , 𝑚 = σ𝑚=1𝑑1 σ𝑘=0𝐾−1 𝜃𝑘(𝑛𝑚) 𝑳𝑘 𝑭𝐼 : , 𝑚
If the node 𝑣𝑗 is more than 𝐾 − ℎ𝑜𝑝𝑠 away from node 𝑣𝑖, then,
𝑘=0 𝐾−1
𝜃𝑘(𝑛𝑚)𝐿𝑘
𝑖𝑗
= 0
⟹ The filter is localized within 𝐾 − ℎ𝑜𝑝𝑠 neighbors in the spatial domain
GCN: Chebyshev Polynomials
High-order polynomial has non-orthogonal basis 1, 𝑥, 𝑥2, … 𝑔 𝑥 = 𝜃0 + 𝜃1𝑥 + 𝜃2𝑥2 + ⋯
Unstable under perturbation of coefficients (convoluted)
GCN: Chebyshev Polynomials
High-order polynomial has non-orthogonal basis 1, 𝑥, 𝑥2, … 𝑔 𝑥 = 𝜃0 + 𝜃1𝑥 + 𝜃2𝑥2 + ⋯
Chebyshev polynomials for recursive formulation for fast filtering:
Recursive definition:
𝑇0 𝑥 = 1; 𝑇1 𝑥 = 𝑥
𝑇𝑘 𝑥 = 2𝑥𝑇𝑘−1 𝑥 − 𝑇𝑘−2 𝑥
Chebyshev polynomials 𝑇𝑘 form an orthogonal basis for the Hilbert space 𝐿2( −1,1 , 𝑑𝑥
1−𝑥2).
GCN: Chebyshev Polynomials
Chebyshev polynomials:
Recursive definition:
𝑇0 𝑥 = 1; 𝑇1 𝑥 = 𝑥
𝑇𝑘 𝑥 = 2𝑥𝑇𝑘−1 𝑥 − 𝑇𝑘−2 𝑥
Chebyshev polynomials 𝑇𝑘 form an orthogonal basis for the Hilbert space 𝐿2( −1,1 , 𝑑𝑥
1−𝑥2).
Chebyshev filter has orthogonal basis 𝑇0 𝑥 , 𝑇1 𝑥 , 𝑇2 𝑥 , …
GCN: ChebNet
Parameterize 𝑔ො𝑛𝑚 𝜦 with Chebyshev polynomials
ො
𝑔𝑛𝑚 𝜦 =
𝑘=0 𝐾
𝜃𝑘(𝑚𝑛)𝑇𝑘 𝜦 , 𝑤𝑖𝑡ℎ ෩෩ 𝜦 = 2𝜦
𝜆𝑚𝑎𝑥 − 𝐈 ∈ [−𝐈, 𝐈]
𝑑2 × 𝑑1 × 𝐾 parameters
ChebNet: Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering (Defferard et al. NIPS 2016)
GCN: ChebNet
Parameterize 𝑔ො𝑛𝑚 𝜦 with Chebyshev polynomials
ො
𝑔𝑛𝑚 𝜦 =
𝑘=0 𝐾
𝜃𝑘(𝑚𝑛)𝑇𝑘 𝜦 , 𝑤𝑖𝑡ℎ ෩෩ 𝜦 = 2𝜦
𝜆𝑚𝑎𝑥 − 𝐈 ∈ [−𝐈, 𝐈]
Single channel:
𝒇𝑶 = 𝑼 ො𝑔 𝜦 𝑼𝑇𝒇𝑰 = σ𝑘=0𝐾 𝜃𝑘𝑇𝑘(෨𝑳) 𝒇𝑰, with ෨𝑳 = 2𝑳
𝜆𝑚𝑎𝑥 − 𝐈
GCN: ChebNet
Single channel:
𝒇𝑶 = 𝑼 ො𝑔 𝜦 𝑼𝑇𝒇𝑰 = σ𝑘=0𝐾 𝜃𝑘𝑇𝑘(෨𝑳) 𝒇𝑰, with ෨𝑳 = 2𝑳
𝜆𝑚𝑎𝑥 − 𝐈 Multiple channel: 𝑑2 × 𝑑1 × 𝐾 parameters
𝑭𝑂 : , 𝑛 = σ𝑚=1𝑑1 σ𝑘=0𝐾 𝜃𝑘(𝑚𝑛)𝑇𝑘(෨𝑳)𝑭𝐼 : , 𝑚
GCN: Simplified ChebNet
Chebyshev polynomials 𝑔ො𝑛𝑚 𝜦 with 𝐾 = 1 and 𝜆𝑚𝑎𝑥 = 2(∗) for 𝑳𝑠𝑦𝑚
ො
𝑔𝑛𝑚 𝜦 = 𝜃0(𝑚𝑛) + 𝜃1(𝑚𝑛)(𝜦 − 𝐈) Further constrain 𝜃 = 𝜃0 = −𝜃1
ො
𝑔𝑛𝑚 𝜦 = 𝜃(𝑚𝑛)(2𝐈 − 𝜦) Single channel: 𝑛 = 1, 𝑚 = 1, for 𝑳 = 𝑳𝑠𝑦𝑚
𝒇𝑂 = 𝑼 ො𝑔 𝜦 𝑼𝑇𝒇𝐼 = 𝜃 2𝐈 − 𝑳 𝒇𝐼 = 𝜃 𝐈 + 𝑫−𝟏/𝟐𝑨𝑫−𝟏/𝟐 𝒇𝐼
GCN: Simplified ChebNet
Single channel: 𝑛 = 1, 𝑚 = 1
𝒇𝑂 = 𝑼 ො𝑔 𝜦 𝑼𝑇𝒇𝐼 = 𝜃 2𝐈 − 𝑳 𝒇𝐼 = 𝜃 𝐈 + 𝑫−𝟏/𝟐𝑨𝑫−𝟏/𝟐 𝒇𝐼
Since 𝜆 𝑰 + 𝑫−𝟏/𝟐𝑨𝑫−𝟏/𝟐 ∈ [0,2], repeated operations lead to instabilties.
Renormalization trick:
𝐈 + 𝑫−𝟏/𝟐𝑨𝑫−𝟏/𝟐 → ෩𝑫−𝟏/𝟐𝑨෩෩𝑫−𝟏/𝟐 with 𝑨 = 𝑨 + 𝐈, and ෩෩ 𝑫𝑖𝑖 = σ𝑗 𝑨෩𝑖𝑗 . This yields
𝒇𝑂 = 𝜃(෩𝑫−𝟏/𝟐𝑨෩෩𝑫−𝟏/𝟐)𝒇𝐼
GCN: Simplified ChebNet
Single channel: 𝑚 = 1, 𝑛 = 1
𝒇𝑂 = 𝜃 ෩𝑫−𝟏/𝟐𝑨෩෩𝑫−𝟏/𝟐 𝒇𝐼, with 𝑨 = 𝑨 + 𝐈, and ෩෩ 𝑫𝑖𝑖 = σ𝑗 𝑨෩𝑖𝑗 Multiple channel: 𝑚 = 1, … , 𝑑1; 𝑛 = 1, … , 𝑑2
𝑭𝑂 : , 𝑛 =
𝑚=1 𝑑1
𝜃(𝑚𝑛) 𝑫෩−𝟏/𝟐𝑨෩෩𝑫−𝟏/𝟐 𝑭𝐼 : , 𝑚
GCN: Simplified ChebNet
Multiple channel: 𝑚 = 1, … , 𝑑1; 𝑛 = 1, … , 𝑑2 𝑭𝑂 : , 𝑛 =
𝑚=1 𝑑1
𝜃(𝑚𝑛) 𝑫෩−𝟏/𝟐𝑨෩෩𝑫−𝟏/𝟐 𝑭𝐼 : , 𝑚 𝑭𝑂 : , 𝑛 = 𝑫෩−𝟏/𝟐𝑨෩෩𝑫−𝟏/𝟐
𝑚=1 𝑑1
𝜃(𝑚𝑛)𝑭𝐼 : , 𝑚
𝑭𝑂 : , 𝑛 = 𝑫෩−𝟏/𝟐𝑨෩෩𝑫−𝟏/𝟐 𝑭𝐼 : , 1 … 𝑭𝐼 : , 𝑑1
𝜃(1𝑛)
⋮ 𝜃(𝑑1𝑛) 𝑭𝑂 : , 𝟏 … 𝑭𝑂 : , 𝑑2 = 𝑫෩−𝟏/𝟐𝑨෩෩𝑫−𝟏/𝟐 𝑭𝐼 : , 1 … 𝑭𝐼 : , 𝑑1
𝜃(11)
⋮ 𝜃(𝑑11)
…
⋱
…
𝜃(1𝑑2)
⋮ 𝜃(𝑑1𝑑2) 𝑭𝑂 = 𝑫෩−𝟏/𝟐𝑨෩෩𝑫−𝟏/𝟐 𝑭𝐼𝚯, with 𝚯 ∈ ℝ𝑑1×𝑑2 and 𝚯 𝑚, 𝑛 = 𝜃(𝑚𝑛)
GCN: Simplified ChebNet
Multiple channel: 𝑚 = 1, … , 𝑑1; 𝑛 = 1, … , 𝑑2 𝑭𝑂 : , 𝑛 =
𝑚=1 𝑑1
𝜃(𝑚𝑛) 𝑫෩−𝟏/𝟐𝑨෩෩𝑫−𝟏/𝟐 𝑭𝐼 : , 𝑚 Matrix form:
𝑭𝑂 = 𝑫෩−𝟏/𝟐𝑨෩෩𝑫−𝟏/𝟐 𝑭𝐼𝚯, with 𝚯 ∈ ℝ𝑑1×𝑑2 and 𝚯 𝑚, 𝑛 = 𝜃(𝑚𝑛) Why Spectral filtering?
GCN: Simplified ChebNet
Matrix form:
𝑭𝑂 = 𝑫෩−𝟏/𝟐𝑨෩෩𝑫−𝟏/𝟐 𝑭𝐼𝚯, with 𝚯 ∈ ℝ𝑑1×𝑑2 and 𝚯 𝑚, 𝑛 = 𝜃(𝑚𝑛) 𝒇 GFT 𝒇 = 𝑼𝑻𝒇
Coefficients Decompose
ො
𝑔(Λ)
Filter 𝑔(Λ)𝑼ො 𝑻𝒇
Filtered coefficients
IGFT
Reconstruct 𝑼 ො𝑔(Λ)𝑼𝑻𝒇
ො 𝑔(𝑳)
Summary Questions of the lecture
Explain the key idea of Spectral GCN: Spectral Networks and Deep Locally Connected Networks on Graphs (Bruna et al. ICLR 2014)
Explain the key idea of ChebNet: Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering (Defferard et al. NIPS 2016)
Explain the key idea of Simplified ChebNet: Semi-Supervised
Classification with Graph Convolutional Networks (Kipf & Welling, ICLR 2017)