RNN (Recurrent Neural Network)

(1)

Recurrent

Neural

Network

Wanho Choi

(2)

Handling Sequence Data

• For example, speech recognition

• FCNN, CNN, AE, and GAN?

• RNN!

• The current state ➜ the next state

y

₁

x

₁

y

₂

x

₂

…

y

_n

x

_n

h

₁

h

₂

h

_n−1

output

hidden

input

h

_t

= f( h

_t−1

, x

_t

)

(3)

x

_t

fold

unfold

: sequence length

n

y

₀

x

₀

x

₁

…

Vanilla RNN Structure

y

₁

y

_t

h

₁

h

₂

h

_n−1

y

_n−1

x

_n−1

h

_t

(4)

x

_t

fold

unfold

: sequence length

n

y

₀

x

₀

x

₁

…

Vanilla RNN Weights

W

_hy

W

_hy

W

_hh

W

_hh

W

_hh

W

_xh

W

_xh

y

₁

y

_t

h

₁

h

₂

h

_n−1

y

_n−1

x

_n−1

W

_hy

W

_xh

h

_t

W

_hh

W

_hy

W

_xh

(5)

x

_t

h

_t

= tanh( W

_hh

h

_t−1

+ W

_xh

x

_t

)

y

_t

= W

_hy

h

_t

W

hh

W

_hy

W

_xh

y

_t

tanh +

Vanilla RNN Computation

(6)

How to use RNN in PyTorch

rnn = torch.nn.RNN( input_size, hidden_size )

outputs, status = rnn( input_data )

<creation>

<run>

x

_t

y

_t

input shape = ( batch_size, sequence_length, input_size )

output shape = ( batch_size, sequence_length, output_size )

(= hidden)

(7)

Example: ‘hello’

import torch

import numpy as np

torch.manual_seed(0) # to make results deterministic and reproducible

unique_characters = ['h', 'e', 'l', ‘o'] INPUT_SIZE = len(unique_characters) # = 4 HIDDEN_SIZE = len(unique_characters) # = 4 x_data = [[0, 1, 2, 2]] # hell # batch_size = 1 x_one_hot = [[[1, 0, 0, 0], # ‘h' [0, 1, 0, 0], # ‘e' [0, 0, 1, 0], # ‘l' [0, 0, 1, 0]]] # ‘l' y_data = [[1, 2, 2, 3]] # ello x = torch.FloatTensor(x_one_hot) y = torch.LongTensor(y_data)

1/2

(8)

Example: ‘hello’

# batch_first guarantees the order = (batch_size, sequence_length, data_size)

model = torch.nn.RNN(INPUT_SIZE, HIDDEN_SIZE, batch_first=True)

CostFunc = torch.nn.CrossEntropyLoss()

optimizer = torch.optim.Adam(model.parameters(), lr=0.1) for i in range(10):

outputs, _ = model(x)

cost = CostFunc(outputs.view(-1, INPUT_SIZE), y.view(-1)) cost.backward()

optimizer.step()

optimizer.zero_grad()

result = outputs.data.numpy().argmax(axis=2)

result = ''.join([unique_characters[c] for c in np.squeeze(result)]) print(result) # ello

(9)

Example: ‘hello’

one hot encoding

‘h’ = [1,0,0,0]

‘e’ = [0,1,0,0]

‘l’ = [0,0,1,0]

‘o’ = [0,0,0,1]

(10)

RNN is flexible.

one-to-one

one-to-many

many-to-one

many-to-many

vanilla RNN

image captioning

video captioning

image → sentence

(seq. of words)

language modeling

sentence →sentiment

translation

(11)

Stacked RNN

time

depth

(12)

RNN Applications

• Temporal analysis

: time-series anomaly detection and time-series prediction

• Computer vision

: image description, video tagging and video analysis

• NLP

: sentiment analysis, speech recognition, language modeling, machine translation and text

generation

https://missinglink.ai/guides/neural-network-concepts/cnn-vs-rnn-neural-network-right/

CNN vs RNN: Which Neural Network Is Right for You?

RNN CNN Hybrids

(13)

Object Recognition using RCNN

(14)

(15)

(16)

(17)

Advanced RNN

• LSTM (Long Short Term Memory)

• GRU

https://www.youtube.com/watch?v=-SHPG_KMUkQ

(18)