Reinforcement Learning(RL)

(1)

Reinforcement Learning(RL)

김형욱

(2)

IVIS Lab, Changwon National University

Reinforcement Learning

(3)

Atari Breakout Game(2013, 2015)

(4)

Reinforcement Learning

(5)

Deep reinforcement learning

(6)

Games with RL

(7)

AlphaGo with RL

(8)

Google Data Center

(9)

Reinforcement Learning Applications

• Robotics : torque or joints

• Business operations

– Inventory management : how much to purchase of inventory, spare parts

– Resource allocation : e.g. in call center, who to service first

• Finance : Investment decisions, portfolio design

• E-commerce/media

– What content to present to users (using click-through / visit time as reward)

– What ads to present to users (avoiding ad fatigue)

(10)

Example - OpenAI GYM Game

(11)

Frozen Lake World

(12)

Frozen Lake World (OpenAI Gym)

(13)

Frozen Lake World (OpenAI Gym)

(14)

Frozen Lake World (OpenAI Gym)

(15)

Frozen Lake World (OpenAI Gym)

(16)

Frozen Lake World (OpenAI Gym)

(17)

Basic installation steps

• OpenAI Gym

– sudo apt install cmake – apt-get install zlib1g-dev – sudo -H pip install gym

– sudo -H pip install gym[atari]

(18)

Frozen Lake:Random?

(19)

Q-function(state-action value function)

(20)

Q-function(state-action value function)

(21)

Policy using Q-function

(22)

Optimal Policy, 𝝿 and Max Q

(23)

Finding, Learning Q

• Assume (believe) Q in s` exists!

• My condition – I am in s

– When I do action a, I’ll go to s`

– When I do action a, I’ll get reward r – Q in s`, Q(s`, a`) exist

• How can we express Q(s, a) using Q(s`, a`)?

(24)

Learning Q(s, a)

(25)

State, action, reward

(26)

Future reward

(27)

Learning Q(s, a)

(28)

Learning Q(s, a)

(29)

Learning Q(s, a) - initial Q values are 0

(30)

Learning Q(s, a)

(31)

Learning Q(s, a)

(32)