2 002 , V ol. 13, N o.2 p p 195~2 00
A N ot e on t h e T w o D e pe n de n t B e rn ou lli A rm s D al H o K im 1) ・Y oung Joon Ch a・Jae M an Le e 2 )
A b s tra c t
W e con sider th e Bern ou lli t w o - arm ed b an dit problem . It is w ell kn ow n th at th e m y optic str at eg y is optim al w h en th e prior distribut ion is con cen tr at ed at t w o p oint s in th e un it s qu ar e. W e in v estig at e s ev er al ca se s in t h e u nit squ ar e w h et h er th e m y opt ic st r at egy is opt im al or n ot . In g en eral, t h e m y optic st r at egy is n ot optim al w h en t h e prior dist ribu tion is n ot con cen t rat ed at t w o point s in t h e u nit squ ar e
K e y w o rd s : Ban dit pr oblem , Bern ou lli, m y opic, opt im al., prior dist ribu t ion , t w o - arm ed.
1 . T w o - A rm e d B an dit P ro ble m s
Con sider t w o depen dent B ern oulli arm s (or ex perim en t s ) w ith t h e prior for (
1,
2) on ly con cen tr at ed on t w o point s in t h e u nit squ are . S o arm 1 g en er at es i.i.d. Bern oulli r an dom v ariables (g en erically den ot ed by X ) w it h m ean
1, an d arm 2 g en er at es i.i.d . Bern ou lli r an dom v ariab les (g en erically den ot ed by Y ) w it h m ean
2. F u rth erm or e, ev ery X is in dep en den t of ev ery Y . T h e dis cou nt s equ en ce is t h e N - h orizon un iform . Obj ect iv e is t o m ax im ize t h e ex p ect ed sum of N ob serv at ion s w h en N is fix ed . T his t w o- arm ed Ban dit problem is w ell int r odu ced in B erry an d F rist edt (1985).
It is of int er est t o kn ow un der w h at con dition s t h e m y opic str at eg y is opt im al : alw ay s select th e arm w it h g reat er m ean . It is called m y opic b ecau s e it "b eh av es "
a s if t h er e w er e alw ay s ju st on m ore trial t o b e allocat ed. W h en th e m y opic str at eg y is opt im al, it m ean s t h at t h e opt im al st rat eg y does n ot depen d on th e n um b er of t rials r em ainin g : it is tim e in v arian t , s o t o sp eak .
F eldm an (1962) con sider ed t his pr oblem in t h e ca se t h at (
1,
2) is eith er
1. Depar tm ent of St atist ics , Kyungpook Nation al Univ er sity , 702- 701, Kor ea E - m ail : dalkim @knu .ac.kr
2. Depar tm ent of St atist ics , Andon g Nation al Univ er sity , Kyeon gbuk 760- 749, Kor ea
( a , b) or ( b , a ) w h ere 0 a < b 1 . S o t h e dist ribu tion of (
1,
2) can b e specified by th e sin gle n um b er = P (
1= a ) = P (
2= b) . H e con sider ed th e pr ocedu re w hich m in im ize s t h e ex pect ed nu m b er of "m ist ak es " m ade by th e st atist ician du rin g th e pr ocedu r e (or m inim izes t h e ex pect ed n um b er of t h e in ferior arm ). In fact , t his is equ iv alen t t o m ax im izin g t h e ex pect ed nu m b er of su cce s ses in t his situ at ion .
F or j = 1 , , N an d 0 1 , w e sh all con sider th e sit u ation in w hich th e t ot al n um b er of ob serv ation s r em ainin g t o b e t ak en is j an d t h e dist ribu t ion of
1an d
2
is specified by t h e pr ob ability = P (
1= a ) . In t his sit u ation , let
jX
den ot e t h e pr ocedur e w hich specifies t h at th e fir st ob s erv at ion sh ould b e t ak en on X an d t h en an opt im al procedur e sh ould b e adopt ed ov er t h e rem ain in g j - 1 ob serv ation s , an d let m
jX( ) den ot e th e ex pect ed nu m b er of m ist ak es durin g th e
j ob serv ation s for w h ich t h e procedur e
jX
is u sed . S im ilarly , let
jY
den ot e t h e pr ocedu re w hich specifies t h at t h e fir st ob serv at ion sh ou ld b e t ak en on Y an d t h en an opt im al procedur e sh ould b e adopt ed ov er t h e rem ain in g j - 1 ob serv ation s , an d let m
jY( ) den ot e t h e ex p ect ed n um b er of m ist ak es w h en th e pr ocedu re
jY
is u s ed.
F u rth erm or e, let
jX Y
b e th e procedur e w hich specifies t h at t h e fir st ob serv ation sh ould b e t ak en on X , t h e secon d ob serv at ion sh ou ld b e t ak en on Y , an d th en an opt im al pr ocedu r e sh ould b e adopt ed ov er th e r em ain in g j - 2 ob serv ation s . S im ilarly , let
jYX
b e t h e pr ocedu r e w h ich specifies t h at t h e fir st ob serv ation sh ould b e t ak en on Y , t h e secon d ob s erv at ion sh ould b e t ak en on X , an d th en an opt im al pr ocedu r e sh ould b e adopt ed ov er th e r em ain in g j - 2 ob serv ation s . A lso, let m
jX Y( ) an d m
jY X( ) b e t h e ex pect ed n um b er s of m ist ak es for t h es e pr ocedu r es .
A s u su al, for t h e prior pr ob ab ilit y , let ( X ) , ( Y ) , ( X , Y ) , or ( Y , X ) den ot e t h e post erior pr ob ab ilit y w h en eit h er X (or Y ) is t ak en or ( X , Y ) (or ( Y , X ) ) ar e t ak en in th at or der . T h e follow in g t w o lem m a s an d T h eor em 1 ar e g iv en in DeGr oot (1970).
L e m m a 1 . F or an d for 0 1 , j = 2 , 3 , m
jX Y( ) = m
jYX( ) .
L e m m a 2 . F or each fix ed v alu e of t in t h e int erv al 0 t 1 , t h e pr ob ab ilit ies P { ( X ) t } an d P { ( Y ) t } ar e n on decr ea sin g fu n ction s or ( 0 1 ).
T h e o re m 1 . Let
*b e a procedur e w hich specifie s t h at an ob serv at ion
sh ould b e t ak en on X at an y st ag e for w h ich = P (
1= a ) < 1 / 2 an d t h at an
ob serv ation sh ould b e t ak en on Y at an y st a g e for w hich > 1 / 2 . T h en
*is an opt im al sequ ent ial pr ocedur e.
T h eor em 1 m ean s th at at ev ery st ag e th e st at ist ician sh ould t ak en an ob serv ation on t h e ran dom v ariab le X an d Y for w h ich t h er e is t h e gr eat er pr ob ab ilit y t h at th e ob serv ed v alu e w ill b e 1. In oth er w ord s , th e opt im al pr ocedu re is t h e m y opic pr ocedur e un der w hich th e st at ist ician m ak es a ch oice at each st ag e a s if it w er e t h e fin al st ag e.
K elley (1974 ) con sider ed t h e ca se t h at t h e prior for (
1,
2) is con cent r at ed on t w o p oint s : ( a , b) an d ( c , d ) w h er e 0 a , b , c , d 1 . F or n = 0 , , N , let V
n( ) den ot e t h e optim al ex pect ed g ain for t h e r em ain in g n trials w h en is t h e curr ent prior distribut ion for (
1,
2) . T h es e fu n ction s ar e defin ed by th e follow in g r ecur siv e form ula s .
V
0( ) = 0 , an d
V
n( ) = m ax {E [ X + V
n - 1( ( X ) ) ] , E [ Y + V
n - 1( ( Y ) ) ]} for n = 1 , , N , w h er e ( X ) den ot es t h e p ost erior distribut ion aft er an ob serv at ion on X , an d
( Y ) t h e p ost erior dist rib ut ion aft er an ob serv at ion on Y . It follow s from ab ov e form u la s t h at th er e ex ist s fu n ct io V
n( ) = max {F
n( ) , G
n( ) } n s F
n( ) an d
G
n( ) su ch t h at
for n = 1 , , N
T h es e fun ction s are als o defin ed r ecu r siv ely . Let F
0( ) = 0 an d G
0( ) = 0 . T h en for n = 1 , , N ,
F
n( ) = E [ X + m ax {F
n - 1( ( X ) ) , G
n - 1( ( X ) ) }] , an d
G
n( ) = E [ Y + m ax {F
n - 1( ( Y ) ) , G
n - 1( ( Y ) ) }] .
Let D
n( ) = F
n( ) - G
n( ) , t h e r elat iv e a dv ant ag e of ex p erim en t 1 ov er ex p erim en t 2. Recur siv e form ula s m ay b e dev elop ed for defin in g D
n( ) . In fact ,
D
1( ) = E (
1) - E (
2) , an d for n = 2 , , N ,
D
n( ) = E [ D
n - 1( ( X ) )
+] + E [ D
n - 1( ( Y ) )
-] w h er e X
+den ot es m ax {X , 0 } an d X
-den ot es m in {X , 0 } .
U sin g th ese form ula s , t h e optim al st r at egies can b e ch ar act erized . If a b an d
c d , t h en D
n( ) 0 for all an d for n = 1 , , N . S o th e optim al st r at egy is
t o alw ay s u se arm 2. N ow a s su m e th at b > a an d c > d .
T h e o re m 2 . F or each n = 1 , 2 , . N , t h e follow in g ar e t ru e :
(ⅰ) D
n( ) is a st rict ly in cr ea sin g fun ct ion of . (ⅱ) D
n( ) is a con t in u ou s fun ct ion of .
(ⅲ) D
n( ) < 0 an d D
n( ) > 0 .
(ⅳ) T h er e ex ist s an u niqu e
n( 0 , 1) su ch th at D
n(
n) = 0 .
T h eor em 2 w a s pr ov ed by K elly (1974 ). T his t h eor em m ean s t h at t h e optim al str at eg y is det erm in ed by an u niqu e sequ en ce of con st ant s
1,
2, , ,
N. F r om ab ov e r esu lt s , it follow s th at w h en ev er a b an d c d th e m y opic str at eg y is opt im al. A lso, t his is t ru e w h en ev er a b an d c d . W h en b > a an d c > d , from ab ov e t h eor em , t h e m y opic st r at egy w ill b e opt im al if an d only if
1