A New Redescending M-Estimating Function

(1)

J ournal of K or ean

D a ta & I nf orma t ion S cience S ocie ty 2 002 , Vol. 13, N o.1 p p . 4 7～53

A N e w R e de s c e n din g M - E s t im at in g F u n c t io n Ro Jin P ak¹⁾

A b s tra c t

A n ew r edes cen din g M - est im atin g fun ction is int r odu ced. T h e est im at or s by th is n ew r edescen din g fu n ction at t ain t h e sam e lev el of r obu st n es s a s t h e ex ist in g r edescen din g M - estim at or s , but h av e les s a sy m pt otic v ar ian ces th an ot h er s ex cept few ca ses . W e h av e focu sed on est im atin g a locat ion p ar am et er , but th e m eth od can b e ex t en ded for a scale estim ation .

K e y w ord s : M in im um L² dist an ce ; M - estim atin g fu n ction

1. In trodu c ti on

Redes cen din g M - e st im at or s h av e fun ct ion s w h ich ar e n on decr ea sin g n ear t h e or ig in b ut t h en decr ea se t ow ar d th e ax is a s t h ey g o far fr om th e origin . T h ey u su ally s at isfy ( x ) = 0 for all x w it h |x | r , r is a finit e nu m b er w h ich m ay b e con sider ed a s th e m inim um r ej ection p oint . T h ey w er e v er y su cces sfu l in th e P r in cet on Rob u st n es s S tu dy (A n dr ew s , et al., 1972). T h er e ar e m an y r epr esen t ativ e r edescen din g fun ction s : thr e e - p ar t r ed es cen d ing f un ction (A n dr ew s , et al., 1972), s in e f unct ion (A n dr ew s , et al, 1972), b iw e ig ht f un ction (Beat on an d T u k ey , 1974 ), T anh f unct ion (A n dr ew s , et al., 1972).

N ew r edescen din g fun ct ion is b a sed on m in im ization of L ₂ dist an ce of a m odel den sity an d it s den sity estim at or . L et g b e a fam ily of pr ob abilit y den sit ies in dex ed by . T h e m inim um dist an ce est im at or is defin ed by a st at ist ical qu an tit y m inim izin g L 2 dist an ce, w hich is a solu tion t o L 2

w h er e w e a s sum e p ( x ) , g ( x ) L 2an d r epr e sen t s a der iv ativ e w it h r esp ect t o

1. A ssociat e Pr ofes sor , Departm ent of Comput er Sciences & St atistics , Dankook Univ er sity , Seoul, Kor ea 140- 714. T he r esear ch w a s con ducted by the r esear ch fund of Dankook Univ er sity in 2001.

( p ( x ) - g ( x ) )²dx , (1)

(2)

박노진

. T h e equation (1) can be writt en as

Since w e h av e g ( x ) g ( x ) dx = ( 1/ 2) g ²( x ) dx = 0 , if is a location par am et er , the equation (2) b ecom es

Giv en a r an dom sample, X ₁, X ₂, , X _n, h avin g a den sity g ( x ) , let p ( x ) be a den sity estim at or for g ( x ) such as p ( x ) = 1

n

n i = 1

1

h K

(

^{x - X}h ⁱ

)

, wh er e K ( ) is a k ernel and h is the win dow width (Silv erman , 1986). T h e equation (3) can be writt en a s

If w e follow Huber (1981), and if w e den ot e T _n as an estim at e of , w e hav e

Suppose a k ernel is a Gau s sian an d a m odel is the n orm al with m ean an d v arian ce

2, then ( 1/ h) K {( x - X _i) / h }g ( x ) dx , a conv olution of a Gau ssian kern el and a n orm al den sity , is th e n orm al with m ean an d v ariance h²+ ². After droppin g unn eces sary con st ant s , w e hav e

P ro p o s e d F u n c t io n : Redefin e - fun ction as

where r is a tuning con st ant .

W e pr opose for g en er ality t o r edefine the _r( t) with r as a tunin g con stant . T he

r( t) is origin ally deriv ed b ased on n orm al den sity , but in the section w e will show that it w ork s quite w ell for dat a sampled fr om n ot only n orm al distribution but also other distribution s . One may con sider r as a function of h and rath er th an ju st a con st ant , an d try t o find a proper v alu e of r by r eplacin g h an d by pr oper estim at or s. T her e ar e a lot of thin g s t o t alk about this idea , but this tim e w e w ould like

( p ( x ) - g ( x ) ) g ( x ) dx = 0 . (2)

p ( x ) g ( x ) dx = p ( x )g ( x ) dx = 0 . (3)

n i = 1

1

h K

(

^{x - X}h ⁱ

)

g ( x ) dx = 0 .

( X _i; T _n) = 1

h K

(

^{x - X}h ⁱ

)

^{g ( x ) dx}

= Tn

.

( X i; T n) = ( X i- ) ex p [ - ( X i- )²/ 2 ( h^{2 +} ²) ]| ^{= T}n

.

r( t) = t ex p [ - t²/ 2 r²] for t ( - , ) , (4) 48

(3)

t o hav e a fun ction similar to the ex isting function s , which ar e contr olled by con st ant s lik e tuning con st ant , trimming con st ant or bendin g con stant .

2 . P rope rti e s of th e propo s e d fu n c ti on

1 . S h ap e : F or a giv en finit e con stant c >0 , the clas s of r edescen ding function s c

con sist s of all mapping s : R R satisfying

is continu ou s on R , ( - t) = - ( t) for all t, and ( t) 0 for t 0;

th e set D ( ) of point s in which ' is n ot defined or n ot continu ou s is finit e;

( t) = 0 for |t| c .

T h e r in (4) satisfies the condition 1 an d 2. T h ou gh it does not fully satisfy th e con dition 3, it is ba sically r edescending (Figure 1). Sin ce th e pr opose _r fun ction is slowly decayin g but nev er becom es 0, w e can av oid som e comput ational pr oblem s m entioned by Huber (1981, p103) and Hampel, et al. (1986, p 152), while it can effectiv ely handle ex tr em e outlier s lik e th e other r edescen din g M - estim ating function s .

Not e th at the estim at es w e got in actu al calculation s ar e so called one - step M - estim at es

which ar e defin ed by

where the initial estim ates of location T ^{( 0 )}_n is the median of the ob serv ation s x ₁, ,x _n, an d S _n= 1 .483 m ed _i{|x _i- m ed_j( x_j) }. In the pr ocess of g etting one - step M - estim at es, w e ar e not g oing to hav e ' 0 ' in denomin at or , which could cau se ' ov erflow ' during comput ation , so that the pr oposed _r( t) pr odu ces v ery stable estim at es compar e to the oth er r edescending M - estim ating function s. T he fun ction _r deals v ariou s M - estim ating fun ction s fr om a r edescendin g function t o a non - decr easin g ( t) = t accor din g t o r . Wh en r is infinit e, w e hav e ( t) = t for all t, which pr oduces (n on - r obu st , but m ost efficient ) least - squar es estim at or s . Of cour se, if r is m oderate, the pr opose _r( t) play s like a r edescendin g M - estim ating fun ction . With a pr oper v alue of r , w e can hav e the r( t) which produces the estim at or as efficient as least squ ar es estim ator while w e keep reasonable lev el of r obu stn ess .

T n= T n^{( 0 )}+ S n

n

i = 1 ( ( x i- T n^{( 0 )}) / Sn)

n

i = 1 ' ( ( x _i- T ^{( 0 )}_n / S _n) ,

(4)

박노진

2 . R o b u s tn e s s : Figure 1 display s _r with r = 2 an d 5 (t op ), Influ en ce F un ction s (IF ) at F = (middle) and Chan ge- of- v arian ce function s (CVF ) at F = (bott om ).

F or each r , IF an d CVF ar e boun ded, an d CVF with r = 5 look s similar t o th e CVF of the logistic likelihood estim ator . F igur e 2 display s ^* ( gr oss - err or - sen sitivity ), ^* (chang e- of - v arian ce sen sitivity ) an d eff iciency (asympt otic efficien cy ) of _r for r ( 0 , 5 ] at F = . It can b e said that when r is m oder at e, an M - estim at or by r is b oth V - r obu st and B - r obu st . W hen r is 1.6, 3.3, eff iciency is about 0.9, 0.99, r espectiv ely , and it will conv er ge t o 1 as r in cr eases . Both ^* an d ^* are goin g upw ar d to , but minimized n ear r = 1 .5 , which w ould be a good choice for r .

3 . E f f i c i e n c y : W e compar e the pr oposed _r with some of the w ell- known r edescending M - estim ating fun ction s (T able 1) as Hampel, et al. (pp . 166 - 167, 1986).

T he asympt otic efficiency of the pr oposed estimator at the st andar d norm al distribution is little high er than or appr ox im at ely equal to th ose of the other estim at or s ex cept Huber - estim at or . T he asympt otic v ariances of the pr oposed estim at or under v ariou s distribution s ar e smaller t h an t h os e for t h e ot h er e st im a t or s ex cept t h e v ar ia n ce of H u b er - e st im at or u n der 5% 3N . W e h a v e sim u la t ed s et s of ob s er v a tion s an d calcu lat ed th e e s tim at e s for t w o r epr e s en t ativ e M - e s tim at ion fun ct ion s ; H ub er ' s , Biw eig ht ; an d th e e st im at e s by t h e fu n ct ion pr op os e d in t his ar t icle . 500 s am ple s of 20 an d 40 ob s er v a tion s a r e sim ulat ed for v ar iou s dist r ib ut ion s list ed in T ab le 1, a sy m pt otic efficien cie s an d v ar ian ce s ar e r e cor ded . T h e sim u lat ed s t a tist ic s dis play s th e s im ilar p at t er n t o t h e t h e or et ica l v alu e s in T able 1, ex cept t h e ca s e of Cau ch y dist r ibu tion .

4 . S t a b l e n e s s : W e als o sim u lat ed 500 s et s of 20 an d 40 ob s er v at ion s fr om 0 . 5N( 0 , 1) + 0 . 25N( - 5 , 1) + 0 .25N(5 , 1) , th at is , alm ost h a lf of sim ulat e d ob s er v a tion s b elow - 4 or ab ov e +4. W e calculat ed H u b er ' s e s tim at e s , th e e st im at e s by u sin g t h e B iw eigh t a n d t h e e st im a t e s by t h e pr op os e d fu n ction . T h os e th r e e fun ct ion s ar e th e s am e fu n ction s a s w e u s ed in th e ab ov e sim u lat ion ; H u b er ' s w ith b= 1 .4088 , Biw eig ht w it h r= 4 , an d th e pr op os ed fun ct ion w ith r= 1 . 9388 . T h e v ar ian ce s for H ub er ' e st im at e s ar e 1.3730 ( n=20) an d 1.4827 (n=40); t h os e for th e pr opos ed e s tim at e s ar e 1.0869 (n=20) an d 1.2981(n=40), w h ile 3.6291 (n=20) an d 2.4543 (n=40) for t h e B iw eigh t . T h e v alu e s for Biw eig ht a r e a lm os t dou ble com pa r e t o t h os e for t h e pr opos ed fu n ct ion . S in c e t h e B iw eigh t fu n ct ion b ecom e s 0 out side (- 4, 4 ), w h ile t h e pr opos ed fun ct ion is s low ly de cay b ut n ev er b ecom e s 0, t h e pr op os ed fun ction pr odu ce d m or e st able e st im at e s t h an Biw eig ht fu n ct ion .

3 . Co n c lu s ion s

W e pr opos ed a n ew r e de s cen din g t y pe of M - e st im a tin g fu n ct ion in du ce d by m in im izin g 5 0

(5)

L ₂ dist an ce b et w een a n orm al den s it y an d a Gau s s ian k er n el den s it y e st im a t or . T his n ew ly pr op os ed M - e st im at in g fu n ct ion is ev er y w h er e differ en tiable w hile it is r ede s cen din g , a n d it h a s b een sh ow n t h a t e s tim at or s by t h e n ew M - e s tim atin g fun ction p er for m b ett er t h an ex ist in g M - e st im a t or s in t er m s of r obu s tn e s s , efficien cy an d c om pu t a tion al s t ab len e s s u n der v ar iou s dist r ibu t ion s .

F igu r e 1: _r (t op ), IF (m iddle ) an d CV F (b ott om )

(6)

박노진

Figur e 2: ^*, ^*, and eff icien cy T able 1: Comparison of Som e Redescending M - estim at or s

A sympt otic Variances

E stim ator efficiency 5%3N 10%10N t³ ^25%3N ^{Cau chy}

Sin e 0.9093 1.1991 1.2691 1.5769 1.7687 2.2688

Huber - Collin s 0.9107 1.1966 1.2689 1.5581 1.7583 2.2591

T hr ee - P art 0.9119 1.1954 1.2662 1.5783 1.7603 2.3306

T anh 0.9205 1.1866 1.2590 1.5625 1.7579 2.2977

Scaled- logistic

MLE 0.9344 1.1872 1.4624 1.5380 1.7989 2.6390

Huber 0.9563 1.1649 1.4385 1.5663 1.7877 2.7890

n =20 0.8879 1.2063 1.4129 1.5848 1.8338 3.9073

n =40 0.9514 1.1330 1.5215 1.5225 1.8216 3.7798

Biw eight 0.9100 1.1978 1.2683 1.5708 1.7645 2.2593

n =20 0.7849 1.2660 1.3046 1.6716 1.7322 3.2951

n =40 0.8953 1.1834 1.3249 1.6160 1.8080 2.7647

Pr oposed 0.9344 1.1709 1.2491 1.5279 1.7360 2.2498

n =20 0.8502 1.2015 1.2721 1.5950 1.7488 3.3522

n =40 0.9341 1.1303 1.2931 1.5319 1.7961 2.8858

T he figur es in this t able except the ones for `Pr opos ed estim at or ' ar e a s s am e as thos e in T able 3 on p.167 of Ham pel et al. (1986). All est im at or s s atisfy ^* = 1.6749 at the st andar d n orm al dis tribution , w her e als o the asympt otic efficiency is ev aluat ed. T he estim at or s un der study ar e the follow ing : sine, ( t) = s in (x / a) for |x | < a and zer o otherw is e, w ith a =1.142; biw eight , 5 2

(7)

( x ) = x ( r²- x²)² for |x | < r and zer o ot herw is e, w ith r=4; Huber - collins , p =1.277, x1=1.344, r =4;

t hr ee- par t r edescending , ben ds at 1.31, 2.039, 4; t anh - es tim at or , r=4, k =3.732, p =1.312, A =0.667, B =0.783; Huber - estim at or ; bends at b=1.4088; nd scaled logist ic M L E , ( x ) = [ ex p (x / a) - 1] / [ ex p (x / a) + 1] w ith a=1.036; the pr opos ed es tim at or , r=1.9388. T he abbreviation % N st ands for the distribution ( 1 - / 100) (x ) + ( / 100) (x / ) and t₃ is the t distribution w ith 3 degrees of freedom and Cauchy is the Cauchy distribution w ith 0 (location ) and 1 (scale).

R ef e re n c e s

1. Andr ew s, D. F ., Bick el, P . J ., Hampel, F . R ., Huber P . J ., Rog er s W . H ., &

T ukey J . W . (1972). R ob us t E s tim a t es of L oca t ion: S urv ey and A d vances. P r in cet on N J : P r in cet on U niv . P r e s s .

2. B eat on , A . E . & T u k ey , J . W . (1974 ). Ou tlie r in S ta tis tical D a ta. W iley , N ew Y or k .

3. H a m p el, F . R ., R on ch ett i, E . M ., R ou s s eeu w , P . J ., & S t ah el, W . A . (1986 ).

R ob us t S ta t is tics , the A p p roach B as ed on I nf lu en ce F un ct ions. J ohn W iley

& S on s , N e w Y or k .

4. H u b er , P . J . (1981), R obus t S ta t is t ics, J ohn W iley & S on s , N ew Y or k .

5. S ilv er m an , B . W . (1986). D ens ity E s t im a t ion f or S ta tis tics and D a ta A naly s is. L on don : Ch apm an & H all.