J ournal of K or ean
D a ta & I nf orma t ion S cience S ocie ty 2 002 , Vol. 13, N o.1 p p . 113~ 119
Go o dn e s s - o f - F it T e s t f o r t h e P are t o D i s t rib u t io n B a s e d o n t h e T ran s f o rm e d S am p le L o re n z c u rv e1 )
S uk - Bok Kang2) and Young - S uk Cho3 )
A b s t ra c t
A pow er fu l an d ea sily com pu t ed g oodn es s - of - fit t est for P ar et o dist r ibu t ion w hich does n ot depen d on t h e u nkn ow n locat ion an d s cale par am et er s is pr op os ed b a sed on t h e t r an sfor m ed sam ple Lor en z cur v e.
W e com p ar e t h e p ow er of th e pr oposed t e st st at ist ic w it h t h e ot h er g oodn e s s - of - fit t est s for P ar et o dist r ib ut ion ag ain st v ar iou s alt ern at iv e s th r ou gh M ont e Car lo m eth od s .
1. In trodu c tion
A cont in u ou s r an dom v ariable X h a s t h e P ar et o dist r ib ut ion w ith th e locat ion p ar am et er a , th e scale par am et er b, an d t h e sh ape par am et er c if it h a s a cum ulat iv e dist ribu tion fu n ct ion (cdf ) of t h e for m
F ( x ) = 1 - [ 1 + ( x - a) / b]- c, x c , b, c >0 . (1.1) T h e P ar et o dist ribu tion is an im por t ant distr ibut ion in st atist ical an aly sis , in com e, w ealt h , an d s er v ice t im e qu eu ein g sy st em . F isk (1961) cit ed sev er al ex am ples of econ om ic dat a w hich follow th e P ar et o distr ibut ion . B er g er an d M an delb r ot (1963 ) u sed th e P ar et o in stu dies of er r or clu st er s in com m u nicat ion cir cu it s . H ar ris (1968 ) fou n d th e P ar et o t o b e u sefu l in m odelin g s er v ice t im es an d in qu euin g sy st em s . K am in sky an d N elson (1975 ) sh ow ed h ow t h e P ar et o can b e u sed in life t estin g , r eliabilit y , an d r eplacem en t policy . D av is an d F eld st ein (1979 ) u sed t h e P ar et o t o m odel su rv iv al dat a b a sed on t im e s - t o - failur e of an ob serv ed
1. T his res earch w as support ed by the Yeungnam Univer sity r esearch grant s in 2001 2. Profes sor , Departm ent of St at istics , Yeungn am Univ ersity , Ky ongs an , 712- 749, Kor ea 3.A djunct A ssist ant Professor , Depart m ent of St atistics , Yeun gnam Univ ersity , Kyon gsan ,
712- 749, Korea
s am ple. La w les s (1983) st at ed t h at th e 3 - par am et er P ar et o distr ibut ion could h av e a decr ea sin g or in cr ea sin g h azar d fun ct ion depen din g on th e par am et er s .
Lik es (1969 ) der iv ed th e un iform ly m in im um v arian ce un bia sed est im at or s (UM V UE ) of t h e p ar am et er s in t h e P ar et o distr ibut ion . M alik (1970) der iv ed dist ribu tion s of t h e m ax im u m lik elih ood est im at or s (M LE ) of t h e par am et er s in th e P ar et o dist r ibu tion . Kulldor ff an d V ann m an (1973 ) stu died e st im at ion of th e location an d scale par am et er s of th e P ar et o dist r ibu tion . W oo an d K an g (1990) con sider ed a m or e g en er al cla s s of UM V UE for t h e fu n ct ion of t w o p ar am et er s in t h e P ar et o distr ibut ion . K an g an d Ch o (1996 ) obt ain ed th e j a ckkn ife e st im at or an d t h e g en er alized j ackkn ife est im at or , th e m inim um r isk est im at or (M RE ) of t w o p ar am et er s in th e P ar et o dist r ib ut ion .
M oot h at hu (1985) der iv ed t h e M LE s of t h e L or en z cur v e an d t h e Gini in dex of a P ar et o dist r ibu t ion , th eir ex act an d a sy m pt otic dist r ib ut ion s an d m om ent s . M oot h ath u (1990) also obt ain ed t h e U M VUE an d a str on g ly con sist ent a sy m pt ot ically n orm al un bia sed est im at or (S CA NUE ) of th e L or en z cu r v e, th e Gin i in dex an d T h eil en tr opy in dex of a P ar et o dist ribu tion . K an g an d Ch o (1999 ) pr oposed t h e sev er al estim at or s of t h e L or en z cur v e in t h e P ar et o distr ibut ion .
U s e of t h e P ar et o dist r ibu tion for pr act ical application s can b e en h an ced by an a ccur at e m eth od of det er m in in g w h et h er a set of dat a com es fr om a popu lat ion g ov er n ed by t h e P ar et o dist r ibu t ion . On e cla s s of g oodn e s s - of - fit t est s t h at can b e u sed for t his pur pose con sist s of t est s b a s ed on th e dist an ce b et w een t h e em pir ical dist r ib ut ion fu n ction (edf ) an d t h e hy p ot h esized cdf . T hr ee of t h e b et t er k n ow n t est s in th is cla s s K olm og or ov - S m ir n ov (K - S ), A n der s on - D ar lin g (A - D ), Cr am er - v on M ise s (C - v M ) ar e v alid w h en th er e ar e n o u nk n ow n par am et er s in t h e h y poth esized distr ibut ion .
Lilliefor s (1967, 1969 ) u sed M ont e Car lo m et h od s t o con st ru ct t ables for th e m odified K - S t est w h en t h e p ar am et er s of a n or m al or an ex p on ent ial dist r ibu tion ar e e st im at ed . Gr een an d H eg azy (1976 ) con st r u ct ed m odified K - S , A - D , an d C - v M crit ical v alu e t ables for t h e u nifor m , L aplace, Cau chy , an d ot h er dist ribu tion s . P or t er III, Colem an , an d M oor e (1992) m odified K - S , A - D , an d C - v M cr itical v alu e t ables for t h e P ar et o dist r ibu t ion .
2 . Goo dn e s s - of - fit T e s t s
L et X ( j ) ( j = 1 , 2 , , n ) b e th e j - t h or der st atist ic b a s ed on a r an dom sam ple
X 1, X 2, . . . , X n fr om t h e P ar et o dist r ibu t ion w it h cdf (1.1). Con sider n ow th e ca se w h en sh ape par am et er c is k n ow n an d b oth th e locat ion par am et er a an d t h e scale par am et er b ar e un kn ow n . T h e b est lin ear u nb ia s ed e st im at es (BLUE s )
a an d b w er e pr opos ed b y Ku lldor ff an d V ann m an (1973 ).
F or c> 2, th e BLUE s ar e
a = x( 1)- Y
( n c - 1) ( c - 2) - n cD ) (2.1)
an d
b = ( x( 1)- a) ( n c - 1) (2.2)
w h er e
Bi = (1 - c( n - i + 1)2 )Bi - 1 ,f or i = 1 , 2 , n ,
B0 1 ,
D = ( c + 1)
n - 1
i = 1Bi+ ( c - 1) Bn, Y = ( c + 1)
n - 1
i = 1Bix( i)+ ( c - 1) Bnx( n )- D x( 1). F or 2 / n < c 2 an d 2/ c is an in t eg er , th e BLUE s ar e
a = x( 1)- b n c - 1
(2.3 )
an d
b = ( c + 1) ( c + 2 ) ( n c - 1)
( n c - 2 ) ( n c - c - 2) [n - 2 / ci = 1 Bix( i)- ( n c - 2)
( c + 2) x( 1)]. (2.4 ) T h e BLUE s w er e u sed t o fin d t h e hy pot h esized cu m ulativ e dist r ibu t ion fu n ct ion Pi= F ( x ( i), a , b, c) , for i = 1 , 2 , , n . T h en th e v alu es of t h e t hr ee m odified t e st st atist ics w er e calculat ed.
T h e K - S st atist ic w a s com pu t ed fr om : D = m ax{D+ , D- },
D+ = s u p 1 i n[Pi- i - 1n ],
D- = s u p 1 i n[ ni - Pi].
T h e A - D st at ist ic w a s com put ed fr om
A 2= - n - 1 n
n
i = 1( 2 i - 1) ( log Pi+ log ( 1 - Pn + 1 - i) ) . T h e C - v M w a s com put ed fr om
W 2= 1
12 n +
n
i = 1(Pi- 2 i - 12 n )2.
T h is pr ocedur e w a s r ep eat ed 5,000 t im es for each s am ple size n , each sh ap e p ar am et er c w it h a = b = 1, an d for all t hr ee t est s . Cr it ical v alu e s ar e cont ain ed in t able b y P ort er III, Colem an , an d M oor e (1992). T h e n ull hy pot h esis th at a s et of s am ple dat a follow s a P ar et o distr ibut ion w it h sp ecified sh ap e par am et er c is r ej ect ed at th e desir ed sig nifican ce lev el if th e calculat ed v alu e of t h e t e st st at ist ic ex ceeds th e t ab le v alu e .
T h e L or en z cu rv e is ex t en siv ely u s ed in t h e stu dy of in equ ality dist ribu tion an d u sed t o b e a pow er fu l t ool for t h e an aly sis of a v ar iet y of scien tific pr oblem s . T h e L or en z cu rv e is g iv en by
L ( y ) =
y
0 x dF ( x ) / E ( Y )
w h er e Y is a n on n eg at iv e in com e v ar iable for w hich t h e m at h em at ical ex p ect ation = E ( Y ) ex ist s .
A s sum e t h at X 1, X 2, . . . , X nar e positiv e r an dom v ariables w ith or der st at ist ics X ( 1)< <X ( n ). L et r = [ np ] den ot e t h e g r eat est in t eg er les s t h an or equ al t o
np . T h en t h e s am ple Lor en z cu r v e (Gail an d Ga st w ir t h (1978 )) is defin ed by
Ln( p) =
r = [ np ] i = 1 X ( i)
n i = 1X ( i)
.
Ch o e t al. (1999 ) pr oposed t h e t r an sfor m ed L or en z cu r v e t h at can b e u s ed in th e stu dy of sy m m etr ic dist ribu tion . T h e t r an sfor m ed L or en z cu rv e is defin ed b y
T L ( p) L ( p) - p + 1.
T o t est H0 :X F ( x ) , K an g an d Ch o (2001) pr oposed N or m alized S am ple L or en z Cur v e (N S L C) . T h e N S L C is defin ed by
N S L C( p) = T S L ( p)
T S LF( p) , p = i/ n , i = 1 , 2 , , n w h er e
T S L ( p) =
i
j = 1( X j : n- X 1 : n)
n
j = 1( X j : n- X 1 : n)
- p + 1 ,
T S LF( p) =
i
j = 1(F- 1( j / ( n + 1) ) - F - 1( 1/ ( n + 1) ))
n
j = 1(F- 1( j / ( n + 1) ) - F - 1( 1/ ( n + 1) ))
- p + 1 .
W e pr opose t est st at ist ic b a sed on N S L C for th e par et o distr ibut ion a s follow s . T S = N S L Cp a r( 0 . 5)
w h er e
N S L Cp a r( p) = T S L ( p )
T S Lp a r( p) , p = i/ n , i = 1 , 2 , , n
T S L ( p) =
i
j = 1( X j : n- X 1 : n)
n
j = 1( X j : n- X 1 : n)
- p + 1 ,
T S Lp a r( p) =
i
j = 1(( 1 - j / ( n + 1) )1/ c- ( 1 - 1/ ( n + 1) )1/ c)
n
j = 1(( 1 - j / ( n + 1) )1/ c- ( 1 - 1/ ( n + 1) )1/ c)
- p + 1 .
3 . T h e S im u l at e d R e s u lt s
T h e ex act dist ribu tion of th e t est st at ist ic T S is h ar d t o calculat e. S o, t h e crit ical v alu es of t h e t est st at istic T S ar e obt ain ed by r epeat in g 5,000 t im e s for s am ple size 25 an d each sh ap e par am et er c. T h e T S cr it ical v alu e is 0.2051726 (1.213659 ) for sam ple size 25, sign ifican ce lev el = 0 . 05 , an d c = 1 . 0 ( 3 . 5) . A p ow er com parison w a s m ade am on g t h e K - S , A - D , C - v M , T S g oodn es s - of - fit t est s for th r ee p ar am et er P ar et o dist r ibu tion w ith only sh ape p ar am et er sp ecified . T h e pow er v alu es w er e ob t ain ed by g en er atin g 5,000 r an dom s am ples of size 25 for each alt er n ativ e distr ibut ion s for each t est s . T ab le cont ain s p ow er s for hy poth esized P ar et o dist r ib ut ion sh ape par am et er s c = 1 . 0 an d 3.5. T h e pr oposed t est st at ist ic u su ally h a s g r eat er pow er th an t h e ot h er t est st atist ics . T h e t est st at ist ic T S pr ov ides a pow er fu l an d ea sily com pu t ed g oodn es s - of - fit t est for P ar et o distr ibut ion w hich does n ot dep en d on th e u nkn ow n location an d scale p ar am et er s .
T a b l e 1 . M ont e Car lo p ow er e st im at es b a sed on 5,000 sam ples of size n = 25 u sin g sig nifican ce lev el = 0 . 05 w it h c = 1 . 0
E x p (0,1) U (0,1) N (0,1) B et a (2,2) W ei (sh ape =3.5 )
K - S 0.139 0.958 0.984 0.927 0.985
A - D 0.154 0.943 0.994 0.964 0.991
C- v M 0.165 0.920 0.992 0.956 0.989
T S 0.404 0.999 1.000 0.999 1.000
T a b l e 2 . M ont e Car lo p ow er e st im at es b a sed on 5,000 sam ples of size n = 25 u sin g sig nifican ce lev el = 0 . 05 w it h c = 3 . 5
E x p (0,1) U (0,1) N (0,1) B et a (2,2) W ei (sh ape =3.5 )
K - S 0.085 0.760 0.936 0.745 0.924
A - D 0.089 0.881 0.985 0.917 0.982
C- v M 0.095 0.856 0.983 0.920 0.978
T S 0.198 0.996 0.999 0.995 0.999
Re f e ren c e s
1. Ber g er , J . M . an d M an delbr ot , B . (1963 ). A n ew m odel for er r or clu st er in g in t eleph on e cir cuit s , I B M J . R es earch & D ev elop m en t, V ol. 7, 224 - 236.
2. Ch o, Y . S ., L ee, J . Y ., an d K an g , S . B . (1999 ). A stu dy on distr ibut ion b a sed on t h e T r an sfor m ed L or ez Cur v e . T he K or ean J ournal of A pp lied S ta tis tics , V ol. 12 (1), 153 - 163.
3. Dav is , H . T . an d F eld st ein , M . L . (1979 ). T h e g en er alized P ar et o law a s a m odel for pr og r es siv ely cen s or ed su rv iv al dat a . B iom e tr ika, V ol. 66, 299 - 306.
4. F isk , P . R . (1961). T h e g r adu at ion of in com e dist r ibu t ion s , E con om e tr ica , V ol. 29, 171- 185.
5. Gail, M . H . an d Ga st w ir t h , J . L . (1978 ). A S cale - F r ee Goodn es s - of - F it t est for t h e ex pon en t ial dist r ibu tion b a sed on L or en z cur v e. J ournal of A m e rican S ta tis tical A s s ocia tion , V ol. 73, 787 - 793.
6. Gr een , J . an d H eg a zy . (1976 ). P ow er fu l m odified E DF g oodn es s - of - fit t est s . J ournal of A m e rican S ta tis tical A s s ocia tion , V ol. 71, 204 - 209.
7. H arr is , C. M . (1968). T h e P ar et o distr ibut ion a s a qu eu e s er v ice disciplin e.
Op era tions R es earch, V ol. 16 307- 313.
8. K am in sky , K . S . an d N elson , P . I. (1975 ). Best lin er u nb ia s ed pr edict ion of or der st at istics in locat ion an d scale fam ilies , J ournal of A m er ican
S ta tis tical A s s ocia tion , V ol. 70, 145 - 150.
9. K an g , S . B . an d Ch o, Y . S . (1996 ). E st im ation of th e P ar am et er s in a P ar et o Distr ibut ion by J ack knife an d Boot st r ap M et h od . J ournal of I nf orm a tion & Op t im iz a tion S cien ces, V ol., 18 (2), 289- 300.
10. K an g , S . B . an d Ch o, Y . S . (2001). A st u dy on Distr ibut ion B a sed on th e N or m alized S am ple L or en z Cur v e. T he K orean Com m un ica tions in
S ta tis tics , V ol. 8 (1), 185 - 192.
11. Kulldor ff, G. an d V ann m an , K . (1973 ). E st im at ion of t h e L ocat ion an d S cale P ar am et er s of a P ar et o Distr ibut ion by Lin ear F u n ction s of Or der
S t at istics , J ournal of A m er ican S ta t is t ical A ss ocia t ion , V ol. 68, 218 - 227.
12. M alik , H . J . (1970). E st im at ion of t h e P ar am et er s of t h e P ar et o Dist r ibu t ion , M e tr ika , V ol. 16, 126- 132.
13. M ooth ath u , T . S . K . (1985 ) S am plin g Distr ibut ion of L or en z Cur v e an d Gini In dex of t h e P ar et o Distr ibut ion . S ank hy a , V ol. 47 (B ), 247 - 278 14. M ooth ath u , T . S . K . (1990) T h e B est E st im at or of L on r ez Cu r v e, Gin i
In dex an d T h eil E n tr opy In dex of P ar et o Dist ribu tion . S ank hy a , V ol. 52 (B ), 125 - 127
15. La w les s , J . F . (1983 ). S ta tis tical M od el f or L if e D a ta , J oh n W iley & S on s . 16. Lik es , J . (1969 ). M in im um V ar ian ce Un bia s ed E st im at ion of t h e P ar am et er s
of pow er - fu n ction an d P ar et o ' s Distr ibut ion , S ta t is tis che H ef te , 10, 104 - 110.
17. Lilliefor s , H . (1967 ). On t h e K olm og or ov - S m ir n ov t est for n orm alit y w it h m ean an d v ar ian ce u nk n ow n , J ournal of A m er ican S ta t is t ical A ss ocia t ion , V ol. 62, 399 - 402.
18. Lilliefor s , H . (1969 ). On t h e K olm og or ov - S m ir n ov t est for th e ex p on ent ial distr ibut ion w it h m ean u nkn ow n , J ournal of A m erican S ta tis t ical
A s s ocia tion , V ol. 64, 387 - 399.
19. P or t er III, J . E ., Colem an , J . W . an d M oor e, A . H . (1992). M odified K S , A D , C - v M t est s for th e P ar et o distr ibut ion w ith un kn ow n locat ion & s cale p ar am et er s , I E E E T ransactions on R e liability , V ol. 41(1), 112- 117.
20. W oo, J . an d K an g , S . B . (1990). E stim ation for F un ct ion s of T w o
P ar am et er s in t h e P ar et o Distr ibut ion , Y oung nam S ta tis tical L e tt ers , V ol. 1, 67 - 76.