Estimation for the extreme value distribution under progressive Type-I interval censoring
Sol-Ji Nam 1 · Suk-Bok Kang 2
12 Department of Statistics, Yeungnam University
Received 8 March 2014, revised 8 April 2014, accepted 14 April 2014
Abstract
In this paper, we propose some estimators for the extreme value distribution based on the interval method and mid-point approximation method from the progressive Type-I interval censored sample. Because log-likelihood function is a non-linear func- tion, we use a Taylor series expansion to derive approximate likelihood equations. We compare the proposed estimators in terms of the mean squared error by using the Monte Carlo simulation.
Keywords: Approximate maximum likelihood estimator, extreme value distribution, progressive Type-I interval censoring.
1. Introduction
The probability density function (pdf) and cumulative distribution function (cdf) of the random variable X having an extreme value distribution with the location parameter µ and the scale parameter σ are given by
f (x) = 1
σ exp x − µ σ
exp
− exp x − µ σ
, −∞ < x < ∞, µ > 0, σ > 0, (1.1) and
F (x) = 1 − exp
− exp x − µ σ
, −∞ < x < ∞. (1.2) In most cases of censoring, estimators of parameters may not be obtained as closed form by the maximum likelihood method. As the log-likelihood functions do not admit closed form, it will be useful to consider an approximation to the likelihood functions which provide us with estimators of closed form. The approximate maximum likelihood estimating method was originally developed by Balakrishnan (1989) for the purpose of exactly estimating the scale parameter in the Rayleigh distribution. Kang et al. (2001) obtained the approximation maximum likelihood estimators (AMLEs) for the parameters in the three-parameter Weibull distribution. Kang (2003) presented approximate MLEs for exponential distribution under
1
Graduate student, Department of Statistics, Yeungnam University, Gyeongsan 712-749, Korea.
2
Corresponding author: Professor, Department of Statistics, Yeungnam University, Gyeongsan 712-749,
Korea. E-mail: [email protected]
multiple Type-II censoring. Kang et al. (2014) presented goodness-of-fit test for the logistic distribution based on multiply Type-II censored samples.
Aggarwala (2001) introduced the progressive Type-I interval censored sample. This pro- gressive Type-I interval censoring is carried out as follows. Suppose n units put a life test at the same time at time T 0 = 0. Units are confirmed by interval T 1 , T 2 , . . . , T m , where T m
is the end point in the experiment. R i censored items randomly failed at the censoring time T i , i = 1, . . . , m.
Figure 1.1 Scheme of a progressive Type-I interval censoring
More concretely, X 1 is the number of failure units in (0, T 1 ], and R 1 is the number of removed units resulting in n − X 1 remaining units. Further X 2 is the number of failure units in the second interval (T 1 , T 2 ], and R 2 is the number of removed units, resulting in n − X 1 − X 2 remaining units. Thus, X m is the final number of failure units in (T m−1 , T m ], and all the remaining units n − P m
i=1 X i − P m
i=1 R i = R m are removed at time T m . A schematic representation of the progressive Type-I interval censoring is presented in Figure 1.1 (see Ng and Wang, 2009).
Chen and Lio (2010) studied statistical estimator for the parameters of generalized ex- ponential distribution. They proposed the mid-point estimators and derived the likelihood function. Shin et al. (2010) presented parameter estimation for exponential distribution un- der progressive Type-I interval censoring. Recently, Cho et al. (2013) studied the estimation for the generalized exponential distribution under progressive Type-I interval censoring.
In this paper, we derive several AMLEs of the location parameter and scale parameter in a extreme value distribution under progressive Type-I interval censoring. We also compare the proposed estimators in the sense of the bias and mean squared error (MSE) for different combination of values of the parameters, sample size, and censoring scheme. In section 2, we obtain the MLE and AMLEs of the parameters in the extreme value distribution based on progressive Type-I interval censored sample. In section 3, we simulate the MSEs of all proposed estimators through Monte Carlo simulation method and compare the performances of the proposed estimators for several censoring schemes.
2. Estimation for parameters
2.1. Mid-point approximation method
Suppose a progressive Tyep-I interval censored sample is collected, being with a random
sample of n units with a continuous life time distibution (1.2). Then, based on the observed
data, Aggarwala (2001) gave the joint likelihood function as follows L(θ) =C[F (T 1 , β; θ)] X
1[1 − F (T 1 , β; θ)] R
1× [F (T 2 , β; θ) − F (T 1 , β; θ)] X
2[1 − F (T 2 , β; θ)] R
2× · · · × [F (T m , β; θ) − F (T m−1 , β; θ)] X
m[1 − F (T m , β; θ)] R
m=C
m
Y
i=1
[F (T i , β; θ) − F (T i−1 , β; θ)] X
i[1 − F (T i , β; θ)] R
i, (2.1)
where C = n(n − 1 − R 1 )(n − 2 − R 1 − R 2 ) · · · (n − m + 1 − R 1 − · · · − R m−1 ) and T 0 = 0.
Ng and Wang (2009) introduced the mid-point estimators that are obtained by assuming the X i failures ocured at the center of the interval M i = 1 2 (T i−1 + T i ) and R i censored items failed at the censoring time T i .
By putting Z i = M
iσ −µ , S i = T
iσ −µ , the likelihood function can be rewritten as
L = C
m
Y
i=1
1 σ f (z i )
X
i[1 − F (s i )] R
i, (2.2)
where f (z) = e z exp(−e z ) and F (z) = 1 − exp(−e z ) are the pdf and the cdf of the standard extreme value distribution. Therefore, we obtain the likelihood equations as follows
∂ ln L
∂µ ' − 1 σ
" m X
i=1
X i f 0 (z i ) f (z i ) −
m
X
i=1
R i f (s i ) 1 − F (s i )
#
= 0 (2.3)
and
∂ ln L
∂σ ' − 1 σ
" m X
i=1
X i
1 + f 0 (z i )z i
f (z i )
−
m
X
i=1
R i f (s i )s i
1 − F (s i )
#
= 0. (2.4)
We may expand the following functions in Taylor series around the points ξ i = F −1 (p i ), f 0 (z i )
f (z i ) , f (s i )
1 − F (s i ) . (2.5)
First, we can approximate these functions by f 0 (z i )
f (z i ) ' −e ξ
iz i + 1 − e ξ
i(1 − ξ i ), (2.6) f (s i )
1 − F (s i ) ' e ξ
is i + e ξ
i(1 − ξ i ). (2.7) By substituting the equations (2.6) and (2.7) into the equations (2.3) and (2.4), we obtain the approximate likelihood equations for µ and σ as follows
∂ ln L
∂µ ' − 1 σ
" m X
i=1
X i {−e ξ
iz i + 1 − e ξ
i(1 − ξ i )} −
m
X
i=1
R i {e ξ
is i + e ξ
i(1 − ξ i )}
#
(2.8)
= 0
and
∂ ln L
∂σ ' − 1 σ
" m X
i=1
X i +
m
X
i=1
X i −e ξ
iz i + 1 − e ξ
i(1 − ξ i ) z i
−
m
X
i=1
R i e ξ
is i + e ξ
i(1 − ξ i ) s i
#
= − 1 σ
"
σ 2
m
X
i=1
X i +
" m X
i=1
X i {1 − e ξ
i(1 − ξ i )}(M i − µ) −
m
X
i=1
R i e ξ
i(1 − ξ i )(T i − µ)
# σ
−
m
X
i=1
X i e ξ
i(M i − µ) 2 −
m
X
i=1
R i e ξ
i(T i − µ) 2
#
= 0. (2.9)
Upon solving the equations (2.8) and (2.9) for σ, we can derive an approximate estimator of σ as follows
ˆ
σ M
1= −N + pN 2 − 4J P m i=1 X i
2 P m
i=1 X i (2.10)
where
N =
m
X
i=1
X i {1 − e ξ
i(1 − ξ i )}(M i − ˆ µ m ) −
m
X
i=1
R i e ξ
i(1 − ξ i )(T i − ˆ µ m ),
J = −
m
X
i=1
X i e ξ
i(M i − ˆ µ m ) 2 −
m
X
i=1
R i e ξ
i(T i − ˆ µ m ) 2 .
Since J is always negative, ˆ σ M
1is greater than 0.
Second, we can also approximate these functions by f 0 (z i )z i
f (z i ) ' a 1i z i + b 1i , (2.11) f (s i )s i
1 − F (s i ) ' c 1i s i + d 1i , (2.12) where
a 1i = 1 − (1 + ξ i )e ξ
i, b 1i = ξ i 2 e ξ
i, c 1i = (ξ i + 1)e ξ
i, d 1i = −ξ i 2 e ξ
i.
By substituting the equations (2.11) and (2.12) into the equation (2.4), we obtain the
approximate likelihood equations for σ as follows
∂ ln L
∂σ ' − 1 σ
" m X
i=1
X i +
m
X
i=1
X i (a 1i z i + b 1i ) −
m
X
i=1
R i (c 1i s i + d 1i )
#
= − 1 σ
"" m X
i=1
X i +
m
X
i=1
X i b 1i −
m
X
i=1
R i d 1i
# σ +
m
X
i=1
X i a 1i (M i − ˆ µ)
−
m
X
i=1
R i c 1i (T i − ˆ µ)
#
= 0. (2.13)
Upon solving the equations (2.8) and (2.13) for σ, we can derive an approximate estimator of σ as follows
ˆ
σ M
2= − F 2 − F 3 µ ˆ m
F 1 , (2.14)
where
F 1 =
m
X
i=1
X i +
m
X
i=1
X i a 1i −
m
X
i=1
R i c 1i ,
F 2 =
m
X
i=1
X i a 1i M i −
m
X
i=1
R i c 1i T i ,
F 3 =
m
X
i=1
X i a 1i −
m
X
i=1
R i c 1i .
From the equations (2.8) and (2.13) for µ, we can obtain the estimator of µ as follows ˆ
µ M = E 2 F 1 − E 3 F 2
E 1 F 1 − E 3 F 3
, (2.15)
where
E 1 =
m
X
i=1
X i e ξ
i+
m
X
i=1
R i e ξ
i,
E 2 =
m
X
i=1
X i e ξ
iM i +
m
X
i=1
R i e ξ
iT i ,
E 3 =
m
X
i=1
X i 1 − e ξ
i(1 − ξ i ) −
m
X
i=1
R i e ξ
i(1 − ξ i ).
2.2. Approximate maximum likelihood estimation The likelihood function can be rewritten as
L =C
m
Y
i=1
[F (s i ) − F (s i−1 )] X
i[1 − F (s i )] Ri . (2.16)
Hence,
∂ ln L
∂µ ' − 1 σ
" m X
i=1
X i
f (s i ) − f (s i−1 ) F (s i ) − F (s i−1 ) −
m
X
i=1
R i
f (s i ) 1 − F (s i )
#
= 0 (2.17)
and
∂ ln L
∂σ ' − 1 σ
" m X
i=1
X i
f (s i )s i − f (s i−1 )s i−1
F (s i ) − F (s i−1 ) −
m
X
i=1
R i
f (s i )s i
1 − F (s i )
#
= 0. (2.18)
First, we can approximate these functions by f (s i )
F (s i ) − F (s i−1 ) ' a 2i + b 2i s i + c 2i s i−1 , (2.19) f (s i−1 )
F (s i ) − F (s i−1 ) ' a 3i + b 3i s i + c 3i s i−1 , (2.20) and
f (s i ) − f (s i−1 )
F (s i ) − F (s i−1 ) ' a 4i + b 4i s i + c 4i s i−1 , (2.21) f (s i )
1 − F (s i ) ' e ξ
i(1 − ξ i ) + e ξ
is i , (2.22) where
a 2i = 1 − (1 − e ξ
i)ξ i p i − p i−1
f (ξ i ) + f (ξ i )ξ i − f (ξ i−1 )ξ i−1 (p i − p i−1 ) 2 f (ξ i ), b 2i = 1 − e ξ
ip i − p i−1 f (ξ i ) −
f (ξ i ) p i − p i−1
2 , c 2i = f (ξ i )f (ξ i−1 )
(p i − p i−1 ) 2 , a 3i = 1 − (1 − e ξ
i)ξ i−1
p i − p i−1
f (ξ i−1 ) + f (ξ i )ξ i − f (ξ i−1 )ξ i−1
(p i − p i−1 ) 2 f (ξ i−1 ), b 3i = − f (ξ i )f (ξ i−1 )
(p i − p i−1 ) 2 , c 3i = 1 − e ξ
i−1p i − p i−1
f (ξ i−1 ) +
f (ξ i−1 ) p i − p i−1
2 , a 4i = a 2i − a 3i , b 4i = b 2i − b 3i , c 4i = c 2i − c 3i .
By substituting the equations (2.21) and (2.22) into the equation (2.17), we obtain the approximate likelihood equations for µ as follows
∂ ln L
∂µ ' − 1 σ
" m X
i=1
X i (a 4i + b 4i s i + c 4i s i−1 ) −
m
X
i=1
R i e ξ
i(1 − ξ i ) + e ξ
is i
#
= − µA 1 + A 2 + A 3 σ (2.23)
= 0,
where
A 1 =
m
X
i=1
X i b 4i +
m
X
i=1
X i c 4i −
m
X
i=1
R i e ξ
i,
A 2 =
m
X
i=1
X i b 4i T i +
m
X
i=1
X i c 4i T i−1 −
m
X
i=1
R i e ξ
iT i ,
A 3 =
m
X
i=1
X i a 4i −
m
X
i=1
R i e ξ
i(1 − ξ i ).
And by substituting the equations (2.19), (2.20) and (2.22) into the equation (2.18), we obtain the approximate likelihood equations for σ as follows
∂ ln L
∂σ ' − 1 σ
" m X
i=1
X i {(a 2i + b 2i s i + c 2i s i−1 )s i − (a 3i + b 3i s i + c 3i s i−1 )s i−1 }
−
m
X
i=1
R i e ξ
i(1 − ξ i ) + e ξ
is i s i
#
= 0. (2.24)
Upon solving the equations (2.23) and (2.24) for σ, we can derive an approximate estimator of σ as follows
ˆ
σ 1 = B 2 + B 3 µ − A ˆ 1 µ ˆ 2
B 1 − A 3 µ ˆ , (2.25)
where
B 1 =
m
X
i=1
X i a 2i T i −
m
X
i=1
X i a 3i T i−1 −
m
X
i=1
R i e ξ
i(1 − ξ i )T i ,
B 2 = −
m
X
i=1
X i b 2i T i 2 −
m
X
i=1
X i c 2i T i T i−1 +
m
X
i=1
X i b 3i T i T i−1 +
m
X
i=1
X i c 3i T i−1 2
+
m
X
i=1
R i e ξ
iT i 2 ,
B 3 =2
m
X
i=1
X i b 2i T i +
m
X
i=1
X i c 2i (T i−1 + T i ) −
m
X
i=1
X i b 3i (T i−1 + T i )
− 2
m
X
i=1
X i c 3i T i−1 − 2
m
X
i=1
R i e ξ
iT i .
Second, we can also approximate these functions by f (s i )s i
F (s i ) − F (s i−1 ) ' a 5i + b 5i s i + c 5i s i−1 , (2.26) f (s i−1 )s i−1
F (s i ) − F (s i−1 ) ' a 6i + b 6i s i + c 6i s i−1 , (2.27)
f (s i )s i − f (s i−1 )s i−1
F (s i ) − F (s i−1 ) ' a 7i + b 7i s i + c 7i s i−1 , (2.28) f (s i )s i
1 − F (s i ) ' −e ξ
iξ i 2 + e ξ
i(1 + ξ i )s i , (2.29) where
a 5i = − (1 − e ξ
i)ξ i
p i − p i−1 f (ξ i )ξ i + f (ξ i )ξ i − f (ξ i−1 )ξ i−1
(p i − p i−1 ) 2 f (ξ i )ξ i , b 5i = (1 − e ξ
i)ξ i + 1
p i − p i−1
f (ξ i ) − ξ i
f (ξ i ) p i − p i−1
2
, c 5i = f (ξ i )f (ξ i−1 )ξ i
(p i − p i−1 ) 2 , a 6i = − (1 − e ξ
i−1)ξ i−1
p i − p i−1
f (ξ i−1 )ξ i−1 + f (ξ i )ξ i − f (ξ i−1 )ξ i−1
(p i − p i−1 ) 2 f (ξ i−1 )ξ i−1 , b 6i = − f (ξ i )f (ξ i−1 )ξ i−1
(p i − p i−1 ) 2 , c 6i = (1 − e ξ
i−1)ξ i−1 + 1
p i − p i−1
f (ξ i−1 ) + ξ i−1
f (ξ i−1 ) p i − p i−1
2 , a 7i = a 5i − a 6i , b 7i = b 5i − b 6i , c 7i = c 5i − c 6i .
By substituting the equations (2.28) and (2.29) into the equation (2.18), we obtain the approximate likelihood equations for σ as follows
∂ ln L
∂σ ' − 1 σ
" m X
i=1
X i (a 7i + b 7i s i + c 7i s i−1 ) −
m
X
i=1
R i −e ξ
iξ 2 i + e ξ
i(1 + ξ i )s i
#
(2.30)
=0.
Upon solving the equations (2.23) and (2.30) for σ, we can derive an approximate estimator of σ as follows
ˆ
σ 2 = C 3 µ − C ˆ 2
C 1 , (2.31)
where
C 1 =
m
X
i=1
X i a 7i +
m
X
i=1
R i e ξ
iξ 2 i ,
C 2 =
m
X
i=1
X i b 7i T i +
m
X
i=1
X i c 7i T i−1 −
m
X
i=1
R i e ξ
i(1 + ξ i )T i ,
C 3 =
m
X
i=1
X i b 7i +
m
X
i=1
X i c 7i −
m
X
i=1
R i e ξ
i(1 + ξ i ).
We can also obtain the following estimators by using equations (2.23), (2.25) and (2.31);
ˆ
µ I
1= A 2 B 1 + A 3 B 2
A 1 B 1 + A 2 A 3 − A 3 B 3 , (2.32) ˆ
µ I
2= A 2 C 1 − A 3 C 2
A 1 C 1 − A 3 C 3 , (2.33)
ˆ
σ I
1= C 3 µ ˆ i1 − C 2 C 1
, (2.34)
ˆ
σ I
2= B 2 + B 3 µ ˆ i2 − A 1 µ ˆ 2 i2
B 1 − A 3 µ ˆ i2 . (2.35)
3. Simulation study
We compare the proposed estimators in the sense of the mean squared errors throught Monte Carlo simulation for various censoring schemes. The simulation procedure is repeated 5,000 times for the sample size n=30, 50 and various choice of censoring.
By Aggarwala (2001), the progressive Type-I interval censored samples were generated by using the following algorithm in which X 1 ∼ Bin(n, F (t 1 )) for i = 2, 3, . . . , m,
X i |X i−1 , . . . , X 1 , R i−1 , . . . , R 1
∼Bin
n −
i−1
X
j=1
(X j + R j ), F (T i ) − F (T i−1 ) 1 − P i−1
j=1 [F (T j ) − F (T j−1 )]
=Bin
n −
i−1
X
j=1
(X j + R j ), F (T i ) − F (T i−1 ) 1 − F (T i−1 )
. (3.1)
The binomial random variables of m units were generated by the following algorithm step.
step 1. Initialize i = 0, xsum = 0, rsum = 0.
step 2. i = i + 1.
step 3. If i = m, exit the algorithm.
step 4. Generate X i with Bin ∼
n − xsum − rsum, F (T [1−F (T
i)−F (T
i−1)
i−1