Bayesian inference in finite population sampling under measurement error model

(1)

Bayesian inference in finite population sampling under measurement error model ^†

You Mee Goo ¹ · Dal Ho Kim ²

12 Department of Statistics, Kyungpook National University

Received 3 October 2012, revised 7 November 2012, accepted 12 November 2012

Abstract

The paper considers empirical Bayes (EB) and hierarchical Bayes (HB) predictors of the finite population mean under a linear regression model with measurement errors.

We discuss how to calculate the mean squared prediction errors of the EB predictors using jackknife methods and the posterior standard deviations of the HB predictors based on the Markov Chain Monte Carlo methods. A simulation study is provided to illustrate the results of the preceding sections and compare the performances of the proposed procedures.

Keywords: Empirical Bayes, finite population mean, Gibbs sampler, hierachical Bayes, jackknife method, mean squared prediction error, posterior standard deviation.

1. Introduction

We consider a finite population U with units labeled 1, 2, ..., N . Let y _i denote the value of a single characteristic attached to the unit i. The vector y = (y ₁ , ..., y _N ) ^T is the unknown state of nature, and is assumed to belong to Θ = R ^N . Here we concern exclusively about the finite population mean γ = N ⁻¹ P N

i=1 y i . A subset s of {1, 2, ..., N } is called a sample.

A sample s of size n is selected from U according to some specified sampling plan. And let

¯

s = U − s be the unobserved part of U .

In this paper we use the superpopulation approach (Cassel et al., 1977) to survey sam- pling. Under this perspective, according to the conditionality principle (Basu, 1975), the sampling plan is not relevant for inference. Extensive bibliographies on the model-based su- perpopulation approach are given in Bolfarine and Zacks (1992). Throughout, we will use the notations y = (y 1 , ..., y n , y n+1 , ..., y N ) ^T and y = (y ^T _s , y ^T _s _¯ ) ^T with y _s = (y 1 , ..., y n ) ^T , y _¯ _s = (y n+1 , ..., y N ) ^T . Also write 1 N −n as the N − n dimensional column vector of 1’s, J N −n = 1 _{N −n} 1 ^T _{N −n} and I _{N −n} as the identity matrix of order N − n.

Measurement errors may occur when the measuring device is biased or inaccurate. Regard- ing human populations, the respondents may not possess accurate information or they may give biased information. As shown in Table 1.1 of Fuller (1987), simple characteristics like sex

† This Research was supported by Kyungpook National University Research Fund, 2010.

1

Ph.D. candidate, Department of Statistics, Kyungpook National University, Daegu 702-701, Korea.

2

Corresponding author: Professor, Department of Statistics, Kyungpook National University, Daegu

702-701, Korea. E-mail: [email protected]

(2)

or age may also present some measurement errors. More complex population characteristics like unemployment, income or salary may present a much more serious measurement bias.

According to Fuller (1987), measurement error is about 15 percent of the total variation for income. The effect of such kind of errors upon estimated regression coefficients has long been recognized as a serious problem. Cochran (1968) and Fuller (1975, 1987) are references reporting distortions that are introduced into the regression coefficient estimates when the variables in the regression equation are measured with errors. There has been little explicit analytical treatment of prediction problem for the finite population mean under regression superpopulation models with measurement errors.

We consider the superpopulation model

y i = α + βx i + e i , i = 1, ..., N ; (1.1) X _i = x _i + η _i , i = 1, ..., N. (1.2) It is assumed that the x _i , η _i and e _i are mutually independent with x _i ^iid ∼ N (µ _x , σ ² _x ), η _i ^iid ∼ N (0, σ ² _η ) and e i

iid ∼ N (0, σ ² _e ). The vector of model parameters is denoted by φ = (α, β, µ x , σ ² _x , σ _η ² , σ ² _e ) ^T . Also we assume that X = (X 1 , X 2 , ..., X N ) ^T is known. Similarly to y, we use the notations X = (X ^T _s , X ^T _s _¯ ) ^T with X _s = (X ₁ , ..., X _n ) ^T , X _¯ _s = (X _n+1 , ..., X _N ) ^T .

Bolfarine and Cordani (1993) provided the likelihood-based inferences on the slope pa- rameter of a simple linear regression model with measurement errors when the reliability ratio is known. Bolfarine et al. (1996) considered the frequentist approach to the prediction of the finite population total under regression superpopulation model with measurement errors. They study the asymptotic behavior of the naive predictor based on the ordinary least squares estimator as well as a bias-adjusted estimator, establishing the asymptotic normality. Later Torabi et al. (2009) justified that the naive predictor given in Bolfarine et al. (1996) is essentially identical to the EB predictor.

In this article, we consider EB and HB predictors of finite population mean when the covariate, say x, is measured with error. We also assume that x is stochastic. Here the EB procedure need to estimate the hyperparameters, and does not require any approximation of the posterior. The HB procedure also does not rely any normal approximation.

The outline of the remaining section is as follows. In Section 2, we provide the EB and HB predictors of finite population mean where the covariates are measured with error. Also we discuss how to calculate the mean squared prediction error (MSPE) of the EB predictors using jackknife methods and the posterior standard deviation of the HB predictors based on the Markov Chain Monte Carlo (MCMC) methods. In Section 3, a simulation study is provided to illustrate the results of the preceding sections and compare the performances of the proposed procedures. Finally we provide the summary and conclusion.

2. EB and HB predictors of the finite population mean

A sample s of size n is drawn from the finite population and the sample data is denoted by (y i , X i ; i ∈ s). From (1.1) and (1.2), the incidental parameters x i can be eliminated in such a way that (y i , X i ) has a bivariate normal distribution with

y _i X _i

∼ N α + βµ _x µ _x

, β ² σ ² _x + σ _e ² βσ ² _x βσ _x ² σ ² _x + σ _η ²

, i = 1, ..., N.

(3)

Using the well-known properties of the bivariate normal distribution, it follows that y _i |X i

ind ∼ N [ µ y + βk _x (X _i − µ x ), σ _e ² + β ² σ _x ² (1 − k _x )], i = 1, ..., N

where µ y = α + βµ x and k x = σ _x ² /(σ ² _x + σ ² _η ). We are interested in the estimation of finite population mean γ = N ⁻¹ P N

i=1 y i from the sample data. It can be rewritten by γ = N ⁻¹ ( 1 ^T _n y _s + 1 ^T _{N −n} y _s _¯ ).

First we derive the EB predictor of γ. The Bayes predictor of γ under squared error loss is

ˆ

γ ^B = E(γ|y _s , X, φ) = (1 − f )¯ y _s + N ⁻¹ 1 ^T _{N −n} E(y _s _¯ |y _s ),

where f = (N − n)/N is the finite population correction factor and ¯ y s = n ⁻¹ P n i=1 y i . The basic problem in finite population sampling is to draw predictive inference about y _¯ _s conditional on y _s . Since the conditional distribution of y _s _¯ given y _s is given by

y _¯ _s |y _s ∼ N [µ y 1 N −n + βk x (X ¯ s − µ x 1 N −n ), {σ ² _e + β ² σ ² _x (1 − k x )}I N −n ],

we have E(y _¯ _s |y _s ) = µ _y 1 _{N −n} + βk _x (X _s _¯ − µ _x 1 _{N −n} ) for all i ∈ ¯ s. Thus the Bayes predictor of γ is given by

ˆ

γ ^B = (1 − f )¯ y s + f µ y + f βk x ( ¯ X ¯ s − µ x ). (2.1) Also, the posterior variance of γ given φ is

V (γ|y _s , X, φ) = 1

N f {σ _e ² + β ² σ _x ² (1 − k x )}. (2.2) Note that V (γ|y _s , X, φ) does not depend on y _s and X. Hence the MSPE of ˆ γ ^B , E(ˆ γ ^B − γ) ² , is equal to the posterior variance of γ. Also, note that the posterior variance of γ depends only on δ = (β, σ ² _x , σ _η ² , σ _e ² ) ^T . We denote g 1 (δ) ≡ M SP E(ˆ γ ^B ) = E(ˆ γ ^B − γ) ² . If N is large and n/N ≈ 0, then g ₁ (δ) ≈ 0.

The EB predictor ˆ γ ^EB of γ is obtained by replacing φ in the Bayes predictor ˆ γ ^B by a consistent estimator ˆ φ. The components of φ are unknown and need to be estimated from the data. Let ¯ X _s = n ⁻¹ P n

i=1 X _i , SS _X = P n

i=1 (X _i − ¯ X _s ) ² , SS _y = P n

i=1 (y _i − ¯ y _s ) ² , S _yX = P n

i=1 (X _i − ¯ X _s )(y _i − ¯ y _s ), M S _X = (n − 1) ⁻¹ SS _X , M S _y = (n − 1) ⁻¹ SS _y and M S _yX = (n − 1) ⁻¹ S _yX .

Under some regularity conditions, ¯ y _s and ¯ X _s are consistent estimator of µ _y and µ _x , re- spectively, i.e., ˆ µ _y = ¯ y _s , ˆ µ _x = ¯ X _s . Under the superpopulation model (1.1) and (1.2), it can be shown that (see Fuller, 1987) E[ ˆ β OLS ] = E( _SS ^S

^yX

X

) = k x β, where ˆ β OLS is the ordinary least-squares estimator of β, and thus k _x β is consistently estimated by ˆ β _OLS . Thus the EB predictor of γ is given by

ˆ

γ ÊB = ¯ y s + f ( ¯ X s ¯ − ¯ X s ) ˆ β OLS . (2.3) Now we obtain a nearly unbiased estimator of MSPE(ˆ γ ÊB ) = E(ˆ γ ÊB − γ) ² , using the jackknife methods proposed by Jiang et al. (2002) and Chen and Lahiri (2002). We have the following orthogonal decomposition:

M SP E(ˆ γ ^EB ) = E(ˆ γ ^B − γ) ² + E(ˆ γ ^EB − ˆ γ ^B ) ² = M ₁ + M ₂ (2.4)

(4)

where M ₁ = g ₁ (δ) is given by (2.2).

A plug-in estimator of g 1 (δ) is g 1 (ˆ δ). Following Fuller (1987), when we assume σ ² _η to be known, the estimators are given by ˆ σ _x ² = M S _X − σ ² _η and ˆ σ _e ² = M S _y − ˆ βM S _yX where β = SS ˆ yX /{SS X − σ _η ² }. We apply the jackknife method of bias reduction to g 1 (ˆ δ) to get a nearly unbiased estimator of M ₁ = g ₁ (δ). Let ˆ φ _−l be the estimator of φ obtained by deleting the lth data set (y ^(l) s , X ^(l) _s ) from the full data set (y _i , X _i ; i ∈ s) and then applying the method-of-moments. This calculation is done for each l in turn to get n estimators of φ : ( ˆ φ _−l ; l = 1, ..., n). A jackknife estimator of M ₁ is given by

M ˆ _1J = g ₁ (ˆ δ) −

n

X

l=1

n − 1

n {g ₁ (ˆ δ _−l ) − g ₁ (ˆ δ)}. (2.5) Turning to jackknife estimation of the last term, M 2 , in (2.4), let ˆ γ ^B = k(y _s , X s , φ) be the Bayes predictor expressed as a function of y _s , X _s and φ so that ˆ γ ^EB = k(y _s , X _s , ˆ φ). Now replace ˆ φ by ˆ φ _−l to get ˆ γ ^EB _−l = k(y s , X s , ˆ φ _−l )(l = 1, ..., n). Then an jackknife estimator of M 2 is given by

M ˆ 2J =

n

X

l=1

n − 1

n (ˆ γ _−l ÊB − ˆ γ ÊB ) ² . (2.6) By taking the sum of (2.5) and (2.6), a jackknife estimator of M SP E(ˆ γ ÊB ) is obtained as

mspe _J (ˆ γ ^EB ) = ˆ M _1J + ˆ M _2J . (2.7) Next, we consider a hierarchical Bayesian framework to predict the population means γ.

To this end, we begin with the following model:

I. y i |α, β, σ ² _e ^ind ∼ N (α + βx i , σ _e ² ), i = 1, ..., n where e i

iid ∼ N (0, σ _e ² ).

II. X i |x i , σ ² _η ^ind ∼ N (x i , σ _η ² ), i = 1, ..., n where η i

iid ∼ N (0, σ _η ² ).

III. x _i ^iid ∼ N (µ x , σ _x ² ).

IV. α, β, µ x , σ ² _e , σ _η ² , σ ² _x are mutually independent with α, β, µ x

iid ∼ uniform(−∞, ∞), σ ² _e ∼ IG(a e /2, b e /2), σ _η ² ∼ IG(a η /2, b η /2), σ _x ² ∼ IG(a x /2, b x /2). Here IG(a, b) denotes an inverse gamma distribution with pdf f _a,b (z) ∝ exp(−a/z)z ^(−b−1) I _[z>0] .

The implementation of the Bayesian procedure is greatly facilitated by the MCMC numeri-

cal integration technique, in particular the Gibbs sampler. This requires generating samples

from the full conditionals of each of x i , α, β, µ x , σ ² _e , σ _η ² and σ _x ² given the remaining

parameters and the data. The details are given below.

(5)

By the HB model I to IV, the joint posterior distribution is given by

π(α, β, µ _x , σ _e ² , σ ² _η , σ _x ² |y _s , X _s ) ∝ (σ _e ² ) ⁻

ⁿ²

exp[− 1 2σ _e ²

n

X

i=1

(y _i − α − βx _i ) ² ]

×(σ ² _η ) ⁻

ⁿ²

exp[− 1 2σ _η ²

n

X

i=1

(X i − x i ) ² ] × (σ _x ² ) ⁻

ⁿ²

exp[− 1 2σ ² _x

n

X

i=1

(x i − µ x ) ²

× exp[− a x

2σ _x ² ](σ _x ² ) ^−(b

^x

^/2+1) × exp[− a η

2σ ² _η ](σ _η ² ) ^−(b

^η

^/2+1) × exp[− a e

2σ _e ² ](σ _e ² ) ^−(b

^e

^/2+1) . Then the full conditionals are obtained as follows:

(i) [α | x, β, µ x , σ ² _e , σ _η ² , σ ² _x , y, X] ∼ N (¯ y − β ¯ x, ^σ _n

²^e

);

(ii) [β | x, α, µ _x , σ _e ² , σ ² _η , σ _x ² , y, X] ∼ N (

P

n

i=1

(y

_i

−α)x

i

P

n

i=1

x

i2

, P

n

^σ

²^e i=1

x

²_i

);

(iii) [µ _x | x, α, β, σ _e ² , σ ² _η , σ _x ² , y, X] ∼ N (¯ x, ^σ _n

²^e

);

(iv) [σ ² _e | x, α, β, µ x , σ _η ² , σ ² _x , y, X] ∼ IG( ¹ ₂ { P n

i=1 (y _i − α − βx i ) ² + a _e }, ^n+b ₂

^e

);

(v) [σ _η ² | x, α, β, µ x , σ _e ² , σ ² _x , y, X] ∼ IG( ¹ ₂ { P n

i=1 (X i − x i ) ² + a η }, ^n+b ₂

^η

);

(vi) [σ ² _x | x, α, β, µ _x , σ ² _e , σ _η ² , y, X] ∼ IG( ¹ ₂ { P n

i=1 (x _i − µ _x ) ² + a _x }, ^n+b ₂

^x

);

(vii) [x _i | α, β, µ _x , σ ² _e , σ _η ² , σ ² _x , y, X] ^ind ∼ N [ (β ² σ _e ⁻² +σ ⁻² _η +σ ⁻² _x ) ⁻¹ ×{(y _i −α)βσ _e ⁻² +X _i σ _η ⁻² + µ x σ _x ⁻² }, (β ² σ e −2 + σ _η ⁻² + σ ⁻² _x ) ⁻¹ ], i = 1, ..., n.

Using Gibbs sampling, we obtain the HB estimate

ˆ

γ ^HB = E[γ|y _s ] ≈ (1 − f )¯ y s + f

T

X

t=1

{µ ^(t) _y + β ^(t) k ^(t) _x ( ¯ X s ¯ − µ ^(t) _x )}/T (2.8)

where µ ^(t) y = α ^(t) + β ^(t) µ ^(t) x and k x ^(t) = σ x ^2(t) /(σ ^2(t) x + σ η ^2(t) ). And the corresponding posterior variance is given by

V [γ|y _s ] ≈ f ² (

T

X

t=1

{µ ^(t) _y + β ^(t) k _x ^(t) ( ¯ X _¯ _s − µ ^(t) _x )} ² /T − [

T

X

t=1

{µ ^(t) _y + β ^(t) k _x ^(t) ( ¯ X _s _¯ − µ ^(t) _x )}/T ] ² )

+f /N

T

X

t=1

{σ _e ^2(t) + β ^2(t) σ ^2(t) _e (1 − k _x ^(t) )}/T. (2.9)

3. Numerical studies

This section concerns the analysis of data set to illustrate the methods obtained in preced- ing section. We create a finite population of total size N = 200 under the superpopulation model with α = 1, β = 2, µ _x = 5, σ ² _e = 10, σ _u ² = 10 and σ ² _x = 10. Here γ = 10.35487.

Then we select samples of size 10, 25, and 50 without replacement from this population.

(6)

We conducted similar simulation studies with several cases of the parameter values, and obtained similar results.

Using the given data set (y i , X i ) (i = 1, ..., n), we compute the sample mean, the EB estimate and the corresponding jackknife estimate of MSPE. To obtain the HB estimators, we run a Gibbs chain of size 10,000 with a burn-in of the first 5000. After burning out the first half (to eliminate any possible instability in the initial generated samples), we use the averaging principle and take the average of the HB estimates over all the remaining sets to obtain the final HB estimate. The HB estimators of the population mean γ are the average over the remaining 5000 Gibbs samples generated. The same method is applied to calculate the posterior standard deviation (PSD). We considered small values for a e , b e , a u , b u , a x , b x

in the inverse gamma distributions for the diffused prior information.

Table 3.1 reports the sample sizes, the true value (γ), the sample mean (¯ y _s ), the EB and the HB estimates as well as RMSE(EB) and PSD(HB) for the samples of size 10, 25, and 50.

Table 3.1 Sample means, Bayes predictors, RMSE and PSD for the sampled data.

n γ y ¯

s

γ ˆ

^EB

ˆ γ

^HB

MSPE (EB) PSD (HB)

10 10.35487 14.31396 13.02840 14.29442 2.810525 2.806568 25 10.35487 12.62731 11.24981 12.49025 1.724124 1.701512 50 10.35487 10.61983 10.43887 10.54388 0.881237 0.855421

From Table 3.1, we may find that the EB and HB predictors are well behaved in the sense that they are closer to γ than at least the classical estimate ¯ y s , but the EB predictor is slightly better than the HB predictor in the closeness to γ. Also the EB and HB predictors are comparable in the view point of the precisions MSPE and PSD.

4. Summary and conclusion

We have derived EB and HB predictors of a finite population mean under a linear re- gression model with covariate subject to measurement error. Our simulation results have shown that EB and HB predictors are quite comparable in the closeness to γ as well as the precisions.

References

Basu, D. (1975). Statistical inference and likelihood. Sankhya A, 37, 1-71.

Bolfarine, H. and Cordani, L. K. (1993). Estimation of a structural linear regression model with known reliability ratio. Annals of the Institute of Statistical Mathematics, 45, 531-540.

Bolfarine, H. and Zacks, S. (1992). Prediction theory for finite populations, Springer-Verlag, New York, NY.

Bolfarine, H., Zacks, S. and Sandoval, M. (1996). On predicting the population total under regression models with measurement errors. Journal of Statistical Planning and Inference, 55, 63-76.

Cassel, C., S¨ arndal, C. E. and Wretman, J. H. (1977). Foundations of inference in survey sampling, Wiley, New York.

Chen, S. and Lahiri, P. (2002). On mean squared prediction error estimation in small ares estimation problems. In Proceedings of the Survey Research Methods Section, American Statistical Association, 473-477.

Cochran, W. (1968). Error of measurement in statistic. Technometrics, 10, 637-666.

(7)

Fuller, W. (1975). Regression analysis for sample surveys. Shakhya C, 37, 117-132.

Fuller, W. (1987). Measurement error models. Wiley, New York.

Jiang, J., Lahiri, P. and Wan, S. M. (2002). A unified Jackknife theory for empirical best prediction with M-estimation. Annals of Statistics, 30, 1782-1810.

Torabi, M., Datta, G. S. and Rao, J. N. K. (2009). Empirical Bayes estimation of small area means under a

nested error linear regression model with measurement errors in the covariates. Scandinavian Journal

of Statistics, 36, 355-368.

Bayesian inference in finite population sampling under measurement error model

Bayesian inference in finite population sampling under measurement error model †

You Mee Goo 1 · Dal Ho Kim 2

12 Department of Statistics, Kyungpook National University

Received 3 October 2012, revised 7 November 2012, accepted 12 November 2012

Abstract

The paper considers empirical Bayes (EB) and hierarchical Bayes (HB) predictors of the finite population mean under a linear regression model with measurement errors.

Keywords: Empirical Bayes, finite population mean, Gibbs sampler, hierachical Bayes, jackknife method, mean squared prediction error, posterior standard deviation.

1. Introduction

i=1 y i . A subset s of {1, 2, ..., N } is called a sample.

A sample s of size n is selected from U according to some specified sampling plan. And let

¯

s = U − s be the unobserved part of U .

Measurement errors may occur when the measuring device is biased or inaccurate. Regard- ing human populations, the respondents may not possess accurate information or they may give biased information. As shown in Table 1.1 of Fuller (1987), simple characteristics like sex

† This Research was supported by Kyungpook National University Research Fund, 2010.

Ph.D. candidate, Department of Statistics, Kyungpook National University, Daegu 702-701, Korea.

Corresponding author: Professor, Department of Statistics, Kyungpook National University, Daegu

702-701, Korea. E-mail: [email protected]

or age may also present some measurement errors. More complex population characteristics like unemployment, income or salary may present a much more serious measurement bias.

We consider the superpopulation model

y i = α + βx i + e i , i = 1, ..., N ; (1.1) X i = x i + η i , i = 1, ..., N. (1.2) It is assumed that the x i , η i and e i are mutually independent with x i iid ∼ N (µ x , σ 2 x ), η i iid ∼ N (0, σ 2 η ) and e i

2. EB and HB predictors of the finite population mean

A sample s of size n is drawn from the finite population and the sample data is denoted by (y i , X i ; i ∈ s). From (1.1) and (1.2), the incidental parameters x i can be eliminated in such a way that (y i , X i ) has a bivariate normal distribution with

 y i X i



∼ N α + βµ x µ x



, β 2 σ 2 x + σ e 2 βσ 2 x βσ x 2 σ 2 x + σ η 2



, i = 1, ..., N.

Using the well-known properties of the bivariate normal distribution, it follows that y i |X i

ind ∼ N [ µ y + βk x (X i − µ x ), σ e 2 + β 2 σ x 2 (1 − k x )], i = 1, ..., N

where µ y = α + βµ x and k x = σ x 2 /(σ 2 x + σ 2 η ). We are interested in the estimation of finite population mean γ = N −1 P N

i=1 y i from the sample data. It can be rewritten by γ = N −1 ( 1 T n y s + 1 T N −n y s ¯ ).

First we derive the EB predictor of γ. The Bayes predictor of γ under squared error loss is

ˆ

γ B = E(γ|y s , X, φ) = (1 − f )¯ y s + N −1 1 T N −n E(y s ¯ |y s ),

where f = (N − n)/N is the finite population correction factor and ¯ y s = n −1 P n i=1 y i . The basic problem in finite population sampling is to draw predictive inference about y ¯ s conditional on y s . Since the conditional distribution of y s ¯ given y s is given by

y ¯ s |y s ∼ N [µ y 1 N −n + βk x (X ¯ s − µ x 1 N −n ), {σ 2 e + β 2 σ 2 x (1 − k x )}I N −n ],

we have E(y ¯ s |y s ) = µ y 1 N −n + βk x (X s ¯ − µ x 1 N −n ) for all i ∈ ¯ s. Thus the Bayes predictor of γ is given by

ˆ

γ B = (1 − f )¯ y s + f µ y + f βk x ( ¯ X ¯ s − µ x ). (2.1) Also, the posterior variance of γ given φ is

V (γ|y s , X, φ) = 1

The EB predictor ˆ γ EB of γ is obtained by replacing φ in the Bayes predictor ˆ γ B by a consistent estimator ˆ φ. The components of φ are unknown and need to be estimated from the data. Let ¯ X s = n −1 P n

i=1 X i , SS X = P n

i=1 (X i − ¯ X s ) 2 , SS y = P n

i=1 (y i − ¯ y s ) 2 , S yX = P n

i=1 (X i − ¯ X s )(y i − ¯ y s ), M S X = (n − 1) −1 SS X , M S y = (n − 1) −1 SS y and M S yX = (n − 1) −1 S yX .

Under some regularity conditions, ¯ y s and ¯ X s are consistent estimator of µ y and µ x , re- spectively, i.e., ˆ µ y = ¯ y s , ˆ µ x = ¯ X s . Under the superpopulation model (1.1) and (1.2), it can be shown that (see Fuller, 1987) E[ ˆ β OLS ] = E( SS S

) = k x β, where ˆ β OLS is the ordinary least-squares estimator of β, and thus k x β is consistently estimated by ˆ β OLS . Thus the EB predictor of γ is given by

ˆ

γ EB = ¯ y s + f ( ¯ X s ¯ − ¯ X s ) ˆ β OLS . (2.3) Now we obtain a nearly unbiased estimator of MSPE(ˆ γ EB ) = E(ˆ γ EB − γ) 2 , using the jackknife methods proposed by Jiang et al. (2002) and Chen and Lahiri (2002). We have the following orthogonal decomposition:

M SP E(ˆ γ EB ) = E(ˆ γ B − γ) 2 + E(ˆ γ EB − ˆ γ B ) 2 = M 1 + M 2 (2.4)

where M 1 = g 1 (δ) is given by (2.2).

M ˆ 1J = g 1 (ˆ δ) −

n

X

l=1

n − 1

M ˆ 2J =

n

X

l=1

n − 1

n (ˆ γ −l EB − ˆ γ EB ) 2 . (2.6) By taking the sum of (2.5) and (2.6), a jackknife estimator of M SP E(ˆ γ EB ) is obtained as

mspe J (ˆ γ EB ) = ˆ M 1J + ˆ M 2J . (2.7) Next, we consider a hierarchical Bayesian framework to predict the population means γ.

To this end, we begin with the following model:

I. y i |α, β, σ 2 e ind ∼ N (α + βx i , σ e 2 ), i = 1, ..., n where e i

iid ∼ N (0, σ e 2 ).

II. X i |x i , σ 2 η ind ∼ N (x i , σ η 2 ), i = 1, ..., n where η i

iid ∼ N (0, σ η 2 ).

III. x i iid ∼ N (µ x , σ x 2 ).

IV. α, β, µ x , σ 2 e , σ η 2 , σ 2 x are mutually independent with α, β, µ x

iid ∼ uniform(−∞, ∞), σ 2 e ∼ IG(a e /2, b e /2), σ η 2 ∼ IG(a η /2, b η /2), σ x 2 ∼ IG(a x /2, b x /2). Here IG(a, b) denotes an inverse gamma distribution with pdf f a,b (z) ∝ exp(−a/z)z (−b−1) I [z>0] .

The implementation of the Bayesian procedure is greatly facilitated by the MCMC numeri-

cal integration technique, in particular the Gibbs sampler. This requires generating samples

from the full conditionals of each of x i , α, β, µ x , σ 2 e , σ η 2 and σ x 2 given the remaining

parameters and the data. The details are given below.

By the HB model I to IV, the joint posterior distribution is given by

π(α, β, µ x , σ e 2 , σ 2 η , σ x 2 |y s , X s ) ∝ (σ e 2 ) −

Bayesian inference in finite population sampling under measurement error model ^†

You Mee Goo ¹ · Dal Ho Kim ²

y i = α + βx i + e i , i = 1, ..., N ; (1.1) X _i = x _i + η _i , i = 1, ..., N. (1.2) It is assumed that the x _i , η _i and e _i are mutually independent with x _i ^iid ∼ N (µ _x , σ ² _x ), η _i ^iid ∼ N (0, σ ² _η ) and e i

y _i X _i

∼ N α + βµ _x µ _x

, β ² σ ² _x + σ _e ² βσ ² _x βσ _x ² σ ² _x + σ _η ²

Using the well-known properties of the bivariate normal distribution, it follows that y _i |X i

ind ∼ N [ µ y + βk _x (X _i − µ x ), σ _e ² + β ² σ _x ² (1 − k _x )], i = 1, ..., N

where µ y = α + βµ x and k x = σ _x ² /(σ ² _x + σ ² _η ). We are interested in the estimation of finite population mean γ = N ⁻¹ P N

i=1 y i from the sample data. It can be rewritten by γ = N ⁻¹ ( 1 ^T _n y _s + 1 ^T _{N −n} y _s _¯ ).

γ ^B = E(γ|y _s , X, φ) = (1 − f )¯ y _s + N ⁻¹ 1 ^T _{N −n} E(y _s _¯ |y _s ),

where f = (N − n)/N is the finite population correction factor and ¯ y s = n ⁻¹ P n i=1 y i . The basic problem in finite population sampling is to draw predictive inference about y _¯ _s conditional on y _s . Since the conditional distribution of y _s _¯ given y _s is given by

y _¯ _s |y _s ∼ N [µ y 1 N −n + βk x (X ¯ s − µ x 1 N −n ), {σ ² _e + β ² σ ² _x (1 − k x )}I N −n ],

we have E(y _¯ _s |y _s ) = µ _y 1 _{N −n} + βk _x (X _s _¯ − µ _x 1 _{N −n} ) for all i ∈ ¯ s. Thus the Bayes predictor of γ is given by

γ ^B = (1 − f )¯ y s + f µ y + f βk x ( ¯ X ¯ s − µ x ). (2.1) Also, the posterior variance of γ given φ is

V (γ|y _s , X, φ) = 1

The EB predictor ˆ γ ^EB of γ is obtained by replacing φ in the Bayes predictor ˆ γ ^B by a consistent estimator ˆ φ. The components of φ are unknown and need to be estimated from the data. Let ¯ X _s = n ⁻¹ P n

i=1 X _i , SS _X = P n

i=1 (X _i − ¯ X _s ) ² , SS _y = P n

i=1 (y _i − ¯ y _s ) ² , S _yX = P n

i=1 (X _i − ¯ X _s )(y _i − ¯ y _s ), M S _X = (n − 1) ⁻¹ SS _X , M S _y = (n − 1) ⁻¹ SS _y and M S _yX = (n − 1) ⁻¹ S _yX .

Under some regularity conditions, ¯ y _s and ¯ X _s are consistent estimator of µ _y and µ _x , re- spectively, i.e., ˆ µ _y = ¯ y _s , ˆ µ _x = ¯ X _s . Under the superpopulation model (1.1) and (1.2), it can be shown that (see Fuller, 1987) E[ ˆ β OLS ] = E( _SS ^S

) = k x β, where ˆ β OLS is the ordinary least-squares estimator of β, and thus k _x β is consistently estimated by ˆ β _OLS . Thus the EB predictor of γ is given by

γ ÊB = ¯ y s + f ( ¯ X s ¯ − ¯ X s ) ˆ β OLS . (2.3) Now we obtain a nearly unbiased estimator of MSPE(ˆ γ ÊB ) = E(ˆ γ ÊB − γ) ² , using the jackknife methods proposed by Jiang et al. (2002) and Chen and Lahiri (2002). We have the following orthogonal decomposition:

M SP E(ˆ γ ^EB ) = E(ˆ γ ^B − γ) ² + E(ˆ γ ^EB − ˆ γ ^B ) ² = M ₁ + M ₂ (2.4)

where M ₁ = g ₁ (δ) is given by (2.2).

M ˆ _1J = g ₁ (ˆ δ) −

n (ˆ γ _−l ÊB − ˆ γ ÊB ) ² . (2.6) By taking the sum of (2.5) and (2.6), a jackknife estimator of M SP E(ˆ γ ÊB ) is obtained as

mspe _J (ˆ γ ^EB ) = ˆ M _1J + ˆ M _2J . (2.7) Next, we consider a hierarchical Bayesian framework to predict the population means γ.

I. y i |α, β, σ ² _e ^ind ∼ N (α + βx i , σ _e ² ), i = 1, ..., n where e i

iid ∼ N (0, σ _e ² ).

II. X i |x i , σ ² _η ^ind ∼ N (x i , σ _η ² ), i = 1, ..., n where η i

iid ∼ N (0, σ _η ² ).

III. x _i ^iid ∼ N (µ x , σ _x ² ).

IV. α, β, µ x , σ ² _e , σ _η ² , σ ² _x are mutually independent with α, β, µ x

iid ∼ uniform(−∞, ∞), σ ² _e ∼ IG(a e /2, b e /2), σ _η ² ∼ IG(a η /2, b η /2), σ _x ² ∼ IG(a x /2, b x /2). Here IG(a, b) denotes an inverse gamma distribution with pdf f _a,b (z) ∝ exp(−a/z)z ^(−b−1) I _[z>0] .

from the full conditionals of each of x i , α, β, µ x , σ ² _e , σ _η ² and σ _x ² given the remaining

π(α, β, µ _x , σ _e ² , σ ² _η , σ _x ² |y _s , X _s ) ∝ (σ _e ² ) ⁻

exp[− 1 2σ _e ²

(y _i − α − βx _i ) ² ]

×(σ ² _η ) ⁻

exp[− 1 2σ _η ²

(X i − x i ) ² ] × (σ _x ² ) ⁻

exp[− 1 2σ ² _x

(x i − µ x ) ²

2σ _x ² ](σ _x ² ) ^−(b

^/2+1) × exp[− a η

2σ ² _η ](σ _η ² ) ^−(b

^/2+1) × exp[− a e

2σ _e ² ](σ _e ² ) ^−(b

^/2+1) . Then the full conditionals are obtained as follows:

(i) [α | x, β, µ x , σ ² _e , σ _η ² , σ ² _x , y, X] ∼ N (¯ y − β ¯ x, ^σ _n

(ii) [β | x, α, µ _x , σ _e ² , σ ² _η , σ _x ² , y, X] ∼ N (

^σ

(iii) [µ _x | x, α, β, σ _e ² , σ ² _η , σ _x ² , y, X] ∼ N (¯ x, ^σ _n

(iv) [σ ² _e | x, α, β, µ x , σ _η ² , σ ² _x , y, X] ∼ IG( ¹ ₂ { P n

i=1 (y _i − α − βx i ) ² + a _e }, ^n+b ₂

(v) [σ _η ² | x, α, β, µ x , σ _e ² , σ ² _x , y, X] ∼ IG( ¹ ₂ { P n

i=1 (X i − x i ) ² + a η }, ^n+b ₂

(vi) [σ ² _x | x, α, β, µ _x , σ ² _e , σ _η ² , y, X] ∼ IG( ¹ ₂ { P n

i=1 (x _i − µ _x ) ² + a _x }, ^n+b ₂

(vii) [x _i | α, β, µ _x , σ ² _e , σ _η ² , σ ² _x , y, X] ^ind ∼ N [ (β ² σ _e ⁻² +σ ⁻² _η +σ ⁻² _x ) ⁻¹ ×{(y _i −α)βσ _e ⁻² +X _i σ _η ⁻² + µ x σ _x ⁻² }, (β ² σ e −2 + σ _η ⁻² + σ ⁻² _x ) ⁻¹ ], i = 1, ..., n.

γ ^HB = E[γ|y _s ] ≈ (1 − f )¯ y s + f

{µ ^(t) _y + β ^(t) k ^(t) _x ( ¯ X s ¯ − µ ^(t) _x )}/T (2.8)

where µ ^(t) y = α ^(t) + β ^(t) µ ^(t) x and k x ^(t) = σ x ^2(t) /(σ ^2(t) x + σ η ^2(t) ). And the corresponding posterior variance is given by

V [γ|y _s ] ≈ f ² (