J ournal of K or ean
D a ta & I nf orma t ion S cience S ocie ty 2 002 , Vol. 13, N o.1 p p . 35~45
A C o n dit io n a l In dire c t S u rv e y M e t h o d 1)
Gi S un g Le e 2 ) ・ Ki H ak H on g 3 ) Ch an g Ky oon S on 4 ) ・ Ki S e on g N am 5 )
A b s tra c t
F or im pr ov in g t h e qu alit y of su r v ey dat a of sen sitiv e ch ar act er , w e su g g est a con dit ion al in dir ect sur v ey m et h od. In th at m et h od , only th e r espon den t s w h o an sw er dir ect ly t o t h e les s sen sit iv e qu estion r esp on d in dir ect ly t o th e m or e sen sit iv e on e b y u sin g t h e on e sam ple u nr elat ed qu estion r an dom ized r espon s e t echniqu e w ith t h e kn ow n
y, th e t r u e pr op or t ion of u nr elat ed gr ou p Y . W e ex t en d it t o t w o sam ple m et h od w h en
yis u nkn ow n . W e also con sider t h e ca se th at people w h o p os s es s les s sen sit iv e ch ar act er an sw er un t ru t hfully . F in ally w e com p ar e ou r m eth od w it h t h e m et h ods of Gr eenb er g et al. an d Car r et al..
K e y W o rd s : a con dition al in dir ect su r v ey m eth od , les s s en sit iv e ch ar act er , s en sit iv e ch ar act er
1. In trodu c tion
M ost of m ar k et in g an d opin ion r esear ch com p anies h av e collect ed in div idu al inform at ion th r ou g h v ariou s sur v ey s . T h ey ar e con t in u ou sly doin g effort t o im pr ov e th e qu alit y of sur v ey t o obt ain m or e accu r at e dat a . But som e r espon dent s w h o ar e a sk ed qu est ion s w hich r elat ed t o pr iv acy or an ti - social pr oblem s m ay r efu se t o an sw er or m ay delib er at ely falsify t h eir r espon se s . It m ak es w or s e t h e qu alit y of su rv ey dat a an d r aises t h e pr ob lem s of con fiden ce . T h e r an dom ized
1. T his paper w as support ed by W oosuk Univ ersit y
2. A ssociat e Pr ofessor , Division of Comput er and Inform ation Science, Woosuk Univ ersity , 490 Hujung - ri, S amry e- up, W anju - gun , Jeonbuk , 565- 701, Kor ea
E - m ail : gisun g @w oosuk .ac.kr
3. A ssociat e Profes sor , Departm ent of Comput er Science, Dongshin Univ ersity , 252 Daeho- don g , Naju , Chonnam , 520- 714, Korea
4. Full- T im e Lectur er , Departm ent of Comput er Science, Dongshin Univ ersity , 252 Daeho- don g , Naju , Chonnam , 520- 714, Korea
5. Lect urer , Departm ent of St atistics , Changw on Univ ersity , 9 Sarim - dong , Kyungnam , 641-
773, Korea
r espon se t ech niqu e (RRT ) w h ich is on e of in dir ect su r v ey m eth od s h a s b een pr oposed a s on e m ean s of obt ainin g an un bia sed estim at e of th e pr oport ion or qu ant ity in a popu lat ion gr oup of per son s p os s es sin g a p ar ticular t r ait or ch ar act er istic w hich t h e per s on s m ay b e r elu ct ant t o ackn ow ledg e. T his m eth od w a s fir st su g g est ed by W ar n er at 1965 a s a r elat ed qu est ion form . H e pr op osed a in dir ect sur v ey m et h od called RRT t o pr ocur e t r u st w or th y in for m ation ab out s en sit iv e dat a fr om th e r esp on dent s in s am ple su r v ey , an d e st im at ed th e sen sit iv e p opulation pr opor t ion b y u sin g t h e dat a collect ed fr om r an dom izat ion dev ice w hich w a s com posed of sen sit iv e an d n on s en sit iv e qu e st ion w it h r espect iv e kn ow n pr ob ab ilit ies , p an d 1 - p .
S in ce th en , m any s cient ist s h av e dev elop ed th e m et h od. Gr een b er g et al.(1969) im pr ov ed t h e W ar n er m eth od b y r eplacin g t h e n on s en sit iv e qu est ion t o t h e u nr elat ed on e. H e als o ex t en ded it t o t w o- sam ple ca s es in ca se t h e un r elat ed pr opor tion
yis n ot kn ow n .
L oy n es (1976 ) r eplaced th e n on sen sit iv e qu est ion t o a for cible qu est ion th at m ade r espon dent s t o an sw er only "y e s ". Car r et al.(1982) m odified t h e L oy n es ' m et h od t o a k in d of t w o st ag e m eth od an d pr opos ed a con dition al r an dom ized r esp on se (CRR ) m et h od for r edu cin g t h e st an dar d er r or by a skin g qu estion s con dition al u pon ear lier an sw er s . T h ey b or r ow ed a com m on sen se t h at in th e ty pical situ at ion , n ot all subj ect s w er e n ot a sk ed a sub s equ en t qu estion w h en t h eir r espon s e could b e dedu ced fr om th eir an sw er t o a pr ev iou s qu est ion . T h ey a sk ed t h e m or e s en sit iv e qu est ion t o r esp on den t s w h o an sw er ed "y es " for th e ear lier les s s en sit iv e qu est ion . But , in Car r et al. ' s m eth od th e p er son s h av e t o u se t h e r an dom izat ion dev ice in t w ice an d t h er e is pos sibility th at som e people w h o said "y es " for t h e for cible qu estion in t h e fir st st ag e also m u st say "y es " by t h e s am e t y pe qu e st ion in s econ d st ag e . In t h at ca s e it m ak es w or se th e qu alit y of su r v ey dat a an d r aises t h e pr oblem s of confiden ce.
W e con sider a m eth od , a con dit ion al in dir ect su r v ey m eth od , t o im pr ov e qu alit y of sur v ey dat a by a skin g a dir ect qu est ion t o t h e people of p os s es sin g les s s en sit iv e ch ar act er . In t h at m et h od, on ly t h e r espon dent s w h o an s w er dir ect ly t o t h e le s s sen sit iv e qu est ion r e spon d in dir ectly t o th e m or e sen sitiv e on e by u sin g t h e on e sam ple u nr elat ed qu estion r an dom ized r esp on se t echn iqu e w ith t h e kn ow n
y
, t h e tr u e pr opor tion of un r elat ed g r oup Y . W e ex t en d it t o t w o sam ple m et h od w h en
yis u nk n ow n . W e also con sider t h e ca s e th at p eople w h o pos se s s les s sen sit iv e ch ar act er r esp on d u nt r ut hfully . F in ally w e com p ar e ou r m et h od w ith t h e m et h od s of Gr eenb er g et al. an d Car r et al..
2 . On e s am ple c on diti on al in dire c t s u rv e y m e th o d
2 .1 t ru t h fu l re p ort in g
In t his ch apt er , w e w ill su g g est a con dit ion al on e sam ple in dir ect sur v ey m et h od . A ccor din g t h e m eth od , on ly t h e r espon dent s w h o an sw er dir ect ly t o th e le s s s en sit iv e ch ar act er B r espon d in dir ect ly t o th e m or e s en sit iv e on e A b y u sin g t h e Gr een b er g et al. ' s on e sam ple un r elat ed qu est ion r an dom ized r espon se t ech niqu e w ith t h e k n ow n
y, t h e t r u e pr opor tion of un r elat ed gr ou p Y .
A n S RS W R of size n is dr aw n fr om th e popu lat ion . In th e fir st st ag e each int er v iew ee w h o is a sk ed dir ect ly t h e follow in g qu e st ion an sw er s "y es " or "n o".
Do y ou h av e pos s es s t h e les s s en sit iv e ch ar act er B ?
If w e a s su m e th e r esp on dent s r espon d tr u th fu lly , t h e v alu e of
1w h ich is t h e pr ob ab ilit y of g et t in g a "y es " is con g r u en t cor r esp on ds t o t h e v alu e of
1w h ich is t h e p opulation pr opor t ion of B . L et n
1b e t h e nu m b er of say "y ese s ", t h en
1
= n
1n . T h e est im at or
1of
1is
1
= n
1n . (2.1)
In t h e secon d st ag e, th e n
1r esp on dent s w h o said "y es " in t h e fir st st ag e, r espon d accor din g t o t h e r esult s of r an dom ization dev ice, R w hich is com p osed of Gr eenb er g et al.' s u nr elat ed qu est ion t ech niqu e.
< r an dom ization dev ice R >
cont ent s select ion
pr ob ability
Qu est ion 1 I am a m em b er of Gr oup A p
Qu est ion 2 I am a m em b er of Gr ou p Y 1 - p
L et
2b e th e con dition al pr ob ability of g et tin g "y es " fr om t h e r espon den t s w h o s aid "y es " in t h e fir st st ag e.
2
= p
21
+ ( 1 - p )
y, (2.2)
w h er e
2is t h e populat ion pr opor tion of sen sitiv e g r ou p A , an d
yis t h e
k n ow n popu lat ion pr opor tion of un r elat ed gr ou p Y .
L et n
2b e th e nu m b er of "y es es " am on g n
1r esp on dent s , th en
2= n
2n
1. T h e est im at or
2of
2is
2
= 1
np [ n
2- ( 1 - p)
yn
1] . (2.3)
S in ce n
1~ b ( n ,
1) , an d n
2~ b ( n ,
1 2) , t h e ex p ect ed v alu e of
2follow s a s
E (
2) = 1
np [ E ( n
2) - ( 1 - p)
yE ( n
1) ]
= 1
np [ n
1 2- ( 1 - p)
yn
1]
=
2.
2
is u nbia sed estim at or of
2. T h e v ar ian ce of
2is
V a r (
2) = V a r [ n
2- ( 1 - p) np
yn
1]
= V a r ( n
2) + ( 1 - p)
2 y2V a r ( n
1) - 2 ( 1 - p )
yCov ( n
1, n
2)
( np)
2(2.4)
S in ce n
1~ b ( n ,
1) , n
2~ b ( n ,
1 2) , an d n
2n
1~ b ( n
1,
2)
V a r ( n
1) = n
1( 1 -
1) ,
V a r ( n
2) = E [ Va r ( n
2n
1) ] + V a r [ E ( n
2n
1) ]
= E [ n
1 2( 1 -
2) ] + V a r ( n
1 2)
= n [ p
2+ ( 1 - p )
1 y][ 1 - p
2- ( 1 - p)
1 y] ,
Cov ( n
1, n
2) = E ( n
1n
2) - E ( n
1) E ( n
2)
= E [ n
1E ( n
2n
1) ] - ( n
1) ( n
1 2)
= n ( 1 -
1) [p
2+ ( 1 - p )
1 y] .
If w e apply th e ab ov e t hr ee r esu lt s t o t h e equ ation (2.4 ), w e can obt ain th e
v ar ian ce of
2.
V a r (
2) =
1( 1 - p )
y{ 1 - ( 1 - p)
y} - p
2{2 ( 1 - p)
y+ p
2- 1 }
np
2. (2.5)
2 .2 l e s s t h a n c om pl e t e ly t ru t h fu l re p ort in g
L et ( 0 < < 1 ) den ot e t h e pr ob abilit y t h at r e spon den t s w h o b elon g t o g r ou p B w ill t ell t h e t r ut h w h en con fr ont ed w ith a dir ect qu estion con cer nin g m em b er ship in t h e fir st st ag e. It is fur t h er postu lat ed th at r esp on den t s con fr ont ed w it h a qu est ion r elat in g t o m em b er sh ip in Gr ou p A w ill r epor t t r ut hfully b ecau se t h ey u se r an dom ization dev ice an d r espon d in dir ectly . T h e pr ob abilit y of g et tin g
"y es " is
2
' = p
21
+ ( 1 - p)
y. (2.6)
In t his ca s e th e est im at or
2is bia s ed e st im at or of
2an d t h e bia s is
B (
2) =
2( 1/ - 1 ) . (2.7)
H en ce, t h e m ean squ ar ed er r or (M S E ) of
2is obt ain ed a s follow s .
M SE (
2) =
1( 1 - p)
y{1 - ( 1 - p)
y} - p
2{2 ( 1 - p)
y+ p
2- 1}
np
2+ {
2( 1/ - 1)}
2. (2.8)
3 . T w o s am ple c on dition al in dire ct s u rv e y m e th od
In t his ch apt er w e con sider t h e ca s e th at t h e popu lat ion pr oport ion
yof u nr elat ed gr oup Y is un kn ow n , an d w ill ex t en d th e m et h od of ch apt er 2 t o t w o s am ple ca se.
If t h e p opu lation pr opor t ion
yof un r elat ed g r oup Y is u nkn ow n in our su g g est ed m et h od , w e n eed t w o in depen dent sam ple t o est im at e
y. W e select t w o S RS W R of size n
1i( i = 1 , 2 ) fr om n
1r espon den t s w h o r e spon ded "y e s " in t h e fir st st ag e.
T h e n
1ir espon dent s r esp on d accor din g t o t h e r e sult s of r an dom izat ion dev ice ,
R ( i ) w hich is com p osed of Gr een b er g et al.' s t w o sam ple un r elat ed qu est ion
t ech niqu e.
< r an dom izat ion dev ice R ( i ) >
cont en t s s election
pr ob ab ilit y
Qu estion 1 I am a m em b er of Gr ou p A p
iQu estion 2 I am a m em b er of Gr oup Y 1 - p
iL et
2 i( i = 1 , 2 ) b e t h e con dition al pr ob abilit y of g ett in g "y es " fr om t h e n
1ir espon dent s .
2 i
= p
i 21
+ ( 1 - p
i)
y, (3.1)
w h er e
1,
2ar e th e p opu lation pr oport ion s of sen sitiv e g r ou p B an d A , an d
y
is t h e un kn ow n populat ion pr opor tion of u nr elat ed gr ou p Y . L et n
2 ib e th e n um b er of "y ese s " am on g n
1ir e spon den t s , th en
2 i= n
2 in
1i. T h e est im at or
2of
2is
2
=
1[ ( 1 - p
2) p
2 11- ( 1 - p - p
2 1)
22]
= 1
n ( p
1- p
2) [ n n
1n
112 1( 1 - p
2) - n n
1n
1222( 1 - p
1) ] , p
1p
2.
(3.2)
S in ce n
1~ b ( n ,
1) , n
2 in
1i~ b ( n
1i,
2 i) , t h e ex p ect ed v alu e of
2is obt ain ed a s follow s .
E (
2) = 1
n ( p
1- p
2) [ 1 - p n
112E ( n
1n
2 1) - 1 - p n
121E ( n
1n
22) ]
= 1
n ( p
1- p
2) [ 1 - p n
112E ( n
1)E ( n
2 1n
11) - 1 - p n
12 1E ( n
1) E ( n
22n
12) ]
= 1
n ( p
1- p
2) [ ( 1 - p n
2) n
11 11 2 1E ( n
1) - ( 1 - p n
1) n
12 12 22E ( n
1) ]
= 1
n ( p
1- p
2) [ ( 1 - p
2)
2 1n
1- ( 1 - p
1)
22n
1]
=
2.
T h er efor e,
2is un bia sed est im at or of
2. T h e v ar ian ce of
2is
V a r (
2) = V a r [ n ( p
11 - p
2) { n n
1n
112 1( 1 - p
2) - n n
1n
1222( 1 - p
1) }]
= 1
n
2( p
1- p
2)
2[ ( 1 - p2)
2n V a r ( n
112 1n
2 1) + ( 1 - p
1)
2n V a r ( n
122 1n
22)
- 2 ( 1 - p
1) ( 1 - p
2) Cov( n
1n
2 1, n
1n
22)
n
11n
12] .
(3.3)
S in ce n
1~ b ( n ,
1) , an d n
2 in
1i~ b ( n
1i,
2 i) , w e can der iv e th e follow in g equ at ion s .
V a r ( n
1n
2 1) = E [ V a r ( n
1n
2 1n
11) ] + V a r [ E ( n
1n
2 1n
11) ]
= E [ n
12
n
11 2 1( 1 -
2 1) ] + Va r ( n
1n
11 2 1)
= n
11 2 1( 1 -
2 1) [ n
1( 1 -
1) + ( n
1)
2] + ( n
11 2 1)
2n
1( 1 -
1) ,
V a r ( n
1n
22) = n
12 22( 1 -
22) [ n
1( 1 -
1) + ( n
1)
2] + ( n
12 22)
2n
1( 1 -
1) ,
Cov ( n
1n
2 1, n
1n
22) = E ( n
1n
2 1n
1n
22) - E ( n
1n
2 1)E ( n
1n
22)
= E ( n
12) E ( n
2 1n
22n
11n
12)
- E ( n
1)E ( n
2 1n
11)E ( n
1) E ( n
22n
12)
= n
11n
12 2 1 22n
1( 1 -
1) .
H en ce, if w e apply t h e ab ov e th r ee r esult s t o t h e equ ation (3.3 ), w e can obt ain
t h e v ar ian ce of
2.
V a r (
2) =
1
[ ( 1 -
1+ n
1) { ( 1 - p
2)
2 2 1( 1 - n
11 2 1) + ( 1 - p
1)
2 22( 1 - n
12 22) }]
n ( p
1- p
2)
2+
22
( 1 -
1)
n
1. (3.4)
4 . Ef fic ie n cy c om p ari s on
W e com par e ou r on e sam ple m et h od w it h t h ose of Gr eenb er g et al. an d Carr et al. an d su g g est t h e con dit ion w h ich ou r m et h od is m or e efficien cy t h an t h em .
In Gr eenb er g et al.' s m et h od , let
gb e th e est im at or of t h e p opu lation pr opor tion
2of s en sit iv e g r ou p A . T h e v arian ce of
gis
V a r (
g) =
2( 1 -
2)
n + ( 1 - p) [ p
2( 1 - 2
y) +
y{ 1 - ( 1 - p)
y} ]
np
2. (4.1)
N ow , th e differ en ce Va r (
g) - V a r (
2) is
V a r (
g) - V a r (
2) = ( 1 -
1) ( 1 - p)
y{1 - ( 1 - p)
y}
np
2. (4.2)
In equ at ion (4.2), if p 1 ,
11 , th en Va r (
g) - V a r (
2) > 0 . T h e con dition s p 1 ,
11 ar e s at isfied in g en er al. W e can see t h at t h e su g g est ed on e sam ple m eth od is m or e efficien cy th an th at of Gr eenb er g et al.. H en ce w e can im pr ov e t h e qu ality of su r v ey dat a ev en th ou gh our m eth od is som ew h at com plicat e t o u se.
N ex t , in Car r et al.' s m et h od, let
cb e t h e e st im at or of t h e p opulation pr opor tion
2of s en sit iv e g r ou p A . T h e v arian ce of
cis
V a r (
c) = {p
1+ ( 1 - p ) }( 1 - p) -
2( 1 - 2p + p
2)
np . (4.3)
T h e differ en ce V a r (
c) - Va r (
2) is
Va r (
c) - Va r (
2) = ( 1 - p) [
1{p
2-
y( 1 - ( 1 - p)
y)} + p ( 1 - p) - 2p
2( 1 -
y) ]
np
2. (4.4)
F r om th e equ at ion (4.4 ), w e can obt ain t h e con dit ion r egion th at s at isfies V a r (
2) < Va r (
c) a s follow s
2
<
1[ p
2-
y{1 - ( 1 - p)
y}] + p ( 1 - p)
2p ( 1 -
y) ,
y1 . (4.5)
H en ce, our m et h od is m or e efficien cy t h an th at of Car r et al. u n der th e con dit ion (4.5 ) an d m or e sim ple t o u se t h an t h at b ecau s e of u sin g only on e r an dom ization dev ice. But it is difficult t o com p ar e t h em an aly tically a s w e kn ow in (4.5 ). S o w e do com par e t h em n um er ically an d fin d t h e con dit ion s in w h ich th e su g g e st ed m et h od a chiev es m or e efficien cy th an t h e cor r espon din g m et h od .
< T able 1> sh ow t h e r elativ e efficien cy R E = V a r (
c) / V a r (
2)
obt ain ed un der th e con dition s n = 100 ,
1= 0 . 5 ,
2of ch an gin g fr om 0.1 t o 0.4 by 0.1, an d p ,
yof ch an g in g fr om 0.1 t o 0.9 by 0.2.
In < T able 1> t h e v alu es g r eat er t h an on e dem on st r at e t h e g ain s of efficien cy for t h e su g g est ed m et h od r elativ e t o Car r et al.' s m eth od . W e can see t h at t h e su g g est ed m eth od g en er ally is effect iv e w h en t h e v alu es of
2is decr ea sin g , an d
p an d
yar e in cr ea sin g .
5 . Con c lu s ion s
F or im pr ov in g su rv ey dat a qu alit y , w e su g g est a con dit ion al in dir ect sur v ey m et h od in w h ich only th e r espon den t s w h o an sw er dir ect ly t o t h e les s sen sit iv e qu est ion r esp on d in dir ect ly t o t h e m or e sen sitiv e on e b y u sin g t h e un r elat ed qu est ion r an dom ized r esp on se t ech niqu e. W e com par e it w it h t h ose of Gr eenb er g et al. an d Car r et al., g en er ally our m eth od is m or e efficien cy th an Gr eenb er g et al ' s m eth od ev en t h ou g h it is som e com plicat e in pr ocedur e. T h e su g g est ed m et h od can b e r edu ced t o Gr een b er g et al.' s on e sam ple u nr elat ed qu est ion t ech niqu e if w e let
1= 1 . H en ce, w e can kn ow th at th e su g g est ed m et h od is a g en er alized for m of Gr een b er g et al. ' s on e sam ple u nr elat ed qu e st ion t echn iqu e. S o, t h e Gr een b er g et al. ' s on e sam ple un r elat ed qu e st ion t ech niqu e is a special ca se of t h e su g g est ed m et h od.
Com p ar in g w ith Carr et al.' s m et h od th e su g g est ed m eth od is effect iv e w h en t h e v alu es of
2is decr ea sin g , an d p an d
yar e in cr ea sin g .
W e also con sider th e ca se th at t h e r e spon den t s w h o ar e con fr ont ed a dir ect
qu est ion t ell t h e t ru t h w it h pr ob ability ( 0 < < 1 ) .
W e ex t en d ou r m eth od t o t w o sam ple con dit ion al in dir ect sur v ey m et h od in ca se t h e t r u e pr opor tion of un r elat ed ch ar act er Y is n ot kn ow n .
< T able 1> Efficiency comparison betw een the tw o m et hods , the suggest ed one s ample con ditional indirect surv ey m ethod and Carr et al.' s m ethod.
2 y