Examining Data Quality from the Evidence and Data for Gender Equality Pilot Surveys

This chapter provides assessment of the quality of the Evidence and Data for Gender Equality (EDGE) pilot survey operations and data collected to measure ownership of assets. The first section documents the steps undertaken before, during, and after survey field operations to ensure the high quality of data. The second section provides a quantitative assessment of the accuracy and precision of the survey estimates.

As a way of evaluating accuracy and precision, select indicators are compared as estimated from EDGE pilot survey with other external data sources to assess the representativeness of the survey. Confidence intervals of the survey estimates are also presented as a measure of precision.

4.1 Quality Control Pre-Survey Field Operations

4.1.1 Questionnaire Design

The customization of the country survey questionnaires and the instructions manual was based on standard Asian Development Bank (ADB)–

EDGE survey instruments. This was mainly done in-house by the officials of the national statistics offices (NSOs) of the three countries, in close consultation with ADB and the United Nations Statistics Division (UNSD). During customization, no additional questions and/or modules were added to the country EDGE questionnaires. However, some amendments to the standard ADB–EDGE questionnaire were done, such as modifications in the wording of the questions or in the response categories. Some questions and modules that were not considered relevant to the country context were deleted, e.g., the National Statistics Office of Georgia (Geostat) dropped the module on small agricultural equipment and questions on tenure status of dwelling and parcel.

The pilot countries also included other types of large agricultural equipment in their questionnaire.

In the questionnaire for Georgia, distinction between hand tractors, mini tractors, and tractors with larger capacity. In Mongolia, other types of equipment were added. These are combines, horse and tractor raker, jatka (combined header and harvesting equipment), manure fertilizer spreaders, irrigation systems, porters, and rickers. Respondents were also given the option to list other large agricultural equipment they may have.

Photographs of different agricultural equipment were also included in the enumerators’

manual of instructions to facilitate the enumerators’

and respondents’ correct identification of large and small agricultural equipment.

For better operationalization of field interviews, questionnaires were translated into the local language.

The translation of the questionnaire and instructions manual was undertaken mostly by staff of the NSOs.

A translator was hired by the Georgian survey team to assist with the translation of the questionnaire and manual. In all the countries, the translation process did not identify any major problems. In some cases, they sought the help of experts from other specialized institutions in their country to translate certain specific English technical terms used in the survey and to get the most appropriate word in their language.

The questionnaires were pretested prior to the conduct of pilot survey to assess their applicability in the country’s context. More specifically, pretesting was done to gauge the clarity of the questions and skip patterns, and evaluate the operational feasibility of conducting the interviews. Pretesting of the questionnaire and instructions manual was done in one round in Georgia and two rounds in Mongolia, and

Cavite, Philippines. The pretesting was conducted from May 2015 to August 2015 in the three countries.

The pretesting team consisted of officials and staff from the NSOs, field supervisors, and enumerators.

In all the three countries, ADB representatives also participated in the pretesting operations. Based on the pretests’ results, the questionnaires’ instructions manual were revised in line with the country’s context.

4.1.2 Training and Organization of Fieldwork

Well-trained field staff is essential to collect high quality survey data. The training was organized in two phases. The first phase of the orientation was for the trainers, while the second phase was for the enumerators and supervisors. The training of trainers (TOT) in each country was facilitated by resource persons from ADB and UNSD. The trainers, who were mostly staff of the three NSOs, conducted the second phase of training. Training duration varied from 2 to 5 days.

In each country, the TOT was primarily a classroom discussion. The survey goals, questionnaire design, main concepts, definitions, and the procedures of survey were discussed.

Clarifications were also made on specific topics that were not clear or misinterpreted by the trainers.

This was essential to convey the concepts to the field operations staff. In Georgia, role-playing sessions were conducted for some modules, during which the enumerators interviewed each other. In Mongolia and the Philippines, however, no field trainings or mock interviews were done during the TOT. The TOT in Mongolia and the Philippines focused on the understanding of the concepts, definitions and the items in the questionnaire.

The training of field enumerators and supervisors, broadly comprised of lectures, recapitulation, mock interviews, and field practice interviews in all countries. In Cavite, Philippines, written exercises were also conducted to evaluate participants’ level

of understanding of the concepts and definitions of the survey. Participants carried out practical interviews in the field to familiarize themselves with the questionnaire, the procedures on data gathering operations, as well as to allow them to experience the actual environment in administering the survey questionnaires. A quiz session with 20 questions was also organized in Mongolia. In each country, the training ended with a debriefing session to discuss the issues observed in the field and to provide necessary clarifications on them.

In Mongolia and Cavite, Philippines, a workshop was also organized for supervisors to discuss their roles and responsibilities when conducting field operations: the process for distributing assignments;

the protocols for nonresponsive households; data quality assurance, including the importance of reviewing questionnaires to ensure that all questions were asked and answers were recorded accurately;

what to do if mistakes were found in completed questionnaires; how to deal with problems that might arise in the field; and maintaining contact with NSO headquarters; etc.

As a result of these training activities, the questionnaires were modified to correct typographical errors, inconsistent skip patterns, and categories of responses. The instructions manual was also revised for more clarity and better comprehension.

Additional explanations and illustrations of possible field situations were provided. Other survey concepts and questions that were found to be more difficult were also documented during the training.

4.2 Quality Control During Survey Field Operations

4.2.1 Organization of Field Operations

As the pilot survey was the first of its kind for all the three NSOs that participated in the project, the pre-survey preparation entailed learning new concepts.

The three participating NSOs took necessary quality

control measures during the actual survey field operations.

The pilot surveys conducted in the three countries spanned over the months of September 2015 to November 2015. Face-to-face interviews using paper questionnaires were implemented in all the three countries for collecting data.

A team approach was used during field enumeration, wherein each field team consisted of one supervisor and two to four enumerators. This approach was considered suitable to implement in the survey protocol that required simultaneously interviewing of all available respondents in the selected households.

This was required to ensure independent interviews of individuals without any influence due to the presence or intervention by other household members.

In Mongolia, all enumerators and supervisors were centrally recruited as contractual staff for carrying out the field operations. In Georgia, the field teams were experienced contractual staff recruited at the regional level, while in Cavite, Philippines, the recruitment was done at the provincial level.

In Cavite, Philippines and in Georgia, while the enumerators were hired on contract, the field supervision was undertaken by the regular staff of the two NSOs.

As part of the quality assurance of field operations, the team supervisors checked the completed questionnaires and advised the enumerators to correct any errors found. During the debriefing sessions, supervisors provided feedback on the inconsistencies or errors seen in the filled-out questionnaires, including omissions, improper recording of responses, and not following the prescribed skip patterns in the questions and instructions. Where it was deemed necessary, sample households were revisited to allow for the subsequent correction of errors identified by the supervisors. Apart from these field supervisors, a team composed of officials from the headquarters of the countries’ respective NSOs monitored the

overall field operations and data quality control. In Mongolia, monitoring teams from the headquarters undertook field visits to check the fieldwork of each team of enumerators at the beginning of the fieldwork. This facilitated direct feedback from the headquarter team to ensure that the enumerators and supervisors correctly understood the survey concepts, instructions, and field protocols. To further ensure the quality of data reported by enumerators in Georgia, a special field monitoring team of Geostat conducted additional interviews using a special questionnaire in two households previously interviewed by every field enumerator.

After completing the monitoring activity, the filled-out actual and special questionnaires were compared for every household and checked for consistency.

Standard protocols, as set by each participating country, on safekeeping, validation, and processing of accomplished survey questionnaires, were carefully followed.

4.2.2 General Sampling Design

The pilot surveys followed a multi-stage stratified sampling design. In Georgia and Cavite, Philippines, a village or a cluster of villages constitutes the primary sampling unit (PSU), while a household within each PSU makes up the second stage sampling unit (SSU).

Households within each PSU were stratified into two groups based on the number of adult members. In Mongolia, the design was extended at the first stage by selecting provinces within the different regions leading to a three-stage selection process. The aimags within the four regions and Ulaanbaatar city as the fifth region constituted the primary sampling units (PSU) while the bags and khesegs within the selected aimags and Ulaanbaatar city respectively, comprised the second stage sampling unit (SSU). The households within the selected bags and khesegs constituted the ultimate stage units (USU). Households within each PSU were stratified into two groups based on the number of adult members—one for households with three or more adults (second-stage stratum 1 [SSS-1] or ultimate-stage stratum 1 [USS-1]) and

another one for households with at most two adult members(second stage stratum-2 [SSS-2] or ultimate stage stratum-2 [USS-2]). For households in SSS-2 or USS-2, all adult members were selected for interview with a probability equal to one. For households in SSS-1 or USS-1, the primary respondent and his/her spouse were selected with a probability equal to one and a third member was randomly selected from all remaining adult household members.

In Mongolia and Cavite, Philippines, where 16 households were targeted per PSU, eight households were selected from each second stage stratum. This approach is expected to provide enough sample households with at least 3 adults and to achieve an adequate number of households with a principal couple—an important unit of analysis of the survey.

A random selection of households could have led to an inadequate number of households with principal couple as observed from the sampling design adopted in previous studies (e.g., Uganda MEXA EDGE survey).

Achieving second- or ultimate- stage stratification required recent or an updated listing of households with information on the number of adults per household. Ideally, a fresh listing of all households in each selected PSU was desirable for the purpose. However, this extra listing required additional resources. In Mongolia, the information on the number of adults in the selected PSUs was available in the population database, which served as the sampling frame and is also expected to be updated dynamically. In Georgia, the 2014 population census frame (nearly 1 year-old at the time of fieldwork) was used to get the information on number of adults in the selected PSUs. In Cavite, Philippines, however, a fresh household listing was undertaken 2 months before fieldwork to gather information on the number of adults per household in each of the selected PSUs.

With the available data on the number of adults in the households of the selected PSUs, the households in each selected PSU were divided into 1 and SSS-2 to select the desired number of sample households from each stratum.

In Georgia, the PSUs were selected with probability proportional to size (PPS), and second stage units were selected following a circular systematic sampling (CSS) with a random start. In Mongolia, the PSUs as well as the SSUs were selected with probability proportional to size (PPS), and ultimate stage units were selected following the CSS with a random start. In the Philippines, where the survey was limited to only one province (Cavite), both the PSUs and SSUs were selected following circular systematic sampling (CSS) with a random start. In each country, the sample PSUs/SSUs in each stratum was drawn in the form of independent sub-samples with a view to generate unbiased estimates of variance of the estimated parameters irrespective of the sampling design adopted.

4.3 Quality Control Post-Survey Field Operations

4.3.1 Data Processing

In large scale surveys, scrutiny and checking of the data collected through the interviews is an essential part of data processing to ensure internal consistency of data.

Once the data were collected in the field through the survey instruments, the supervisors manually checked the completeness and consistencies of the entries in the questionnaire. After manual checking, and carrying out necessary corrections, the questionnaires were submitted to the NSO’s central office for data processing. A series of data processing activities were carried out to ensure the quality and consistency of the data. Prior to data entry, preliminary checks were made by the NSO staff at their headquarters. This step includes coverage checking to ensure that all surveyed questionnaires for each sample as per the sample list has been received and undertaking checks for completeness of identification information of each questionnaire.

In general, the data entered were subjected to different validation checks by developing a

series of rules for data consistency within and across survey questionnaires. These included checks for completeness of data entry of all survey questionnaires for which data were collected, duplicate records, omissions, range, skip, consistency, and typographical errors. A range check ensures that every variable in the survey is within a limited domain of valid values. A skip check verifies whether the skip patterns and codes have been followed appropriately.

A consistency check verifies if the values from one question are consistent with the values from another question. A typographic check entails transposition of figures mistakenly encoded.

Double data entry procedure was implemented in the three countries, wherein the questionnaires were entered in two different databases and compared. If any inconsistencies were found, the data encoder made the necessary correction.

Data cleaning codes per module using Stata were created by the ADB–EDGE team to validate the data from the NSOs. A list of errors was generated per module and sent to NSOs for review and action.

Cleaned up datasets were then provided after each round of data validation. A regional workshop was also conducted for all participating countries and it served as a venue to sort out the problems encountered during data cleaning and validation.

4.3.2 Calculation of Sampling Weights

In statistical surveys, a survey sampling weight refers to the total number of units in the target population that each sampled unit represents. In the context of these pilot surveys, the target population corresponds to all adult household members in a specific country or geographic area.

As discussed earlier, the survey protocol required interviewing selected respondents via household separately to provide information about assets that they own either exclusively or jointly with others as well as proxy information

on assets which other adult household members own. In principle, an unbiased estimate of any desired population parameter (e.g., proportion of adult household population owning a specific type of asset) can be derived using self-reported information from all survey respondents. However, the complexity arises from using proxy information about assets that other household members own, as reported by survey respondents. The rationale for collecting proxy information was to provide a comprehensive inventory of all assets owned by all household members. Chapter 2 already discussed ownership assigned by any respondent (OAAR) and self-assigned ownership (SAO), two methodological approaches to analyze proxy and self-reported information, respectively. The appropriate sampling weight depends on the estimation approach under consideration. This section discusses how the survey sampling weights were calculated under the two approaches.

Sampling Weights for Ownership Assigned by Any Respondent Approach

Under the OAAR approach, the combination of self-reported and proxy information provided by the respondents constitute a household level information. More specifically, the OAAR approach follows the broadest definition of ownership wherein as long as an adult household member is identified as an owner of any asset by at least one respondent, that person is considered as an owner.

Hence, survey weight calculation for estimation of population parameters based on the OAAR approach is akin to how survey weights are calculated in typical household surveys and are given below for the three countries:

Equation 1: Two-Stage Probability Proportional to Size (PPS) Design (Georgia)

In statistical surveys, a survey weight refers to the total number of units in the target population that each sampled unit represents. In this context, the target population corresponds to all adult household members in a specific country or geographic area.

As discussed earlier, the survey protocol required interviewing respondents separately where they were asked to provide information about assets that they own either exclusively or jointly with others as well as proxy information on assets which other adult household members own. In principle, an unbiased estimate of any desired population parameter (e.g., proportion of adult household population owning a specific type of asset) can be derived using self-reported information from all survey respondents. However, the complexity arises from using proxy information about assets that other household members own, as reported by survey respondents.

Recall that the rationale for collecting proxy information was to provide a comprehensive inventory of all assets owned by each household. Since the proxy information provided by two different respondents could be referring to the assets owned by one and the same person, the proxy information cannot be treated independently. Chapter 2 already discussed Ownership Assigned by Any Respondent (OAAR) and Self-Assigned Ownership (SAO), two methodological approaches designed to use proxy and self-reported information, respectively. This section discusses how the survey sampling weights are calculated under the two approaches.

Sampling Weights for Ownership Assigned by Any Respondent Approach

Under the OAAR approach, the combination of self-reported and proxy information provided by the respondents constitute a household-level information. More specifically, the OAAR approach follows the broadest definition of ownership wherein as long as a person has been identified as an owner by at least one respondent, that person is already considered as an owner.

Hence, survey weight calculation for estimation of population parameters based on the OAAR approach is akin to how survey weights are calculated in typical household surveys.

Equation 1: Two-Stage PPS Design (Georgia)

�_�

�_�× 1

�_��×�_��

ℎ��

Equation 2: Three-Stage PPS Design (Mongolia)

�_�

Equation 3: Three-Stage CSS Design (Philippines)

�_�

�_�×�_��

ℎ_��

In statistical surveys, a survey weight refers to the total number of units in the target population that each sampled unit represents. In this context, the target population corresponds to all adult

문서에서 MEAsurinG AssEt OwnErshiP AnD EntrEPrEnEurshiP frOM A GEnDEr PErsPEctivE (페이지 66-71)