Approaches to Weighting Data from Dual-Frame Surveys Darren Pennay
Transcription
Approaches to Weighting Data from Dual-Frame Surveys Darren Pennay
Approaches to Weighting Data from Dual-Frame Surveys Darren Pennay Social Research Centre Pty Ltd Michele Haynes, Mark Western, Bernard Baffour Institute for Social Science Research, UQ Social Research Centre Workshop (17 July 2012) PHONE COVERAGE IN AUSTRALIA • Currently 22.5 million people reside in Australia – 16 million adults – 8.5 million households. • Approximately 19% of adults (3 million) live in mobile phone only households in Australia1. • An increase of almost 5% in 12 months. • Landline sampling frames result in non-coverage of 21% of households. 1 Australian Communications Media Authority (2011) CRICOS Provider No 00025B DEMONSTRATION SURVEY 2010 • First Australian dual – frame survey conducted in September 2010 – reported at 2010 ACSPRI Social Science Methodology Conference • This was a Demonstration Survey implemented by SRC & ISSR. • Questions on demographics and social issues sourced from reputable questionnaires and scales. • Sampling design used 2 telephone sampling frames – A landline RDD sample (400 interviews) – A mobile phone sample (300 interviews). • Participants from each frame were asked whether they used – Landline only, Landline mostly, Mixed, Mobile mostly, Mobile only. CRICOS Provider No 00025B Landline Sample Telephone usage Landline only Landline mostly Mixed Mobile mostly Mobile only Not determined Mobile Sample Total (LL) Landline & Mobile LLO Total Mobile & LL MPO (n=400) % (n=342) % (n=58) % (n=300) % (n=215) % (n=83) % a 28.1 31.0 29.1 11.8 b 43.1 40.5 16.4 c 100.0 - d 18.2 24.4 32.9 e 23.6 31.7 42.8 f - - - - 22.7 1.9 1.9 100.0 - Of those interviewed via landline sample, only 11.8% use ‘mobile mostly’ Of the landline sample dual users, only 16.4% use their ‘mobile mostly’ CRICOS Provider No 00025B • Sample profile showed considerable differences in characteristics of those who primarily used mobile and landline phones. • Compared to landline users, mobile phone users are more likely to be – Younger, reside in a capital city, born outside Australia, renting, living in a group household, students, employed. • Conclusion from demonstration dual-frame survey – – – – 80% of the adult population have mobile phones.....BUT Dual-user respondents from landline and mobile frames are very different Low chance of ‘mobile mostly’ user from a landline sampling frame Response mechanism differs for landline & mobile sampling frames. • Has implications for combining estimates. CRICOS Provider No 00025B THE OMNIBUS SURVEY 2011 • The larger omnibus survey was administered by SRC in Dec 2011. • The sampling design again used 2 telephone sampling frames: – A landline RDD sample proportionally stratified by geographical location across Australia (1,012 interviews) – A mobile phone RDD sample (1,002 interviews). • Survey questions were provided by subscriber organisations. • Response rates were 22.2% for the landline frame and 12.7% for the mobile phone frame.....people with mobile phones are less accessible. • Once again, the sample profile was very different for the landline and mobile frame dual-users. CRICOS Provider No 00025B SAMPLE PROFILE Selected Characteristics ABS Pop’n 2011 Landline Frame Mobile Frame Landline only (n=174) Dual-user (n=838) Mobile only (n=295) Dual-user (n=707) % % % % % Male 49.3 35.1 36.9 56.9 50.4 Female 50.7 64.9 63.1 43.1 49.6 18-24 12.8 2.3 3.7 23.1 19.0 25-39 28.3 6.9 17.1 48.1 26.9 40-49 18.0 6.3 20.9 8.8 18.8 50-64 24.0 24.7 32.7 15.6 27.2 65+ 16.9 59.8 25.7 4.4 8.2 - 17.2 23.4 73.2 41.9 Gender Age group (years) Time in neigh’hood 5 years or less CRICOS Provider No 00025B APPROACHES TO WEIGHTING • • Purpose: to combine the samples from each frame to produce unbiased estimates for all adults that can be reached by telephone. A dual frame estimator needs to combine data from the – – – landline only sample mobile phone only sample, and the overlapping sample from both frames. 3-Stage Weighting Strategy 1. Design weights To adjust for sampling error (inverse of probability of selection) 2. Poststratification weights To adjust for non-response bias in the population 3. Composite weights To produce a weighted average of the two dual-user sample estimates. CRICOS Provider No 00025B THE WEIGHTS 1. Design weights The probability that an individual is selected into the sample depends on their probability of being in the landline sample or mobile phone sample, less the probability of being in both (Best, 2010): S LL S MP S LL LL S MP Pind LL U LL AD U MP U LL AD U MP • ULL= 7,228,117 estimated universe of residential phones UMP=15,334,107 estimated universe of mobile phones SLL= 1,012 number of interviews with landline phone SMP= 1,002 number of interviews with mobile phones LL = number of landlines in household AD = number of in-scope adults in household In Australia, we can only estimate ‘universe’ from ABS and ACMA reports CRICOS Provider No 00025B STAGE 2 WEIGHTS 2. Poststratification weights (by raking) – – – Have shown that non-response mechanism differs by sampling frame BUT ... data needed for poststratification by telephone usage domain is not available in Australia Have poststratified to population characteristics only Gender: male, female Location: Capital city, rest of state or territory Age x education: 5 age categories by tertiary degree (or not) Birthplace: English speaking background or not Telephone status: mobile only (19%), dual user (72%), landline only (9%) • • Do not account for non-response due to inaccessibility. Is there an alternative approach when domain characteristics are unknown? CRICOS Provider No 00025B STAGE 3 WEIGHTS 3. Composite Weights – A weighted average of the dual-user estimates from each frame – We have poststratifed and then averaged (rather than other way round) – Both approaches are unbiased and consistent in the absence of nonsampling errors (Brick, 2011). A = landline B = mobile phone frame • The composite estimator is y y a yb y a where B y yabA (1 ) yab yabA b B yab are non-response adjusted estimators for dual-users and from frame A and frame B, respectively. CRICOS Provider No 00025B ab Choice of , the compositing factor • Lambda can be fixed or vary with quantity being estimated. • Most researchers use a fixed λ = 0.5 as probability of selecting a person in sample A is similar to probability of selection in sample B. • We also use λ = 0.68, where probability of selecting a person from landline frame is twice as high, relative to mobile frame. Remember: ULL= 7,228,117, UMP= 15,334,107, 2.2 adults per household Is OR CRICOS Provider No 00025B S LL S LL U LL 0.68 U LL S MP U MP S LL S LL (2.2 U LL ) 0.49 (2.2 U LL ) S MP U MP OPTIMAL LAMBDA • Or could calculate an optimal value of λ which minimises the variance of the quantity Yˆ being estimated ˆB) Var ( Y ab ˆ Var(YˆabA ) Var(YˆabB ) ˆ • So λ close to optimal will have a small effect on the variance, but the bias may be more sensitive. • Brick et al. (2011) – show that λ influences the bias and the variance of the estimator – propose that an alternative is to choose the compositing factor to eliminate bias of the average estimator. CRICOS Provider No 00025B RESULTS FOR OMNIBUS SURVEY Variable Sex Age Country of Birth Degree Status Tenure Living Arrangement Time in neighbourhood Anxiety or Depression Hours of TV watched Smoking Status Belief in Climate Change Part-time work (under 35 hours per week) CRICOS Provider No 00025B Optimal Choice of λ 0.50 0.49 0.55 0.49 0.57 0.53 0.58 0.57 0.47 0.55 0.50 0.55 RESULTS Estimated proportion of adults by sex, age, degree & weighting scheme Weighted Unweighted and raked to total population Landline frame only (raked) Mobile frame only (raked) Composite with λ=0.68 Composite with λ=0.5 Sex Male Female 0.444 0.566 0.493 0.507 0.404 0.597 0.572 0.428 0.488 0.512 0.502 0.498 Age 18-24 25-39 40-49 50-64 65-74 75+ 0.118 0.242 0.171 0.276 0.129 0.065 0.128 0.283 0.180 0.240 0.117 0.052 0.042 0.187 0.198 0.282 0.184 0.106 0.204 0.368 0.164 0.203 0.057 0.005 0.119 0.293 0.163 0.232 0.130 0.063 0.134 0.301 0.160 0.226 0.122 0.057 Degree status Degree + 0.328 0.185 0.162 0.205 0.173 0.178 CRICOS Provider No 00025B RESULTS/2 Estimated proportion (SEs) of adults by degree, employment, housing tenure & weighting scheme Weighted by Unweighted population total raking Landline frame only (raked) Mobile frame only (raked) Composite with λ=0.68 Composite with λ=0.5 Employment Employed 0.649 0.663 (0.012) 0.592 (0.017) 0.725 (0.016) 0.630 (0.013) 0.641 (0.013) Tenure Own Mortgage Rent 0.338 (0.012) 0.489 (0.018) 0.203 (0.014) 0.340 (0.013) 0.322 (0.013) 0.339 (0.012) 0.359 (0.018) 0.321 (0.017) 0.308 (0.013) 0.306 (0.013) 0.323 (0.012) 0.152 (0.013) 0.476 (0.018) 0.352 (0.014) 0.372 (0.014) CRICOS Provider No 00025B 0.378 0.329 0.293 RESULTS/3 Estimated proportion (SEs) of adults by employment, housing tenure & weighting scheme Weighted by ABS Census population 2011 raking Landline frame only (raked) Mobile frame only (raked) Composite with λ=0.68 Composite with λ=0.5 Employment Employed Unavailable 0.663 (0.012) 0.592 (0.017) 0.725 (0.016) 0.6301 (0.013) 0.641 (0.013) Tenure Own Mortgage Rent 0.321 0.349 0.296 1,2Significantly 3 0.338 (0.012) 0.489 (0.018) 0.203 (0.014) 0.340 (0.013) 0.322 (0.013) 0.339 (0.012) 0.359 (0.018) 0.321 (0.017) 0.308 (0.013) 0.3062 (0.013) 0.323 (0.012) 0.152 (0.013) 0.476 (0.018) 0.352 (0.014) 0.3723 (0.014) different to estimate from raked data at p-value=0.06 Significantly different to estimate from raked data at p-value=0.008 CRICOS Provider No 00025B RESULTS/4 Estimated proportion of adults by degree, employment, housing tenure & weighting scheme Anxiety or Depression Yes Hours of TV watched 5 hours or more Smoking Status Yes,daily Believe in climate change Yes Living arrangement Group household Time in neigh’hood Less than 5 years CRICOS Provider No 00025B Unweighted total Weighted by population raking Composite with λ=0.68 Composite with λ=0.5 Composite with optimal λ 0.192 0.198 0.199 0.204 0.207 0.108 0.108 0.113 0.114 0.114 0.154 0.173 0.180 0.182 0.183 0.786 0.774 0.769 0.778 0.778 0.088 0.010 0.116 0.122* 0.124* 0.366 0.338 0.397 0.413* 0.421* WHERE TO FROM HERE? • There is little to no population information on characteristics of mobile phone usage in Australia. • How can we improve on poststratification weights to account for nonresponse – Due to inaccessibility to mobile phone users? – In different telephone usage domains? • Should the average estimator be applied before or after poststratification? • What is the best choice of compositing factor for dual-frame telephone surveys in Australia? CRICOS Provider No 00025B REFERENCES • Best, J. (2010). First-stage weights for overlapping dual frame telephone surveys. Presented at AAPOR’s 65th Annual Conference, Chicago, IL. • Brick, J.M., Cervantes, I.F., Lee, S. and Norman, G. (2011). Nonsampling errors in dual frame telephone surveys. Survey Methodology. 37(1), pp.1-12. • Brick, J., Dipko, S., Presser, S., Tucker, C., Yuan, Y.(2006). Nonresponse bias in a dual frame sample of cell & landline numbers. Public Opinion Quarterly. 70(5), pp.780-793. • Lohr, S.L. (2010). Dual frame surveys: Recent developments and challenges. Proceedings of the 45th Meeting of the Italian Statistical Society. • Lohr, S.L. and Rao, J.N.K. (2000). Inference from dual frame surveys. Journal of the American Statistical Association. 95(449), pp.271-280. • Pennay, D.W. and Vickers, N. (2012). Dual-frame Omnibus Survey. Technical and Methodological Summary Report, Social Research Centre Pty Ltd, Melbourne, Australia. CRICOS Provider No 00025B