Multivariate Stratified Sample Design

Transcription

Multivariate Stratified Sample Design
Contents
Page
1 of 4
Print
Multivariate Stratified Sample Design
Pyong Namkung; Jae-Hyuk Choi; Yon-Hyong Kim
Professor, Department of Statistics, Sungkyunkwan University, 53 Myeongyun-dong 3-ga, Jongno-gu
Seoul(110-745), Korea.
Sessions
CPMs
E-mail: namkung@skku.edu; leonash@skku.edu; yhkim@jj.ac.kr
The ordinal stratified sample design uses one interest variable. However, real surveys have more than one
variable. The survey designer is often faced with the choice between several potential interest variables.
While each variables separately may be well suited for certain purpose, most surveys are multipurpose and
the designer needs to employ some multivariate stratification scheme which represents a compromise
between these purpose(Kish & Anderson, 1978)
The multivariate stratified sample design is different with two step form the ordinal stratified sample design.
The first step is to decide strata using stratified variables which are multivariate. The second steps are to
decide a sample size and the optimal allocation, which is to decide a sample size of each stratum, using
interest variables which are multivariate.
T
Contents
Page
2 of 4
Print
To decide strata using a single stratified variable is easy, but using multivariate stratified variables is not.
This is likely to multivariate analysis such as the clustering, the classification, and the discrimination. The
simplest and most popular method is using one stratified variable which is selected of multivariate stratified
variables. This method uses the most important variable. Another method is multi-way stratification which is
likely to multi-way contingency. This is the choice of optimum points of stratification for two or three
continuous stratified variables(Skinner, Holmes & Holt, 1994)
Sessions
CPMs
There are five methods for allocation to strata in the multivariate stratified sample design. The first method is
the proportional allocation, and the second is the univariate allocation using one interest variable which is
selected of multivariate interest variables. The third is a compromise allocation which is a weighted average
of sample size of strata using individual allocation(Cochran(1963), Chatterjee(1967)). The fourth is the
optimal allocation for a loss function of characteristic values which combine variances of all
variables(Kish(1976), Sukhatme(1984)). Finally, the fifth is the optimal allocation by mathematical
programming(Causey(1983), Bethel(1989), Khan & Ahsan(2003), Gracia & Corytez(2006)).
Modified Ideas
Almost multivariate stratified sample design doesn’t consider the correlation among interest variables, but
the correlation exists in real data. Therefore we consider the correlation among interest variables in
multivariate stratified sample design.
T
Contents
Page
3 of 4
Print
The first method is a compromise allocation which weighted by correlation coefficients or corvariances, and
the second method is the optimal allocation for a loss function of characteristic values of variancecorvariance matrix. A variation of one variable can explain variation of another with high correlations.
Intertest variables with lower correlation are more important than others.
Sessions
CPMs
The third method is to use weighted object function by the importance of interest variables in mathematical
programming, and importance is the error of estimation, the correlation coefficient, or the corvariances. The
multivariate stratified sample design is used for multi-objective surveys, in which there is difference among
the importance of interest variables. Therefore we need to use them.
Additional methods are minimizing the weighted average of design effect of interest variables and using new
variables which are mutually independent such as component scores by principal component analysis.
REFERENCES
[1]
Bethel, J. (1989), "Sampling Allocation in Multivariate Surveys", Survey Methodology, Vol. 15, No. 1,
47-57.
T
Contents
Page
4 of 4
Print
[2]
Causey, B.D. (1983), "Computational Aspects of Optimal Allocation in Multivariate Stratified
Sampling", Siam Journal of Science Statistics, Vol. 4, No. 2, 322-329.
[3]
Chatterjee, S. (1967), "A Note on Optimum Allocation", Scandinavian Actuarial journal, 50, 40-44.
[4]
Cochran, W.G. (1963), "Sampling techniques", John Wiley & sons. Inc. New York.
[5]
Garica, J.A.D. and Cortez, L.U. (2006), "Optimum Allocation in Multivariate Stratified Sampling :
Sessions
CPMs
Multi-Objective Programming". Comunicacion Technica. No I-06-06, 1-22.
[6]
Khan, M.G.M. and Ashan, M.J. (2003), "A Note on Optimum Allocation in Multivariate Stratified
Sampling", South Pacific Journal of Natural Science, Vol. 21, pp. 91-95.
[7]
Kish, L. (1976), "Optima and Proxima in Linear Sample Designs", Journal of the Royal Statistical
Society. Series A, Vol. 139, No. 1, 80-95.
[8]
Kish, L. and Anderson, D.W. (1978), "Multivariate and Multipurpose Stratification", Journal of the
American Statistical Association, Vol. 73, No. 361, 24-34.
[9]
Skinner, C.J., Holmes, D.J and Holt, D. (1994), "Multiple Frame Sampling for Multivariate
Stratification", International Statistical Review, Vol. 62, No. 3, 333-347.
[10] Sukhatme, P.V., Sukhatme, B.V., Sukhatme, S. and Asok, C. (1984), "Sampling Theory of Surveys
T
with Applications". Iowa State University Press. Ames. IA.