Detecting Overdispersion in Large Scale Surveys: Application to a Study of Education and Social Class in Britain [статья]
Опубликовано на портале: 17-12-2002Garrett M. Fitzmaurice, Anthony F. Heath, David R. Cox Applied Statistics. 1997. Vol. 46. No. 4. P. 415-432.
A practical problem with large scale survey data is the potential for overdispersion. Overdispersion occurs when the data display more variability than is predicted by the variance-mean relationship for the assumed sampling model. This paper describes a simple strategy for detecting and adjusting for overdispersion in large scale survey data. The method is primarily motivated by data on the relationship between social class and educational attainment obtained from a 2% sample from the 1991 census of the population of Great Britain. Overdispersion can be detected by first grouping the data into a number of strata of approximately equal size. Under the assumption that the observations are independent and there is no variability in the parameter of interest, there is a direct relationship between the nominal standard errors and the empirical or sample standard deviation of the parameter estimates obtained from each of the separate strata. With the 2% sample from the British census data, quite a discernible departure from this relationship was found, indicating overdispersion. After allowing for overdispersion, improved and more realistic measures of precision of the strength of the social class-education associations were obtained.
Опубликовано на портале: 24-06-2004Charles S. Davis, Michael A. Stephens Applied Statistics. 1989. Vol. 38. No. 3. P. 535-582.
Empirical distribution function (EDF) statistics for goodness of fit are based on a comparison of the hypothesized distrribution function F(x) with the empirical distribution function Fn(x). When F(x) is continuous and completely specified, it has long been known that? in general, EDF statistics give more powerful tests of Ho then the classical X2 test. This work has made possible to use EDF statistics very easily when F(x) is completely specified and also for two practical situations.