Journal of the American Statistical Association
Опубликовано на портале: 20-07-2004Malay Ghosh, Narinder Nangia, Dal Ho Kim Journal of the American Statistical Association. 1996. Vol. 91. No. 436. P. 1423-1431.
This article develops a general methodology for small domain estimation based on data from repeated surveys. The results are directly applied to the estimation of median income of four-person families for the 50 states and the District of Columbia. These estimates are needed by the U.S. Department of Health and Human Services (HHS) to formulate its energy assistance program for low income families. The U.S. Bureau of the Census, by an informal agreement, has provided such estimates to HHS through a linear regression methodology since the latter part of the 1970s. The current method is an empirical Bayes method (EB) that uses the Current Population Survey (CPS) estimates as well as the most recent decennial census estimates updated by the per capita income estimates of the Bureau of Economic Analysis. However, with the existing methodology, standard errors associated with these estimates are not easy to obtain. The EB estimates, when used naively, can lead to underestimation of standard errors. Moreover, because the sample estimates are collected through the CPS every year, there is a very natural time series aspect of the data that is currently ignored. We have performed a full Bayesian analysis using a hierarchical Bayes (HB) time series model. In addition to providing the median income estimates as the posterior means, we have provided also the posterior standard deviations. Included in our model is the information on the median incomes of three- and five-person families as well. In this way a multivariate HB procedure is used. The Bayesian analysis requires evaluation of high-dimensional integrals. We have overcome this problem by using the Gibbs sampling technique, which has turned out to be a very convenient tool for Monte Carlo integration. Also, we have validated our results by comparing them against the 1989 four-person median income figures obtained from the 1990 census. We used four different criteria for such comparisons. It turns out that the estimates obtained by using a bivariate time-series model are the best overall. We use a criterion based on deviances for model selection and also provide a sensitivity analysis of the proposed hierarchical model.