The paper considers estimating a parameter beta that defines an estimating function U(y, x, beta) for an outcome variable y and its covariate x when the outcome is missing in some of the observations. We assume that, in addition to the outcome and the covariate, a surrogate outcome is available in every observation. The efficiency of existing estimators for beta depends critically on correctly specifying the conditional expectation of U given the surrogate and the covariate. When the conditional expectation is not correctly specified, which is the most likely scenario in practice, the efficiency of estimation can be severely compromised even if the propensity function (of missingness) is correctly specified. We propose an estimator that is robust against the choice of the conditional expectation via an empirical likelihood. We demonstrate that the estimator proposed achieves a gain in efficiency whether the conditional score is correctly specified or not. When the conditional score is correctly specified, the estimator reaches the semiparametric variance bound within the class of estimating functions that are generated by U. The practical performance of the estimator is evaluated by using simulation and a data set that is based on the 1996 US presidential election.
empirical likelihood, estimating equations, missing values, surrogate outcome
Journal of the Royal Statistical Society: Series B: Statistical Methodology
Wiley: 12 months
CHEN, Song Xi; LEUNG, Denis H. Y.; and QIN, Jin.
Improving semiparametric estimation by using surrogate data. (2008). Journal of the Royal Statistical Society: Series B: Statistical Methodology. 70, (4), 803-823. Research Collection School Of Economics.
Available at: http://ink.library.smu.edu.sg/soe_research/1937
Copyright Owner and License
Creative Commons License
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License.