Publication Type
Journal Article
Version
publishedVersion
Publication Date
2011
Abstract
We propose an asymptotic likelihood-based LASSO approach for model selection in regression analysis when data are subject to validation sampling. The method makes use of an initial estimator of the regression coefficients and their asymptotic covariance matrix to form an asymptotic likelihood. This ``working'' objective function facilitates the formulation of the LASSO and the implementation of a fast algorithm. Our method circumvents the need to use a likelihood set-up that requires full distributional assumptions about the data. We show that the resulting estimator is consistent in model selection and that the method has lower prediction errors than a model that uses only the validation sample. Furthermore, we show that this formulation gives an optimal estimator in a certain sense. Extensive simulation studies are conducted for the linear regression model, the generalized linear regression model, and the Cox model. Our simulation results support our claims. The method is further applied to a dataset to illustrate its practical use.
Keywords
Asymptotic likelihoodbased LASSO, LASSO, least squaresapproximation, validation sampling.
Discipline
Econometrics | Economics | Statistics and Probability
Research Areas
Econometrics
Publication
Statistica Sinica
Volume
21
First Page
659
Last Page
678
ISSN
1017-0405
Identifier
10.5705/ss.2011.029a
Citation
LENG, Chenlei and LEUNG, Denis H. Y..
Model Selection in Validation Sampling Data: An Asymptotic Likelihood-based LASSO Approach. (2011). Statistica Sinica. 21, 659-678.
Available at: https://ink.library.smu.edu.sg/soe_research/1333
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.5705/ss.2011.029a