Publication Type

Working Paper

Publication Date

3-2016

Abstract

In this paper we propose a smoothed Q-learning algorithm for estimating optimal dynamic treatment regimes. In contrast to the Q-learning algorithm in which non-regular inference is involved, we show that under assumptions adopted in this paper, the proposed smoothed Q-learning estimator is asymptotically normally distributed even when the Q-learning estimator is not and its asymptotic variance can be consistently estimated. As a result, inference based on the smoothed Q-learning estimator is standard. We derive the optimal smoothing parameter and propose a data-driven method for estimating it. The finite sample properties of the smoothed Q-learning estimator are studied and compared with several existing estimators including the Q-learning estimator via an extensive simulation study. We illustrate the new method by analyzing data from the Clinical Antipsychotic Trials of Intervention EffectivenessAlzheimer’s Disease (CATIE-AD) study.

Keywords

Asymptotic normality; Exceptional law; Optimal smoothing parameter; Sequential randomization; Wald-type inference

Discipline

Econometrics

Research Areas

Econometrics

First Page

1

Last Page

45

Creative Commons License

Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License.

Additional URL

http://www.colorado.edu/economics/seminars/SeminarArchive/2015-16/Fan.pdf

Included in

Econometrics Commons

Share

COinS