solidot新版网站常见问题,请点击这里查看。

In Defense of the Indefensible: A Very Naive Approach to High-Dimensional Inference. (arXiv:1705.05543v1 [stat.ME])

来源于:arXiv
In recent years, a great deal of interest has focused on conducting inference on the parameters in a linear model in the high-dimensional setting. In this paper, we consider a simple and very na\"{i}ve two-step procedure for this task, in which we (i) fit a lasso model in order to obtain a subset of the variables; and (ii) fit a least squares model on the lasso-selected set. Conventional statistical wisdom tells us that we cannot make use of the standard statistical inference tools for the resulting least squares model (such as confidence intervals and $p$-values), since we peeked at the data twice: once in running the lasso, and again in fitting the least squares model. However, in this paper, we show that under a certain set of assumptions, with high probability, the set of variables selected by the lasso is deterministic. Consequently, the na\"{i}ve two-step approach can yield confidence intervals that have asymptotically correct coverage, as well as p-values with proper Type-I erro 查看全文>>