solidot新版网站常见问题,请点击这里查看。
消息
本文已被查看7750次
A New Theory for Sketching in Linear Regression. (arXiv:1810.06089v1 [math.ST])
来源于:arXiv
Large datasets create opportunities as well as analytic challenges. A recent
development is to use random projection or sketching methods for dimension
reduction in statistics and machine learning. In this work, we study the
statistical performance of sketching algorithms for linear regression. Suppose
we randomly project the data matrix and the outcome using a random sketching
matrix reducing the sample size, and do linear regression on the resulting
data. How much do we lose compared to the original linear regression? The
existing theory does not give a precise enough answer, and this has been a
bottleneck for using random projections in practice.
In this paper, we introduce a new mathematical approach to the problem,
relying on very recent results from asymptotic random matrix theory and free
probability theory. This is a perfect fit, as the sketching matrices are random
in practice. We allow the dimension and sample sizes to have an arbitrary
ratio. We study the most popular sketch 查看全文>>