On Nonconvex Decentralized Gradient Descent. (arXiv:1608.05766v3 [math.OC] UPDATED)

Consensus optimization has received considerable attention in recent years. A number of decentralized algorithms have been proposed for {convex} consensus optimization. However, to the behaviors or consensus \emph{nonconvex} optimization, our understanding is more limited. When we lose convexity, we cannot hope our algorithms always return global solutions though they sometimes still do sometimes. Somewhat surprisingly, the decentralized consensus algorithms, DGD and Prox-DGD, retain most other properties that are known in the convex setting. In particular, when diminishing (or constant) step sizes are used, we can prove convergence to a (or a neighborhood of) consensus stationary solution and have guaranteed rates of convergence. It is worth noting that Prox-DGD can handle nonconvex nonsmooth functions if their proximal operators can be computed. Such functions include SCAD and $\ell_q$ quasi-norms, $q\in[0,1)$. Similarly, Prox-DGD can take the constraint to a nonconvex set with an ea查看全文

Solidot 文章翻译