In this paper we demonstrate a simple heuristic adaptive restart techniquethat can dramatically improve the convergence rate of accelerated gradientschemes. The analysis of the technique relies on the observation that theseschemes exhibit two modes of behavior depending on how much momentum isapplied. In what we refer to as the 'high momentum' regime the iteratesgenerated by an accelerated gradient scheme exhibit a periodic behavior, wherethe period is proportional to the square root of the local condition number ofthe objective function. This suggests a restart technique whereby we reset themomentum whenever we observe periodic behavior. We provide analysis to showthat in many cases adaptively restarting allows us to recover the optimal rateof convergence with no prior knowledge of function parameters.
translated by 谷歌翻译