2019-01-03

translated by 谷歌翻译

2018-08-04

translated by 谷歌翻译

2018-01-13

translated by 谷歌翻译

2017-05-30

translated by 谷歌翻译

2018-06-29

translated by 谷歌翻译

2017-06-13
Despite the growing prominence of generative adversarial networks (GANs),optimization in GANs is still a poorly understood topic. In this paper, weanalyze the "gradient descent" form of GAN optimization i.e., the naturalsetting where we simultaneously take small gradient steps in both generator anddiscriminator parameters. We show that even though GAN optimization does notcorrespond to a convex-concave game (even for simple parameterizations), underproper conditions, equilibrium points of this optimization procedure are still\emph{locally asymptotically stable} for the traditional GAN formulation. Onthe other hand, we show that the recently proposed Wasserstein GAN can havenon-convergent limit cycles near equilibrium. Motivated by this stabilityanalysis, we propose an additional regularization term for gradient descent GANupdates, which \emph{is} able to guarantee local stability for both the WGANand the traditional GAN, and also shows practical promise in speeding upconvergence and addressing mode collapse.
translated by 谷歌翻译

2018-07-11

translated by 谷歌翻译

2018-05-15

translated by 谷歌翻译

2018-02-16
Motivated by the pursuit of a systematic computational and algorithmic understanding of Generative Adversarial Networks (GANs), we present a simple yet unified non-asymptotic local convergence theory for smooth two-player games, which subsumes several discrete-time gradient-based saddle point dynamics. The analysis reveals the surprising nature of the off-diagonal interaction term as both a blessing and a curse. On the one hand, this interaction term explains the origin of the slowdown effect in the convergence of Simultaneous Gradient Ascent (SGA) to stable Nash equilibria. On the other hand, for the unstable equilibria, exponential convergence can be proved thanks to the interaction term, for four modified dynamics proposed to stabilize GAN training: Optimistic Mirror Descent (OMD), Consensus Optimization (CO), Implicit Updates (IU) and Predictive Method (PM). The analysis uncovers the intimate connections among these stabilizing techniques, and provides detailed characterization on the choice of learning rate. As a by-product, we present a new analysis for OMD proposed in Daskalakis, Ilyas, Syrgkanis, and Zeng [2017] with improved rates.
translated by 谷歌翻译
Given a non-convex twice differentiable cost function f , we prove that the set of initial conditions so that gradient descent converges to saddle points where 2 f has at least one strictly negative eigenvalue has (Lebesgue) measure zero, even for cost functions f with non-isolated critical points, answering an open question in [12]. Moreover, this result extends to forward-invariant convex subspaces, allowing for weak (non-globally Lipschitz) smoothness assumptions. Finally, we produce an upper bound on the allowable step-size.
translated by 谷歌翻译

2019-01-23

translated by 谷歌翻译

2018-09-13

translated by 谷歌翻译

2018-02-28

translated by 谷歌翻译

2018-06-18

translated by 谷歌翻译
The models surveyed include generalized Pólya urns, reinforced random walks, interacting urn models, and continuous reinforced processes. Emphasis is on methods and results, with sketches provided of some proofs. Applications are discussed in statistics, biology, economics and a number of other areas.
translated by 谷歌翻译

2018-07-11

translated by 谷歌翻译

2019-05-11

translated by 谷歌翻译
The focus of this thesis is on solving a sequence of optimization problems that change over time in a structured manner. This type of problem naturally arises in contexts as diverse as channel estimation, target tracking, sequential machine learning, and repeated games. Due to the time-varying nature of these problems, it is necessary to determine new solutions as the problems change in order to ensure good solution quality. However, since the problems change over time in a structured manner, it is beneficial to exploit solutions to the previous optimization problems in order to efficiently solve the current optimization problem. The first problem considered is sequentially solving minimization problems that change slowly, in the sense that the gap between successive minimizers is bounded in norm. The minimization problems are solved by sequentially applying a selected optimization algorithm, such as stochastic gradient descent (SGD), based on drawing a number of samples in order to carry out a desired number of iterations. Two tracking criteria are introduced to evaluate approximate minimizer quality: one based on being accurate with respect to the mean trajectory, and the other based on being accurate in high probability (IHP). Knowledge of the bound on how the minimizers change, combined with properties of the chosen optimization algorithm, is used to select the number of samples needed to meet the desired tracking criterion. Next, it is not assumed that the bound on how the minimizers change is known. A technique to estimate the change in minimizers is provided along with analysis to show that eventually the estimate upper bounds the change in minimizers. This estimate of the change in minimizers is combined with the previous analysis to provide sample size selection rules to ensure that the mean or IHP tracking criterion is met. Simulations are used to confirm that the estimation approach provides the desired tracking accuracy in practice.
translated by 谷歌翻译
We study whether a depth two neural network can learn another depth two network using gradient descent. Assuming a linear output node, we show that the question of whether gradient descent converges to the target function is equivalent to the following question in electrodynamics: Given k fixed protons in R d , and k electrons, each moving due to the attractive force from the protons and repulsive force from the remaining electrons, whether at equilibrium all the electrons will be matched up with the protons, up to a permutation. Under the standard electrical force, this follows from the classic Earnshaw's theorem. In our setting, the force is determined by the activation function and the input distribution. Building on this equivalence, we prove the existence of an activation function such that gradient descent learns at least one of the hidden nodes in the target network. Iterating, we show that gradient descent can be used to learn the entire network one node at a time.
translated by 谷歌翻译
This paper examines the convergence of payoffs and strategies in Erev and Roth's model of reinforcement learning. When all players use this rule it eliminates iteratively dominated strategies and in two-person constant-sum games average payoffs converge to the value of the game. Strategies converge in constant-sum games with unique equilibria if they are pure or if they are mixed and the game is 2 × 2. The long-run behaviour of the learning rule is governed by equations related to Maynard Smith's version of the replicator dynamic. Properties of the learning rule against general opponents are also studied.
translated by 谷歌翻译
${authors} 分类：${tags}
${pubdate}${abstract_cn}
translated by 谷歌翻译