In this paper we survey the primary research, both theoretical and applied, in the area of Robust Optimization (RO). Our focus is on the computational attractiveness of RO approaches, as well as the modeling power and broad applicability of the methodology. In addition to surveying prominent theoretical results of RO, we also present some recent results linking RO to adaptable models for multi-stage decision-making problems. Finally, we highlight applications of RO across a wide spectrum of domains, including finance, statistics, learning, and various areas of engineering.
translated by 谷歌翻译
Stochastic programming can effectively describe many decision making problems in uncertain environments. Unfortunately , such programs are often computationally demanding to solve. In addition, their solution can be misleading when there is ambiguity in the choice of a distribution for the random parameters. In this paper, we propose a model that describes uncertainty in both the distribution form (discrete, Gaussian, exponential, etc.) and moments (mean and covari-ance matrix). We demonstrate that for a wide range of cost functions the associated distributionally robust (or min-max) stochastic program can be solved efficiently. Furthermore, by deriving a new confidence region for the mean and the covariance matrix of a random vector, we provide probabilistic arguments for using our model in problems that rely heavily on historical data. These arguments are confirmed in a practical example of portfolio selection, where our framework leads to better performing policies on the "true" distribution underlying the daily returns of financial assets.
translated by 谷歌翻译
The stochastic block model (SBM) is a popular framework for studying community detection in networks. This model is limited by the assumption that all nodes in the same community are statistically equivalent and have equal expected degrees. The degree-corrected stochastic block model (DCSBM) is a natural extension of SBM that allows for degree heterogeneity within communities. This paper proposes a convexi-fied modularity maximization approach for estimating the hidden communities under DCSBM. Our approach is based on a convex programming relaxation of the classical (generalized) modularity maximization formulation, followed by a novel doubly-weighted 1-norm k-median procedure. We establish non-asymptotic theoretical guarantees for both approximate clustering and perfect clustering. Our approximate clustering results are insensitive to the minimum degree, and hold even in sparse regime with bounded average degrees. In the special case of SBM, these theoretical results match the best-known performance guarantees of computationally feasible algorithms. Numerically, we provide an efficient implementation of our algorithm, which is applied to both synthetic and real-world networks. Experiment results show that our method enjoys competitive performance compared to the state of the art in the literature.
translated by 谷歌翻译
本文研究了凸优化方法在社区检测中的最新理论进展。我们介绍了一些重要的理论技术和结果,以建立各种统计模型下凸社区检测的一致性。特别是,我们讨论了基于原始和双重分析的基本技术。我们还提出了一些结果,证明了凸群落检测的几个独特优势,包括对异常节点的鲁棒性,弱相关性下的一致性以及对异质度的适应性。本调查并不是对这个快速发展的主题的大量文献的完整概述。相反,我们的目标是提供该领域近期显着发展的全貌,并使调查可供广大受众使用。我们希望这篇说明文章可以作为有兴趣在网络分析中使用,设计和分析凸松弛方法的读者的入门指南。
translated by 谷歌翻译
To address difficult optimization problems, convex relaxations based on semidefinite programming are now common place in many fields. Although solvable in polynomial time, large semidefinite programs tend to be computationally challenging. Over a decade ago, exploiting the fact that in many applications of interest the desired solutions are low rank, Burer and Monteiro proposed a heuristic to solve such semidefinite programs by restricting the search space to low-rank matrices. The accompanying theory does not explain the extent of the empirical success. We focus on Synchronization and Community Detection problems and provide theoretical guarantees shedding light on the remarkable efficiency of this heuristic.
translated by 谷歌翻译
This paper considers the problem of clustering a partially observedunweighted graph---i.e., one where for some node pairs we know there is an edgebetween them, for some others we know there is no edge, and for the remainingwe do not know whether or not there is an edge. We want to organize the nodesinto disjoint clusters so that there is relatively dense (observed)connectivity within clusters, and sparse across clusters. We take a novel yet natural approach to this problem, by focusing on findingthe clustering that minimizes the number of "disagreements"---i.e., the sum ofthe number of (observed) missing edges within clusters, and (observed) presentedges across clusters. Our algorithm uses convex optimization; its basis is areduction of disagreement minimization to the problem of recovering an(unknown) low-rank matrix and an (unknown) sparse matrix from their partiallyobserved sum. We evaluate the performance of our algorithm on the classicalPlanted Partition/Stochastic Block Model. Our main theorem provides sufficientconditions for the success of our algorithm as a function of the minimumcluster size, edge density and observation probability; in particular, theresults characterize the tradeoff between the observation probability and theedge density gap. When there are a constant number of clusters of equal size,our results are optimal up to logarithmic factors.
translated by 谷歌翻译
我们提出了一种新的超完全独立分量分析(ICA)算法,其中潜在源k的数量超过观察到的变量的维数p。先前的算法要么具有高计算复杂度,要么对混合矩阵的形式做出强有力的假设。我们的算法不做任何稀疏性假设,但仍具有良好的计算和理论性质。我们的算法包括两个主要步骤:(a)估计累积量生成函数的Hessians(与大多数算法使用的四阶和更高阶累积量相反)和(b)用于恢复a的新的半定规划(SDP)松弛我们表明,利用投影加速梯度下降法可以有效地解决这种松弛问题,使整个算法在计算上具有实用性。此外,我们推测所提出的程序以k <p ^ 2/4的速率恢复混合分量,并证明当k <(2-epsilon)p log时原始分量均匀采样时,可以高概率地回收混合成分。随机的超球面。在合成数据和真实图像的CIFAR-10数据集上提供实验。
translated by 谷歌翻译
We consider the problem of estimating the parameters of a Gaussian or binary distribution in such a way that the resulting undirected graphical model is sparse. Our approach is to solve a maximum likelihood problem with an added 1-norm penalty term. The problem as formulated is convex but the memory requirements and complexity of existing interior point methods are prohibitive for problems with more than tens of nodes. We present two new algorithms for solving problems with at least a thousand nodes in the Gaussian case. Our first algorithm uses block coordinate descent, and can be interpreted as recursive 1-norm penalized regression. Our second algorithm, based on Nesterov's first order method, yields a complexity estimate with a better dependence on problem size than existing interior point methods. Using a log determinant relaxation of the log partition function (Wainwright and Jordan, 2006), we show that these same algorithms can be used to solve an approximate sparse maximum likelihood problem for the binary case. We test our algorithms on synthetic data, as well as on gene expression and senate voting records data.
translated by 谷歌翻译
This paper is concerned with the optimal distributed control (ODC) problem for discrete-time deterministic and stochastic systems. The objective is to design a fixed-order distributed controller with a pre-specified structure that is globally optimal with respect to a quadratic cost functional. It is shown that this NP-hard problem has a quadratic formulation, which can be relaxed to a semidefinite program (SDP). If the SDP relaxation has a rank-1 solution, a globally optimal distributed controller can be recovered from this solution. By utilizing the notion of treewidth, it is proved that the nonlinearity of the ODC problem appears in such a sparse way that an SDP relaxation of this problem has a matrix solution with rank at most 3. Since the proposed SDP relaxation is computationally expensive for a large-scale system, a computationally-cheap SDP relaxation is also developed with the property that its objective function indirectly penalizes the rank of the SDP solution. Various techniques are proposed to approximate a low-rank SDP solution with a rank-1 matrix, leading to recovering a near-global controller together with a bound on its optimality degree. The above results are developed for both finite-horizon and infinite horizon ODC problems. While the finite-horizon ODC is investigated using a time-domain formulation, the infinite-horizon ODC problem for both deterministic and stochastic systems is studied via a Lyapunov formulation. The SDP relaxations developed in this work are exact for the design of a centralized controller, hence serving as an alternative for solving Riccati equations. The efficacy of the proposed SDP relaxations is elucidated in numerical examples.
translated by 谷歌翻译
Graph clustering involves the task of dividing nodes into clusters, so that the edge density is higher within clusters as opposed to across clusters. A natural, classic and popular statistical setting for evaluating solutions to this problem is the stochastic block model, also referred to as the planted partition model. In this paper we present a new algorithm-a convexified version of Maximum Likelihood-for graph clustering. We show that, in the classic stochastic block model setting, it outperforms existing methods by polynomial factors when the cluster size is allowed to have general scalings. In fact, it is within logarithmic factors of known lower bounds for spectral methods, and there is evidence suggesting that no polynomial time algorithm would do significantly better. We then show that this guarantee carries over to a more general extension of the stochastic block model. Our method can handle the settings of semi-random graphs, heterogeneous degree distributions, unequal cluster sizes, unaffiliated nodes, partially observed graphs and planted clique/coloring etc. In particular, our results provide the best exact recovery guarantees to date for the planted partition, planted k-disjoint-cliques and planted noisy coloring models with general cluster sizes; in other settings, we match the best existing results up to logarithmic factors.
translated by 谷歌翻译
We perform a finite sample analysis of the detection levels for sparse principal components of a high-dimensional covariance matrix. Our mini-max optimal test is based on a sparse eigenvalue statistic. Alas, computing this test is known to be NP-complete in general, and we describe a compu-tationally efficient alternative test using convex relaxations. Our relaxation is also proved to detect sparse principal components at near optimal detection levels, and it performs well on simulated datasets. Moreover, using polynomial time reductions from theoretical computer science, we bring significant evidence that our results cannot be improved, thus revealing an inherent trade off between statistical and computational performance.
translated by 谷歌翻译
We present a novel approach to the matching of subgraphs for object recognition in computer vision. Feature similarities between object model and scene graph are complemented with a regularization term that measures differences of the relational structure. For the resulting quadratic integer program, a mathematically tight relaxation is derived by exploiting the degrees of freedom of the embedding space of positive semidefinite matrices. We show that the global minimum of the relaxed convex problem can be interpreted as probability distribution over the original space of matching matrices, providing a basis for efficiently sampling all close-to-optimal combinatorial matchings within the original solution space. As a result, the approach can even handle completely ambiguous situations, despite uniqueness of the relaxed convex problem. Exhaustive numerical experiments demonstrate the promising performance of the approach which-up to a single inevitable regu-larization parameter that weights feature similarity against structural similarity-is free of any further tuning parameters.
translated by 谷歌翻译
This paper addresses the optimal control problem known as the Linear Quadratic Regulator in the case when the dynamics are unknown. We propose a multi-stage procedure, called Coarse-ID control, that estimates a model from a few experimental trials, estimates the error in that model with respect to the truth, and then designs a controller using both the model and uncertainty estimate. Our technique uses contemporary tools from random matrix theory to bound the error in the estimation procedure. We also employ a recently developed approach to control synthesis called System Level Synthesis that enables robust control design by solving a quasiconvex optimization problem. We provide end-to-end bounds on the relative error in control cost that are optimal in the number of parameters and that highlight salient properties of the system to be controlled such as closed-loop sensitivity and optimal control magnitude. We show experimentally that the Coarse-ID approach enables efficient computation of a stabilizing controller in regimes where simple control schemes that do not take the model uncertainty into account fail to stabilize the true system.
translated by 谷歌翻译
Community detection, which aims to cluster N nodes in a given graph into r distinct groups based on the observed undirected edges, is an important problem in network data analysis. In this paper, the popular stochastic block model (SBM) is extended to the generalized stochastic block model (GSBM) that allows for adversarial outlier nodes, which are connected with the other nodes in the graph in an arbitrary way. Under this model, we introduce a procedure using convex optimization followed by k-means algorithm with k = r. Both theoretical and numerical properties of the method are analyzed. A theoretical guarantee is given for the procedure to accurately detect the communities with small misclassification rate under the setting where the number of clusters can grow with N. This theoretical result admits to the best-known result in the literature of com-putationally feasible community detection in SBM without outliers. Numerical results show that our method is both computationally fast and robust to different kinds of outliers, while some popular compu-tationally fast community detection algorithms, such as spectral clustering applied to adjacency matrices or graph Laplacians, may fail to retrieve the major clusters due to a small portion of outliers. We apply a slight modification of our method to a political blogs data set, showing that our method is competent in practice and comparable to existing computationally feasible methods in the literature. To the best of the authors' knowledge, our result is the first in the literature in terms of clustering communities with fast growing numbers under the GSBM where a portion of arbitrary outlier nodes exist.
translated by 谷歌翻译
In this paper we consider the cluster estimation problem under the Stochastic Block Model. We show that the semidefinite programming (SDP) formulation for this problem achieves an error rate that decays exponentially in the signal-to-noise ratio. The error bound implies weak recovery in the sparse graph regime with bounded expected degrees, as well as exact recovery in the dense regime. An immediate corollary of our results yields error bounds under the Censored Block Model. Moreover, these error bounds are robust, continuing to hold under heterogeneous edge probabilities and a form of the so-called monotone attack. Significantly, this error rate is achieved by the SDP solution itself without any further pre-or post-processing, and improves upon existing polynomially-decaying error bounds proved using the Grothendieck's inequality. Our analysis has two key ingredients: (i) showing that the graph has a well-behaved spectrum, even in the sparse regime, after discounting an exponentially small number of edges, and (ii) an order-statistics argument that governs the final error rate. Both arguments highlight the implicit regularization effect of the SDP formulation.
translated by 谷歌翻译
We study the computational complexity of approximating the 2-to-q norm of linear operators (defined as A 2→q = max v0 Av q /v 2) for q > 2, as well as connections between this question and issues arising in quantum information theory and the study of Khot's Unique Games Conjecture (UGC). We show the following: 1. For any constant even integer q 4, a graph G is a small-set expander if and only if the projector into the span of the top eigenvectors of G's adjacency matrix has bounded 2 → q norm. As a corollary, a good approximation to the 2 → q norm will refute the Small-Set Expansion Conjecture-a close variant of the UGC. We also show that such a good approximation can be computed in exp(n 2/q) time, thus obtaining a different proof of the known subexponential algorithm for Small-Set Expansion. 2. Constant rounds of the "Sum of Squares" semidefinite programing hierarchy certify an upper bound on the 2 → 4 norm of the projector to low-degree polynomials over the Boolean cube, as well certify the unsatisfiability of the "noisy cube" and "short code" based instances of Unique Games considered by prior works. This improves on the previous upper bound of exp(log O(1) n) rounds (for the "short code"), as well as separates the "Sum of Squares"/"Lasserre" hierarchy from weaker hierarchies that were known to require ω(1) rounds. * Microsoft Research New England.. Much of the work done while the author was an intern at Microsoft Research New England. 3. We show reductions between computing the 2 → 4 norm and computing the injective tensor norm of a tensor, a problem with connections to quantum information theory. Three corollaries are: (i) the 2 → 4 norm is NP-hard to approximate to precision inverse-polynomial in the dimension, (ii) the 2 → 4 norm does not have a good approximation (in the sense above) unless 3-SAT can be solved in time exp(√ n poly log(n)), and (iii) known algorithms for the quantum sep-arability problem imply a non-trivial additive approximation for the 2 → 4 norm.
translated by 谷歌翻译
The multireference alignment problem consists of estimating a signal from multiple noisy shifted observations. Inspired by existing Unique-Games approximation algorithms, we provide a semidefi-nite program (SDP) based relaxation which approximates the maximum likelihood estimator (MLE) for the multireference alignment problem. Although we show that the MLE problem is Unique-Games hard to approximate within any constant, we observe that our poly-time approximation algorithm for the MLE appears to perform quite well in typical instances, outperforming existing methods. In an attempt to explain this behavior we provide stability guarantees for our SDP under a random noise model on the observations. This case is more challenging to analyze than traditional semi-random instances of Unique-Games: the noise model is on vertices of a graph and translates into dependent noise on the edges. Interestingly, we show that if certain positivity constraints in the SDP are dropped, its solution becomes equivalent to performing phase correlation, a popular method used for pairwise alignment in imaging applications. Finally, we show how symmetry reduction techniques from matrix representation theory can simplify the analysis and computation of the SDP, greatly decreasing its computational cost.
translated by 谷歌翻译
由于将$ d $变量或$ d $点聚类到$ K $组中的任务所激发,我们研究了有效的算法来解决彭伟(P-W)$ K $ -means半定规划(SDP)松弛。文献中已经显示P-W SDP在各种设置中具有良好的统计特性,但在实践中仍然难以解决。为此,我们提出了一种解决这种SDP松弛的新算法FORCE。与naiveinterior point方法相比,我们的方法降低了将SDP从$ \ tilde {O}(d ^ 7 \ log \ epsilon ^ { - 1})$转换为$ \ tilde {O}的计算复杂度(d ^ {6 } K ^ { - 2} \ epsilon ^ { - 1})$ \ epsilon $ -optimal解决方案的$算术运算。我们的方法结合了原始的一阶方法和双重最优证书搜索,当成功时,允许先行终止原始方法。我们针对某些变量聚类问题表明,FORCE很有可能找到SDP松弛的最优解,并提供精确最优的证明。通过我们的数值实验验证,这允许FORCE解决尺寸为的P-WSDP。几百秒内的数百。对于PW SDP的变体,其中$ K $不知道先验,FORCE的轻微修改也减少了解决这个问题的计算复杂度:来自$ \ tilde {O}(d ^ 7 \ log \ epsilon ^ { - 1} )$使用标准SDP求解器到$ \ tilde {O}(d ^ {4} \ epsilon ^ { - 1})$。
translated by 谷歌翻译
A common assumption in supervised machine learning is that the training examples provided to the learning algorithm are statistically identical to the instances encountered later on, during the classification phase. This assumption is unrealistic in many real-world situations where machine learning techniques are used. We focus on the case where features of a binary classification problem, which were available during the training phase, are either deleted or become corrupted during the classification phase. We prepare for the worst by assuming that the subset of deleted and corrupted features is controlled by an adversary , and may vary from instance to instance. We design and analyze two novel learning algorithms that anticipate the actions of the adversary and account for them when training a classifier. Our first technique formulates the learning problem as a linear program. We discuss how the particular structure of this program can be exploited for computational efficiency and we prove statistical bounds on the risk of the resulting classifier. Our second technique addresses the robust learning problem by combining a modified version of the Perceptron algorithm with an online-to-batch conversion technique, and also comes with statistical generalization guarantees. We demonstrate the effectiveness of our approach with a set of experiments.
translated by 谷歌翻译
We present a general approach to rounding semidefinite programming relaxations obtained by the Sum-of-Squares method (Lasserre hierarchy). Our approach is based on using the connection between these relaxations and the Sum-of-Squares proof system to transform a combining algorithm-an algorithm that maps a distribution over solutions into a (possibly weaker) solution-into a rounding algorithm that maps a solution of the relaxation to a solution of the original problem. Using this approach, we obtain algorithms that yield improved results for natural variants of three well-known problems: 1. We give a quasipolynomial-time algorithm that approximates max x 2 =1 P(x) within an additive factor of εP spectral additive approximation, where ε > 0 is a constant, P is a degree d = O(1), n-variate polynomial with nonnegative coefficients, and P spectral is the spectral norm of a matrix corresponding to P's coefficients. Beyond being of interest in its own right, obtaining such an approximation for general polynomials (with possibly negative coefficients) is a long-standing open question in quantum information theory, and our techniques have already led to improved results in this area (Brandão and Harrow, STOC '13). 2. We give a polynomial-time algorithm that, given a subspace V ⊆ n of dimension d that (almost) contains the characteristic function of a set of size n/k, finds a vector v ∈ V that satisfies ¾ i v 4 i Ω(d −1/3 k(¾ i v 2 i) 2). This is a natural analytical relaxation of the problem of finding the sparsest element in a subspace, and is also motivated by a connection to the Small Set Expansion problem shown by Barak et al. (STOC 2012). In particular our results yield an improvement of the previous best known algorithms for small set expansion in a certain range of parameters. 3. We use this notion of L 4 vs. L 2 sparsity to obtain a polynomial-time algorithm with substantially improved guarantees for recovering a planted sparse vector v in a random d-dimensional subspace of n. If v has µn nonzero coordinates, we can recover it with high probability whenever µ O(min(1, n/d 2)). In particular, when d √ n, this recovers a planted vector with up to Ω(n) nonzero coordinates. When d n 2/3 , our algorithm improves upon existing methods based on comparing the L 1 and L ∞ norms, which intrinsically require µ O
translated by 谷歌翻译