网络系统的一个基本挑战是检测和删除所谓的恶意节点。实际上,检测总是不完美的,并且关于哪些潜在的恶意节点要删除的决定必须权衡假存(错误地移除良性节点)和漏报(错误地删除恶意节点)。但是,在网络设置中,此常规权衡现在必须考虑节点连接。特别是,恶意节点可能会产生恶意影响,因此错误地将其中的一些遗留在网络中可能会导致传播损害。另一方面,移除对象节点会对这些节点造成直接伤害,并对希望与他们通信的良性邻居造成间接伤害。我们通过一个将连接带入计算的目标,正确地解决了在不确定性下从网络中删除潜在恶意节点的问题。我们表明,最佳地解决由此产生的问题是NP-Hard。 Wethen提出了一种基于目标凸松弛的易处理解决方案。最后,我们通过实验证明,我们的方法明显优于忽略网络结构的简单基线,以及相关问题的最新方法,无论是合成数据集还是现实数据集。
translated by 谷歌翻译
In this paper we survey the primary research, both theoretical and applied, in the area of Robust Optimization (RO). Our focus is on the computational attractiveness of RO approaches, as well as the modeling power and broad applicability of the methodology. In addition to surveying prominent theoretical results of RO, we also present some recent results linking RO to adaptable models for multi-stage decision-making problems. Finally, we highlight applications of RO across a wide spectrum of domains, including finance, statistics, learning, and various areas of engineering.
translated by 谷歌翻译
Stochastic programming can effectively describe many decision making problems in uncertain environments. Unfortunately , such programs are often computationally demanding to solve. In addition, their solution can be misleading when there is ambiguity in the choice of a distribution for the random parameters. In this paper, we propose a model that describes uncertainty in both the distribution form (discrete, Gaussian, exponential, etc.) and moments (mean and covari-ance matrix). We demonstrate that for a wide range of cost functions the associated distributionally robust (or min-max) stochastic program can be solved efficiently. Furthermore, by deriving a new confidence region for the mean and the covariance matrix of a random vector, we provide probabilistic arguments for using our model in problems that rely heavily on historical data. These arguments are confirmed in a practical example of portfolio selection, where our framework leads to better performing policies on the "true" distribution underlying the daily returns of financial assets.
translated by 谷歌翻译
The stochastic block model (SBM) is a popular framework for studying community detection in networks. This model is limited by the assumption that all nodes in the same community are statistically equivalent and have equal expected degrees. The degree-corrected stochastic block model (DCSBM) is a natural extension of SBM that allows for degree heterogeneity within communities. This paper proposes a convexi-fied modularity maximization approach for estimating the hidden communities under DCSBM. Our approach is based on a convex programming relaxation of the classical (generalized) modularity maximization formulation, followed by a novel doubly-weighted 1-norm k-median procedure. We establish non-asymptotic theoretical guarantees for both approximate clustering and perfect clustering. Our approximate clustering results are insensitive to the minimum degree, and hold even in sparse regime with bounded average degrees. In the special case of SBM, these theoretical results match the best-known performance guarantees of computationally feasible algorithms. Numerically, we provide an efficient implementation of our algorithm, which is applied to both synthetic and real-world networks. Experiment results show that our method enjoys competitive performance compared to the state of the art in the literature.
translated by 谷歌翻译
本文研究了凸优化方法在社区检测中的最新理论进展。我们介绍了一些重要的理论技术和结果,以建立各种统计模型下凸社区检测的一致性。特别是,我们讨论了基于原始和双重分析的基本技术。我们还提出了一些结果,证明了凸群落检测的几个独特优势,包括对异常节点的鲁棒性,弱相关性下的一致性以及对异质度的适应性。本调查并不是对这个快速发展的主题的大量文献的完整概述。相反,我们的目标是提供该领域近期显着发展的全貌,并使调查可供广大受众使用。我们希望这篇说明文章可以作为有兴趣在网络分析中使用,设计和分析凸松弛方法的读者的入门指南。
translated by 谷歌翻译
We consider the problem of estimating the parameters of a Gaussian or binary distribution in such a way that the resulting undirected graphical model is sparse. Our approach is to solve a maximum likelihood problem with an added 1-norm penalty term. The problem as formulated is convex but the memory requirements and complexity of existing interior point methods are prohibitive for problems with more than tens of nodes. We present two new algorithms for solving problems with at least a thousand nodes in the Gaussian case. Our first algorithm uses block coordinate descent, and can be interpreted as recursive 1-norm penalized regression. Our second algorithm, based on Nesterov's first order method, yields a complexity estimate with a better dependence on problem size than existing interior point methods. Using a log determinant relaxation of the log partition function (Wainwright and Jordan, 2006), we show that these same algorithms can be used to solve an approximate sparse maximum likelihood problem for the binary case. We test our algorithms on synthetic data, as well as on gene expression and senate voting records data.
translated by 谷歌翻译
我们提出了一种新的超完全独立分量分析(ICA)算法,其中潜在源k的数量超过观察到的变量的维数p。先前的算法要么具有高计算复杂度,要么对混合矩阵的形式做出强有力的假设。我们的算法不做任何稀疏性假设,但仍具有良好的计算和理论性质。我们的算法包括两个主要步骤:(a)估计累积量生成函数的Hessians(与大多数算法使用的四阶和更高阶累积量相反)和(b)用于恢复a的新的半定规划(SDP)松弛我们表明,利用投影加速梯度下降法可以有效地解决这种松弛问题,使整个算法在计算上具有实用性。此外,我们推测所提出的程序以k <p ^ 2/4的速率恢复混合分量,并证明当k <(2-epsilon)p log时原始分量均匀采样时,可以高概率地回收混合成分。随机的超球面。在合成数据和真实图像的CIFAR-10数据集上提供实验。
translated by 谷歌翻译
Graph clustering involves the task of dividing nodes into clusters, so that the edge density is higher within clusters as opposed to across clusters. A natural, classic and popular statistical setting for evaluating solutions to this problem is the stochastic block model, also referred to as the planted partition model. In this paper we present a new algorithm-a convexified version of Maximum Likelihood-for graph clustering. We show that, in the classic stochastic block model setting, it outperforms existing methods by polynomial factors when the cluster size is allowed to have general scalings. In fact, it is within logarithmic factors of known lower bounds for spectral methods, and there is evidence suggesting that no polynomial time algorithm would do significantly better. We then show that this guarantee carries over to a more general extension of the stochastic block model. Our method can handle the settings of semi-random graphs, heterogeneous degree distributions, unequal cluster sizes, unaffiliated nodes, partially observed graphs and planted clique/coloring etc. In particular, our results provide the best exact recovery guarantees to date for the planted partition, planted k-disjoint-cliques and planted noisy coloring models with general cluster sizes; in other settings, we match the best existing results up to logarithmic factors.
translated by 谷歌翻译
This paper considers the problem of clustering a partially observedunweighted graph---i.e., one where for some node pairs we know there is an edgebetween them, for some others we know there is no edge, and for the remainingwe do not know whether or not there is an edge. We want to organize the nodesinto disjoint clusters so that there is relatively dense (observed)connectivity within clusters, and sparse across clusters. We take a novel yet natural approach to this problem, by focusing on findingthe clustering that minimizes the number of "disagreements"---i.e., the sum ofthe number of (observed) missing edges within clusters, and (observed) presentedges across clusters. Our algorithm uses convex optimization; its basis is areduction of disagreement minimization to the problem of recovering an(unknown) low-rank matrix and an (unknown) sparse matrix from their partiallyobserved sum. We evaluate the performance of our algorithm on the classicalPlanted Partition/Stochastic Block Model. Our main theorem provides sufficientconditions for the success of our algorithm as a function of the minimumcluster size, edge density and observation probability; in particular, theresults characterize the tradeoff between the observation probability and theedge density gap. When there are a constant number of clusters of equal size,our results are optimal up to logarithmic factors.
translated by 谷歌翻译
This paper addresses the optimal control problem known as the Linear Quadratic Regulator in the case when the dynamics are unknown. We propose a multi-stage procedure, called Coarse-ID control, that estimates a model from a few experimental trials, estimates the error in that model with respect to the truth, and then designs a controller using both the model and uncertainty estimate. Our technique uses contemporary tools from random matrix theory to bound the error in the estimation procedure. We also employ a recently developed approach to control synthesis called System Level Synthesis that enables robust control design by solving a quasiconvex optimization problem. We provide end-to-end bounds on the relative error in control cost that are optimal in the number of parameters and that highlight salient properties of the system to be controlled such as closed-loop sensitivity and optimal control magnitude. We show experimentally that the Coarse-ID approach enables efficient computation of a stabilizing controller in regimes where simple control schemes that do not take the model uncertainty into account fail to stabilize the true system.
translated by 谷歌翻译
A common assumption in supervised machine learning is that the training examples provided to the learning algorithm are statistically identical to the instances encountered later on, during the classification phase. This assumption is unrealistic in many real-world situations where machine learning techniques are used. We focus on the case where features of a binary classification problem, which were available during the training phase, are either deleted or become corrupted during the classification phase. We prepare for the worst by assuming that the subset of deleted and corrupted features is controlled by an adversary , and may vary from instance to instance. We design and analyze two novel learning algorithms that anticipate the actions of the adversary and account for them when training a classifier. Our first technique formulates the learning problem as a linear program. We discuss how the particular structure of this program can be exploited for computational efficiency and we prove statistical bounds on the risk of the resulting classifier. Our second technique addresses the robust learning problem by combining a modified version of the Perceptron algorithm with an online-to-batch conversion technique, and also comes with statistical generalization guarantees. We demonstrate the effectiveness of our approach with a set of experiments.
translated by 谷歌翻译
我们考虑了在聚合样本中估计稀疏高斯图形模型(sGGM)的额外知识的问题,这些模型经常出现在生物信息学和神经成像应用中。以前的联合sGGM估计未能使用现有知识,或者无法在高维(大$ $)情况下扩展到许多任务(大$ K $)。在本文中,我们提出了一个新的\下划线{J} oint \下划线{E} lementary \ underline {E} stimatorincorporating additional \ underline {K} nowledge(JEEK)从大规模异构数据中推断出多个相关的稀疏高斯图形模型。使用域知识作为权重,我们设计了一个新的混合规范作为最小化目标,以强制实现两个加权稀疏约束的叠加,一个在共享交互上,另一个在任务特定的结构模式上。这使得JEEK能够优雅地考虑基于现有领域的各种形式的现有知识,并且无需设计知识特定的优化。 JEEK通过快速且可入门的可并行化解决方案解决,该解决方案大大提高了最先进的$ O(p ^ 5K ^ 4)$到$ O(p ^ 2K ^ 4)$的计算效率。我们进行了严格的统计分析,显示JEEK实现了相同的收敛率$ O(\ log(Kp)/ n_ {tot})$作为最先进的计算器估算器。根据经验,在多个合成数据集和两个真实世界数据上,JEEK在达到相同水平的预测准确度的同时,显着优于现有技术的速度。可用作R工具“jeek”
translated by 谷歌翻译
In data-driven inverse optimization an observer aims to learn the preferences of an agent who solves a parametric optimization problem depending on an exogenous signal. Thus, the observer seeks the agent's objective function that best explains a historical sequence of signals and corresponding optimal actions. We focus here on situations where the observer has imperfect information, that is, where the agent's true objective function is not contained in the search space of candidate objectives, where the agent suffers from bounded rationality or implementation errors, or where the observed signal-response pairs are corrupted by measurement noise. We formalize this inverse optimization problem as a distributionally robust program minimizing the worst-case risk that the predicted decision (i.e., the decision implied by a particular candidate objective) differs from the agent's actual response to a random signal. We show that our framework offers rigorous out-of-sample guarantees for different loss functions used to measure prediction errors and that the emerging inverse optimization problems can be exactly reformulated as (or safely approximated by) tractable convex programs when a new suboptimality loss function is used. We show through extensive numerical tests that the proposed distributionally robust approach to inverse optimization attains often better out-of-sample performance than the state-of-the-art approaches.
translated by 谷歌翻译
We present a Distributionally Robust Optimization (DRO) approach to estimate a robusti-fied regression plane in a linear regression setting, when the observed samples are potentially contaminated with adversarially corrupted outliers. Our approach mitigates the impact of outliers by hedging against a family of probability distributions on the observed data, some of which assign very low probabilities to the outliers. The set of distributions under consideration are close to the empirical distribution in the sense of the Wasserstein metric. We show that this DRO formulation can be relaxed to a convex optimization problem which encompasses a class of models. By selecting proper norm spaces for the Wasserstein metric, we are able to recover several commonly used regularized regression models. We provide new insights into the regularization term and give guidance on the selection of the regularization coefficient from the standpoint of a confidence region. We establish two types of performance guarantees for the solution to our formulation under mild conditions. One is related to its out-of-sample behavior (prediction bias), and the other concerns the discrepancy between the estimated and true regression planes (estimation bias). Extensive numerical results demonstrate the superiority of our approach to a host of regression models , in terms of the prediction and estimation accuracies. We also consider the application of our robust learning procedure to outlier detection, and show that our approach achieves a much higher AUC (Area Under the ROC Curve) than M-estimation (Huber, 1964, 1973).
translated by 谷歌翻译
To address difficult optimization problems, convex relaxations based on semidefinite programming are now common place in many fields. Although solvable in polynomial time, large semidefinite programs tend to be computationally challenging. Over a decade ago, exploiting the fact that in many applications of interest the desired solutions are low rank, Burer and Monteiro proposed a heuristic to solve such semidefinite programs by restricting the search space to low-rank matrices. The accompanying theory does not explain the extent of the empirical success. We focus on Synchronization and Community Detection problems and provide theoretical guarantees shedding light on the remarkable efficiency of this heuristic.
translated by 谷歌翻译
This paper is concerned with the optimal distributed control (ODC) problem for discrete-time deterministic and stochastic systems. The objective is to design a fixed-order distributed controller with a pre-specified structure that is globally optimal with respect to a quadratic cost functional. It is shown that this NP-hard problem has a quadratic formulation, which can be relaxed to a semidefinite program (SDP). If the SDP relaxation has a rank-1 solution, a globally optimal distributed controller can be recovered from this solution. By utilizing the notion of treewidth, it is proved that the nonlinearity of the ODC problem appears in such a sparse way that an SDP relaxation of this problem has a matrix solution with rank at most 3. Since the proposed SDP relaxation is computationally expensive for a large-scale system, a computationally-cheap SDP relaxation is also developed with the property that its objective function indirectly penalizes the rank of the SDP solution. Various techniques are proposed to approximate a low-rank SDP solution with a rank-1 matrix, leading to recovering a near-global controller together with a bound on its optimality degree. The above results are developed for both finite-horizon and infinite horizon ODC problems. While the finite-horizon ODC is investigated using a time-domain formulation, the infinite-horizon ODC problem for both deterministic and stochastic systems is studied via a Lyapunov formulation. The SDP relaxations developed in this work are exact for the design of a centralized controller, hence serving as an alternative for solving Riccati equations. The efficacy of the proposed SDP relaxations is elucidated in numerical examples.
translated by 谷歌翻译
When tracking user-specific online activities, each user's preference is revealed in the form of choices and comparisons. For example, a user's purchase history is a record of her choices, i.e. which item was chosen among a subset of offerings. A user's preferences can be observed either explicitly as in movie ratings or implicitly as in viewing times of news articles. Given such individualized ordinal data in the form of comparisons and choices, we address the problem of collaboratively learning representations of the users and the items. The learned features can be used to predict a user's preference of an unseen item to be used in recommendation systems. This also allows one to compute similarities among users and items to be used for categorization and search. Motivated by the empirical successes of the MultiNomial Logit (MNL) model in marketing and transportation, and also more recent successes in word embedding and crowdsourced image embedding, we pose this problem as learning the MNL model parameters that best explain the data. We propose a convex relaxation for learning the MNL model, and show that it is minimax optimal up to a logarithmic factor by comparing its performance to a fundamental lower bound. This characterizes the minimax sample complexity of the problem, and proves that the proposed estimator cannot be improved upon other than by a logarithmic factor. Further, the analysis identifies how the accuracy depends on the topology of sampling via the spectrum of the sampling graph. This provides a guideline for designing surveys when one can choose which items are to be compared. This is accompanied by numerical simulations on synthetic and real data sets, confirming our theoretical predictions.
translated by 谷歌翻译
We propose a novel, computationally efficient mirror-descent based optimization framework for subgraph detection in graph-structured data. Our aim is to discover anomalous patterns present in a connected subgraph of a given graph. This problem arises in many applications such as detection of network intrusions, community detection , detection of anomalous events in surveillance videos or disease outbreaks. Since optimization over connected subgraphs is a combina-torial and computationally difficult problem, we propose a convex relaxation that offers a princi-pled approach to incorporating connectivity and conductance constraints on candidate subgraphs. We develop a novel efficient algorithm to solve the relaxed problem, establish convergence guarantees and demonstrate its feasibility and performance with experiments on real and very large simulated networks.
translated by 谷歌翻译
We present a general approach to rounding semidefinite programming relaxations obtained by the Sum-of-Squares method (Lasserre hierarchy). Our approach is based on using the connection between these relaxations and the Sum-of-Squares proof system to transform a combining algorithm-an algorithm that maps a distribution over solutions into a (possibly weaker) solution-into a rounding algorithm that maps a solution of the relaxation to a solution of the original problem. Using this approach, we obtain algorithms that yield improved results for natural variants of three well-known problems: 1. We give a quasipolynomial-time algorithm that approximates max x 2 =1 P(x) within an additive factor of εP spectral additive approximation, where ε > 0 is a constant, P is a degree d = O(1), n-variate polynomial with nonnegative coefficients, and P spectral is the spectral norm of a matrix corresponding to P's coefficients. Beyond being of interest in its own right, obtaining such an approximation for general polynomials (with possibly negative coefficients) is a long-standing open question in quantum information theory, and our techniques have already led to improved results in this area (Brandão and Harrow, STOC '13). 2. We give a polynomial-time algorithm that, given a subspace V ⊆ n of dimension d that (almost) contains the characteristic function of a set of size n/k, finds a vector v ∈ V that satisfies ¾ i v 4 i Ω(d −1/3 k(¾ i v 2 i) 2). This is a natural analytical relaxation of the problem of finding the sparsest element in a subspace, and is also motivated by a connection to the Small Set Expansion problem shown by Barak et al. (STOC 2012). In particular our results yield an improvement of the previous best known algorithms for small set expansion in a certain range of parameters. 3. We use this notion of L 4 vs. L 2 sparsity to obtain a polynomial-time algorithm with substantially improved guarantees for recovering a planted sparse vector v in a random d-dimensional subspace of n. If v has µn nonzero coordinates, we can recover it with high probability whenever µ O(min(1, n/d 2)). In particular, when d √ n, this recovers a planted vector with up to Ω(n) nonzero coordinates. When d n 2/3 , our algorithm improves upon existing methods based on comparing the L 1 and L ∞ norms, which intrinsically require µ O
translated by 谷歌翻译
We perform a finite sample analysis of the detection levels for sparse principal components of a high-dimensional covariance matrix. Our mini-max optimal test is based on a sparse eigenvalue statistic. Alas, computing this test is known to be NP-complete in general, and we describe a compu-tationally efficient alternative test using convex relaxations. Our relaxation is also proved to detect sparse principal components at near optimal detection levels, and it performs well on simulated datasets. Moreover, using polynomial time reductions from theoretical computer science, we bring significant evidence that our results cannot be improved, thus revealing an inherent trade off between statistical and computational performance.
translated by 谷歌翻译