网络值时间序列是目前的网络数据的常见形式。然而,研究由网络价值随机过程产生的网络序列的总体行为相对较少。现有的大多数研究都集中在简单的设置上,其中网络在整个时间内是独立的(或有条件独立的),并且所有边缘在每个时间步骤均同步更新。在本文中,我们研究了聚集的邻接矩阵的浓度特性以及与懒惰网络值随机过程产生的网络序列相关的相应拉普拉斯矩阵,其中边缘异步不断地更新,并且每个边缘都遵循其懒惰的随机过程,以更新独立于其更新其他边缘。我们证明了这些集中度的有用性,从而证明了标准估计器在社区估计和变更点估计问题中的一致性。我们还进行了一项仿真研究,以证明懒惰参数的影响,该参数控制时间相关的程度,对社区和变化点估计的准确性。
translated by 谷歌翻译
当分布生成数据变化时,ChangePoint分析处理时间序列数据中的时间点的无监督检测和/或估计。在本文中,我们在大规模文本数据的上下文中考虑\ emph {offline} ChangePoint检测。我们在主题比例分布的分布中构建了一个专门的时间主题模型。随着该模型的完全可能性推断是在计算上难以解决的,我们开发了一个计算易诊的近似推理过程。更具体地,我们使用样品分离来首先估计多个主题,然后将似然比统计与Fryzlewicz等人的野生二进制分割算法的修改版本一起应用。 (2014)。我们的方法促进了大公司的结构变化的自动检测,而无需通过域专家手动处理。随着我们模型下的变换点对应于主题结构的变化,估计的变化点通常是高度可解释的,因为标志着时尚主题的普及涌现或下降。我们在两个大型数据集上应用我们的程序:(i)从1800-1922期(Underweet Al,2015年)的英语文学语料库; (ii)来自高能物理arxiv存储库的摘要(Clementet al。,2019)。我们获得一些历史上众所周知的改变点,发现一些新的变化点。
translated by 谷歌翻译
Morphological neurons, that is morphological operators such as dilation and erosion with learnable structuring elements, have intrigued researchers for quite some time because of the power these operators bring to the table despite their simplicity. These operators are known to be powerful nonlinear tools, but for a given problem coming up with a sequence of operations and their structuring element is a non-trivial task. So, the existing works have mainly focused on this part of the problem without delving deep into their applicability as generic operators. A few works have tried to utilize morphological neurons as a part of classification (and regression) networks when the input is a feature vector. However, these methods mainly focus on a specific problem, without going into generic theoretical analysis. In this work, we have theoretically analyzed morphological neurons and have shown that these are far more powerful than previously anticipated. Our proposed morphological block, containing dilation and erosion followed by their linear combination, represents a sum of hinge functions. Existing works show that hinge functions perform quite well in classification and regression problems. Two morphological blocks can even approximate any continuous function. However, to facilitate the theoretical analysis that we have done in this paper, we have restricted ourselves to the 1D version of the operators, where the structuring element operates on the whole input. Experimental evaluations also indicate the effectiveness of networks built with morphological neurons, over similarly structured neural networks.
translated by 谷歌翻译
随机梯度体面(SGD)是深神经网络成功背后的核心技术之一。梯度提供有关功能具有最陡变化率的方向的信息。基本SGD的主要问题是通过梯度行为而对所有参数的相等大小的步骤进行更改。因此,深度网络优化的有效方式是为每个参数进行自适应步骤尺寸。最近,已经进行了几次尝试,以改善梯度下降方法,例如Adagrad,Adadelta,RMSProp和Adam。这些方法依赖于平方过去梯度的指数移动平均线的平方根。因此,这些方法不利用梯度的局部变化。在本文中,基于当前和立即梯度(即,差异)之间的差异提出了一种新颖的优化器。在所提出的差异优化技术中,以这样的方式调整步长,使得它应该具有更大的梯度改变参数的较大步长,以及用于较低梯度改变参数的较低步长。收敛分析是使用在线学习框架的遗憾方法完成。在本文中进行严格的分析超过三种合成复合的非凸功能。图像分类实验也在CiFar10和CiFAR100数据集上进行,以观察漫反射的性能,相对于最先进的优化器,例如SGDM,Adagrad,Adadelta,RMSProp,Amsgrad和Adam。基于基于单元(Reset)的基于卷积神经网络(CNN)架构用于实验中。实验表明,Diffgrad优于其他优化器。此外,我们表明差异对使用不同的激活功能训练CNN的均匀良好。源代码在https://github.com/shivram1987/diffgrad公开使用。
translated by 谷歌翻译
We consider the problem of constructing minimax rate-optimal estimators for a doubly robust nonparametric functional that has witnessed applications across the causal inference and conditional independence testing literature. Minimax rate-optimal estimators for such functionals are typically constructed through higher-order bias corrections of plug-in and one-step type estimators and, in turn, depend on estimators of nuisance functions. In this paper, we consider a parallel question of interest regarding the optimality and/or sub-optimality of plug-in and one-step bias-corrected estimators for the specific doubly robust functional of interest. Specifically, we verify that by using undersmoothing and sample splitting techniques when constructing nuisance function estimators, one can achieve minimax rates of convergence in all H\"older smoothness classes of the nuisance functions (i.e. the propensity score and outcome regression) provided that the marginal density of the covariates is sufficiently regular. Additionally, by demonstrating suitable lower bounds on these classes of estimators, we demonstrate the necessity to undersmooth the nuisance function estimators to obtain minimax optimal rates of convergence.
translated by 谷歌翻译
Search and Rescue (SAR) missions in remote environments often employ autonomous multi-robot systems that learn, plan, and execute a combination of local single-robot control actions, group primitives, and global mission-oriented coordination and collaboration. Often, SAR coordination strategies are manually designed by human experts who can remotely control the multi-robot system and enable semi-autonomous operations. However, in remote environments where connectivity is limited and human intervention is often not possible, decentralized collaboration strategies are needed for fully-autonomous operations. Nevertheless, decentralized coordination may be ineffective in adversarial environments due to sensor noise, actuation faults, or manipulation of inter-agent communication data. In this paper, we propose an algorithmic approach based on adversarial multi-agent reinforcement learning (MARL) that allows robots to efficiently coordinate their strategies in the presence of adversarial inter-agent communications. In our setup, the objective of the multi-robot team is to discover targets strategically in an obstacle-strewn geographical area by minimizing the average time needed to find the targets. It is assumed that the robots have no prior knowledge of the target locations, and they can interact with only a subset of neighboring robots at any time. Based on the centralized training with decentralized execution (CTDE) paradigm in MARL, we utilize a hierarchical meta-learning framework to learn dynamic team-coordination modalities and discover emergent team behavior under complex cooperative-competitive scenarios. The effectiveness of our approach is demonstrated on a collection of prototype grid-world environments with different specifications of benign and adversarial agents, target locations, and agent rewards.
translated by 谷歌翻译
This paper presents a novel federated reinforcement learning (Fed-RL) methodology to enhance the cyber resiliency of networked microgrids. We formulate a resilient reinforcement learning (RL) training setup which (a) generates episodic trajectories injecting adversarial actions at primary control reference signals of the grid forming (GFM) inverters and (b) trains the RL agents (or controllers) to alleviate the impact of the injected adversaries. To circumvent data-sharing issues and concerns for proprietary privacy in multi-party-owned networked grids, we bring in the aspects of federated machine learning and propose a novel Fed-RL algorithm to train the RL agents. To this end, the conventional horizontal Fed-RL approaches using decoupled independent environments fail to capture the coupled dynamics in a networked microgrid, which leads us to propose a multi-agent vertically federated variation of actor-critic algorithms, namely federated soft actor-critic (FedSAC) algorithm. We created a customized simulation setup encapsulating microgrid dynamics in the GridLAB-D/HELICS co-simulation platform compatible with the OpenAI Gym interface for training RL agents. Finally, the proposed methodology is validated with numerical examples of modified IEEE 123-bus benchmark test systems consisting of three coupled microgrids.
translated by 谷歌翻译
Many visual recognition models are evaluated only on their classification accuracy, a metric for which they obtain strong performance. In this paper, we investigate whether computer vision models can also provide correct rationales for their predictions. We propose a ``doubly right'' object recognition benchmark, where the metric requires the model to simultaneously produce both the right labels as well as the right rationales. We find that state-of-the-art visual models, such as CLIP, often provide incorrect rationales for their categorical predictions. However, by transferring the rationales from language models into visual representations through a tailored dataset, we show that we can learn a ``why prompt,'' which adapts large visual representations to produce correct rationales. Visualizations and empirical experiments show that our prompts significantly improve performance on doubly right object recognition, in addition to zero-shot transfer to unseen tasks and datasets.
translated by 谷歌翻译
Dataset Distillation (DD), a newly emerging field, aims at generating much smaller and high-quality synthetic datasets from large ones. Existing DD methods based on gradient matching achieve leading performance; however, they are extremely computationally intensive as they require continuously optimizing a dataset among thousands of randomly initialized models. In this paper, we assume that training the synthetic data with diverse models leads to better generalization performance. Thus we propose two \textbf{model augmentation} techniques, ~\ie using \textbf{early-stage models} and \textbf{weight perturbation} to learn an informative synthetic set with significantly reduced training cost. Extensive experiments demonstrate that our method achieves up to 20$\times$ speedup and comparable performance on par with state-of-the-art baseline methods.
translated by 谷歌翻译
Recent years have seen rapid progress at the intersection between causality and machine learning. Motivated by scientific applications involving high-dimensional data, in particular in biomedicine, we propose a deep neural architecture for learning causal relationships between variables from a combination of empirical data and prior causal knowledge. We combine convolutional and graph neural networks within a causal risk framework to provide a flexible and scalable approach. Empirical results include linear and nonlinear simulations (where the underlying causal structures are known and can be directly compared against), as well as a real biological example where the models are applied to high-dimensional molecular data and their output compared against entirely unseen validation experiments. These results demonstrate the feasibility of using deep learning approaches to learn causal networks in large-scale problems spanning thousands of variables.
translated by 谷歌翻译