智能论文笔记

An additive graphical model for discrete data

Jun Tao , Bing Li , Lingzhou Xue

分类：机器学习 | (统计)机器学习

2021-12-29

基于添加条件独立性，我们为离散节点变量引入非参数图形模型。添加剂条件独立性是一种三种方式统计关系，其通过满足半石灰阳极公理来利用有条件独立性与有条件的独立性共享类似的性质。基于该关系，我们构建了一种用于离散变量的加性图形模型，其不受诸如诸如Ising模型的参数模型的限制。我们通过惩罚添加精度运算符的离散版本的惩罚估算来开发新的图形模型的估计，并在超高维设置下建立估计器的一致性。随着这些方法的发展，我们还利用离散随机变量的性质来揭示添加剂条件独立性与条件独立性之间的更深层次关系。新的图形模型在某些稀疏条件下减少了条件独立性图形模型。我们进行仿真实验和对HIV抗逆转录病毒治疗数据集的分析，以比较现有的新方法。

translated by 谷歌翻译

FuDGE: A Method to Estimate a Functional Differential Graph in a High-Dimensional Setting

Boxin Zhao , Y. Samuel Wang , Mladen Kolar

分类： (统计)机器学习 | 机器学习

2020-03-11

我们考虑使用共享结构估算两个功能无向图形模型之间的差异的问题。在许多应用中，数据自然被认为是随机函数的向量而不是标量的矢量。例如，脑电图（EEG）数据更适当地被视为时间函数。在这样的问题中，不仅可以每个样本测量的函数数量大，而且每个功能都是自身是无限尺寸对象，使估计模型参数具有挑战性。这进一步复杂于曲线通常仅在离散时间点观察到。我们首先定义一个功能差异图，捕获两个功能图形模型之间的差异，并在功能性差分图定义良好时正式表征。然后，我们提出了一种方法，软件，直接估计功能差异图，而不首先估计每个图形。这在各个图形是密集的情况下，这是特别有益的，但差分图是稀疏的。我们表明，融合始终估计功能差图，即使在全面观察和离散的功能路径的高维设置中也是如此。我们通过仿真研究说明了我们方法的有限样本性质。我们还提出了一种竞争方法，该方法是关节功能图形套索，它概括了关节图形套索到功能设置。最后，我们将我们的方法应用于EEG数据，以揭示一群含有酒精使用障碍和对照组的个体之间的功能性脑连接的差异。

translated by 谷歌翻译

Partial Separability and Functional Graphical Models for Multivariate Gaussian Processes

Javier Zapata , Sang-Yun Oh , Alexander Petersen

分类： (统计)机器学习

2019-10-07

多变量功能数据的协方差结构可以高度复杂，特别是如果多变量维度大，则使标准多变量数据的统计方法的扩展到功能数据设置具有挑战性。例如，通过将多变量方法应用于截断的基础扩展系数，最近已经扩展到高斯图形模型。然而，与多变量数据相比的关键难度是协方差操作员紧凑，因此不可逆转。本文中的方法论地解决了多元函数数据的协方差建模的一般问题，特别是特定功能性高斯图形模型。作为第一步，提出了多变量功能数据的协方差运算符的可分离性的新概念，称为部分可分离性，导致这种数据的新型Karhunen-Lo \“Eve型扩展。接下来，示出部分可分离结构是特别有用的，以提供可以用一系列有限维图形模型，每个相同的固定尺寸识别的明确定义的功能高斯图形模型。这通过应用联合图形套索来激发一个简单有效的估计过程。通过在电机任务期间的模拟和分析功能性脑连接的仿真和分析来评估图形模型估计方法的经验性能。通过在电机任务期间的仿真和分析来评估图形模型估计方法的百分比实证性能。

translated by 谷歌翻译

A rigorous introduction to linear models

Jun Lu

分类：机器学习 | (统计)机器学习

2021-05-10

这项调查旨在提供线性模型及其背后的理论的介绍。我们的目标是对读者进行严格的介绍，并事先接触普通最小二乘。在机器学习中，输出通常是输入的非线性函数。深度学习甚至旨在找到需要大量计算的许多层的非线性依赖性。但是，这些算法中的大多数都基于简单的线性模型。然后，我们从不同视图中描述线性模型，并找到模型背后的属性和理论。线性模型是回归问题中的主要技术，其主要工具是最小平方近似，可最大程度地减少平方误差之和。当我们有兴趣找到回归函数时，这是一个自然的选择，该回归函数可以最大程度地减少相应的预期平方误差。这项调查主要是目的的摘要，即线性模型背后的重要理论的重要性，例如分布理论，最小方差估计器。我们首先从三种不同的角度描述了普通的最小二乘，我们会以随机噪声和高斯噪声干扰模型。通过高斯噪声，该模型产生了可能性，因此我们引入了最大似然估计器。它还通过这种高斯干扰发展了一些分布理论。最小二乘的分布理论将帮助我们回答各种问题并引入相关应用。然后，我们证明最小二乘是均值误差的最佳无偏线性模型，最重要的是，它实际上接近了理论上的极限。我们最终以贝叶斯方法及以后的线性模型结束。

translated by 谷歌翻译

Nonlinear Sufficient Dimension Reduction for Distribution-on-Distribution Regression

Qi Zhang , Bing Li , Lingzhou Xue

分类： (统计)机器学习

2022-07-11

我们引入了一个新的非线性降低框架的新框架，其中预测因子和响应都是分布数据，它们被建模为度量空间的成员。我们实现非线性足够尺寸降低的关键步骤是在度量空间上构建通用内核，从而导致繁殖Hilbert空间的预测变量和响应，这些空间足以表征有条件的独立性，以决定足够的尺寸减少。对于单变量分布，我们使用Wasserstein距离的众所周知的分位数来构建通用内核。对于多元分布，我们求助于最近开发的切成薄片的Wasserstein距离，以实现此目的。由于可以通过单变量瓦斯汀距离的分位数表示来计算切片的瓦斯坦距离，因此多变量瓦斯坦距离的计算保持在可管理的水平。该方法应用于几个数据集，包括生育能力和死亡率分布数据和卡尔加里温度数据。

translated by 谷歌翻译

Learning particle swarming models from data with Gaussian processes

Jinchao Feng , Charles Kulick , Yunxiang Ren , Sui Tang

分类： (统计)机器学习 | 机器学习

2021-06-04

Interacting particle or agent systems that display a rich variety of swarming behaviours are ubiquitous in science and engineering. A fundamental and challenging goal is to understand the link between individual interaction rules and swarming. In this paper, we study the data-driven discovery of a second-order particle swarming model that describes the evolution of $N$ particles in $\mathbb{R}^d$ under radial interactions. We propose a learning approach that models the latent radial interaction function as Gaussian processes, which can simultaneously fulfill two inference goals: one is the nonparametric inference of {the} interaction function with pointwise uncertainty quantification, and the other one is the inference of unknown scalar parameters in the non-collective friction forces of the system. We formulate the learning problem as a statistical inverse problem and provide a detailed analysis of recoverability conditions, establishing that a coercivity condition is sufficient for recoverability. Given data collected from $M$ i.i.d trajectories with independent Gaussian observational noise, we provide a finite-sample analysis, showing that our posterior mean estimator converges in a Reproducing kernel Hilbert space norm, at an optimal rate in $M$ equal to the one in the classical 1-dimensional Kernel Ridge regression. As a byproduct, we show we can obtain a parametric learning rate in $M$ for the posterior marginal variance using $L^{\infty}$ norm, and the rate could also involve $N$ and $L$ (the number of observation time instances for each trajectory), depending on the condition number of the inverse problem. Numerical results on systems that exhibit different swarming behaviors demonstrate efficient learning of our approach from scarce noisy trajectory data.

translated by 谷歌翻译

Learning the Structure of Large Networked Systems Obeying Conservation Laws

Anirudh Rayas , Rajasekhar Anguluri , Gautam Dasarathy

分类： (统计)机器学习 | 机器学习

2022-06-14

众所周知，许多网络系统，例如电网，大脑和舆论动态社交网络，都可以遵守保护法。这种现象的例子包括电网中的基尔乔夫法律和社交网络中的意见共识。网络系统中的保护定律可以建模为$ x = b^{*} y $的平衡方程，其中$ b^{*} $的稀疏模式捕获了网络的连接，$ y，x \在\ mathbb {r}^p $中分别是节点上“电势”和“注入流”的向量。节点电位$ y $会导致跨边缘的流量，并且在节点上注入的流量$ x $是网络动力学的无关紧要的。在几个实用的系统中，网络结构通常是未知的，需要从数据估算。为此，可以访问节点电位$ y $的样本，但只有节点注射$ x $的统计信息。在这个重要问题的激励下，我们研究了$ n $ y $ y $ y $ y $ y $ y $ y $ y $ b^{*} $稀疏结构的估计，假设节点注射$ x $遵循高斯分布，并带有已知的发行协方差$ \ sigma_x $。我们建议在高维度中为此问题的新$ \ ell_ {1} $ - 正则最大似然估计器，网络的大小$ p $大于样本量$ n $。我们表明，此优化问题是目标中的凸，并接受了独特的解决方案。在新的相互不一致的条件下，我们在三重$（n，p，d）$上建立了足够的条件，对于$ b^{*} $的精确稀疏恢复是可能的； $ d $是图的程度。我们还建立了在元素最大，Frobenius和运营商规范中回收$ b^{*} $的保证。最后，我们通过对拟议估计量对合成和现实世界数据的性能进行实验验证来补充这些理论结果。

translated by 谷歌翻译

The Projected Covariance Measure for assumption-lean variable significance testing

Anton Rask Lundborg , Ilmun Kim , Rajen D. Shah , Richard J. Samworth

分类： (统计)机器学习

2022-11-03

Testing the significance of a variable or group of variables $X$ for predicting a response $Y$, given additional covariates $Z$, is a ubiquitous task in statistics. A simple but common approach is to specify a linear model, and then test whether the regression coefficient for $X$ is non-zero. However, when the model is misspecified, the test may have poor power, for example when $X$ is involved in complex interactions, or lead to many false rejections. In this work we study the problem of testing the model-free null of conditional mean independence, i.e. that the conditional mean of $Y$ given $X$ and $Z$ does not depend on $X$. We propose a simple and general framework that can leverage flexible nonparametric or machine learning methods, such as additive models or random forests, to yield both robust error control and high power. The procedure involves using these methods to perform regressions, first to estimate a form of projection of $Y$ on $X$ and $Z$ using one half of the data, and then to estimate the expected conditional covariance between this projection and $Y$ on the remaining half of the data. While the approach is general, we show that a version of our procedure using spline regression achieves what we show is the minimax optimal rate in this nonparametric testing problem. Numerical experiments demonstrate the effectiveness of our approach both in terms of maintaining Type I error control, and power, compared to several existing approaches.

translated by 谷歌翻译

Fair Structure Learning in Heterogeneous Graphical Models

Davoud Ataee Tarzanagh , Laura Balzano , Alfred O. Hero

分类： (统计)机器学习 | 机器学习

2021-12-09

当节点具有人口统计属性时，概率图形模型中社区结构的推理可能不会与公平约束一致。某些人口统计学可能在某些检测到的社区中过度代表，在其他人中欠代表。本文定义了一个新的$ \ ell_1 $ -regulared伪似然方法，用于公平图形模型选择。特别是，我们假设真正的基础图表中存在一些社区或聚类结构，我们寻求从数据中学习稀疏的无向图形及其社区，使得人口统计团体在社区内相当代表。我们的优化方法使用公平的人口统计奇偶校验定义，但框架很容易扩展到其他公平的定义。我们建立了分别，连续和二进制数据的高斯图形模型和Ising模型的提出方法的统计一致性，证明了我们的方法可以以高概率恢复图形及其公平社区。

translated by 谷歌翻译

Dimension-agnostic inference using cross U-statistics

Ilmun Kim , Aaditya Ramdas

分类： (统计)机器学习

2020-11-10

Classical asymptotic theory for statistical inference usually involves calibrating a statistic by fixing the dimension $d$ while letting the sample size $n$ increase to infinity. Recently, much effort has been dedicated towards understanding how these methods behave in high-dimensional settings, where $d$ and $n$ both increase to infinity together. This often leads to different inference procedures, depending on the assumptions about the dimensionality, leaving the practitioner in a bind: given a dataset with 100 samples in 20 dimensions, should they calibrate by assuming $n \gg d$, or $d/n \approx 0.2$? This paper considers the goal of dimension-agnostic inference; developing methods whose validity does not depend on any assumption on $d$ versus $n$. We introduce an approach that uses variational representations of existing test statistics along with sample splitting and self-normalization to produce a new test statistic with a Gaussian limiting distribution, regardless of how $d$ scales with $n$. The resulting statistic can be viewed as a careful modification of degenerate U-statistics, dropping diagonal blocks and retaining off-diagonal blocks. We exemplify our technique for some classical problems including one-sample mean and covariance testing, and show that our tests have minimax rate-optimal power against appropriate local alternatives. In most settings, our cross U-statistic matches the high-dimensional power of the corresponding (degenerate) U-statistic up to a $\sqrt{2}$ factor.

translated by 谷歌翻译

A non-graphical representation of conditional independence via the neighbourhood lattice

Arash A. Amini , Bryon Aragam , Qing Zhou

分类： (统计)机器学习

2022-06-12

我们介绍并研究了分布的邻居晶格分解，这是有条件独立性的紧凑，非图形表示，在没有忠实的图形表示的情况下是有效的。这个想法是将变量的一组社区视为子集晶格，并将此晶格分配到凸sublattices中，每个晶格都直接编码有条件的独立关系集合。我们表明，这种分解存在于任何组成型绘画中，并且可以在高维度中有效且一致地计算出来。 {特别是，这给了一种方法来编码满足组合公理的分布所隐含的所有独立关系，该分布严格比图形方法通常假定的忠实假设弱弱。}我们还讨论了各种特殊案例，例如图形模型和投影晶格，每个晶格都有直观的解释。一路上，我们看到了这个问题与邻域回归密切相关的，该回归已在图形模型和结构方程式的背景下进行了广泛的研究。

translated by 谷歌翻译

Statistical Inference for Maximin Effects: Identifying Stable Associations across Multiple Studies

Zijian Guo

分类： (统计)机器学习

2020-11-15

Integrative analysis of data from multiple sources is critical to making generalizable discoveries. Associations that are consistently observed across multiple source populations are more likely to be generalized to target populations with possible distributional shifts. In this paper, we model the heterogeneous multi-source data with multiple high-dimensional regressions and make inferences for the maximin effect (Meinshausen, B{\"u}hlmann, AoS, 43(4), 1801--1830). The maximin effect provides a measure of stable associations across multi-source data. A significant maximin effect indicates that a variable has commonly shared effects across multiple source populations, and these shared effects may be generalized to a broader set of target populations. There are challenges associated with inferring maximin effects because its point estimator can have a non-standard limiting distribution. We devise a novel sampling method to construct valid confidence intervals for maximin effects. The proposed confidence interval attains a parametric length. This sampling procedure and the related theoretical analysis are of independent interest for solving other non-standard inference problems. Using genetic data on yeast growth in multiple environments, we demonstrate that the genetic variants with significant maximin effects have generalizable effects under new environments.

translated by 谷歌翻译

The Voronoigram: Minimax Estimation of Bounded Variation Functions From Scattered Data

Addison J. Hu , Alden Green , Ryan J. Tibshirani

分类： (统计)机器学习 | 机器学习

2022-12-30

We consider the problem of estimating a multivariate function $f_0$ of bounded variation (BV), from noisy observations $y_i = f_0(x_i) + z_i$ made at random design points $x_i \in \mathbb{R}^d$, $i=1,\ldots,n$. We study an estimator that forms the Voronoi diagram of the design points, and then solves an optimization problem that regularizes according to a certain discrete notion of total variation (TV): the sum of weighted absolute differences of parameters $\theta_i,\theta_j$ (which estimate the function values $f_0(x_i),f_0(x_j)$) at all neighboring cells $i,j$ in the Voronoi diagram. This is seen to be equivalent to a variational optimization problem that regularizes according to the usual continuum (measure-theoretic) notion of TV, once we restrict the domain to functions that are piecewise constant over the Voronoi diagram. The regression estimator under consideration hence performs (shrunken) local averaging over adaptively formed unions of Voronoi cells, and we refer to it as the Voronoigram, following the ideas in Koenker (2005), and drawing inspiration from Tukey's regressogram (Tukey, 1961). Our contributions in this paper span both the conceptual and theoretical frontiers: we discuss some of the unique properties of the Voronoigram in comparison to TV-regularized estimators that use other graph-based discretizations; we derive the asymptotic limit of the Voronoi TV functional; and we prove that the Voronoigram is minimax rate optimal (up to log factors) for estimating BV functions that are essentially bounded.

translated by 谷歌翻译

Efficient Multidimensional Functional Data Analysis Using Marginal Product Basis Systems

William Consagra , Arun Venkataraman , Xing Qiu

分类： (统计)机器学习

2021-07-30

许多现代数据集，从神经影像和地统计数据等领域都以张量数据的随机样本的形式来说，这可以被理解为对光滑的多维随机功能的嘈杂观察。来自功能数据分析的大多数传统技术被维度的诅咒困扰，并且随着域的尺寸增加而迅速变得棘手。在本文中，我们提出了一种学习从多维功能数据样本的持续陈述的框架，这些功能是免受诅咒的几种表现形式的。这些表示由一组可分离的基函数构造，该函数被定义为最佳地适应数据。我们表明，通过仔细定义的数据的仔细定义的减少转换的张测仪分解可以有效地解决所得到的估计问题。使用基于差分运算符的惩罚，并入粗糙的正则化。也建立了相关的理论性质。在模拟研究中证明了我们对竞争方法的方法的优点。我们在神经影像动物中得出真正的数据应用。

translated by 谷歌翻译

Low-Rank Covariance Completion for Graph Quilting with Applications to Functional Connectivity

Andersen Chang , Lili Zheng , Genevera I. Allen

分类： (统计)机器学习

2022-09-17

作为估计高维网络的工具，图形模型通常应用于钙成像数据以估计功能性神经元连接，即神经元活动之间的关系。但是，在许多钙成像数据集中，没有同时记录整个神经元的人群，而是部分重叠的块。如（Vinci等人2019年）最初引入的，这导致了图形缝问题，在该问题中，目的是在仅观察到功能的子集时推断完整图的结构。在本文中，我们研究了一种新颖的两步方法来绘制缝的方法，该方法首先使用低级协方差完成技术在估计图结构之前使用低级协方差完成技术划分完整的协方差矩阵。我们介绍了三种解决此问题的方法：阻止奇异价值分解，核标准惩罚和非凸低级别分解。尽管先前的工作已经研究了低级别矩阵的完成，但我们解决了阻碍遗失的挑战，并且是第一个在图形学习背景下研究问题的挑战。我们讨论了两步过程的理论特性，通过证明新颖的l无限 - 基 - 误差界的矩阵完成，以块错失性证明了一种提出的方法的图选择一致性。然后，我们研究了所提出的方法在模拟和现实世界数据示例上的经验性能，通过该方法，我们显示了这些方法从钙成像数据中估算功能连通性的功效。

translated by 谷歌翻译

Causal Feature Selection via Orthogonal Search

Ashkan Soleymani , Anant Raj , Stefan Bauer , Bernhard Schölkopf , Michel Besserve

分类： (统计)机器学习 | 机器学习

2020-07-06

在许多学科中，在大量解释变量中推断反应变量的直接因果父母的问题具有很高的实际意义。但是，建立的方法通常至少会随着解释变量的数量而呈指数级扩展，难以扩展到非线性关系，并且很难扩展到周期性数据。受{\ em Debiased}机器学习方法的启发，我们研究了一种单Vs.-the-Rest特征选择方法，以发现响应的直接因果父母。我们提出了一种用于纯观测数据的算法，同时还提供理论保证，包括可能在周期存在下的部分非线性关系的情况。由于它仅需要对每个变量进行一个估计，因此我们的方法甚至适用于大图。与既定方法相比，我们证明了显着改善。

translated by 谷歌翻译

Variable Clustering via Distributionally Robust Nodewise Regression

Kaizheng Wang , Xiao Xu , Xun Yu Zhou

分类：机器学习

2022-12-15

We study a multi-factor block model for variable clustering and connect it to the regularized subspace clustering by formulating a distributionally robust version of the nodewise regression. To solve the latter problem, we derive a convex relaxation, provide guidance on selecting the size of the robust region, and hence the regularization weighting parameter, based on the data, and propose an ADMM algorithm for implementation. We validate our method in an extensive simulation study. Finally, we propose and apply a variant of our method to stock return data, obtain interpretable clusters that facilitate portfolio selection and compare its out-of-sample performance with other clustering methods in an empirical study.

translated by 谷歌翻译

High Dimensional Statistical Estimation under Uniformly Dithered One-bit Quantization

Junren Chen , Cheng-Long Wang , Michael K. Ng , Di Wang

分类： (统计)机器学习 | 机器学习

2022-02-26

在本文中，我们提出了一种均匀抖动的一位量化方案，以进行高维统计估计。该方案包含截断，抖动和量化，作为典型步骤。作为规范示例，量化方案应用于三个估计问题：稀疏协方差矩阵估计，稀疏线性回归和矩阵完成。我们研究了高斯和重尾政权，假定重尾数据的基本分布具有有限的第二或第四刻。对于每个模型，我们根据一位量化的数据提出新的估计器。在高斯次级政权中，我们的估计器达到了对数因素的最佳最小速率，这表明我们的量化方案几乎没有额外的成本。在重尾状态下，虽然我们的估计量基本上变慢，但这些结果是在这种单位量化和重型尾部设置中的第一个结果，或者比现有可比结果表现出显着改善。此外，我们为一位压缩传感和一位矩阵完成的问题做出了巨大贡献。具体而言，我们通过凸面编程将一位压缩感传感扩展到次高斯甚至是重尾传感向量。对于一位矩阵完成，我们的方法与标准似然方法基本不同，并且可以处理具有未知分布的预量化随机噪声。提出了有关合成数据的实验结果，以支持我们的理论分析。

translated by 谷歌翻译

Asymptotics of Network Embeddings Learned via Subsampling

Andrew Davison , Morgane Austern

分类： (统计)机器学习 | 机器学习

2021-07-06

Network data are ubiquitous in modern machine learning, with tasks of interest including node classification, node clustering and link prediction. A frequent approach begins by learning an Euclidean embedding of the network, to which algorithms developed for vector-valued data are applied. For large networks, embeddings are learned using stochastic gradient methods where the sub-sampling scheme can be freely chosen. Despite the strong empirical performance of such methods, they are not well understood theoretically. Our work encapsulates representation methods using a subsampling approach, such as node2vec, into a single unifying framework. We prove, under the assumption that the graph is exchangeable, that the distribution of the learned embedding vectors asymptotically decouples. Moreover, we characterize the asymptotic distribution and provided rates of convergence, in terms of the latent parameters, which includes the choice of loss function and the embedding dimension. This provides a theoretical foundation to understand what the embedding vectors represent and how well these methods perform on downstream tasks. Notably, we observe that typically used loss functions may lead to shortcomings, such as a lack of Fisher consistency.

translated by 谷歌翻译

Laplacian Constrained Precision Matrix Estimation: Existence and High Dimensional Consistency

Eduardo Pavez

分类： (统计)机器学习 | 机器学习

2021-10-31

本文考虑通过最小化Stein损失来估算高维拉普人约束精密矩阵的问题。我们获得了这种估计器存在的必要和充分条件，这归结为检查某些数据相关图是否已连接。我们还在对称沥青损失下的高维设置中证明了一致性。我们表明错误率不依赖于图形稀疏性，或其他类型的结构，并且Laplacian约束足以实现高维一致性。我们的证据利用图拉普拉斯人的性质，以及基于有效图电阻的提出估计的表征。我们通过数值实验验证了我们的理论索赔。

translated by 谷歌翻译