智能论文笔记

Error analysis for denoising smooth modulo signals on a graph

Hemant Tyagi

分类： (统计)机器学习

2020-09-10

在许多应用中，我们获得了流畅的函数的嘈杂模态样本的访问，其目标是鲁棒地解开样本，即估计该功能的原始样本。在最近的工作中，Cucuringu和Tyagi通过首先将它们代表在单元复杂圆上，然后解决平滑度规则化最小二乘问题 - Laplacian的平滑度适用的Proximity Graph的平滑度$ G $ - ON单位圆的产品歧管。这个问题是二次受约束的二次程序（QCQP），其是非凸显的，因此提出解决其球形放松导致信任区域子问题（TRS）。就理论担保而言，派生$ \ ell_2 $错误界限（trs）。然而，这些界限通常弱，并且没有真正证明由（TRS）进行的去噪。在这项工作中，我们分析（TRS）以及（QCQP）的不受约束的放松。对于这些估算器，我们在高斯噪声的设置中提供了一种精致的分析，并导出了噪音制度，其中他们可否证明模数观察W.R.T $ \ ell_2 $常规。分析在$ G $是任何连接的图形中的常规设置中进行。

translated by 谷歌翻译

Dynamic Ranking and Translation Synchronization

Ernesto Araya , Eglantine Karlé , Hemant Tyagi

分类： (统计)机器学习

2022-07-04

在许多应用程序（例如运动锦标赛或推荐系统）中，我们可以使用该数据，包括一组$ n $项目（或玩家）之间的成对比较。目的是使用这些数据来推断每个项目和/或其排名的潜在强度。此问题的现有结果主要集中在由单个比较图$ g $组成的设置上。但是，存在成对比较数据随时间发展的场景（例如体育比赛）。这种动态设置的理论结果相对有限，是本文的重点。我们研究\ emph {翻译同步}问题的扩展，到动态设置。在此设置中，我们给出了一系列比较图$（g_t）_ {t \ in \ mathcal {t}} $，其中$ \ nathcal {t} \ subset [0,1] $是代表时间的网格域，对于每个项目$ i $和time $ t \ in \ mathcal {t} $，有一个关联的未知强度参数$ z^*_ {t，i} \ in \ mathbb {r} $。我们的目标是恢复，以$ t \在\ Mathcal {t} $中，强度向量$ z^*_ t =（z^*_ {t，1}，\ cdots，z^*_ {t，n}） $从$ z^*_ {t，i} -z^*_ {t，j} $的噪声测量值中，其中$ \ {i，j \} $是$ g_t $中的边缘。假设$ z^*_ t $在$ t $中顺利地演变，我们提出了两个估计器 - 一个基于平滑度的最小二乘方法，另一个基于对合适平滑度操作员低频本质空间的投影。对于两个估计器，我们为$ \ ell_2 $估计错误提供有限的样本范围，假设$ g_t $已连接到\ mathcal {t} $中的所有$ t \网格尺寸$ | \ MATHCAL {T} | $。我们通过有关合成和真实数据的实验来补充理论发现。

translated by 谷歌翻译

Minimax Estimation of Linear Functions of Eigenvectors in the Face of Small Eigen-Gaps

Gen Li , Changxiao Cai , H. Vincent Poor , Yuxin Chen

分类：机器学习 | (统计)机器学习

2021-04-07

特征向量扰动分析在各种数据科学应用中起着至关重要的作用。然而，大量的先前作品着重于建立$ \ ell_ {2} $ eigenVector扰动边界，这些范围通常在解决依赖特征向量的细粒度行为的任务方面非常不足。本文通过研究未知特征向量的线性函数的扰动来取得进展。在存在高斯噪声的情况下，着重于两个基本问题 - 矩阵denoising和主成分分析 - 我们开发了一个统计理论的套件，该理论表征了未知特征向量的任意线性函数的扰动。为了减轻自然``插件''估计器固有的不可忽略的偏见问题，我们开发了偏低的估计器，即（1）（1）为场景家庭实现最小的下限（模仿某些对数因素），并且（2）可以以数据驱动的方式计算，而无需样品分裂。值得注意的是，即使相关的特征间隙{\ em少于先前的统计理论所要求的，提出的估计器几乎是最佳的最佳选择。

translated by 谷歌翻译

Recovering Hölder smooth functions from noisy modulo samples

Michaël Fanuel , Hemant Tyagi

分类： (统计)机器学习

2021-12-02

在信号处理中，若干应用涉及给出噪声模型样本的函数的恢复。本文考虑的设置是由于模数运行，通过添加剂高斯噪声破坏的样本。该问题的典型示例在相位展开问题或在自复位模拟到数字转换器的上下文中出现。我们考虑一个固定的设计设置，其中在常规网格上给出了模态样本。然后，提出了三个阶段的恢复策略来将地面真理信号恢复到全局整数班次。第一阶段通过使用局部多项式估计器来剥夺模型样本。在第二阶段，将展开算法应用于网格上的去噪模式。最后，使用了一种样条曲的准立体算子来产生对地面真理函数的估计到全局整数偏移。对于H \“较旧的类，均匀的误差率具有高概率的恢复性能。这扩展了由Fanuel和Tyagi获得的最近结果，用于Lipschitz平滑功能，其中在去噪步骤中使用了$ K $ NN回归。

translated by 谷歌翻译

Minimax Optimal Regression over Sobolev Spaces via Laplacian Eigenmaps on Neighborhood Graphs

Alden Green , Sivaraman Balakrishnan , Ryan J. Tibshirani

分类： (统计)机器学习

2021-11-14

本文研究了基于Laplacian Eigenmaps（Le）的基于Laplacian EIGENMAPS（PCR-LE）的主要成分回归的统计性质，这是基于Laplacian Eigenmaps（Le）的非参数回归的方法。 PCR-LE通过投影观察到的响应的向量$ {\ bf y} =（y_1，\ ldots，y_n）$ to to changbood图表拉普拉斯的某些特征向量跨越的子空间。我们表明PCR-Le通过SoboLev空格实现了随机设计回归的最小收敛速率。在设计密度$ P $的足够平滑条件下，PCR-le达到估计的最佳速率（其中已知平方$ l ^ 2 $ norm的最佳速率为$ n ^ { - 2s /（2s + d））} $）和健美的测试（$ n ^ { - 4s /（4s + d）$）。我们还表明PCR-LE是\ EMPH {歧管Adaptive}：即，我们考虑在小型内在维度$ M $的歧管上支持设计的情况，并为PCR-LE提供更快的界限Minimax估计（$ n ^ { - 2s /（2s + m）$）和测试（$ n ^ { - 4s /（4s + m）$）收敛率。有趣的是，这些利率几乎总是比图形拉普拉斯特征向量的已知收敛率更快;换句话说，对于这个问题的回归估计的特征似乎更容易，统计上讲，而不是估计特征本身。我们通过经验证据支持这些理论结果。

translated by 谷歌翻译

A Non-Asymptotic Framework for Approximate Message Passing in Spiked Models

Gen Li , Yuting Wei

分类：机器学习 | (统计)机器学习

2022-08-05

近似消息传递（AMP）是解决高维统计问题的有效迭代范式。但是，当迭代次数超过$ o \ big（\ frac {\ log n} {\ log log \ log \ log n} \时big）$（带有$ n $问题维度）。为了解决这一不足，本文开发了一个非吸附框架，用于理解峰值矩阵估计中的AMP。基于AMP更新的新分解和可控的残差项，我们布置了一个分析配方，以表征在存在独立初始化的情况下AMP的有限样本行为，该过程被进一步概括以进行光谱初始化。作为提出的分析配方的两个具体后果：（i）求解$ \ mathbb {z} _2 $同步时，我们预测了频谱初始化AMP的行为，最高为$ o \ big（\ frac {n} {\ mathrm {\ mathrm { poly} \ log n} \ big）$迭代，表明该算法成功而无需随后的细化阶段（如最近由\ citet {celentano2021local}推测）; （ii）我们表征了稀疏PCA中AMP的非反应性行为（在尖刺的Wigner模型中），以广泛的信噪比。

translated by 谷歌翻译

The Voronoigram: Minimax Estimation of Bounded Variation Functions From Scattered Data

Addison J. Hu , Alden Green , Ryan J. Tibshirani

分类： (统计)机器学习 | 机器学习

2022-12-30

We consider the problem of estimating a multivariate function $f_0$ of bounded variation (BV), from noisy observations $y_i = f_0(x_i) + z_i$ made at random design points $x_i \in \mathbb{R}^d$, $i=1,\ldots,n$. We study an estimator that forms the Voronoi diagram of the design points, and then solves an optimization problem that regularizes according to a certain discrete notion of total variation (TV): the sum of weighted absolute differences of parameters $\theta_i,\theta_j$ (which estimate the function values $f_0(x_i),f_0(x_j)$) at all neighboring cells $i,j$ in the Voronoi diagram. This is seen to be equivalent to a variational optimization problem that regularizes according to the usual continuum (measure-theoretic) notion of TV, once we restrict the domain to functions that are piecewise constant over the Voronoi diagram. The regression estimator under consideration hence performs (shrunken) local averaging over adaptively formed unions of Voronoi cells, and we refer to it as the Voronoigram, following the ideas in Koenker (2005), and drawing inspiration from Tukey's regressogram (Tukey, 1961). Our contributions in this paper span both the conceptual and theoretical frontiers: we discuss some of the unique properties of the Voronoigram in comparison to TV-regularized estimators that use other graph-based discretizations; we derive the asymptotic limit of the Voronoi TV functional; and we prove that the Voronoigram is minimax rate optimal (up to log factors) for estimating BV functions that are essentially bounded.

translated by 谷歌翻译

A Cross Validation framework for Signal Denoising with Applications to Trend Filtering, Dyadic CART and Beyond

Anamitra Chaudhuri , Sabyasachi Chatterjee

分类： (统计)机器学习

2022-01-07

本文为信号去噪提供了一般交叉验证框架。然后将一般框架应用于非参数回归方法，例如趋势过滤和二元推车。然后显示所得到的交叉验证版本以获得最佳调谐的类似物所熟知的几乎相同的收敛速度。没有任何先前的趋势过滤或二元推车的理论分析。为了说明框架的一般性，我们还提出并研究了两个基本估算器的交叉验证版本;套索用于高维线性回归和矩阵估计的奇异值阈值阈值。我们的一般框架是由Chatterjee和Jafarov（2015）的想法的启发，并且可能适用于使用调整参数的广泛估算方法。

translated by 谷歌翻译

Multivariate Trend Filtering for Lattice Data

Veeranjaneyulu Sadhanala , Yu-Xiang Wang , Addison J. Hu , Ryan J. Tibshirani

分类： (统计)机器学习 | 机器学习

2021-12-29

我们研究了趋势过滤的多元版本，称为Kronecker趋势过滤或KTF，因为设计点以$ D $维度形成格子。 KTF是单变量趋势过滤的自然延伸（Steidl等，2006; Kim等人，2009; Tibshirani，2014），并通过最大限度地减少惩罚最小二乘问题，其罚款术语总和绝对（高阶）沿每个坐标方向估计参数的差异。相应的惩罚运算符可以编写单次趋势过滤惩罚运营商的Kronecker产品，因此名称Kronecker趋势过滤。等效，可以在$ \ ell_1 $ -penalized基础回归问题上查看KTF，其中基本功能是下降阶段函数的张量产品，是一个分段多项式（离散样条）基础，基于单变量趋势过滤。本文是Sadhanala等人的统一和延伸结果。（2016,2017）。我们开发了一套完整的理论结果，描述了$ k \ grone 0 $和$ d \ geq 1 $的$ k ^ {\ mathrm {th}} $ over kronecker趋势过滤的行为。这揭示了许多有趣的现象，包括KTF在估计异构平滑的功能时KTF的优势，并且在$ d = 2（k + 1）$的相位过渡，一个边界过去（在高维对 - 光滑侧）线性泡沫不能完全保持一致。我们还利用Tibshirani（2020）的离散花键来利用最近的结果，特别是离散的花键插值结果，使我们能够将KTF估计扩展到恒定时间内的任何偏离晶格位置（与晶格数量的大小无关）。

translated by 谷歌翻译

Tight bounds for minimum l1-norm interpolation of noisy data

Guillaume Wang , Konstantin Donhauser , Fanny Yang

分类：机器学习 | (统计)机器学习

2021-11-10

我们提供匹配的Under $ \ sigma ^ 2 / \ log（d / n）$的匹配的上下界限为最低$ \ ell_1 $ -norm插值器，a.k.a.基础追踪。我们的结果紧紧达到可忽略的术语，而且是第一个暗示噪声最小范围内插值的渐近一致性，因为各向同性特征和稀疏的地面真理。我们的工作对最低$ \ ell_2 $ -norm插值的“良性接收”进行了补充文献，其中才能在特征有效地低维时实现渐近一致性。

translated by 谷歌翻译

Tight bounds for maximum $\ell_1$-margin classifiers

Stefan Stojanovic , Konstantin Donhauser , Fanny Yang

分类： (统计)机器学习 | 机器学习

2022-12-07

Popular iterative algorithms such as boosting methods and coordinate descent on linear models converge to the maximum $\ell_1$-margin classifier, a.k.a. sparse hard-margin SVM, in high dimensional regimes where the data is linearly separable. Previous works consistently show that many estimators relying on the $\ell_1$-norm achieve improved statistical rates for hard sparse ground truths. We show that surprisingly, this adaptivity does not apply to the maximum $\ell_1$-margin classifier for a standard discriminative setting. In particular, for the noiseless setting, we prove tight upper and lower bounds for the prediction error that match existing rates of order $\frac{\|\wgt\|_1^{2/3}}{n^{1/3}}$ for general ground truths. To complete the picture, we show that when interpolating noisy observations, the error vanishes at a rate of order $\frac{1}{\sqrt{\log(d/n)}}$. We are therefore first to show benign overfitting for the maximum $\ell_1$-margin classifier.

translated by 谷歌翻译

Bless and curse of smoothness and phase transitions in nonparametric regressions: a nonasymptotic perspective

Ying Zhu

分类：机器学习

2021-12-07

当回归函数属于标准的平滑类时，由衍生物的单变量函数组成，衍生物到达$（\ gamma + 1）$ th由Action Anclople或Ae界定的常见常数，众所周知，最小的收敛速率均值平均错误（MSE）是$ \左（\ FRAC {\ SIGMA ^ {2}} {n} \右）^ {\ frac {2 \ gamma + 2} {2 \ gamma + 3}} $ \伽玛$是有限的，样本尺寸$ n \ lightarrow \ idty $。从一个不可思议的观点来看，考虑有限$ N $，本文显示：对于旧的H \“较旧的和SoboLev类，最低限度最佳速率是$ \ frac {\ sigma ^ {2} \ left（\ gamma \ vee1 \右）$ \ frac {n} {\ sigma ^ {2}} \ precsim \ left（\ gamma \ vee1 \右）^ {2 \ gamma + 3} $和$ \ left（\ frac {\ sigma ^ {2}} {n} \右）^ {\ frac {2 \ gamma + 2} $ \ r \ frac {n} {\ sigma ^ {2}}} \ succsim \ left（\ gamma \ vee1 \右）^ {2 \ gamma + 3} $。为了建立这些结果，我们在覆盖和覆盖号码上获得上下界限，以获得$ k的广义H \“较旧的班级$ th（$ k = 0，...，\ gamma $）衍生物由上面的参数$ r_ {k} $和$ \ gamma $ th衍生物是$ r _ {\ gamma + 1} - $ lipschitz （以及广义椭圆形的平滑功能）。我们的界限锐化了标准类的古典度量熵结果，并赋予$ \ gamma $和$ r_ {k} $的一般依赖。通过在$ r_ {k} = 1 $以下派生MIMIMAX最佳MSE率，$ r_ {k} \ LEQ \ left（k-1 \右）！$和$ r_ {k} = k！$（与后两个在我们的介绍中有动机的情况）在我们的新熵界的帮助下，我们展示了一些有趣的结果，无法在文献中的现有熵界显示。对于H \“较旧的$ D-$变化函数，我们的结果表明，归一渐近率$ \左（\ frac {\ sigma ^ {2}} {n}右）^ {\ frac {2 \ Gamma + 2} {2 \ Gamma + 2 + D}} $可能是有限样本中的MSE低估。

translated by 谷歌翻译

Adversarial Sign-Corrupted Isotonic Regression

Shamindra Shrotriya , Matey Neykov

分类： (统计)机器学习

2022-07-14

经典的同学回归涉及在真实信号的单调性约束下进行非参数估计。我们考虑了此生成过程的变化，我们将其称为对抗符号折磨的等渗（\ texttt {asci}）回归。在此\ texttt {asci}设置下，对手可以完全访问真实的等渗响应，并且可以自由签名。鉴于这些标志浪费的响应，估计真正的单调信号是一项高度挑战的任务。值得注意的是，标志腐败旨在违反单调性，并可能在损坏的响应术语之间引起严重的依赖。从这个意义上讲，\ texttt {asci}回归可以被视为等渗回归的对抗压力测试。我们的动机是通过理解在这种对抗性环境下对单调信号的有效稳健估计是否可行的驱动。我们开发\ texttt {ascifit}，这是\ texttt {asci}设置下的三步估计过程。 \ texttt {ascifit}过程在概念上是简单的，易于使用现有软件实现，并包括使用至关重要的预处理和后处理更正应用\ texttt {pava}。我们对该程序进行了形式化，并以急剧高概率上限和最小值下限的形式证明其理论保证。我们通过详细的模拟说明了我们的发现。

translated by 谷歌翻译

Optimal transport map estimation in general function spaces

Vincent Divol , Jonathan Niles-Weed , Aram-Alexandre Pooladian

分类： (统计)机器学习

2022-12-07

We consider the problem of estimating the optimal transport map between a (fixed) source distribution $P$ and an unknown target distribution $Q$, based on samples from $Q$. The estimation of such optimal transport maps has become increasingly relevant in modern statistical applications, such as generative modeling. At present, estimation rates are only known in a few settings (e.g. when $P$ and $Q$ have densities bounded above and below and when the transport map lies in a H\"older class), which are often not reflected in practice. We present a unified methodology for obtaining rates of estimation of optimal transport maps in general function spaces. Our assumptions are significantly weaker than those appearing in the literature: we require only that the source measure $P$ satisfies a Poincar\'e inequality and that the optimal map be the gradient of a smooth convex function that lies in a space whose metric entropy can be controlled. As a special case, we recover known estimation rates for bounded densities and H\"older transport maps, but also obtain nearly sharp results in many settings not covered by prior work. For example, we provide the first statistical rates of estimation when $P$ is the normal distribution and the transport map is given by an infinite-width shallow neural network.

translated by 谷歌翻译

Off-policy estimation of linear functionals: Non-asymptotic theory for semi-parametric efficiency

Wenlong Mou , Martin J. Wainwright , Peter L. Bartlett

分类： (统计)机器学习

2022-09-26

在因果推理和强盗文献中，基于观察数据的线性功能估算线性功能的问题是规范的。我们分析了首先估计治疗效果函数的广泛的两阶段程序，然后使用该数量来估计线性功能。我们证明了此类过程的均方误差上的非反应性上限：这些边界表明，为了获得非反应性最佳程序，应在特定加权$ l^2 $中最大程度地估算治疗效果的误差。 -规范。我们根据该加权规范的约束回归分析了两阶段的程序，并通过匹配非轴突局部局部最小值下限，在有限样品中建立了实例依赖性最优性。这些结果表明，除了取决于渐近效率方差之外，最佳的非质子风险除了取决于样本量支持的最富有函数类别的真实结果函数与其近似类别之间的加权规范距离。

translated by 谷歌翻译

Perturbation Analysis of Randomized SVD and its Applications to High-dimensional Statistics

Yichi Zhang , Minh Tang

分类： (统计)机器学习

2022-03-19

随机奇异值分解（RSVD）是用于计算大型数据矩阵截断的SVD的一类计算算法。给定A $ n \ times n $对称矩阵$ \ mathbf {m} $，原型RSVD算法输出通过计算$ \ mathbf {m mathbf {m} $的$ k $引导singular vectors的近似m}^{g} \ mathbf {g} $;这里$ g \ geq 1 $是一个整数，$ \ mathbf {g} \ in \ mathbb {r}^{n \ times k} $是一个随机的高斯素描矩阵。在本文中，我们研究了一般的“信号加上噪声”框架下的RSVD的统计特性，即，观察到的矩阵$ \ hat {\ mathbf {m}} $被认为是某种真实但未知的加法扰动信号矩阵$ \ mathbf {m} $。我们首先得出$ \ ell_2 $（频谱规范）和$ \ ell_ {2 \ to \ infty} $（最大行行列$ \ ell_2 $ norm）$ \ hat {\ hat {\ Mathbf {M}} $和信号矩阵$ \ Mathbf {M} $的真实单数向量。这些上限取决于信噪比（SNR）和功率迭代$ g $的数量。观察到一个相变现象，其中较小的SNR需要较大的$ g $值以保证$ \ ell_2 $和$ \ ell_ {2 \ to \ fo \ infty} $ distances的收敛。我们还表明，每当噪声矩阵满足一定的痕量生长条件时，这些相变发生的$ g $的阈值都会很清晰。最后，我们得出了近似奇异向量的行波和近似矩阵的进入波动的正常近似。我们通过将RSVD的几乎最佳性能保证在应用于三个统计推断问题的情况下，即社区检测，矩阵完成和主要的组件分析，并使用缺失的数据来说明我们的理论结果。

translated by 谷歌翻译

Tractability from overparametrization: The example of the negative perceptron

Andrea Montanari , Yiqiao Zhong , Kangjie Zhou

分类：机器学习

2021-10-28

在负面的感知问题中，我们给出了$ n $数据点$（{\ boldsymbol x} _i，y_i）$，其中$ {\ boldsymbol x} _i $是$ d $ -densional vector和$ y_i \ in \ { + 1，-1 \} $是二进制标签。数据不是线性可分离的，因此我们满足自己的内容，以找到最大的线性分类器，具有最大的\ emph {否定}余量。换句话说，我们想找到一个单位常规矢量$ {\ boldsymbol \ theta} $，最大化$ \ min_ {i \ le n} y_i \ langle {\ boldsymbol \ theta}，{\ boldsymbol x} _i \ rangle $ 。这是一个非凸优化问题（它相当于在Polytope中找到最大标准矢量），我们在两个随机模型下研究其典型属性。我们考虑比例渐近，其中$ n，d \ to \ idty $以$ n / d \ to \ delta $，并在最大边缘$ \ kappa _ {\ text {s}}（\ delta）上证明了上限和下限）$或 - 等效 - 在其逆函数$ \ delta _ {\ text {s}}（\ kappa）$。换句话说，$ \ delta _ {\ text {s}}（\ kappa）$是overparametization阈值：以$ n / d \ le \ delta _ {\ text {s}}（\ kappa） - \ varepsilon $一个分类器实现了消失的训练错误，具有高概率，而以$ n / d \ ge \ delta _ {\ text {s}}（\ kappa）+ \ varepsilon $。我们在$ \ delta _ {\ text {s}}（\ kappa）$匹配，以$ \ kappa \ to - \ idty $匹配。然后，我们分析了线性编程算法来查找解决方案，并表征相应的阈值$ \ delta _ {\ text {lin}}（\ kappa）$。我们观察插值阈值$ \ delta _ {\ text {s}}（\ kappa）$和线性编程阈值$ \ delta _ {\ text {lin {lin}}（\ kappa）$之间的差距，提出了行为的问题其他算法。

translated by 谷歌翻译

Distributed Sparse Regression via Penalization

Yao Ji , Gesualdo Scutari , Ying Sun , Harsha Honnappa

分类：机器学习

2021-11-12

我们研究稀疏的线性回归在一个代理网络上，建模为无向图（没有集中式节点）。估计问题被制定为当地套索损失函数的最小化，加上共识约束的二次惩罚 - 后者是获取分布式解决方案方法的工具。虽然在优化文献中广泛研究了基于惩罚的共识方法，但其高维设置中的统计和计算保证仍不清楚。这项工作提供了对此公开问题的答案。我们的贡献是两倍。 First, we establish statistical consistency of the estimator: under a suitable choice of the penalty parameter, the optimal solution of the penalized problem achieves near optimal minimax rate $\mathcal{O}(s \log d/N)$ in $\ell_2 $ -loss，$ s $是稀疏性值，$ d $是环境维度，$ n $是网络中的总示例大小 - 这与集中式采样率相匹配。其次，我们表明，应用于惩罚问题的近端梯度算法，它自然导致分布式实现，线性地收敛到集中统计误差的顺序的公差 - 速率比例为$ \ mathcal {o}（ d）$，揭示不可避免的速度准确性困境。数值结果证明了衍生的采样率和收敛速率缩放的紧张性。

translated by 谷歌翻译

Asymptotics of Network Embeddings Learned via Subsampling

Andrew Davison , Morgane Austern

分类： (统计)机器学习 | 机器学习

2021-07-06

Network data are ubiquitous in modern machine learning, with tasks of interest including node classification, node clustering and link prediction. A frequent approach begins by learning an Euclidean embedding of the network, to which algorithms developed for vector-valued data are applied. For large networks, embeddings are learned using stochastic gradient methods where the sub-sampling scheme can be freely chosen. Despite the strong empirical performance of such methods, they are not well understood theoretically. Our work encapsulates representation methods using a subsampling approach, such as node2vec, into a single unifying framework. We prove, under the assumption that the graph is exchangeable, that the distribution of the learned embedding vectors asymptotically decouples. Moreover, we characterize the asymptotic distribution and provided rates of convergence, in terms of the latent parameters, which includes the choice of loss function and the embedding dimension. This provides a theoretical foundation to understand what the embedding vectors represent and how well these methods perform on downstream tasks. Notably, we observe that typically used loss functions may lead to shortcomings, such as a lack of Fisher consistency.

translated by 谷歌翻译

The Lasso with general Gaussian designs with applications to hypothesis testing

Michael Celentano , Andrea Montanari , Yuting Wei

分类：机器学习 | (统计)机器学习

2020-07-27

套索是一种高维回归的方法，当时，当协变量$ p $的订单数量或大于观测值$ n $时，通常使用它。由于两个基本原因，经典的渐近态性理论不适用于该模型：$（1）$正规风险是非平滑的； $（2）$估算器$ \ wideHat {\ boldsymbol {\ theta}} $与true参数vector $ \ boldsymbol {\ theta}^*$无法忽略。结果，标准的扰动论点是渐近正态性的传统基础。另一方面，套索估计器可以精确地以$ n $和$ p $大，$ n/p $的订单为一。这种表征首先是在使用I.I.D的高斯设计的情况下获得的。协变量：在这里，我们将其推广到具有非偏差协方差结构的高斯相关设计。这是根据更简单的``固定设计''模型表示的。我们在两个模型中各种数量的分布之间的距离上建立了非反应界限，它们在合适的稀疏类别中均匀地固定在信号上$ \ boldsymbol {\ theta}^*$。作为应用程序，我们研究了借助拉索的分布，并表明需要校正程度对于计算有效的置信区间是必要的。

translated by 谷歌翻译