智能论文笔记

Kernel Autocovariance Operators of Stationary Processes: Estimation and Convergence

Mattes Mollenhauer , Stefan Klus , Christof Schütte , Péter Koltai

分类：机器学习 | (统计)机器学习

2020-04-02

We consider autocovariance operators of a stationary stochastic process on a Polish space that is embedded into a reproducing kernel Hilbert space. We investigate how empirical estimates of these operators converge along realizations of the process under various conditions. In particular, we examine ergodic and strongly mixing processes and obtain several asymptotic results as well as finite sample error bounds. We provide applications of our theory in terms of consistency results for kernel PCA with dependent data and the conditional mean embedding of transition probabilities. Finally, we use our approach to examine the nonparametric estimation of Markov transition operators and highlight how our theory can give a consistency analysis for a large family of spectral analysis methods including kernel-based dynamic mode decomposition.

translated by 谷歌翻译

Optimal Rates for Regularized Conditional Mean Embedding Learning

Zhu Li , Dimitri Meunier , Mattes Mollenhauer , Arthur Gretton

分类： (统计)机器学习 | 机器学习

2022-08-02

我们解决了条件平均嵌入（CME）的内核脊回归估算的一致性，这是给定$ y $ x $的条件分布的嵌入到目标重现内核hilbert space $ hilbert space $ hilbert Space $ \ Mathcal {H} _y $ $ $ $ 。 CME允许我们对目标RKHS功能的有条件期望，并已在非参数因果和贝叶斯推论中使用。我们解决了错误指定的设置，其中目标CME位于Hilbert-Schmidt操作员的空间中，该操作员从$ \ Mathcal {H} _X _x $和$ L_2 $和$ \ MATHCAL {H} _Y $ $之间的输入插值空间起作用。该操作员的空间被证明是新定义的矢量值插值空间的同构。使用这种同构，我们在未指定的设置下为经验CME估计量提供了一种新颖的自适应统计学习率。我们的分析表明，我们的费率与最佳$ o（\ log n / n）$速率匹配，而无需假设$ \ Mathcal {h} _y $是有限维度。我们进一步建立了学习率的下限，这表明所获得的上限是最佳的。

translated by 谷歌翻译

Learning Dynamical Systems via Koopman Operator Regression in Reproducing Kernel Hilbert Spaces

Vladimir Kostic , Pietro Novelli , Andreas Maurer , Carlo Ciliberto , Lorenzo Rosasco , Massimiliano Pontil

分类：机器学习

2022-05-27

We study a class of dynamical systems modelled as Markov chains that admit an invariant distribution via the corresponding transfer, or Koopman, operator. While data-driven algorithms to reconstruct such operators are well known, their relationship with statistical learning is largely unexplored. We formalize a framework to learn the Koopman operator from finite data trajectories of the dynamical system. We consider the restriction of this operator to a reproducing kernel Hilbert space and introduce a notion of risk, from which different estimators naturally arise. We link the risk with the estimation of the spectral decomposition of the Koopman operator. These observations motivate a reduced-rank operator regression (RRR) estimator. We derive learning bounds for the proposed estimator, holding both in i.i.d. and non i.i.d. settings, the latter in terms of mixing coefficients. Our results suggest RRR might be beneficial over other widely used estimators as confirmed in numerical experiments both for forecasting and mode decomposition.

translated by 谷歌翻译

Is completeness necessary? Estimation in nonidentified linear models

Andrii Babii , Jean-Pierre Florens

分类： (统计)机器学习

2017-09-11

我们显示基于光谱正则化的估计变换到一类非识别线性不良逆模型中的结构参数的最佳近似。重要的是，这种融合在均匀和希尔伯特空间规范中保持。当最佳近似与结构参数重合时，我们描述了几种情况，或者至少合理地近似，并且讨论我们的结果在部分识别设置中是如何有用的。最后，我们记录了识别失败对正规化估计器的线性功能的渐近分布具有重要意义，该估算器可以具有加权Chi平方组分。该理论被示出了各种高维和非参数IV回归。

translated by 谷歌翻译

Optimal and instance-dependent guarantees for Markovian linear stochastic approximation

Wenlong Mou , Ashwin Pananjady , Martin J. Wainwright , Peter L. Bartlett

分类：机器学习 | (统计)机器学习

2021-12-23

我们研究了随机近似程序，以便基于观察来自ergodic Markov链的长度$ n $的轨迹来求近求解$ d -dimension的线性固定点方程。我们首先表现出$ t _ {\ mathrm {mix}} \ tfrac {n}} \ tfrac {n}} \ tfrac {d}} \ tfrac {d} {n} $的非渐近性界限。$ t _ {\ mathrm {mix $是混合时间。然后，我们证明了一种在适当平均迭代序列上的非渐近实例依赖性，具有匹配局部渐近最小的限制的领先术语，包括对参数$的敏锐依赖（d，t _ {\ mathrm {mix}}） $以高阶术语。我们将这些上限与非渐近Minimax的下限补充，该下限是建立平均SA估计器的实例 - 最优性。我们通过Markov噪声的政策评估导出了这些结果的推导 - 覆盖了所有$ \ lambda \中的TD（$ \ lambda $）算法，以便[0,1）$ - 和线性自回归模型。我们的实例依赖性表征为HyperParameter调整的细粒度模型选择程序的设计开放了门（例如，在运行TD（$ \ Lambda $）算法时选择$ \ lambda $的值）。

translated by 谷歌翻译

Convergence Rates for Learning Linear Operators from Noisy Data

Maarten V. de Hoop , Nikola B. Kovachki , Nicholas H. Nelsen , Andrew M. Stuart

分类：机器学习 | (统计)机器学习

2021-08-27

本文研究了无限二维希尔伯特空间之间线性算子的学习。训练数据包括希尔伯特空间中的一对随机输入向量以及在未知的自我接合线性运算符下的嘈杂图像。假设操作员在已知的基础上是对角线化的，则该工作解决了给定数据估算操作员特征值的等效反问题。采用贝叶斯方法，理论分析在无限的数据限制中建立了后部收缩率，而高斯先验者与反向问题的正向图没有直接相关。主要结果还包括学习理论的概括错误保证了广泛的分配变化。这些收敛速率分别量化了数据平滑度和真实特征值衰减或生长的影响，分别是紧凑或无界操作员对样品复杂性的影响。数值证据支持对角线和非对角性环境中的理论。

translated by 谷歌翻译

Three rates of convergence or separation via U-statistics in a dependent framework

Quentin Duchemin , Yohann De Castro , Claire Lacour

分类： (统计)机器学习

2021-06-24

尽管U统计量在现代概率和统计学中存在着无处不在的，但其在依赖框架中的非反应分析可能被忽略了。在最近的一项工作中，已经证明了对统一的马尔可夫链的U级统计数据的新浓度不平等。在本文中，我们通过在三个不同的研究领域中进一步推动了当前知识状态，将这一理论突破付诸实践。首先，我们为使用MCMC方法估算痕量类积分运算符光谱的新指数不平等。新颖的是，这种结果适用于具有正征和负征值的内核，据我们所知，这是新的。此外，我们研究了使用成对损失函数和马尔可夫链样品的在线算法的概括性能。我们通过展示如何从任何在线学习者产生的假设序列中提取低风险假设来提供在线到批量转换结果。我们最终对马尔可夫链的不变度度量的密度进行了拟合优度测试的非反应分析。我们确定了一些类别的替代方案，基于$ L_2 $距离的测试具有规定的功率。

translated by 谷歌翻译

A general framework for the analysis of kernel-based tests

Tamara Fernández , Nicolás Rivera

分类： (统计)机器学习

2022-08-31

基于内核的测试提供了一个简单而有效的框架，该框架使用繁殖内核希尔伯特空间的理论设计非参数测试程序。在本文中，我们提出了新的理论工具，可用于在几种数据方案以及许多不同的测试问题中研究基于内核测试的渐近行为。与当前的方法不同，我们的方法避免使用冗长的$ u $和$ v $统计信息扩展并限制定理，该定理通常出现在文献中，并直接与希尔伯特空格上的随机功能合作。因此，我们的框架会导致对内核测试的简单明了的分析，只需要轻度的规律条件。此外，我们表明，通常可以通过证明我们方法所需的规律条件既足够又需要进行必要的规律条件来改进我们的分析。为了说明我们的方法的有效性，我们为有条件的独立性测试问题提供了一项新的内核测试，以及针对已知的基于内核测试的新分析。

translated by 谷歌翻译

Signature moments to characterize laws of stochastic processes

Ilya Chevyrev , Harald Oberhauser

分类： (统计)机器学习

2018-10-25

矢量值随机变量的矩序列可以表征其定律。我们通过使用所谓的稳健签名矩来研究路径值随机变量（即随机过程）的类似问题。这使我们能够为随机过程定律得出最大平均差异类型的度量，并研究其在随机过程定律方面引起的拓扑。可以使用签名内核对该度量进行内核，从而有效地计算它。作为应用程序，我们为随机过程定律提供了非参数的两样本假设检验。

translated by 谷歌翻译

Regularized ERM on random subspaces

Andrea Della Vecchia , Jaouad Mourtada , Ernesto De Vito , Lorenzo Rosasco

分类： (统计)机器学习 | 机器学习

2022-12-04

We study a natural extension of classical empirical risk minimization, where the hypothesis space is a random subspace of a given space. In particular, we consider possibly data dependent subspaces spanned by a random subset of the data, recovering as a special case Nystrom approaches for kernel methods. Considering random subspaces naturally leads to computational savings, but the question is whether the corresponding learning accuracy is degraded. These statistical-computational tradeoffs have been recently explored for the least squares loss and self-concordant loss functions, such as the logistic loss. Here, we work to extend these results to convex Lipschitz loss functions, that might not be smooth, such as the hinge loss used in support vector machines. This unified analysis requires developing new proofs, that use different technical tools, such as sub-gaussian inputs, to achieve fast rates. Our main results show the existence of different settings, depending on how hard the learning problem is, for which computational efficiency can be improved with no loss in performance.

translated by 谷歌翻译

A Spectral Representation of Kernel Stein Discrepancy with Application to Goodness-of-Fit Tests for Measures on Infinite Dimensional Hilbert Spaces

George Wynne , Mikołaj Kasprzak , Andrew B. Duncan

分类： (统计)机器学习

2022-06-09

内核Stein差异（KSD）是一种基于内核的广泛使用概率指标之间差异的非参数量度。它通常在用户从候选概率度量中收集的样本集合的情况下使用，并希望将它们与指定的目标概率度量进行比较。 KSD的一个有用属性是，它可以仅从候选度量的样本中计算出来，并且不知道目标度量的正常化常数。 KSD已用于一系列设置，包括合适的测试，参数推断，MCMC输出评估和生成建模。当前KSD方法论的两个主要问题是（i）超出有限维度欧几里得环境之外的适用性以及（ii）缺乏影响KSD性能的清晰度。本文提供了KSD的新频谱表示，这两种补救措施都使KSD适用于希尔伯特（Hilbert）评估数据，并揭示了内核和Stein oterator Choice对KSD的影响。我们通过在许多合成数据实验中对各种高斯和非高斯功能模型进行拟合优度测试来证明所提出的方法的功效。

translated by 谷歌翻译

Approximate Kernel PCA Using Random Features: Computational vs. Statistical Trade-off

Bharath Sriperumbudur , Nicholas Sterge

分类： (统计)机器学习

2017-06-20

内核方法是强大的学习方法，允许执行非线性数据分析。尽管它们很受欢迎，但在大数据方案中，它们的可伸缩性差。已经提出了各种近似方法，包括随机特征近似，以减轻问题。但是，除了内核脊回归外，大多数这些近似内核方法的统计一致性尚不清楚，其中已证明随机特征近似不仅在计算上有效，而且在统计上与最小值最佳收敛速率一致。在本文中，我们通过研究近似KPCA的计算和统计行为之间的权衡，研究了内核主成分分析（KPCA）中随机特征近似的功效。我们表明，与KPCA相比，与KPCA相比，与KPCA相比，近似KPCA在与基于内核函数基于其对相应的特征面积的投影相关的误差方面是有效的。该分析取决于伯恩斯坦类型的不平等现象，对自我偶和式希尔伯特·史克米特（Hilbert-Schmidt）操作员价值u统计量的运营商和希尔伯特·史克米特（Hilbert-Schmidt）规范取决于独立利益。

translated by 谷歌翻译

Proximal Causal Learning with Kernels: Two-Stage Estimation and Moment Restriction

Afsaneh Mastouri , Yuchen Zhu , Limor Gultchin , Anna Korba , Ricardo Silva , Matt J. Kusner , Arthur Gretton , Krikamol Muandet

分类：机器学习

2021-05-10

我们解决了在没有观察到的混杂的存在下的因果效应估计的问题，但是观察到潜在混杂因素的代理。在这种情况下，我们提出了两种基于内核的方法，用于非线性因果效应估计：（a）两阶段回归方法，以及（b）最大矩限制方法。我们专注于近端因果学习设置，但是我们的方法可以用来解决以弗雷霍尔姆积分方程为特征的更广泛的逆问题。特别是，我们提供了在非线性环境中解决此问题的两阶段和矩限制方法的统一视图。我们为每种算法提供一致性保证，并证明这些方法在合成数据和模拟现实世界任务的数据上获得竞争结果。特别是，我们的方法优于不适合利用代理变量的早期方法。

translated by 谷歌翻译

Debiased Inference on Identified Linear Functionals of Underidentified Nuisances via Penalized Minimax Estimation

Nathan Kallus , Xiaojie Mao

分类： (统计)机器学习

2022-08-17

我们研究了对识别的非唯一麻烦的线性功能的通用推断，该功能定义为未识别条件矩限制的解决方案。这个问题出现在各种应用中，包括非参数仪器变量模型，未衡量的混杂性下的近端因果推断以及带有阴影变量的丢失 - 与随机数据。尽管感兴趣的线性功能（例如平均治疗效应）在适当的条件下是可以识别出的，但令人讨厌的非独家性对统计推断构成了严重的挑战，因为在这种情况下，常见的滋扰估计器可能是不稳定的，并且缺乏固定限制。在本文中，我们提出了对滋扰功能的受惩罚的最小估计器，并表明它们在这种挑战性的环境中有效推断。提出的滋扰估计器可以适应灵活的功能类别，重要的是，无论滋扰是否是唯一的，它们都可以融合到由惩罚确定的固定限制。我们使用受惩罚的滋扰估计器来形成有关感兴趣的线性功能的依据估计量，并在通用高级条件下证明其渐近正态性，这提供了渐近有效的置信区间。

translated by 谷歌翻译

Policy evaluation from a single path: Multi-step methods, mixing and mis-specification

Yaqi Duan , Martin J. Wainwright

分类： (统计)机器学习 | 机器学习

2022-11-07

We study non-parametric estimation of the value function of an infinite-horizon $\gamma$-discounted Markov reward process (MRP) using observations from a single trajectory. We provide non-asymptotic guarantees for a general family of kernel-based multi-step temporal difference (TD) estimates, including canonical $K$-step look-ahead TD for $K = 1, 2, \ldots$ and the TD$(\lambda)$ family for $\lambda \in [0,1)$ as special cases. Our bounds capture its dependence on Bellman fluctuations, mixing time of the Markov chain, any mis-specification in the model, as well as the choice of weight function defining the estimator itself, and reveal some delicate interactions between mixing time and model mis-specification. For a given TD method applied to a well-specified model, its statistical error under trajectory data is similar to that of i.i.d. sample transition pairs, whereas under mis-specification, temporal dependence in data inflates the statistical error. However, any such deterioration can be mitigated by increased look-ahead. We complement our upper bounds by proving minimax lower bounds that establish optimality of TD-based methods with appropriately chosen look-ahead and weighting, and reveal some fundamental differences between value function estimation and ordinary non-parametric regression.

translated by 谷歌翻译

Reproducing kernel Hilbert C*-module and kernel mean embeddings

Yuka Hashimoto , Isao Ishikawa , Masahiro Ikeda , Fuyuta Komura , Takeshi Katsura , Yoshinobu Kawahara

分类： (统计)机器学习 | 机器学习

2021-01-27

内核方法是机器学习中最流行的技术之一，使用再现内核希尔伯特空间（RKHS）的属性来解决学习任务。在本文中，我们提出了一种新的数据分析框架，与再现内核Hilbert $ C ^ * $ - 模块（rkhm）和rkhm中的内核嵌入（kme）。由于RKHM包含比RKHS或VVRKHS）的更丰富的信息，因此使用RKHM的分析使我们能够捕获和提取诸如功能数据的结构属性。我们向RKHM展示了rkhm理论的分支，以适用于数据分析，包括代表性定理，以及所提出的KME的注射性和普遍性。我们还显示RKHM概括RKHS和VVRKHS。然后，我们提供采用RKHM和提议的KME对数据分析的具体程序。

translated by 谷歌翻译

Ensemble forecasts in reproducing kernel Hilbert space family: dynamical systems in Wonderland

Bérenger Hug , Etienne Memin , Gilles Tissot

分类：机器学习

2022-07-29

提出了用于基于合奏的估计和模拟高维动力系统（例如海洋或大气流）的方法学框架。为此，动态系统嵌入了一个由动力学驱动的内核功能的繁殖核Hilbert空间的家族中。这个家庭因其吸引人的财产而被昵称为仙境。在梦游仙境中，Koopman和Perron-Frobenius操作员是统一且均匀的。该属性保证它们可以在一系列可对角线的无限发电机中表达。访问Lyapunov指数和切线线性动力学的精确集合表达式也可以直接可用。仙境使我们能够根据轨迹样本的恒定时间线性组合来设计出惊人的简单集合数据同化方法。通过几个基本定理的完全合理的叠加原则，使这种令人尴尬的简单策略成为可能。

translated by 谷歌翻译

Coefficient-based Regularized Distribution Regression

Yuan Mao , Lei Shi , Zheng-Chu Guo

分类： (统计)机器学习 | 机器学习

2022-08-26

在本文中，我们考虑了基于系数的正则分布回归，该回归旨在从概率措施中回归到复制的内核希尔伯特空间（RKHS）的实现响应（RKHS），该响应将正则化放在系数上，而内核被假定为无限期的。。该算法涉及两个采样阶段，第一阶段样本由分布组成，第二阶段样品是从这些分布中获得的。全面研究了回归函数的不同规律性范围内算法的渐近行为，并通过整体操作员技术得出学习率。我们在某些温和条件下获得最佳速率，这与单级采样的最小最佳速率相匹配。与文献中分布回归的内核方法相比，所考虑的算法不需要内核是对称的和阳性的半明确仪，因此为设计不确定的内核方法提供了一个简单的范式，从而丰富了分布回归的主题。据我们所知，这是使用不确定核进行分配回归的第一个结果，我们的算法可以改善饱和效果。

translated by 谷歌翻译

HTML版本

Optimal variance-reduced stochastic approximation in Banach spaces

Wenlong Mou , Koulik Khamaru , Martin J. Wainwright , Peter L. Bartlett , Michael I. Jordan

分类：机器学习 | (统计)机器学习

2022-01-21

We study the problem of estimating the fixed point of a contractive operator defined on a separable Banach space. Focusing on a stochastic query model that provides noisy evaluations of the operator, we analyze a variance-reduced stochastic approximation scheme, and establish non-asymptotic bounds for both the operator defect and the estimation error, measured in an arbitrary semi-norm. In contrast to worst-case guarantees, our bounds are instance-dependent, and achieve the local asymptotic minimax risk non-asymptotically. For linear operators, contractivity can be relaxed to multi-step contractivity, so that the theory can be applied to problems like average reward policy evaluation problem in reinforcement learning. We illustrate the theory via applications to stochastic shortest path problems, two-player zero-sum Markov games, as well as policy evaluation and $Q$-learning for tabular Markov decision processes.

translated by 谷歌翻译

Robust Generalised Bayesian Inference for Intractable Likelihoods

Takuo Matsubara , Jeremias Knoblauch , François-Xavier Briol , Chris. J. Oates

分类： (统计)机器学习

2021-04-15

广义贝叶斯推理使用损失函数而不是可能性的先前信仰更新，因此可以用于赋予鲁棒性，以防止可能的错误规范的可能性。在这里，我们认为广泛化的贝叶斯推论斯坦坦差异作为损失函数的损失，由应用程序的可能性含有难治性归一化常数。在这种情况下，斯坦因差异来避免归一化恒定的评估，并产生封闭形式或使用标准马尔可夫链蒙特卡罗的通用后出版物。在理论层面上，我们显示了一致性，渐近的正常性和偏见 - 稳健性，突出了这些物业如何受到斯坦因差异的选择。然后，我们提供关于一系列棘手分布的数值实验，包括基于内核的指数家庭模型和非高斯图形模型的应用。

translated by 谷歌翻译