智能论文笔记

Linking Across Data Granularity: Fitting Multivariate Hawkes Processes to Partially Interval-Censored Data

Pio Calderon , Alexander Soen , Marian-Andrei Rizoiu

分类：机器学习

2021-11-03

这项工作引入了一种新颖的多变量时间点过程，部分均值行为泊松（PMBP）过程，可以利用以将多变量霍克斯过程适合部分间隔删除的数据，该数据包括在尺寸和间隔子集上的事件时间戳的混合中组成的数据。 - 委员会互补尺寸的事件计数。首先，我们通过其条件强度定义PMBP过程，并导出子临界性的规律性条件。我们展示了鹰过程和MBP过程（Rizoiu等人）是PMBP过程的特殊情况。其次，我们提供了能够计算PMBP过程的条件强度和采样事件历史的数字方案。第三，我们通过使用合成和现实世界数据集来证明PMBP过程的适用性：我们测试PMBP过程的能力，以恢复多变量霍克参数给出鹰过程的样本事件历史。接下来，我们在YouTube流行预测任务上评估PMBP过程，并表明它优于当前最先进的鹰强度过程（Rizoiu等人。（2017b））。最后，在Covid19的策划数据集上，关于国家样本的Covid19每日案例计数和Covid19相关的新闻文章，我们展示了PMBP拟合参数上的聚类使各国的分类能够分类案件和新闻的国家级互动报告。

translated by 谷歌翻译

Mutually exciting point process graphs for modelling dynamic networks

Francesco Sanna Passino , Nicholas A. Heard

分类：机器学习 | (统计)机器学习

2021-02-11

提出了一种新的动态网络模型，称为相互刺激的点处理图（MEG）。 MEG是一种可扩展的网络范围统计模型，用于多达数码标记的点进程，可用于评估未来事件的重要事件时，包括以前未观察到的连接的异常检测。该模型组合了互励磁点过程来估计事件和潜在空间模型之间的依赖性，以推断节点之间的关系。每个网络边缘的强度函数专用于节点特定参数参数，允许跨网络共享信息。这种结构甚至可以估计强度，即使对于未被观察的边缘，这在现实世界中尤其重要，例如网络安全中产生的计算机网络。获得了日志似然的递归形式，用于通过现代梯度上升算法推导快速推理过程。也导出了EM算法。该模型在模拟图和现实世界数据集上进行测试，展示出色的性能。

translated by 谷歌翻译

An Introduction to Modern Statistical Learning

Joseph G. Makin

分类：机器学习

2022-07-20

这项正在进行的工作旨在为统计学习提供统一的介绍，从诸如GMM和HMM等经典模型到现代神经网络（如VAE和扩散模型）缓慢地构建。如今，有许多互联网资源可以孤立地解释这一点或新的机器学习算法，但是它们并没有（也不能在如此简短的空间中）将这些算法彼此连接起来，或者与统计模型的经典文献相连现代算法出现了。同样明显缺乏的是一个单一的符号系统，尽管对那些已经熟悉材料的人（如这些帖子的作者）不满意，但对新手的入境造成了重大障碍。同样，我的目的是将各种模型（尽可能）吸收到一个用于推理和学习的框架上，表明（以及为什么）如何以最小的变化将一个模型更改为另一个模型（其中一些是新颖的，另一些是文献中的）。某些背景当然是必要的。我以为读者熟悉基本的多变量计算，概率和统计以及线性代数。这本书的目标当然不是完整性，而是从基本知识到过去十年中极强大的新模型的直线路径或多或少。然后，目标是补充而不是替换，诸如Bishop的\ emph {模式识别和机器学习}之类的综合文本，该文本现在已经15岁了。

translated by 谷歌翻译

Consistent and fast inference in compartmental models of epidemics using Poisson Approximate Likelihoods

Michael Whitehouse , Nick Whiteley , Lorenzo Rimella

分类：机器学习

2022-05-26

解决扩大流行病学推断对复杂和异质模型的挑战，我们引入了泊松近似可能性（PAL）方法。 PAL是从有限人口，随机隔室模型的近似滤波方程中得出的，并且较大的人口限制驱动了最大PAL估计器的一致性。我们的理论结果似乎是基于大量的部分观察到的关于大量人群限制的部分随机隔室模型的第一个基于可能性的参数估计一致性结果。与基于仿真的方法（例如近似贝叶斯计算和顺序蒙特卡洛）相比，PALS易于实现，仅涉及基本算术操作，而无需调整参数。并快速评估，不需要模型的模拟，并且具有与人口规模无关的计算成本。通过示例，我们演示了PAL的如何：嵌入延迟的接受粒子马尔可夫链蒙特卡洛中以促进贝叶斯的推断；用于拟合流感的年龄结构化模型，利用Stan的自动分化；并应用于校准麻疹的空间元群模型。

translated by 谷歌翻译

Adjusted chi-square test for degree-corrected block models

Linfan Zhang , Arash A. Amini

分类： (统计)机器学习

2020-12-30

我们提出了对学度校正随机块模型（DCSBM）的合适性测试。该测试基于调整后的卡方统计量，用于测量$ n $多项式分布的组之间的平等性，该分布具有$ d_1，\ dots，d_n $观测值。在网络模型的背景下，多项式的数量（$ n $）的数量比观测值数量（$ d_i $）快得多，与节点$ i $的度相对应，因此设置偏离了经典的渐近学。我们表明，只要$ \ {d_i \} $的谐波平均值生长到无穷大，就可以使统计量在NULL下分配。顺序应用时，该测试也可以用于确定社区数量。该测试在邻接矩阵的压缩版本上进行操作，因此在学位上有条件，因此对大型稀疏网络具有高度可扩展性。我们结合了一个新颖的想法，即在测试$ K $社区时根据$（k+1）$ - 社区分配来压缩行。这种方法在不牺牲计算效率的情况下增加了顺序应用中的力量，我们证明了它在恢复社区数量方面的一致性。由于测试统计量不依赖于特定的替代方案，因此其效用超出了顺序测试，可用于同时测试DCSBM家族以外的各种替代方案。特别是，我们证明该测试与具有社区结构的潜在可变性网络模型的一般家庭一致。

translated by 谷歌翻译

Maximum Likelihood from Incomplete Data Via the EM Algorithm

分类：

JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact

translated by 谷歌翻译

Opinion Market Model: Stemming Far-Right Opinion Spread using Positive Interventions

Pio Calderon , Rohit Ram , Marian-Andrei Rizoiu

分类：机器学习

2022-08-13

近年来，在我们称之为社交媒体的意见生态系统中，极端主义观点的兴起。允许在线极端主义坚持会带来可怕的社会后果，并不断探索减轻它的努力。积极的干预措施，受控信号，以提高某些意见的目的增加了对意见生态系统的关注，这就是缓解途径的一种途径。这项工作提出了一个平台，通过意见市场模型（OMM）测试积极干预措施的有效性，这是一个在线意见生态系统的两层模型，共同考虑了开幕式的相互作用和积极干预的作用。第一层使用多元离散时间霍克斯流程模拟了意见注意市场的规模；第二层利用市场份额吸引模型来模拟合作的意见并竞争市场份额，但注意力有限。在合成数据集上，我们显示了我们提出的估计方案的收敛性。在Facebook和Twitter讨论的数据集中，其中包含有关丛林大火和气候变化的中等和极右翼意见，我们在最先进的表现以及揭示潜在意见互动的能力上表现出了优越的预测性能。最后，我们使用OMM来证明主流媒体报道的有效性是抑制极右翼意见的积极干预措施。

translated by 谷歌翻译

Strong identifiability and parameter learning in regression with heterogeneous response

Dat Do , Linh Do , XuanLong Nguyen

分类： (统计)机器学习

2022-12-08

Mixtures of regression are a powerful class of models for regression learning with respect to a highly uncertain and heterogeneous response variable of interest. In addition to being a rich predictive model for the response given some covariates, the parameters in this model class provide useful information about the heterogeneity in the data population, which is represented by the conditional distributions for the response given the covariates associated with a number of distinct but latent subpopulations. In this paper, we investigate conditions of strong identifiability, rates of convergence for conditional density and parameter estimation, and the Bayesian posterior contraction behavior arising in finite mixture of regression models, under exact-fitted and over-fitted settings and when the number of components is unknown. This theory is applicable to common choices of link functions and families of conditional distributions employed by practitioners. We provide simulation studies and data illustrations, which shed some light on the parameter learning behavior found in several popular regression mixture models reported in the literature.

translated by 谷歌翻译

Spatiotemporal Clustering with Neyman-Scott Processes via Connections to Bayesian Nonparametric Mixture Models

Yixin Wang , Anthony Degleris , Alex H. Williams , Scott W. Linderman

分类： (统计)机器学习 | 机器学习

2022-01-13

Neyman-Scott processes (NSPs) are point process models that generate clusters of points in time or space. They are natural models for a wide range of phenomena, ranging from neural spike trains to document streams. The clustering property is achieved via a doubly stochastic formulation: first, a set of latent events is drawn from a Poisson process; then, each latent event generates a set of observed data points according to another Poisson process. This construction is similar to Bayesian nonparametric mixture models like the Dirichlet process mixture model (DPMM) in that the number of latent events (i.e. clusters) is a random variable, but the point process formulation makes the NSP especially well suited to modeling spatiotemporal data. While many specialized algorithms have been developed for DPMMs, comparatively fewer works have focused on inference in NSPs. Here, we present novel connections between NSPs and DPMMs, with the key link being a third class of Bayesian mixture models called mixture of finite mixture models (MFMMs). Leveraging this connection, we adapt the standard collapsed Gibbs sampling algorithm for DPMMs to enable scalable Bayesian inference on NSP models. We demonstrate the potential of Neyman-Scott processes on a variety of applications including sequence detection in neural spike trains and event detection in document streams.

translated by 谷歌翻译

A Framework for Machine Learning of Model Error in Dynamical Systems

Matthew E. Levine , Andrew M. Stuart

分类：机器学习 | (统计)机器学习

2021-07-14

在许多学科中，动态系统的数据信息预测模型的开发引起了广泛的兴趣。我们提出了一个统一的框架，用于混合机械和机器学习方法，以从嘈杂和部分观察到的数据中识别动态系统。我们将纯数据驱动的学习与混合模型进行比较，这些学习结合了不完善的域知识。我们的公式与所选的机器学习模型不可知，在连续和离散的时间设置中都呈现，并且与表现出很大的内存和错误的模型误差兼容。首先，我们从学习理论的角度研究无内存线性（W.R.T.参数依赖性）模型误差，从而定义了过多的风险和概括误差。对于沿阵行的连续时间系统，我们证明，多余的风险和泛化误差都通过与T的正方形介于T的术语（指定训练数据的时间间隔）的术语界定。其次，我们研究了通过记忆建模而受益的方案，证明了两类连续时间复发性神经网络（RNN）的通用近似定理：两者都可以学习与内存有关的模型误差。此外，我们将一类RNN连接到储层计算，从而将学习依赖性错误的学习与使用随机特征在Banach空间之间进行监督学习的最新工作联系起来。给出了数值结果（Lorenz '63，Lorenz '96多尺度系统），以比较纯粹的数据驱动和混合方法，发现混合方法较少，渴望数据较少，并且更有效。最后，我们从数值上证明了如何利用数据同化来从嘈杂，部分观察到的数据中学习隐藏的动态，并说明了通过这种方法和培训此类模型来表示记忆的挑战。

translated by 谷歌翻译

Estimating means of bounded random variables by betting

Ian Waudby-Smith , Aaditya Ramdas

分类： (统计)机器学习

2020-10-19

本文衍生了置信区间（CI）和时间统一的置信序列（CS），用于从有限观测值中估算未知平均值的经典问题。我们提出了一种衍生浓度界限的一般方法，可以看作是著名的切尔诺夫方法的概括（和改进）。它的核心是基于推导一类新的复合非负胸腔，通过投注和混合方法与测试的连接很强。我们展示了如何将这些想法扩展到无需更换的情况下，这是另一个经过深入研究的问题。在所有情况下，我们的界限都适应未知的差异，并且基于Hoeffding或经验的Bernstein不平等及其最近的Supermartingale概括，经验上大大优于现有方法。简而言之，我们为四个基本问题建立了一个新的最先进的问题：在有或没有替换的情况下进行采样时，CS和CI进行有限的手段。

translated by 谷歌翻译

Rigorous data-driven computation of spectral properties of Koopman operators for dynamical systems

Matthew J. Colbrook , Alex Townsend

分类：机器学习

2021-11-29

Koopman运算符是无限维的运算符，可全球线性化非线性动态系统，使其光谱信息可用于理解动态。然而，Koopman运算符可以具有连续的光谱和无限维度的子空间，使得它们的光谱信息提供相当大的挑战。本文介绍了具有严格融合的数据驱动算法，用于从轨迹数据计算Koopman运算符的频谱信息。我们引入了残余动态模式分解（ResDMD），它提供了第一种用于计算普通Koopman运算符的Spectra和PseudtoStra的第一种方案，无需光谱污染。使用解析器操作员和RESDMD，我们还计算与测量保存动态系统相关的光谱度量的平滑近似。我们证明了我们的算法的显式收敛定理，即使计算连续频谱和离散频谱的密度，也可以实现高阶收敛即使是混沌系统。我们展示了在帐篷地图，高斯迭代地图，非线性摆，双摆，洛伦茨系统和11美元延长洛伦兹系统的算法。最后，我们为具有高维状态空间的动态系统提供了我们的算法的核化变体。这使我们能够计算与具有20,046维状态空间的蛋白质分子的动态相关的光谱度量，并计算出湍流流过空气的误差界限的非线性Koopman模式，其具有雷诺数为$> 10 ^ 5 $。一个295,122维的状态空间。

translated by 谷歌翻译

State-space deep Gaussian processes with applications

Zheng Zhao

分类： (统计)机器学习

2021-11-24

本论文主要涉及解决深层（时间）高斯过程（DGP）回归问题的状态空间方法。更具体地，我们代表DGP作为分层组合的随机微分方程（SDES），并且我们通过使用状态空间过滤和平滑方法来解决DGP回归问题。由此产生的状态空间DGP（SS-DGP）模型生成丰富的电视等级，与建模许多不规则信号/功能兼容。此外，由于他们的马尔可道结构，通过使用贝叶斯滤波和平滑方法可以有效地解决SS-DGPS回归问题。本论文的第二次贡献是我们通过使用泰勒力矩膨胀（TME）方法来解决连续离散高斯滤波和平滑问题。这诱导了一类滤波器和SmooThers，其可以渐近地精确地预测随机微分方程（SDES）解决方案的平均值和协方差。此外，TME方法和TME过滤器和SmoOthers兼容模拟SS-DGP并解决其回归问题。最后，本文具有多种状态 - 空间（深）GPS的应用。这些应用主要包括（i）来自部分观察到的轨迹的SDES的未知漂移功能和信号的光谱 - 时间特征估计。

translated by 谷歌翻译

Iterated Block Particle Filter for High-dimensional Parameter Learning: Beating the Curse of Dimensionality

Ning Ning , Edward L. Ionides

分类： (统计)机器学习 | 机器学习

2021-10-20

高维，部分观察和非线性随机过程的参数学习是方法论挑战。时空疾病传播系统提供了此类过程的示例，导致开放推理问题。我们提出了迭代的块粒子滤波器（IBPF）算法，用于学习具有一般状态空间，测量，过渡密度和图形结构的图形状态空间模型上的高维参数。在击败维度（COD），算法收敛和可能性最大化的诅咒时，获得了理论性能保证。在高度非线性和非高斯时空模型上进行麻疹传播的实验表明，迭代的集合卡尔曼滤波器算法（Li等人（2020））无效，迭代过滤算法（Ionides et al。（2015））受到损害。COD，而我们的IBPF算法在不同指标的各种实验中始终如一地击败COD。

translated by 谷歌翻译

Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions

Nathan Halko , Per-Gunnar Martinsson , Joel A. Tropp

分类：

2009-09-22

Low-rank matrix approximations, such as the truncated singular value decomposition and the rank-revealing QR decomposition, play a central role in data analysis and scientific computing. This work surveys and extends recent research which demonstrates that randomization offers a powerful tool for performing low-rank matrix approximation. These techniques exploit modern computational architectures more fully than classical methods and open the possibility of dealing with truly massive data sets.This paper presents a modular framework for constructing randomized algorithms that compute partial matrix decompositions. These methods use random sampling to identify a subspace that captures most of the action of a matrix. The input matrix is then compressed-either explicitly or implicitly-to this subspace, and the reduced matrix is manipulated deterministically to obtain the desired low-rank factorization. In many cases, this approach beats its classical competitors in terms of accuracy, speed, and robustness. These claims are supported by extensive numerical experiments and a detailed error analysis.The specific benefits of randomized techniques depend on the computational environment. Consider the model problem of finding the k dominant components of the singular value decomposition of an m × n matrix. (i) For a dense input matrix, randomized algorithms require O(mn log(k)) floating-point operations (flops) in contrast with O(mnk) for classical algorithms. (ii) For a sparse input matrix, the flop count matches classical Krylov subspace methods, but the randomized approach is more robust and can easily be reorganized to exploit multi-processor architectures. (iii) For a matrix that is too large to fit in fast memory, the randomized techniques require only a constant number of passes over the data, as opposed to O(k) passes for classical algorithms. In fact, it is sometimes possible to perform matrix approximation with a single pass over the data.

translated by 谷歌翻译

Unifying Epidemic Models with Mixtures

Arnab Sarker , Ali Jadbabaie , Devavrat Shah

分类： (统计)机器学习 | 机器学习

2022-01-07

Covid-19大流行强调了对疫情模型的强大了解的需要。目前的流行模型被归类为机械或非机械方式：机械模型对疾病的动态作出明确的假设，而非机械模型对观察时间序列的形式做出假设。在这里，我们介绍了一种简单的混合模型，该模型桥接两种方法，同时保持两者的益处。该模型表示作为高斯曲线的混合的情况和死亡率的时间序列，提供灵活的函数类，与传统的机制模型相比从数据中学习。虽然该模型是非机械的，但我们表明它是基于网络SIR框架的随机过程的自然结果。这允许学习参数与类似的非机械模型相比，使用更有意义的解释，并且我们使用在Covid-19流行期间收集的辅助移动性数据来验证解释。我们提供了一种简单的学习算法来识别模型参数并建立显示模型可以从数据有效学习模型的理论结果。凭经验，我们发现模型具有低预测误差。该模型可在CovidPredictions中提供.Mit.edu。最终，这使我们能够系统地了解干预措施对Covid-19的影响，这对于开发数据驱动的解决方案来控制流行病的解决方案至关重要。

translated by 谷歌翻译

The Voronoigram: Minimax Estimation of Bounded Variation Functions From Scattered Data

Addison J. Hu , Alden Green , Ryan J. Tibshirani

分类： (统计)机器学习 | 机器学习

2022-12-30

We consider the problem of estimating a multivariate function $f_0$ of bounded variation (BV), from noisy observations $y_i = f_0(x_i) + z_i$ made at random design points $x_i \in \mathbb{R}^d$, $i=1,\ldots,n$. We study an estimator that forms the Voronoi diagram of the design points, and then solves an optimization problem that regularizes according to a certain discrete notion of total variation (TV): the sum of weighted absolute differences of parameters $\theta_i,\theta_j$ (which estimate the function values $f_0(x_i),f_0(x_j)$) at all neighboring cells $i,j$ in the Voronoi diagram. This is seen to be equivalent to a variational optimization problem that regularizes according to the usual continuum (measure-theoretic) notion of TV, once we restrict the domain to functions that are piecewise constant over the Voronoi diagram. The regression estimator under consideration hence performs (shrunken) local averaging over adaptively formed unions of Voronoi cells, and we refer to it as the Voronoigram, following the ideas in Koenker (2005), and drawing inspiration from Tukey's regressogram (Tukey, 1961). Our contributions in this paper span both the conceptual and theoretical frontiers: we discuss some of the unique properties of the Voronoigram in comparison to TV-regularized estimators that use other graph-based discretizations; we derive the asymptotic limit of the Voronoi TV functional; and we prove that the Voronoigram is minimax rate optimal (up to log factors) for estimating BV functions that are essentially bounded.

translated by 谷歌翻译

Cardinality-Regularized Hawkes-Granger Model

Tsuyoshi Idé , Georgios Kollias , Dzung T. Phan , Naoki Abe

分类：机器学习 | 人工智能

2022-08-23

我们为时间事件数据提出了一个新的稀疏Granger-Causal学习框架。我们专注于一种称为Hawkes流程的特定点过程。我们首先指出，霍克斯工艺的大多数现有稀疏因果学习算法在最大似然估计中都具有奇异性。结果，它们的稀疏溶液只能显示为数值伪像。在本文中，我们提出了一个基于基于基数规范化的霍克斯过程的数学定义明确的稀疏因果学习框架，该过程可以纠正现有方法的病理问题。我们利用提出的算法来完成实例因果事件分析的任务，其中稀疏性起着至关重要的作用。我们使用两个真实用例验证了所提出的框架，一个来自电网，另一个来自云数据中心管理域。

translated by 谷歌翻译

Generalised Bayesian Inference for Discrete Intractable Likelihood

Takuo Matsubara , Jeremias Knoblauch , François-Xavier Briol , Chris. J. Oates

分类： (统计)机器学习

2022-06-16

离散状态空间代表了对统计推断的主要计算挑战，因为归一化常数的计算需要在大型或可能的无限集中进行求和，这可能是不切实际的。本文通过开发适合离散可怜的可能性的新型贝叶斯推理程序来解决这一计算挑战。受到连续数据的最新方法学进步的启发，主要思想是使用离散的Fisher Divergence更新有关模型参数的信念，以代替有问题的棘手的可能性。结果是可以使用标准计算工具（例如Markov Chain Monte Carlo）进行采样的广义后部，从而规避了棘手的归一化常数。分析了广义后验的统计特性，并具有足够的后验一致性和渐近正态性的条件。此外，提出了一种新颖的通用后代校准方法。应用程序在离散空间数据的晶格模型和计数数据的多元模型上介绍，在每种情况下，方法论都以低计算成本促进通用的贝叶斯推断。

translated by 谷歌翻译

An Empirical Study: Extensive Deep Temporal Point Process

Haitao Lin , Cheng Tan , Lirong Wu , Zhangyang Gao , Stan. Z. Li

分类：机器学习

2021-10-19

时间点过程作为连续域的随机过程通常用于模拟具有发生时间戳的异步事件序列。由于深度神经网络的强烈表达性，在时间点过程的背景下，它们是捕获异步序列中的模式的有希望的选择。在本文中，我们首先审查了最近的研究强调和困难，在深处时间点过程建模异步事件序列，可以得出四个领域：历史序列的编码，条件强度函数的制定，事件的关系发现和学习方法优化。我们通过将其拆除进入四个部分来介绍最近提出的模型，并通过对公平实证评估的相同学习策略进行重新涂布前三个部分进行实验。此外，我们扩展了历史编码器和条件强度函数家族，并提出了一种GRANGER因果区发现框架，用于利用多种事件之间的关系。因为格兰杰因果关系可以由格兰杰因果关系图表示，所以采用分层推断框架中的离散图结构学习来揭示图的潜在结构。进一步的实验表明，具有潜在图表发现的提议框架可以捕获关系并实现改进的拟合和预测性能。

translated by 谷歌翻译