智能论文笔记

Deep Neyman-Scott Processes

Chengkuan Hong , Christian R. Shelton

分类： (统计)机器学习 | 机器学习

2021-11-06

Neyman-Scott过程是COX过程的特殊情况。潜在和可观察的随机过程均为泊松过程。我们考虑了本文的深度Neyman-Scott过程，其中网络的建筑组件是所有泊松过程。我们通过Markov Chain Monte Carlo开发了一种高效的后部抽样，并使用它来实现基于可能性的推断。我们的方法为复杂的分层点流程推断出来的空间。我们在实验中展示了更多隐藏的泊松过程为似然拟合和事件类型预测带来了更好的性能。我们还将我们的方法与最先进的模式进行了用于时间现实世界数据集的方法，并使用较少的参数展示数据拟合和预测的竞争能力。

translated by 谷歌翻译

Spatiotemporal Clustering with Neyman-Scott Processes via Connections to Bayesian Nonparametric Mixture Models

Yixin Wang , Anthony Degleris , Alex H. Williams , Scott W. Linderman

分类： (统计)机器学习 | 机器学习

2022-01-13

Neyman-Scott processes (NSPs) are point process models that generate clusters of points in time or space. They are natural models for a wide range of phenomena, ranging from neural spike trains to document streams. The clustering property is achieved via a doubly stochastic formulation: first, a set of latent events is drawn from a Poisson process; then, each latent event generates a set of observed data points according to another Poisson process. This construction is similar to Bayesian nonparametric mixture models like the Dirichlet process mixture model (DPMM) in that the number of latent events (i.e. clusters) is a random variable, but the point process formulation makes the NSP especially well suited to modeling spatiotemporal data. While many specialized algorithms have been developed for DPMMs, comparatively fewer works have focused on inference in NSPs. Here, we present novel connections between NSPs and DPMMs, with the key link being a third class of Bayesian mixture models called mixture of finite mixture models (MFMMs). Leveraging this connection, we adapt the standard collapsed Gibbs sampling algorithm for DPMMs to enable scalable Bayesian inference on NSP models. We demonstrate the potential of Neyman-Scott processes on a variety of applications including sequence detection in neural spike trains and event detection in document streams.

translated by 谷歌翻译

Linking Across Data Granularity: Fitting Multivariate Hawkes Processes to Partially Interval-Censored Data

Pio Calderon , Alexander Soen , Marian-Andrei Rizoiu

分类：机器学习

2021-11-03

这项工作引入了一种新颖的多变量时间点过程，部分均值行为泊松（PMBP）过程，可以利用以将多变量霍克斯过程适合部分间隔删除的数据，该数据包括在尺寸和间隔子集上的事件时间戳的混合中组成的数据。 - 委员会互补尺寸的事件计数。首先，我们通过其条件强度定义PMBP过程，并导出子临界性的规律性条件。我们展示了鹰过程和MBP过程（Rizoiu等人）是PMBP过程的特殊情况。其次，我们提供了能够计算PMBP过程的条件强度和采样事件历史的数字方案。第三，我们通过使用合成和现实世界数据集来证明PMBP过程的适用性：我们测试PMBP过程的能力，以恢复多变量霍克参数给出鹰过程的样本事件历史。接下来，我们在YouTube流行预测任务上评估PMBP过程，并表明它优于当前最先进的鹰强度过程（Rizoiu等人。（2017b））。最后，在Covid19的策划数据集上，关于国家样本的Covid19每日案例计数和Covid19相关的新闻文章，我们展示了PMBP拟合参数上的聚类使各国的分类能够分类案件和新闻的国家级互动报告。

translated by 谷歌翻译

Neural Point Process for Learning Spatiotemporal Event Dynamics

Zihao Zhou , Xingyi Yang , Ryan Rossi , Handong Zhao , Rose Yu

分类：机器学习 | 人工智能

2021-12-12

学习时空事件的动态是一个根本的问题。神经点过程提高了与深神经网络的点过程模型的表现。但是，大多数现有方法只考虑没有空间建模的时间动态。我们提出了深蓝点过程（DeepStpp），这是一款整合时空点流程的深层动力学模型。我们的方法灵活，高效，可以在空间和时间准确地预测不规则采样的事件。我们方法的关键构造是非参数时空强度函数，由潜在过程管理。强度函数享有密度的闭合形式集成。潜在进程捕获事件序列的不确定性。我们使用摊销变分推理来推断使用深网络的潜在进程。使用合成数据集，我们验证我们的模型可以准确地学习真实的强度函数。在真实世界的基准数据集上，我们的模型展示了最先进的基线的卓越性能。

translated by 谷歌翻译

Scalable Variational Bayes methods for Hawkes processes

Deborah Sulem , Vincent Rivoirard , Judith Rousseau

分类： (统计)机器学习

2022-12-01

Multivariate Hawkes processes are temporal point processes extensively applied to model event data with dependence on past occurrences and interaction phenomena. In the generalised nonlinear model, positive and negative interactions between the components of the process are allowed, therefore accounting for so-called excitation and inhibition effects. In the nonparametric setting, learning the temporal dependence structure of Hawkes processes is often a computationally expensive task, all the more with Bayesian estimation methods. In general, the posterior distribution in the nonlinear Hawkes model is non-conjugate and doubly intractable. Moreover, existing Monte-Carlo Markov Chain methods are often slow and not scalable to high-dimensional processes in practice. Recently, efficient algorithms targeting a mean-field variational approximation of the posterior distribution have been proposed. In this work, we unify existing variational Bayes inference approaches under a general framework, that we theoretically analyse under easily verifiable conditions on the prior, the variational class, and the model. We notably apply our theory to a novel spike-and-slab variational class, that can induce sparsity through the connectivity graph parameter of the multivariate Hawkes model. Then, in the context of the popular sigmoid Hawkes model, we leverage existing data augmentation technique and design adaptive and sparsity-inducing mean-field variational methods. In particular, we propose a two-step algorithm based on a thresholding heuristic to select the graph parameter. Through an extensive set of numerical simulations, we demonstrate that our approach enjoys several benefits: it is computationally efficient, can reduce the dimensionality of the problem by selecting the graph parameter, and is able to adapt to the smoothness of the underlying parameter.

translated by 谷歌翻译

Variational Inference: A Review for Statisticians

David M. Blei , Alp Kucukelbir , Jon D. McAuliffe

分类：

2016-01-04

One of the core problems of modern statistics is to approximate difficult-to-compute probability densities. This problem is especially important in Bayesian statistics, which frames all inference about unknown quantities as a calculation involving the posterior density. In this paper, we review variational inference (VI), a method from machine learning that approximates probability densities through optimization. VI has been used in many applications and tends to be faster than classical methods, such as Markov chain Monte Carlo sampling. The idea behind VI is to first posit a family of densities and then to find the member of that family which is close to the target. Closeness is measured by Kullback-Leibler divergence. We review the ideas behind mean-field variational inference, discuss the special case of VI applied to exponential family models, present a full example with a Bayesian mixture of Gaussians, and derive a variant that uses stochastic optimization to scale up to massive data. We discuss modern research in VI and highlight important open problems. VI is powerful, but it is not yet well understood. Our hope in writing this paper is to catalyze statistical research on this class of algorithms.

translated by 谷歌翻译

The Application of Zig-Zag Sampler in Sequential Markov Chain Monte Carlo

Yu Han , Kazuyuki Nakamura

分类：机器学习 | (统计)机器学习

2021-11-18

颗粒滤波方法广泛应用于非线性非高斯状态空间模型内的顺序状态估计。然而，传统的颗粒过滤方法在高维状态空间模型中遭受重量退化。目前，有许多方法可以提高高维状态空间模型中粒子滤波的性能。其中，更先进的方法是通过实施复合Metropolis-Hasting（MH）内核来构建顺序Makov Chian Monte Carlo（SMCMC）框架。在本文中，我们提出了离散的示出ZAG采样器，并在SMCMC框架内的复合MH内核的细化阶段应用Zig-Zag采样器，其在联合拉伸阶段中的可逆颗粒流动实现。通过挑战复杂的高维过滤实施例的数值实验，我们评估所提出的方法的性能。无限的实验表明，在高维状态估计例中，所提出的方法提高了估计精度并增加了与最先进的过滤方法相比的接收比率。

translated by 谷歌翻译

$π$VAE: a stochastic process prior for Bayesian deep learning with MCMC

Swapnil Mishra , Seth Flaxman , Tresnia Berah , Harrison Zhu , Mikko Pakkanen , Samir Bhatt

分类：机器学习 | (统计)机器学习

2020-02-17

随机过程提供了数学上优雅的方式模型复杂数据。从理论上讲，它们为可以编码广泛有趣的假设的功能类提供了灵活的先验。但是，实际上，难以通过优化或边缘化来有效推断，这一问题进一步加剧了大数据和高维输入空间。我们提出了一种新颖的变性自动编码器（VAE），称为先前的编码变量自动编码器（$ \ pi $ vae）。 $ \ pi $ vae是有限的交换且Kolmogorov一致的，因此是一个连续的随机过程。我们使用$ \ pi $ vae学习功能类的低维嵌入。我们表明，我们的框架可以准确地学习表达功能类，例如高斯流程，也可以学习函数的属性以启用统计推断（例如log高斯过程的积分）。对于流行的任务，例如空间插值，$ \ pi $ vae在准确性和计算效率方面都达到了最先进的性能。也许最有用的是，我们证明了所学的低维独立分布的潜在空间表示提供了一种优雅，可扩展的方法，可以在概率编程语言（例如Stan）中对随机过程进行贝叶斯推断。

translated by 谷歌翻译

Non-Gaussian Process Regression

Yaman Kındap , Simon Godsill

分类： (统计)机器学习 | 机器学习

2022-09-07

标准GPS为行为良好的流程提供了灵活的建模工具。然而，预计与高斯的偏差有望在现实世界数据集中出现，结构异常值和冲击通常会观察到。在这些情况下，GP可能无法充分建模不确定性，并且可能会过度推动。在这里，我们将GP框架扩展到一类新的时间变化的GP，从而可以直接建模重尾非高斯行为，同时通过非均匀GPS表示的无限混合物保留了可拖动的条件GP结构。有条件的GP结构是通过在潜在转化的输入空间上调节观测值来获得的，并使用L \'{e} Vy过程对潜在转化的随机演变进行建模，该过程允许贝叶斯在后端预测密度和潜在转化中的贝叶斯推断功能。我们为该模型提供了马尔可夫链蒙特卡洛推理程序，并证明了与标准GP相比的潜在好处。

translated by 谷歌翻译

An Empirical Study: Extensive Deep Temporal Point Process

Haitao Lin , Cheng Tan , Lirong Wu , Zhangyang Gao , Stan. Z. Li

分类：机器学习

2021-10-19

时间点过程作为连续域的随机过程通常用于模拟具有发生时间戳的异步事件序列。由于深度神经网络的强烈表达性，在时间点过程的背景下，它们是捕获异步序列中的模式的有希望的选择。在本文中，我们首先审查了最近的研究强调和困难，在深处时间点过程建模异步事件序列，可以得出四个领域：历史序列的编码，条件强度函数的制定，事件的关系发现和学习方法优化。我们通过将其拆除进入四个部分来介绍最近提出的模型，并通过对公平实证评估的相同学习策略进行重新涂布前三个部分进行实验。此外，我们扩展了历史编码器和条件强度函数家族，并提出了一种GRANGER因果区发现框架，用于利用多种事件之间的关系。因为格兰杰因果关系可以由格兰杰因果关系图表示，所以采用分层推断框架中的离散图结构学习来揭示图的潜在结构。进一步的实验表明，具有潜在图表发现的提议框架可以捕获关系并实现改进的拟合和预测性能。

translated by 谷歌翻译

Approximate Gibbs Sampler for Efficient Inference of Hierarchical Bayesian Models for Grouped Count Data

Jin-Zhu Yu , Hiba Baroud

分类：机器学习

2022-11-28

Hierarchical Bayesian Poisson regression models (HBPRMs) provide a flexible modeling approach of the relationship between predictors and count response variables. The applications of HBPRMs to large-scale datasets require efficient inference algorithms due to the high computational cost of inferring many model parameters based on random sampling. Although Markov Chain Monte Carlo (MCMC) algorithms have been widely used for Bayesian inference, sampling using this class of algorithms is time-consuming for applications with large-scale data and time-sensitive decision-making, partially due to the non-conjugacy of many models. To overcome this limitation, this research develops an approximate Gibbs sampler (AGS) to efficiently learn the HBPRMs while maintaining the inference accuracy. In the proposed sampler, the data likelihood is approximated with Gaussian distribution such that the conditional posterior of the coefficients has a closed-form solution. Numerical experiments using real and synthetic datasets with small and large counts demonstrate the superior performance of AGS in comparison to the state-of-the-art sampling algorithm, especially for large datasets.

translated by 谷歌翻译

Faster MCMC for Gaussian Latent Position Network Models

Neil A. Spencer , Brian Junker , Tracy M. Sweet

分类： (统计)机器学习

2020-06-13

潜在位置网络模型是网络科学的多功能工具;应用程序包括集群实体，控制因果混淆，并在未观察的图形上定义前提。估计每个节点的潜在位置通常是贝叶斯推理问题的群体，吉布斯内的大都市是最流行的近似后分布的工具。然而，众所周知，GIBBS内的大都市对于大型网络而言是低效;接受比计算成本昂贵，并且所得到的后绘高度相关。在本文中，我们提出了一个替代的马尔可夫链蒙特卡罗战略 - 使用分裂哈密顿蒙特卡罗和萤火虫蒙特卡罗的组合定义 - 利用后部分布的功能形式进行更有效的后退计算。我们展示了这些战略在吉布斯和综合网络上的其他算法中优于大都市，以及学区的教师和工作人员的真正信息共享网络。

translated by 谷歌翻译

Joint Non-parametric Point Process model for Treatments and Outcomes: Counterfactual Time-series Prediction Under Policy Interventions

Çağlar Hızlı , ST John , Anne Juuti , Tuure Saarinen , Kirsi Pietiläinen , Pekka Marttinen

分类：机器学习

2022-09-09

决策者需要在采用新的治疗政策之前预测结果的发展，该政策定义了何时以及如何连续地影响结果的治疗序列。通常，预测介入的未来结果轨迹的算法将未来治疗的固定顺序作为输入。这要么忽略了未来治疗对结果之前的结果的依赖性，要么隐含地假设已知治疗政策，因此排除了该政策未知或需要反事实分析的情况。为了应对这些局限性，我们开发了一种用于治疗和结果的联合模型，该模型允许估计处理策略和顺序治疗（OUT COMECTION数据）的影响。它可以回答有关治疗政策干预措施的介入和反事实查询，因为我们使用有关血糖进展的现实数据显示，并在此基础上进行了模拟研究。

translated by 谷歌翻译

Variational Gibbs inference for statistical model estimation from incomplete data

Vaidotas Simkus , Benjamin Rhodes , Michael U. Gutmann

分类：机器学习 | (统计)机器学习

2021-11-25

统计模型是机器学习的核心，具有广泛适用性，跨各种下游任务。模型通常由通过最大似然估计从数据估计的自由参数控制。但是，当面对现实世界数据集时，许多模型运行到一个关键问题：它们是在完全观察到的数据方面配制的，而在实践中，数据集会困扰缺失数据。来自不完整数据的统计模型估计理论在概念上类似于潜在变量模型的估计，其中存在强大的工具，例如变分推理（VI）。然而，与标准潜在变量模型相比，具有不完整数据的参数估计通常需要估计缺失变量的指数 - 许多条件分布，因此使标准的VI方法是棘手的。通过引入变分Gibbs推理（VGI），是一种新的通用方法来解决这个差距，以估计来自不完整数据的统计模型参数。我们在一组合成和实际估算任务上验证VGI，从不完整的数据中估算重要的机器学习模型，VAE和标准化流程。拟议的方法，同时通用，实现比现有的特定模型特定估计方法竞争或更好的性能。

translated by 谷歌翻译

Cardinality-Regularized Hawkes-Granger Model

Tsuyoshi Idé , Georgios Kollias , Dzung T. Phan , Naoki Abe

分类：机器学习 | 人工智能

2022-08-23

我们为时间事件数据提出了一个新的稀疏Granger-Causal学习框架。我们专注于一种称为Hawkes流程的特定点过程。我们首先指出，霍克斯工艺的大多数现有稀疏因果学习算法在最大似然估计中都具有奇异性。结果，它们的稀疏溶液只能显示为数值伪像。在本文中，我们提出了一个基于基于基数规范化的霍克斯过程的数学定义明确的稀疏因果学习框架，该过程可以纠正现有方法的病理问题。我们利用提出的算法来完成实例因果事件分析的任务，其中稀疏性起着至关重要的作用。我们使用两个真实用例验证了所提出的框架，一个来自电网，另一个来自云数据中心管理域。

translated by 谷歌翻译

Stochastic Variational Inference

Matt Hoffman , David M. Blei , Chong Wang , John Paisley

分类：

2012-06-29

We develop stochastic variational inference, a scalable algorithm for approximating posterior distributions. We develop this technique for a large class of probabilistic models and we demonstrate it with two probabilistic topic models, latent Dirichlet allocation and the hierarchical Dirichlet process topic model. Using stochastic variational inference, we analyze several large collections of documents: 300K articles from Nature, 1.8M articles from The New York Times, and 3.8M articles from Wikipedia. Stochastic inference can easily handle data sets of this size and outperforms traditional variational inference, which can only handle a smaller subset. (We also show that the Bayesian nonparametric topic model outperforms its parametric counterpart.) Stochastic variational inference lets us apply complex Bayesian models to massive data sets.

translated by 谷歌翻译

Nonparametric Embeddings of Sparse High-Order Interaction Events

Zheng Wang , Yiming Xu , Conor Tillinghast , Shibo Li , Akil Narayan , Shandian Zhe

分类：机器学习 | (统计)机器学习

2022-07-08

高阶交互事件在现实世界应用中很常见。从这些事件中编码参与者的复杂关系的学习嵌入在知识挖掘和预测任务中至关重要。尽管现有方法取得了成功，例如泊松张量分解，它们忽略了数据基础的稀疏结构，即发生的相互作用远小于所有参与者之间可能的相互作用。在本文中，我们提出了稀疏高阶交互事件（NESH）的非参数嵌入。我们杂交稀疏的超图（张量）过程和一个基质高斯过程，以捕获相互作用中的渐近结构稀疏性和参与者之间的非线性时间关系。我们证明了稀疏性比的强渐近边界（包括较低和上限），这揭示了采样结构的渐近特性。我们使用批界规范化，破坏性结构和稀疏的变分GP近似来开发有效的，可扩展的模型推理算法。我们在几个现实世界应用中证明了方法的优势。

translated by 谷歌翻译

Sparse Gaussian Process Hyperparameters: Optimize or Integrate?

Vidhi Lalchand , Wessel P. Bruinsma , David R. Burt , Carl E. Rasmussen

分类： (统计)机器学习 | 机器学习

2022-11-04

The kernel function and its hyperparameters are the central model selection choice in a Gaussian proces (Rasmussen and Williams, 2006). Typically, the hyperparameters of the kernel are chosen by maximising the marginal likelihood, an approach known as Type-II maximum likelihood (ML-II). However, ML-II does not account for hyperparameter uncertainty, and it is well-known that this can lead to severely biased estimates and an underestimation of predictive uncertainty. While there are several works which employ a fully Bayesian characterisation of GPs, relatively few propose such approaches for the sparse GPs paradigm. In this work we propose an algorithm for sparse Gaussian process regression which leverages MCMC to sample from the hyperparameter posterior within the variational inducing point framework of Titsias (2009). This work is closely related to Hensman et al. (2015b) but side-steps the need to sample the inducing points, thereby significantly improving sampling efficiency in the Gaussian likelihood case. We compare this scheme against natural baselines in literature along with stochastic variational GPs (SVGPs) along with an extensive computational analysis.

translated by 谷歌翻译

A Unified Approach to Variational Autoencoders and Stochastic Normalizing Flows via Markov Chains

Johannes Hertrich , Paul Hagemann , Gabriele Steidl

分类：机器学习

2021-11-24

标准化流动，扩散归一化流量和变形自动置换器是强大的生成模型。在本文中，我们提供了一个统一的框架来通过马尔可夫链处理这些方法。实际上，我们考虑随机标准化流量作为一对马尔可夫链，满足一些属性，并表明许多用于数据生成的最先进模型适合该框架。马尔可夫链的观点使我们能够将确定性层作为可逆的神经网络和随机层作为大都会加速层，Langevin层和变形自身偏移，以数学上的声音方式。除了具有Langevin层的密度的层，扩散层或变形自身形式，也可以处理与确定性层或大都会加热器层没有密度的层。因此，我们的框架建立了一个有用的数学工具来结合各种方法。

translated by 谷歌翻译

Conjugate priors for count and rounded data regression

Daniel R. Kowal

分类： (统计)机器学习

2021-10-23

离散数据丰富，并且通常作为计数或圆形数据而出现。甚至对于线性回归模型，缀合格前沿和闭合形式的后部通常是不可用的，这需要近似诸如MCMC的后部推理。对于广泛的计数和圆形数据回归模型，我们介绍了能够闭合后部推理的共轭前沿。密钥后和预测功能可通过直接蒙特卡罗模拟来计算。至关重要的是，预测分布是离散的，以匹配数据的支持，并且可以在多个协变量中进行共同评估或模拟。这些工具广泛用途是线性回归，非线性模型，通过基础扩展，以及模型和变量选择。多种仿真研究表明计算，预测性建模和相对于现有替代方案的选择性的显着优势。

translated by 谷歌翻译