当时间序列具有自然组结构时,出现分层预测问题,并且需要在多个聚集水平和对组中分类的预测。在这些问题中,通常希望满足给定层次结构中的聚合约束,称为文献中的分层一致性。在生产准确的预测的同时保持层次连贯可能是一个具有挑战性的问题,特别是在概率预测的情况下。我们提出了一种能够对等级序列准确和相干的概率预测的新方法。我们称之为Deep Poisson混合网络(DPMN)。它依赖于神经网络的组合和用于分层多变量时间序列结构的关节分布的统计模型。通过施工,模型可确保分层一致性,并为预测分布的聚集和分解提供简单的规则。我们进行广泛的实证评估,将DPMN与其他最先进的方法进行比较,该方法在多个公共数据集上产生分层相干的概率预测。与现有的相干概率模型相比,我们在澳大利亚国内旅游数据的总体连续排名概率评分(CRP)的总体连续排名概率评分(CRP)的相对改善,24.2位于青年杂货店销售数据集中,6.9%在旧金山湾区公路交通数据集。
translated by 谷歌翻译
Multivariate time series forecasting with hierarchical structure is pervasive in real-world applications, demanding not only predicting each level of the hierarchy, but also reconciling all forecasts to ensure coherency, i.e., the forecasts should satisfy the hierarchical aggregation constraints. Moreover, the disparities of statistical characteristics between levels can be huge, worsened by non-Gaussian distributions and non-linear correlations. To this extent, we propose a novel end-to-end hierarchical time series forecasting model, based on conditioned normalizing flow-based autoregressive transformer reconciliation, to represent complex data distribution while simultaneously reconciling the forecasts to ensure coherency. Unlike other state-of-the-art methods, we achieve the forecasting and reconciliation simultaneously without requiring any explicit post-processing step. In addition, by harnessing the power of deep model, we do not rely on any assumption such as unbiased estimates or Gaussian distribution. Our evaluation experiments are conducted on four real-world hierarchical datasets from different industrial domains (three public ones and a dataset from the application servers of Alipay's data center) and the preliminary results demonstrate efficacy of our proposed method.
translated by 谷歌翻译
基于预测方法的深度学习已成为时间序列预测或预测的许多应用中的首选方法,通常通常优于其他方法。因此,在过去的几年中,这些方法现在在大规模的工业预测应用中无处不在,并且一直在预测竞赛(例如M4和M5)中排名最佳。这种实践上的成功进一步提高了学术兴趣,以理解和改善深厚的预测方法。在本文中,我们提供了该领域的介绍和概述:我们为深入预测的重要构建块提出了一定深度的深入预测;随后,我们使用这些构建块,调查了最近的深度预测文献的广度。
translated by 谷歌翻译
大量的时间序列数据通常被组织成具有不同聚集水平的横截面结构。示例包括产品和地理组。与此类数据集相干决策和计划的必要条件是针对分散的系列的预测,可以准确地添加到汇总的系列预测中,这激发了新型层次结构预测算法的创建。机器学习社区对横截面层次预测系统的兴趣日益增长,我们正处于一个有利的时刻,以确保科学的努力基于声音基线。因此,我们提出了层次Forecast库,该库包含预处理的公开可用数据集,评估指标和一组编译的统计基线模型。我们基于Python的框架旨在弥合统计,计量经济学建模和机器学习预测研究之间的差距。代码和文档可在https://github.com/nixtla/hierarchicalforecast中找到。
translated by 谷歌翻译
预测组合在预测社区中蓬勃发展,近年来,已经成为预测研究和活动主流的一部分。现在,由单个(目标)系列产生的多个预测组合通过整合来自不同来源收集的信息,从而提高准确性,从而减轻了识别单个“最佳”预测的风险。组合方案已从没有估计的简单组合方法演变为涉及时间变化的权重,非线性组合,组件之间的相关性和交叉学习的复杂方法。它们包括结合点预测和结合概率预测。本文提供了有关预测组合的广泛文献的最新评论,并参考可用的开源软件实施。我们讨论了各种方法的潜在和局限性,并突出了这些思想如何随着时间的推移而发展。还调查了有关预测组合实用性的一些重要问题。最后,我们以当前的研究差距和未来研究的潜在见解得出结论。
translated by 谷歌翻译
Probabilistic forecasting, i.e. estimating the probability distribution of a time series' future given its past, is a key enabler for optimizing business processes. In retail businesses, for example, forecasting demand is crucial for having the right inventory available at the right time at the right place. In this paper we propose DeepAR, a methodology for producing accurate probabilistic forecasts, based on training an auto-regressive recurrent network model on a large number of related time series. We demonstrate how by applying deep learning techniques to forecasting, one can overcome many of the challenges faced by widely-used classical approaches to the problem. We show through extensive empirical evaluation on several real-world forecasting data sets accuracy improvements of around 15% compared to state-of-the-art methods.
translated by 谷歌翻译
我们介绍了称为\ texttt {mecats}的异构专家框架的混合,其同时预测通过聚合层次结构相关的一组时间序列的值。不同类型的预测模型可以作为个别专家使用,以便可以根据相应时间序列的性质来定制每个模型的形式。 \ TextTt {Mecats}在培训阶段期间了解分层关系,以帮助概括在被建模的所有时间序列中更好地提高,并且还减轻了由于层次结构施加的约束而产生的一致性问题。我们进一步在点预测的顶部构建多个分位数估计值。由此产生的概率预测几乎是连贯的,无分布的,并且独立于预测模型的选择。我们对两点和概率预测进行了全面的评估,并制定了序列数据中存在变化点的情况的扩展。通常,我们的方法是强大的,适用于具有不同特性的数据集,对大规模预测管道具有高度可配置和高效的。
translated by 谷歌翻译
时间变化数量的估计是医疗保健和金融等领域决策的基本组成部分。但是,此类估计值的实际实用性受到它们量化预测不确定性的准确程度的限制。在这项工作中,我们解决了估计高维多元时间序列的联合预测分布的问题。我们提出了一种基于变压器体系结构的多功能方法,该方法使用基于注意力的解码器估算关节分布,该解码器可被学会模仿非参数Copulas的性质。最终的模型具有多种理想的属性:它可以扩展到数百个时间序列,支持预测和插值,可以处理不规则和不均匀的采样数据,并且可以在训练过程中无缝地适应丢失的数据。我们从经验上证明了这些属性,并表明我们的模型在多个现实世界数据集上产生了最新的预测。
translated by 谷歌翻译
Hierarchical time series are common in several applied fields. Forecasts are required to be coherent, that is, to satisfy the constraints given by the hierarchy. The most popular technique to enforce coherence is called reconciliation, which adjusts the base forecasts computed for each time series. However, recent works on probabilistic reconciliation present several limitations. In this paper, we propose a new approach based on conditioning to reconcile any type of forecast distribution. We then introduce a new algorithm, called Bottom-Up Importance Sampling, to efficiently sample from the reconciled distribution. It can be used for any base forecast distribution: discrete, continuous, or in the form of samples, providing a major speedup compared to the current methods. Experiments on several temporal hierarchies show a significant improvement over base probabilistic forecasts.
translated by 谷歌翻译
我们提出了一种对任何概率基础预测进行核对的原则方法。我们展示了如何通过通过贝叶斯规则合并底部预测和上层时间序列中包含的信息来获得概率对帐。我们在玩具层次结构上说明了我们的方法,展示了我们的框架如何允许对任何基本预测的概率对帐。我们对计数时间序列的时间层次结构进行对帐进行实验,与基于高斯或截短的高斯分布相比,获得了重大改进。
translated by 谷歌翻译
概率分层时间序列预测是时间序列预测的重要变体,其目标是建模和预测具有基本层次关系的多元时间序列。大多数方法都集中在点预测上,并且不提供良好的概率预测分布。最近的最先进的概率预测方法还对点预测和分布样本施加了层次关系,这并不能说明预测分布的相干性。先前的作品还默默地假设数据集始终与给定的层次关系一致,并且不适应显示出与此假设偏差的现实世界数据集。我们弥合了这两个差距,并提出了Profhit,这是一个完全概率的层次预测模型,共同模拟整个层次结构的预测分布。 Profhit使用一种灵活的概率贝叶斯方法,并引入了一种新颖的分布相干性正规化,以从层次关系中学习整个预测分布,以实现强大和校准的预测以及适应不同层次结构一致性的数据集。在评估广泛数据集的PROFHIT时,我们观察到准确性和校准的性能提高了41-88%。由于对完整分布的相干性进行了建模,我们观察到,即使缺少多达10%的输入时间序列数据,其他方法的性能严重降低70%以上,即使最多10%的输入时间序列数据也可以提供可靠的预测。
translated by 谷歌翻译
神经预测的最新进展加速了大规模预测系统的性能。然而,长途预测仍然是一项非常艰巨的任务。困扰任务的两个常见挑战是预测的波动及其计算复杂性。我们介绍了N-HITS,该模型通过结合新的分层插值和多率数据采样技术来解决挑战。这些技术使提出的方法能够顺序组装其预测,并在分解输入信号并合成预测的同时强调不同频率和尺度的组件。我们证明,在平稳性的情况下,层次结构插值技术可以有效地近似于任意长的视野。此外,我们从长远的预测文献中进行了广泛的大规模数据集实验,证明了我们方法比最新方法的优势,在该方法中,N-HITS可提供比最新的16%的平均准确性提高。变压器体系结构在减少计算时间的同时(50次)。我们的代码可在https://bit.ly/3jlibp8上找到。
translated by 谷歌翻译
间歇时间序列的分层预测是研究和实证研究中的挑战。庞大的研究侧重于提高每个层次结构的准确性,尤其是底部层次的间歇时间序列。然后,在每个层次结构上调和预测,以进一步提高整体性能。在本文中,我们提出了一种与分层对准方法的预测,该方法将底部水平预测视为可变的柔和预测,以确保在层次结构的上层上的预测精度。我们采用纯深度学习预测方法的N- BEATS对高层的连续时间序列和广泛使用的基于树的算法LightGBM为底层间歇时间序列。具有对准方法的分层预测是自下而上方法的简单且有效的变体,其占难以观察到底部水平的偏差。它允许在较低级别的次优预测保留更高的整体性能。该研究在本实证研究中由第一作者在M5预测准确性竞争期间开发,排名第二。该方法也是良好的商业战略规划有益。
translated by 谷歌翻译
Generalizability of time series forecasting models depends on the quality of model selection. Temporal cross validation (TCV) is a standard technique to perform model selection in forecasting tasks. TCV sequentially partitions the training time series into train and validation windows, and performs hyperparameter optmization (HPO) of the forecast model to select the model with the best validation performance. Model selection with TCV often leads to poor test performance when the test data distribution differs from that of the validation data. We propose a novel model selection method, H-Pro that exploits the data hierarchy often associated with a time series dataset. Generally, the aggregated data at the higher levels of the hierarchy show better predictability and more consistency compared to the bottom-level data which is more sparse and (sometimes) intermittent. H-Pro performs the HPO of the lowest-level student model based on the test proxy forecasts obtained from a set of teacher models at higher levels in the hierarchy. The consistency of the teachers' proxy forecasts help select better student models at the lowest-level. We perform extensive empirical studies on multiple datasets to validate the efficacy of the proposed method. H-Pro along with off-the-shelf forecasting models outperform existing state-of-the-art forecasting methods including the winning models of the M5 point-forecasting competition.
translated by 谷歌翻译
We propose Multivariate Quantile Function Forecaster (MQF$^2$), a global probabilistic forecasting method constructed using a multivariate quantile function and investigate its application to multi-horizon forecasting. Prior approaches are either autoregressive, implicitly capturing the dependency structure across time but exhibiting error accumulation with increasing forecast horizons, or multi-horizon sequence-to-sequence models, which do not exhibit error accumulation, but also do typically not model the dependency structure across time steps. MQF$^2$ combines the benefits of both approaches, by directly making predictions in the form of a multivariate quantile function, defined as the gradient of a convex function which we parametrize using input-convex neural networks. By design, the quantile function is monotone with respect to the input quantile levels and hence avoids quantile crossing. We provide two options to train MQF$^2$: with energy score or with maximum likelihood. Experimental results on real-world and synthetic datasets show that our model has comparable performance with state-of-the-art methods in terms of single time step metrics while capturing the time dependency structure.
translated by 谷歌翻译
在许多应用和研究领域,时间序列的概率预测是重要的事情。为了从概率预测中得出结论,我们必须确保用于近似真实预测分布的模型类足够表达。然而,模型本身的特征,例如其不确定性或特征结果关系并不重要。本文提出了自回旋转换模型(ATM),该模型类是受各种研究方向启发的模型类别,使用半参数分布假设和可解释的模型规范结合表达性分布预测。我们在理论上和通过几个模拟和真实的预测数据集上通过经验评估来证明ATM的属性。
translated by 谷歌翻译
概率时间序列预测在许多应用领域至关重要,例如零售,电子商务,金融或生物学。随着大量数据的增加,已经提出了许多神经架构为此问题。特别是,基于变压器的方法实现了现实世界基准的最先进的性能。然而,这些方法需要了解大量参数,这对培训此类模型的计算资源施加了高的内存要求。为了解决这个问题,我们介绍了一种新颖的双向时间卷积网络(Bitcn),该网络(Bitcn)需要比公共变换器的方法更少的参数较少的阶数。我们的模型结合了两个时间卷积网络(TCN):第一个网络编码了时间序列的未来协变量,而第二网络编码过往观察和协变量。我们通过这两个网络联合估计输出分布的参数。四个现实世界数据集的实验表明,我们的方法与四个最先进的概率预测方法进行了表演,包括基于变压器的方法和Wavenet,在两点指标(Smape,NRMSE)以及A上大多数情况下的范围指标(定量损失百分位数)集。其次,我们证明我们的方法比基于变压器的方法所需的参数明显更少,这意味着模型可以培训更快,内存要求显着降低,因此降低了部署这些模型的基础架构成本。
translated by 谷歌翻译
We consider the problem of dynamic pricing of a product in the presence of feature-dependent price sensitivity. Developing practical algorithms that can estimate price elasticities robustly, especially when information about no purchases (losses) is not available, to drive such automated pricing systems is a challenge faced by many industries. Based on the Poisson semi-parametric approach, we construct a flexible yet interpretable demand model where the price related part is parametric while the remaining (nuisance) part of the model is non-parametric and can be modeled via sophisticated machine learning (ML) techniques. The estimation of price-sensitivity parameters of this model via direct one-stage regression techniques may lead to biased estimates due to regularization. To address this concern, we propose a two-stage estimation methodology which makes the estimation of the price-sensitivity parameters robust to biases in the estimators of the nuisance parameters of the model. In the first-stage we construct estimators of observed purchases and prices given the feature vector using sophisticated ML estimators such as deep neural networks. Utilizing the estimators from the first-stage, in the second-stage we leverage a Bayesian dynamic generalized linear model to estimate the price-sensitivity parameters. We test the performance of the proposed estimation schemes on simulated and real sales transaction data from the Airline industry. Our numerical studies demonstrate that our proposed two-stage approach reduces the estimation error in price-sensitivity parameters from 25\% to 4\% in realistic simulation settings. The two-stage estimation techniques proposed in this work allows practitioners to leverage modern ML techniques to robustly estimate price-sensitivities while still maintaining interpretability and allowing ease of validation of its various constituent parts.
translated by 谷歌翻译
项目反应理论(IRT)是一个无处不在的模型,可以根据他们对问题的回答理解人类行为和态度。大型现代数据集为捕捉人类行为的更多细微差别提供了机会,从而有可能改善心理测量模型,从而改善科学理解和公共政策。但是,尽管较大的数据集允许采用更灵活的方法,但许多用于拟合IRT模型的当代算法也可能具有禁止现实世界应用的巨大计算需求。为了解决这种瓶颈,我们引入了IRT的变异贝叶斯推理算法,并表明它在不牺牲准确性的情况下快速可扩展。将此方法应用于认知科学和教育的五个大规模项目响应数据集中,比替代推理算法更高的对数可能性和更高的准确性。然后,使用这种新的推论方法,我们将IRT概括为具有表现力的贝叶斯响应模型,利用深度学习的最新进展来捕获具有神经网络的非线性项目特征曲线(ICC)。使用TIMSS的特定级数学测试,我们显示我们的非线性IRT模型可以捕获有趣的不对称ICC。该算法实现是开源的,易于使用。
translated by 谷歌翻译
远程预测是许多决策支持系统的起点,需要在预测值上从高级聚合模式中汲取推断。最先进的时间序列预测方法要么受到Long-Horizo n预测的概念漂移,或者未能准确地预测连贯和准确的高水平聚集体。在这项工作中,我们提出了一种新颖的概率预测方法,其在基础级别和预测总统计方面产生了一致的预测。我们使用新推断方法实现预测基础级和聚合统计数据之间的一致性。我们的推断方法基于KL分歧,可以在封闭形式中有效地解决。我们表明,我们的方法在基本级别和静电汇总推断上的预测性能提高了三种不同域的真实数据集的帖子推断。
translated by 谷歌翻译