Platelet products are both expensive and have very short shelf lives. As usage rates for platelets are highly variable, the effective management of platelet demand and supply is very important yet challenging. The primary goal of this paper is to present an efficient forecasting model for platelet demand at Canadian Blood Services (CBS). To accomplish this goal, four different demand forecasting methods, ARIMA (Auto Regressive Moving Average), Prophet, lasso regression (least absolute shrinkage and selection operator) and LSTM (Long Short-Term Memory) networks are utilized and evaluated. We use a large clinical dataset for a centralized blood distribution centre for four hospitals in Hamilton, Ontario, spanning from 2010 to 2018 and consisting of daily platelet transfusions along with information such as the product specifications, the recipients' characteristics, and the recipients' laboratory test results. This study is the first to utilize different methods from statistical time series models to data-driven regression and a machine learning technique for platelet transfusion using clinical predictors and with different amounts of data. We find that the multivariate approaches have the highest accuracy in general, however, if sufficient data are available, a simpler time series approach such as ARIMA appears to be sufficient. We also comment on the approach to choose clinical indicators (inputs) for the multivariate models.
translated by 谷歌翻译
预测可帮助企业分配资源并实现目标。在LinkedIn,产品所有者使用预测来设定业务目标,跟踪前景和监视健康。工程师使用预测有效地提供硬件。开发一种预测解决方案来满足这些需求,需要对各种时间序列进行准确,可解释的预测,并以次数至季度的频率。我们提出了Greykite,这是一个用于预测的开源Python库,已在LinkedIn上部署了二十多种用例。它的旗舰算法Silverkite提供了可解释的,快速且高度灵活的单变量预测,可捕获诸如时期增长和季节性,自相关,假期和回归剂等效果。该库通过促进数据探索,模型配置,执行和解释来实现自我服务的准确性和信任。我们的基准结果显示了来自各个域的数据集的现成速度和准确性。在过去的两年中,金融,工程和产品团队的资源计划和分配,目标设置和进度跟踪,异常检测和根本原因分析的资源团队一直信任灰金矿的预测。我们希望灰金矿对具有类似应用的预测从业者有用,这些应用需要准确,可解释的预测,这些预测捕获了与人类活动相关的时间序列共有的复杂动力学。
translated by 谷歌翻译
操作网络通常依靠机器学习模型来进行许多任务,包括检测异常,推断应用程序性能和预测需求。然而,不幸的是,模型精度会因概念漂移而降低,从而,由于从软件升级到季节性到用户行为的变化,功能和目标预测之间的关系会发生变化。因此,缓解概念漂移是操作机器学习模型的重要组成部分,尽管它很重要,但在网络或一般的回归模型的背景下,概念漂移并未得到广泛的探索。因此,对于当前依赖机器学习模型的许多常见网络管理任务,如何检测或减轻它并不是一件好事。不幸的是,正如我们所展示的那样,通过使用新可用的数据经常重新培训模型可以充分缓解概念漂移,甚至可以进一步降低模型的准确性。在本文中,我们表征了美国主要大都市地区的大型蜂窝网络中的概念漂移。我们发现,概念漂移发生在许多重要的关键性能指标(KPI)上,独立于模型,训练集大小和时间间隔,因此需要采用实用方法来检测,解释和减轻它。为此,我们开发了特征(叶)的局部误差近似。叶检测到漂移;解释最有助于漂移的功能和时间间隔;并使用遗忘和过度采样来减轻漂移。我们使用超过四年的蜂窝KPI数据来评估叶子与行业标准的缓解方法。在美国,我们对主要的细胞提供商进行的初步测试表明,LEAF在各种KPI和模型上都是有效的。叶子始终优于周期性,并触发重新培训,同时还要降低昂贵的重新经营操作。
translated by 谷歌翻译
短期负荷预测(STLF)由于复杂的时间序列(TS)是一种表达三个季节性模式和非线性趋势的挑战。本文提出了一种新的混合分层深度学习模型,涉及多个季节性,并产生两点预测和预测间隔(PIS)。它结合了指数平滑(ES)和经常性神经网络(RNN)。 ES动态提取每个单独的TS的主要组件,并启用在飞行的临时化,这在相对较小的数据集上操作时特别有用。多层RNN配备了一种新型扩张的经常性电池,旨在有效地模拟TS中的短期和长期依赖性。为了改善内部TS表示,因此模型的性能,RNN同时学习ES参数和主要映射函数将输入转换为预测。我们比较我们对几种基线方法的方法,包括古典统计方法和机器学习(ML)方法,在35个欧洲国家的STLF问题。实证研究清楚地表明,该模型具有高表现力,以解决非线性随机预测问题,包括多个季节性和显着的随机波动。实际上,它在准确性方面优于统计和最先进的ML模型。
translated by 谷歌翻译
急诊部门(EDS)是葡萄牙国家卫生服务局的基本要素,可作为具有多样化和非常严重医疗问题的用户的切入点。由于ED的固有特征;预测使用服务的患者数量特别具有挑战性。富裕和医疗专业人员人数之间的不匹配可能会导致提供的服务质量下降,并造成对整个医院产生影响的问题,并从其他部门征用医疗保健工作者以及推迟手术。 。 ED人满为患的部分是由非紧急患者驱动的,尽管没有医疗紧急情况,但诉诸于紧急服务,几乎占每日患者总数的一半。本文描述了一种新颖的深度学习体系结构,即时间融合变压器,该结构使用日历和时间序列协变量来预测预测间隔和4周期间的点预测。我们得出的结论是,可以预测葡萄牙健康区域(HRA)(HRA)的平均绝对百分比误差(MAPE)和均方根误差(RMSE)为84.4102人/天的平均绝对百分比误差(MAPE)。本文显示了支持使用静态和时间序列协变量的多元方法的经验证据,同时超越了文献中常见的其他模型。
translated by 谷歌翻译
随着Covid-19影响每个国家的全球和改变日常生活,预测疾病的传播的能力比任何先前的流行病更重要。常规的疾病 - 展开建模方法,隔间模型,基于对病毒的扩散的时空均匀性的假设,这可能导致预测到欠低,特别是在高空间分辨率下。本文采用替代技术 - 时空机器学习方法。我们提出了Covid-LSTM,一种基于长期短期内存深度学习架构的数据驱动模型,用于预测Covid-19在美国县级的发病率。我们使用每周数量的新阳性案例作为时间输入,以及来自Facebook运动和连通数据集的手工工程空间特征,以捕捉时间和空间的疾病的传播。 Covid-LSTM在我们的17周的评估期间优于Covid-19预测集线器集合模型(CovidHub-Ensemble),使其首先比一个或多个预测期更准确的模型。在4周的预测地平线上,我们的型号平均每县平均50例比CovidHub-Ensemble更准确。我们强调,在Covid-19之前,在Covid-19之前的数据驱动预测的未充分利用疾病传播的预测可能是由于以前疾病缺乏足够的数据,除了最近的时尚预测方法的机器学习方法的进步。我们讨论了更广泛的数据驱动预测的障碍,以及将来将使用更多的基于学习的模型。
translated by 谷歌翻译
A well-performing prediction model is vital for a recommendation system suggesting actions for energy-efficient consumer behavior. However, reliable and accurate predictions depend on informative features and a suitable model design to perform well and robustly across different households and appliances. Moreover, customers' unjustifiably high expectations of accurate predictions may discourage them from using the system in the long term. In this paper, we design a three-step forecasting framework to assess predictability, engineering features, and deep learning architectures to forecast 24 hourly load values. First, our predictability analysis provides a tool for expectation management to cushion customers' anticipations. Second, we design several new weather-, time- and appliance-related parameters for the modeling procedure and test their contribution to the model's prediction performance. Third, we examine six deep learning techniques and compare them to tree- and support vector regression benchmarks. We develop a robust and accurate model for the appliance-level load prediction based on four datasets from four different regions (US, UK, Austria, and Canada) with an equal set of appliances. The empirical results show that cyclical encoding of time features and weather indicators alongside a long-short term memory (LSTM) model offer the optimal performance.
translated by 谷歌翻译
In this paper, we propose a new short-term load forecasting (STLF) model based on contextually enhanced hybrid and hierarchical architecture combining exponential smoothing (ES) and a recurrent neural network (RNN). The model is composed of two simultaneously trained tracks: the context track and the main track. The context track introduces additional information to the main track. It is extracted from representative series and dynamically modulated to adjust to the individual series forecasted by the main track. The RNN architecture consists of multiple recurrent layers stacked with hierarchical dilations and equipped with recently proposed attentive dilated recurrent cells. These cells enable the model to capture short-term, long-term and seasonal dependencies across time series as well as to weight dynamically the input information. The model produces both point forecasts and predictive intervals. The experimental part of the work performed on 35 forecasting problems shows that the proposed model outperforms in terms of accuracy its predecessor as well as standard statistical models and state-of-the-art machine learning models.
translated by 谷歌翻译
地下水位预测是一个应用时间序列预测任务,具有重要的社会影响,以优化水管理以及防止某些自然灾害:例如,洪水或严重的干旱。在文献中已经报告了机器学习方法以实现这项任务,但它们仅专注于单个位置的地下水水平的预测。一种全球预测方法旨在利用从各个位置的地下水级时序列序列,一次在一个地方或一次在几个地方产生预测。鉴于全球预测方法在著名的竞争中取得了成功,因此在地下水级别的预测上进行评估并查看它们与本地方法的比较是有意义的。在这项工作中,我们创建了一个1026地下水级时序列的数据集。每个时间序列都是由每日测量地下水水平和两个外源变量,降雨和蒸散量制成的。该数据集可向社区提供可重现性和进一步评估。为了确定最佳的配置,可以有效地预测完整的时间序列的地下水水平,我们比较了包括本地和全球时间序列预测方法在内的不同预测因子。我们评估了外源变量的影响。我们的结果分析表明,通过训练过去的地下水位和降雨数据的全球方法获得最佳预测。
translated by 谷歌翻译
Wind power forecasting helps with the planning for the power systems by contributing to having a higher level of certainty in decision-making. Due to the randomness inherent to meteorological events (e.g., wind speeds), making highly accurate long-term predictions for wind power can be extremely difficult. One approach to remedy this challenge is to utilize weather information from multiple points across a geographical grid to obtain a holistic view of the wind patterns, along with temporal information from the previous power outputs of the wind farms. Our proposed CNN-RNN architecture combines convolutional neural networks (CNNs) and recurrent neural networks (RNNs) to extract spatial and temporal information from multi-dimensional input data to make day-ahead predictions. In this regard, our method incorporates an ultra-wide learning view, combining data from multiple numerical weather prediction models, wind farms, and geographical locations. Additionally, we experiment with global forecasting approaches to understand the impact of training the same model over the datasets obtained from multiple different wind farms, and we employ a method where spatial information extracted from convolutional layers is passed to a tree ensemble (e.g., Light Gradient Boosting Machine (LGBM)) instead of fully connected layers. The results show that our proposed CNN-RNN architecture outperforms other models such as LGBM, Extra Tree regressor and linear regression when trained globally, but fails to replicate such performance when trained individually on each farm. We also observe that passing the spatial information from CNN to LGBM improves its performance, providing further evidence of CNN's spatial feature extraction capabilities.
translated by 谷歌翻译
预测基金绩效对投资者和基金经理都是有益的,但这是一项艰巨的任务。在本文中,我们测试了深度学习模型是否比传统统计技术更准确地预测基金绩效。基金绩效通常通过Sharpe比率进行评估,该比例代表了风险调整的绩效,以确保基金之间有意义的可比性。我们根据每月收益率数据序列数据计算了年度夏普比率,该数据的时间序列数据为600多个投资于美国上市大型股票的开放式共同基金投资。我们发现,经过现代贝叶斯优化训练的长期短期记忆(LSTM)和封闭式复发单元(GRUS)深度学习方法比传统统计量相比,预测基金的Sharpe比率更高。结合了LSTM和GRU的预测的合奏方法,可以实现所有模型的最佳性能。有证据表明,深度学习和结合能提供有希望的解决方案,以应对基金绩效预测的挑战。
translated by 谷歌翻译
对于长期来说,研究人员一直在开发可靠而准确的股票价格预测预测模型。根据文献,如果预测模型是正确的设计和精炼,他们可以煞费苦心地和忠实地估计未来的库存价值。本文展示了一组时间序列,计量经济性和各种基于学习的股票价格预测模型。在此处使用来自2004年1月至2019年12月至2019年12月的Infosys,Icici和Sun Pharma的数据用于培训和测试模型,以了解哪种模型在哪个部门中表现最佳。一个时间序列模型(Holt-Winters指数平滑),一个计量计量模型(Arima),两台机器学习模型(随机林和火星),以及两种深度学习的模型(简单的RNN和LSTM)已被列入本文。火星已被证明是最好的执行机器学习模式,而LSTM已被证明是表现最好的深层学习模式。但总体而言,对于所有三个部门 - 它(在Infosys数据上),银行业务(在ICICI数据)和健康(在Sun Pharma数据上),Mars已被证明是销售预测中最佳表现模式。
translated by 谷歌翻译
电力行业正在大力实施智能网格技术,以提高可靠性,可用性,安全性和效率。该实施需要技术进步,标准和法规的发展以及测试和计划。智能电网载荷预测和管理对于降低需求波动和改善连接发电机,分销商和零售商的市场机制至关重要。在政策实施或外部干预措施中,有必要分析其对电力需求的影响的不确定性,以使系统对需求的波动更加准确。本文分析了外部干预的不确定性对电力需求的影响。它实现了一种结合概率和全局预测模型的框架,使用深度学习方法来估计干预措施的因果影响分布。通过预测受影响实例的反事实分布结果,然后将其与实际结果进行对比来评估因果效应。我们将COVID-19锁定对能源使用的影响视为评估这种干预对电力需求分布的不均匀影响的案例研究。我们可以证明,在澳大利亚和某些欧洲国家的最初封锁期间,槽通常比峰值更大的下降,而平均值几乎不受影响。
translated by 谷歌翻译
With the evolution of power systems as it is becoming more intelligent and interactive system while increasing in flexibility with a larger penetration of renewable energy sources, demand prediction on a short-term resolution will inevitably become more and more crucial in designing and managing the future grid, especially when it comes to an individual household level. Projecting the demand for electricity for a single energy user, as opposed to the aggregated power consumption of residential load on a wide scale, is difficult because of a considerable number of volatile and uncertain factors. This paper proposes a customized GRU (Gated Recurrent Unit) and Long Short-Term Memory (LSTM) architecture to address this challenging problem. LSTM and GRU are comparatively newer and among the most well-adopted deep learning approaches. The electricity consumption datasets were obtained from individual household smart meters. The comparison shows that the LSTM model performs better for home-level forecasting than alternative prediction techniques-GRU in this case. To compare the NN-based models with contrast to the conventional statistical technique-based model, ARIMA based model was also developed and benchmarked with LSTM and GRU model outcomes in this study to show the performance of the proposed model on the collected time series data.
translated by 谷歌翻译
已经显示混合方法以在预测任务中以纯粹的统计和纯粹的深度学习方法优于预测,并定量与这些预测(预测间隔)的相关不确定性。一个示例是指数平滑复发性神经网络(ES-RNN),统计预测模型和经常性神经网络变体之间的混合。 ES-RNN在Makridakis-4预测竞争中实现了9.4 \%的绝对错误。这种改进和类似的混合模型的表现主要是仅在单变量数据集上展示。将混合预测方法应用于多变量数据的困难包括($ i $)的高参数调整所涉及的高计算成本,用于与数据中固有的自动关联相关的模型(II $)挑战,以及( $ iii $)在可能难以捕获的协变量之间的复杂依赖(交叉相关)。本文介绍了多变量指数平滑的长短短期记忆(MES-LSTM),对ES-RNN的广义多元扩展,克服了这些挑战。 MES-LSTM利用了矢量化实现。我们在2019年(Covid-19)发病率数据集的几种聚集冠状病毒病中测试MES-LSTM,并发现我们的混合方法在预测准确性和预测间隔建设下对纯统计和深度学习方法进行了一致的,显着改善。
translated by 谷歌翻译
As ride-hailing services become increasingly popular, being able to accurately predict demand for such services can help operators efficiently allocate drivers to customers, and reduce idle time, improve congestion, and enhance the passenger experience. This paper proposes UberNet, a deep learning Convolutional Neural Network for short-term prediction of demand for ride-hailing services. UberNet empploys a multivariate framework that utilises a number of temporal and spatial features that have been found in the literature to explain demand for ride-hailing services. The proposed model includes two sub-networks that aim to encode the source series of various features and decode the predicting series, respectively. To assess the performance and effectiveness of UberNet, we use 9 months of Uber pickup data in 2014 and 28 spatial and temporal features from New York City. By comparing the performance of UberNet with several other approaches, we show that the prediction quality of the model is highly competitive. Further, Ubernet's prediction performance is better when using economic, social and built environment features. This suggests that Ubernet is more naturally suited to including complex motivators in making real-time passenger demand predictions for ride-hailing services.
translated by 谷歌翻译
Algorithms that involve both forecasting and optimization are at the core of solutions to many difficult real-world problems, such as in supply chains (inventory optimization), traffic, and in the transition towards carbon-free energy generation in battery/load/production scheduling in sustainable energy systems. Typically, in these scenarios we want to solve an optimization problem that depends on unknown future values, which therefore need to be forecast. As both forecasting and optimization are difficult problems in their own right, relatively few research has been done in this area. This paper presents the findings of the ``IEEE-CIS Technical Challenge on Predict+Optimize for Renewable Energy Scheduling," held in 2021. We present a comparison and evaluation of the seven highest-ranked solutions in the competition, to provide researchers with a benchmark problem and to establish the state of the art for this benchmark, with the aim to foster and facilitate research in this area. The competition used data from the Monash Microgrid, as well as weather data and energy market data. It then focused on two main challenges: forecasting renewable energy production and demand, and obtaining an optimal schedule for the activities (lectures) and on-site batteries that lead to the lowest cost of energy. The most accurate forecasts were obtained by gradient-boosted tree and random forest models, and optimization was mostly performed using mixed integer linear and quadratic programming. The winning method predicted different scenarios and optimized over all scenarios jointly using a sample average approximation method.
translated by 谷歌翻译
制定准确的旅游预测模型对于为旅游管理做出理想的政策决策至关重要。早期研究旅游管理专注于发现与旅游需求相关的外部因素。最近的研究利用深度学习随需需求预测以及这些外部因素。它们主要使用递归神经网络模型,例如LSTM和RNN的框架。然而,这些模型不适合用于预测旅游需求。这是因为旅游需求受到各种外部因素变化的强烈影响,递归神经网络模型在处理这些多变量输入方面具有限制。我们提出了一种多主题CNN模型(MHAC),用于解决这些限制。 MHAC使用1D卷积神经网络来分析时间模式和注意机制,以反映输入变量之间的相关性。该模型可以从各种变量的时间序列数据中提取空间特征。我们通过考虑韩国文化的政治,疾病,季节和吸引力等外部因素,应用我们的预测框架来预测韩国的入境旅游变化。广泛实验的性能结果表明,我们的方法优于韩国旅游预测的其他基于深受学习的预测框架。
translated by 谷歌翻译
预测经济的短期动态 - 对经济代理商决策过程的重要意见 - 经常在线性模型中使用滞后指标。这通常在正常时期就足够了,但在危机期间可能不足。本文旨在证明,在非线性机器学习方法的帮助下,非传统和及时的数据(例如零售和批发付款)可以为决策者提供复杂的模型,以准确地估算几乎实时的关键宏观经济指标。此外,我们提供了一组计量经济学工具,以减轻机器学习模型中的过度拟合和解释性挑战,以提高其政策使用的有效性。我们的模型具有付款数据,非线性方法和量身定制的交叉验证方法,有助于提高宏观经济的启示准确性高达40 \% - 在COVID-19期间的增长较高。我们观察到,付款数据对经济预测的贡献很小,在低和正常增长期间是线性的。但是,在强年或正增长期间,付款数据的贡献很大,不对称和非线性。
translated by 谷歌翻译
预测组合在预测社区中蓬勃发展,近年来,已经成为预测研究和活动主流的一部分。现在,由单个(目标)系列产生的多个预测组合通过整合来自不同来源收集的信息,从而提高准确性,从而减轻了识别单个“最佳”预测的风险。组合方案已从没有估计的简单组合方法演变为涉及时间变化的权重,非线性组合,组件之间的相关性和交叉学习的复杂方法。它们包括结合点预测和结合概率预测。本文提供了有关预测组合的广泛文献的最新评论,并参考可用的开源软件实施。我们讨论了各种方法的潜在和局限性,并突出了这些思想如何随着时间的推移而发展。还调查了有关预测组合实用性的一些重要问题。最后,我们以当前的研究差距和未来研究的潜在见解得出结论。
translated by 谷歌翻译