KDD CUP 2022提出了有关空间动态风能数据集的时间序列预测任务,其中要求参与者预测未来一代给定历史上下文因素。评估指标包含RMSE和MAE。本文介绍了团队88VIP的解决方案,该解决方案主要包括两种模型:梯度增强决策树,以记住基本数据模式和复发性神经网络,以捕获深层和潜在的概率过渡。结合这些模型有助于应对风力的波动,以及训练子模型对预测的异质时间尺度(从几分钟到几天)的杰出特性的目标。此外,还详细介绍了功能工程,插补技术和离线评估的设计。拟议的解决方案在第3阶段的总体在线得分为-45.213。
translated by 谷歌翻译
Wind power forecasting helps with the planning for the power systems by contributing to having a higher level of certainty in decision-making. Due to the randomness inherent to meteorological events (e.g., wind speeds), making highly accurate long-term predictions for wind power can be extremely difficult. One approach to remedy this challenge is to utilize weather information from multiple points across a geographical grid to obtain a holistic view of the wind patterns, along with temporal information from the previous power outputs of the wind farms. Our proposed CNN-RNN architecture combines convolutional neural networks (CNNs) and recurrent neural networks (RNNs) to extract spatial and temporal information from multi-dimensional input data to make day-ahead predictions. In this regard, our method incorporates an ultra-wide learning view, combining data from multiple numerical weather prediction models, wind farms, and geographical locations. Additionally, we experiment with global forecasting approaches to understand the impact of training the same model over the datasets obtained from multiple different wind farms, and we employ a method where spatial information extracted from convolutional layers is passed to a tree ensemble (e.g., Light Gradient Boosting Machine (LGBM)) instead of fully connected layers. The results show that our proposed CNN-RNN architecture outperforms other models such as LGBM, Extra Tree regressor and linear regression when trained globally, but fails to replicate such performance when trained individually on each farm. We also observe that passing the spatial information from CNN to LGBM improves its performance, providing further evidence of CNN's spatial feature extraction capabilities.
translated by 谷歌翻译
风能供应的可变性可能会给将风力发电纳入网格系统带来重大挑战。因此,风力预测(WPF)已被广泛认为是风能整合和操作中最关键的问题之一。在过去的几十年中,关于风能预测问题的研究爆炸了。然而,如何很好地处理WPF问题仍然具有挑战性,因为始终要求高预测准确性以确保电网稳定性和供应的安全性。我们提出了独特的空间动态风能预测数据集:SDWPF,其中包括风力涡轮机的空间分布以及动态上下文因素。鉴于,大多数现有数据集只有少量的风力涡轮机,而无需以细粒度的时间尺度了解风力涡轮机的位置和上下文信息。相比之下,SDWPF提供了半年多的风力涡轮机的风能数据,其相对位置和内部地位。我们使用此数据集启动BAIDU KDD杯2022来检查当前WPF解决方案的极限。该数据集在https://aistudio.baidu.com/aistudio/competition/detail/152/0/datasets上发布。
translated by 谷歌翻译
With the evolution of power systems as it is becoming more intelligent and interactive system while increasing in flexibility with a larger penetration of renewable energy sources, demand prediction on a short-term resolution will inevitably become more and more crucial in designing and managing the future grid, especially when it comes to an individual household level. Projecting the demand for electricity for a single energy user, as opposed to the aggregated power consumption of residential load on a wide scale, is difficult because of a considerable number of volatile and uncertain factors. This paper proposes a customized GRU (Gated Recurrent Unit) and Long Short-Term Memory (LSTM) architecture to address this challenging problem. LSTM and GRU are comparatively newer and among the most well-adopted deep learning approaches. The electricity consumption datasets were obtained from individual household smart meters. The comparison shows that the LSTM model performs better for home-level forecasting than alternative prediction techniques-GRU in this case. To compare the NN-based models with contrast to the conventional statistical technique-based model, ARIMA based model was also developed and benchmarked with LSTM and GRU model outcomes in this study to show the performance of the proposed model on the collected time series data.
translated by 谷歌翻译
In this paper, we propose a new short-term load forecasting (STLF) model based on contextually enhanced hybrid and hierarchical architecture combining exponential smoothing (ES) and a recurrent neural network (RNN). The model is composed of two simultaneously trained tracks: the context track and the main track. The context track introduces additional information to the main track. It is extracted from representative series and dynamically modulated to adjust to the individual series forecasted by the main track. The RNN architecture consists of multiple recurrent layers stacked with hierarchical dilations and equipped with recently proposed attentive dilated recurrent cells. These cells enable the model to capture short-term, long-term and seasonal dependencies across time series as well as to weight dynamically the input information. The model produces both point forecasts and predictive intervals. The experimental part of the work performed on 35 forecasting problems shows that the proposed model outperforms in terms of accuracy its predecessor as well as standard statistical models and state-of-the-art machine learning models.
translated by 谷歌翻译
Forecasts by the European Centre for Medium-Range Weather Forecasts (ECMWF; EC for short) can provide a basis for the establishment of maritime-disaster warning systems, but they contain some systematic biases.The fifth-generation EC atmospheric reanalysis (ERA5) data have high accuracy, but are delayed by about 5 days. To overcome this issue, a spatiotemporal deep-learning method could be used for nonlinear mapping between EC and ERA5 data, which would improve the quality of EC wind forecast data in real time. In this study, we developed the Multi-Task-Double Encoder Trajectory Gated Recurrent Unit (MT-DETrajGRU) model, which uses an improved double-encoder forecaster architecture to model the spatiotemporal sequence of the U and V components of the wind field; we designed a multi-task learning loss function to correct wind speed and wind direction simultaneously using only one model. The study area was the western North Pacific (WNP), and real-time rolling bias corrections were made for 10-day wind-field forecasts released by the EC between December 2020 and November 2021, divided into four seasons. Compared with the original EC forecasts, after correction using the MT-DETrajGRU model the wind speed and wind direction biases in the four seasons were reduced by 8-11% and 9-14%, respectively. In addition, the proposed method modelled the data uniformly under different weather conditions. The correction performance under normal and typhoon conditions was comparable, indicating that the data-driven mode constructed here is robust and generalizable.
translated by 谷歌翻译
我们在在线环境中研究了非线性预测,并引入了混合模型,该模型通过端到端体系结构有效地减轻了对手工设计的功能的需求和传统非线性预测/回归方法的手动模型选择问题。特别是,我们使用递归结构从顺序信号中提取特征,同时保留状态信息,即历史记录和增强决策树以产生最终输出。该连接是以端到端方式的,我们使用随机梯度下降共同优化整个体系结构,我们还为此提供了向后的通过更新方程。特别是,我们采用了一个经常性的神经网络(LSTM)来从顺序数据中提取自适应特征,并提取梯度增强机械(Soft GBDT),以进行有效的监督回归。我们的框架是通用的,因此可以使用其他深度学习体系结构进行特征提取(例如RNN和GRU)和机器学习算法进行决策,只要它们是可区分的。我们证明了算法对合成数据的学习行为以及各种现实生活数据集对常规方法的显着性能改进。此外,我们公开分享提出的方法的源代码,以促进进一步的研究。
translated by 谷歌翻译
A well-performing prediction model is vital for a recommendation system suggesting actions for energy-efficient consumer behavior. However, reliable and accurate predictions depend on informative features and a suitable model design to perform well and robustly across different households and appliances. Moreover, customers' unjustifiably high expectations of accurate predictions may discourage them from using the system in the long term. In this paper, we design a three-step forecasting framework to assess predictability, engineering features, and deep learning architectures to forecast 24 hourly load values. First, our predictability analysis provides a tool for expectation management to cushion customers' anticipations. Second, we design several new weather-, time- and appliance-related parameters for the modeling procedure and test their contribution to the model's prediction performance. Third, we examine six deep learning techniques and compare them to tree- and support vector regression benchmarks. We develop a robust and accurate model for the appliance-level load prediction based on four datasets from four different regions (US, UK, Austria, and Canada) with an equal set of appliances. The empirical results show that cyclical encoding of time features and weather indicators alongside a long-short term memory (LSTM) model offer the optimal performance.
translated by 谷歌翻译
21世纪的现代旅游面临着许多挑战。这些挑战之一是太空有限地区的游客数量迅速增长,例如历史城市中心,博物馆或地理瓶颈,例如狭窄的山谷。在这种情况下,对特定领域内的旅游量和旅游流程的正确准确预测对于游客管理任务,例如游客流量控制和预防人满为患至关重要。静态流量控制方法,例如限制对热点或使用常规低级控制器的访问,无法解决问题。在本文中,我们通过使用旅游区提供的可用粒状数据,并将结果与​​Arima进行比较,并将结果与​​Arima进行比较经典统计方法。我们的结果表明,与Arima方法相比,深度学习模型可以产生更好的预测,同时具有更快的推理时间和能够结合其他输入功能。
translated by 谷歌翻译
干扰风暴时间(DST)指数已被广泛用作环电流强度的代理,因此被用作地磁活性的量度。它是由地磁赤道区域中四个地面磁力计测量得出的。我们提出了一种新的模型,用于预测$ DST $,在1到6个小时之间提前时间。该模型首先是使用封闭式复发单元(GRU)网络开发的,该网络是使用太阳风参数训练的。然后,使用Ackrue方法估算$ DST $模型的不确定性[Camporeale等。2021]。最后,开发了一种多保真提升方法,以提高模型的准确性并降低其相关的不确定性。结果表明,开发的模型可以通过13.54 $ \ mathrm {nt} $的根平方(RMSE)(RMSE)预测$ DST $ 6小时。这比持久性模型和简单的GRU模型要好得多。
translated by 谷歌翻译
Spatio-temporal modeling as a canonical task of multivariate time series forecasting has been a significant research topic in AI community. To address the underlying heterogeneity and non-stationarity implied in the graph streams, in this study, we propose Spatio-Temporal Meta-Graph Learning as a novel Graph Structure Learning mechanism on spatio-temporal data. Specifically, we implement this idea into Meta-Graph Convolutional Recurrent Network (MegaCRN) by plugging the Meta-Graph Learner powered by a Meta-Node Bank into GCRN encoder-decoder. We conduct a comprehensive evaluation on two benchmark datasets (METR-LA and PEMS-BAY) and a large-scale spatio-temporal dataset that contains a variaty of non-stationary phenomena. Our model outperformed the state-of-the-arts to a large degree on all three datasets (over 27% MAE and 34% RMSE). Besides, through a series of qualitative evaluations, we demonstrate that our model can explicitly disentangle locations and time slots with different patterns and be robustly adaptive to different anomalous situations. Codes and datasets are available at https://github.com/deepkashiwa20/MegaCRN.
translated by 谷歌翻译
评估能源转型和能源市场自由化对资源充足性的影响是一种越来越重要和苛刻的任务。能量系统的上升复杂性需要足够的能量系统建模方法,从而提高计算要求。此外,随着复杂性,同样调用概率评估和场景分析同样增加不确定性。为了充分和高效地解决这些各种要求,需要来自数据科学领域的新方法来加速当前方法。通过我们的系统文献综述,我们希望缩小三个学科之间的差距(1)电力供应安全性评估,(2)人工智能和(3)实验设计。为此,我们对所选应用领域进行大规模的定量审查,并制作彼此不同学科的合成。在其他发现之外,我们使用基于AI的方法和应用程序的AI方法和应用来确定电力供应模型的复杂安全性的元素,并作为未充分涵盖的应用领域的储存调度和(非)可用性。我们结束了推出了一种新的方法管道,以便在评估电力供应安全评估时充分有效地解决当前和即将到来的挑战。
translated by 谷歌翻译
近年来,随着传感器和智能设备的广泛传播,物联网(IoT)系统的数据生成速度已大大增加。在物联网系统中,必须经常处理,转换和分析大量数据,以实现各种物联网服务和功能。机器学习(ML)方法已显示出其物联网数据分析的能力。但是,将ML模型应用于物联网数据分析任务仍然面临许多困难和挑战,特别是有效的模型选择,设计/调整和更新,这给经验丰富的数据科学家带来了巨大的需求。此外,物联网数据的动态性质可能引入概念漂移问题,从而导致模型性能降解。为了减少人类的努力,自动化机器学习(AUTOML)已成为一个流行的领域,旨在自动选择,构建,调整和更新机器学习模型,以在指定任务上实现最佳性能。在本文中,我们对Automl区域中模型选择,调整和更新过程中的现有方法进行了审查,以识别和总结将ML算法应用于IoT数据分析的每个步骤的最佳解决方案。为了证明我们的发现并帮助工业用户和研究人员更好地实施汽车方法,在这项工作中提出了将汽车应用于IoT异常检测问题的案例研究。最后,我们讨论并分类了该领域的挑战和研究方向。
translated by 谷歌翻译
PV power forecasting models are predominantly based on machine learning algorithms which do not provide any insight into or explanation about their predictions (black boxes). Therefore, their direct implementation in environments where transparency is required, and the trust associated with their predictions may be questioned. To this end, we propose a two stage probabilistic forecasting framework able to generate highly accurate, reliable, and sharp forecasts yet offering full transparency on both the point forecasts and the prediction intervals (PIs). In the first stage, we exploit natural gradient boosting (NGBoost) for yielding probabilistic forecasts, while in the second stage, we calculate the Shapley additive explanation (SHAP) values in order to fully comprehend why a prediction was made. To highlight the performance and the applicability of the proposed framework, real data from two PV parks located in Southern Germany are employed. Comparative results with two state-of-the-art algorithms, namely Gaussian process and lower upper bound estimation, manifest a significant increase in the point forecast accuracy and in the overall probabilistic performance. Most importantly, a detailed analysis of the model's complex nonlinear relationships and interaction effects between the various features is presented. This allows interpreting the model, identifying some learned physical properties, explaining individual predictions, reducing the computational requirements for the training without jeopardizing the model accuracy, detecting possible bugs, and gaining trust in the model. Finally, we conclude that the model was able to develop complex nonlinear relationships which follow known physical properties as well as human logic and intuition.
translated by 谷歌翻译
COVID-19大流行对全球医疗保健系统造成了沉重的负担,并造成了巨大的社会破坏和经济损失。已经提出了许多深度学习模型来执行临床预测任务,例如使用电子健康记录(EHR)数据在重症监护病房中为Covid-19患者的死亡率预测。尽管在某些临床应用中取得了最初的成功,但目前缺乏基准测试结果来获得公平的比较,因此我们可以选择最佳模型以供临床使用。此外,传统预测任务的制定与重症监护现实世界的临床实践之间存在差异。为了填补这些空白,我们提出了两项​​临床预测任务,特定于结局的预测和重症监护病房中的COVID-19患者的早期死亡率预测。这两个任务是根据幼稚的停车时间和死亡率预测任务的改编,以适应COVID-19患者的临床实践。我们提出了公平,详细的开源数据预处管道,并评估了两项任务的17个最先进的预测模型,包括5个机器学习模型,6种基本的深度学习模型和6种专门为EHR设计的深度学习预测模型数据。我们使用来自两个现实世界Covid-19 EHR数据集的数据提供基准测试结果。这两个数据集都可以公开可用,而无需任何查询,并且可以根据要求访问一个数据集。我们为两项任务提供公平,可重复的基准测试结果。我们在在线平台上部署所有实验结果和模型。我们还允许临床医生和研究人员将其数据上传到平台上,并使用训练有素的模型快速获得预测结果。我们希望我们的努力能够进一步促进Covid-19预测建模的深度学习和机器学习研究。
translated by 谷歌翻译
时间序列预测在城市生活中广泛应用,从空气质量监测到交通分析。但是,准确的时间序列预测是具有挑战性的,因为现实世界中的时间序列遇到了分配转移问题,在该问题中,它们的统计属性会随着时间而变化。尽管对域适应或概括的分布变化的广泛解决方案,但它们在未知的,不断变化的分布变化中无法有效发挥作用,这在时间序列中很常见。在本文中,我们提出了超时性预测(HTSF),这是一个基于超网络的框架,用于在分配变化下预测准确的时间序列。 HTSF以端到端的方式共同学习时间变化的分布和相应的预测模型。具体而言,HTSF利用超层来学习分布移位的最佳表征,从而为主层生成模型参数以进行准确的预测。我们将HTSF实施为可扩展的框架,可以结合不同的时间序列预测模型,例如RNN和Transformers。对9个基准测试的广泛实验表明,HTSF达到了最先进的表现。
translated by 谷歌翻译
预测基金绩效对投资者和基金经理都是有益的,但这是一项艰巨的任务。在本文中,我们测试了深度学习模型是否比传统统计技术更准确地预测基金绩效。基金绩效通常通过Sharpe比率进行评估,该比例代表了风险调整的绩效,以确保基金之间有意义的可比性。我们根据每月收益率数据序列数据计算了年度夏普比率,该数据的时间序列数据为600多个投资于美国上市大型股票的开放式共同基金投资。我们发现,经过现代贝叶斯优化训练的长期短期记忆(LSTM)和封闭式复发单元(GRUS)深度学习方法比传统统计量相比,预测基金的Sharpe比率更高。结合了LSTM和GRU的预测的合奏方法,可以实现所有模型的最佳性能。有证据表明,深度学习和结合能提供有希望的解决方案,以应对基金绩效预测的挑战。
translated by 谷歌翻译
电力是一种波动的电源,需要短期和长期的精力计划和资源管理。更具体地说,在短期,准确的即时能源消耗中,预测极大地提高了建筑物的效率,为采用可再生能源提供了新的途径。在这方面,数据驱动的方法,即基于机器学习的方法,开始优先于更传统的方法,因为它们不仅提供了更简化的部署方式,而且还提供了最新的结果。从这个意义上讲,这项工作应用和比较了几种深度学习算法,LSTM,CNN,CNN-LSTM和TCN的性能,在制造业内的一个真实测试中。实验结果表明,TCN是预测短期即时能源消耗的最可靠方法。
translated by 谷歌翻译
信息爆炸的时代促使累积巨大的时间序列数据,包括静止和非静止时间序列数据。最先进的算法在处理静止时间数据方面取得了体面的性能。然而,解决静止​​时间系列的传统算法不适用于外汇交易的非静止系列。本文调查了适用的模型,可以提高预测未来非静止时间序列序列趋势的准确性。特别是,我们专注于识别潜在模型,并调查识别模式从历史数据的影响。我们提出了基于RNN的\ Rebuttal {The} SEQ2Seq模型的组合,以及通过动态时间翘曲和Zigzag峰谷指示器提取的注重机制和富集的集合特征。定制损失函数和评估指标旨在更加关注预测序列的峰值和谷点。我们的研究结果表明,我们的模型可以在外汇数据集中预测高精度的4小时未来趋势,这在逼真的情况下至关重要,以协助外汇交易决策。我们进一步提供了对各种损失函数,评估指标,模型变体和组件对模型性能的影响的评估。
translated by 谷歌翻译
Algorithms that involve both forecasting and optimization are at the core of solutions to many difficult real-world problems, such as in supply chains (inventory optimization), traffic, and in the transition towards carbon-free energy generation in battery/load/production scheduling in sustainable energy systems. Typically, in these scenarios we want to solve an optimization problem that depends on unknown future values, which therefore need to be forecast. As both forecasting and optimization are difficult problems in their own right, relatively few research has been done in this area. This paper presents the findings of the ``IEEE-CIS Technical Challenge on Predict+Optimize for Renewable Energy Scheduling," held in 2021. We present a comparison and evaluation of the seven highest-ranked solutions in the competition, to provide researchers with a benchmark problem and to establish the state of the art for this benchmark, with the aim to foster and facilitate research in this area. The competition used data from the Monash Microgrid, as well as weather data and energy market data. It then focused on two main challenges: forecasting renewable energy production and demand, and obtaining an optimal schedule for the activities (lectures) and on-site batteries that lead to the lowest cost of energy. The most accurate forecasts were obtained by gradient-boosted tree and random forest models, and optimization was mostly performed using mixed integer linear and quadratic programming. The winning method predicted different scenarios and optimized over all scenarios jointly using a sample average approximation method.
translated by 谷歌翻译