智能论文笔记

An Attention Free Long Short-Term Memory for Time Series Forecasting

Hugo Inzirillo , Ludovic De Villelongue

分类：机器学习

2022-09-20

深度学习在时间序列分析中起着越来越重要的作用。我们专注于使用无注意机制，更有效的框架的时间序列预测，并为时间序列预测提出了一个新的体系结构，该预测似乎无法捕获时间依赖性。我们提出了一个使用无注意LSTM层构建的体系结构，该层是克服条件差异预测的线性模型。我们的发现证实了我们的模型的有效性，该模型还允许提高LSTM的预测能力，同时提高学习任务的效率。

translated by 谷歌翻译

Trading with the Momentum Transformer: An Intelligent and Interpretable Architecture

Kieran Wood , Sven Giegerich , Stephen Roberts , Stefan Zohren

分类：机器学习 | (统计)机器学习

2021-12-16

已经发现，已经发现深度学习架构，特别是深度动量网络（DMNS）[1904.04912]是一种有效的势头和平均逆转交易的方法。然而，近年来一些关键挑战涉及学习长期依赖，在考虑返回交易成本净净额并适应新的市场制度时，绩效的退化，特别是在SARS-COV-2危机期间。注意机制或基于变换器的架构是对这些挑战的解决方案，因为它们允许网络专注于过去和长期模式的重要时间步骤。我们介绍了势头变压器，一种基于关注的架构，胜过基准，并且本质上是可解释的，为我们提供更大的深入学习交易策略。我们的模型是基于LSTM的DMN的扩展，它通过在风险调整的性能度量上优化网络，直接输出位置尺寸，例如锐利比率。我们发现注意力LSTM混合解码器仅时间融合变压器（TFT）样式架构是最佳的执行模型。在可解释性方面，我们观察注意力模式的显着结构，在动量转点时具有重要的重要性。因此，时间序列被分段为制度，并且该模型倾向于关注以前的制度中的先前时间步骤。我们发现ChangePoint检测（CPD）[2105.13727]，另一个用于响应政权变化的技术可以补充多抬头的注意力，特别是当我们在多个时间尺度运行CPD时。通过添加可解释的变量选择网络，我们观察CPD如何帮助我们的模型在日常返回数据上主要远离交易。我们注意到该模型可以智能地切换和混合古典策略 - 基于数据的决定。

translated by 谷歌翻译

Predicting Performances of Mutual Funds using Deep Learning and Ensemble Techniques

Nghia Chu , Binh Dao , Nga Pham , Huy Nguyen , Hien Tran

分类：机器学习

2022-09-18

预测基金绩效对投资者和基金经理都是有益的，但这是一项艰巨的任务。在本文中，我们测试了深度学习模型是否比传统统计技术更准确地预测基金绩效。基金绩效通常通过Sharpe比率进行评估，该比例代表了风险调整的绩效，以确保基金之间有意义的可比性。我们根据每月收益率数据序列数据计算了年度夏普比率，该数据的时间序列数据为600多个投资于美国上市大型股票的开放式共同基金投资。我们发现，经过现代贝叶斯优化训练的长期短期记忆（LSTM）和封闭式复发单元（GRUS）深度学习方法比传统统计量相比，预测基金的Sharpe比率更高。结合了LSTM和GRU的预测的合奏方法，可以实现所有模型的最佳性能。有证据表明，深度学习和结合能提供有希望的解决方案，以应对基金绩效预测的挑战。

translated by 谷歌翻译

Short-term Prediction of Household Electricity Consumption Using Customized LSTM and GRU Models

Saad Emshagin , Wayes Koroni Halim , Rasha Kashef

分类：机器学习 | 神经与进化计算

2022-12-16

With the evolution of power systems as it is becoming more intelligent and interactive system while increasing in flexibility with a larger penetration of renewable energy sources, demand prediction on a short-term resolution will inevitably become more and more crucial in designing and managing the future grid, especially when it comes to an individual household level. Projecting the demand for electricity for a single energy user, as opposed to the aggregated power consumption of residential load on a wide scale, is difficult because of a considerable number of volatile and uncertain factors. This paper proposes a customized GRU (Gated Recurrent Unit) and Long Short-Term Memory (LSTM) architecture to address this challenging problem. LSTM and GRU are comparatively newer and among the most well-adopted deep learning approaches. The electricity consumption datasets were obtained from individual household smart meters. The comparison shows that the LSTM model performs better for home-level forecasting than alternative prediction techniques-GRU in this case. To compare the NN-based models with contrast to the conventional statistical technique-based model, ARIMA based model was also developed and benchmarked with LSTM and GRU model outcomes in this study to show the performance of the proposed model on the collected time series data.

translated by 谷歌翻译

Deep Transformer Model with Pre-Layer Normalization for COVID-19 Growth Prediction

Rizki Ramadhan Fitra , Novanto Yudistira , Wayan Firdaus Mahmudy

分类：机器学习 | 人工智能

2022-07-10

冠状病毒疾病或Covid-19是由SARS-COV-2病毒引起的一种传染病。该病毒引起的第一个确认病例是在2019年12月底在中国武汉市发现的。然后，此案遍布全球，包括印度尼西亚。因此，联合19案被WHO指定为全球大流行。可以使用多种方法（例如深神经网络（DNN））预测COVID-19病例的增长，尤其是在印度尼西亚。可以使用的DNN模型之一是可以预测时间序列的深变压器。该模型经过多种测试方案的培训，以获取最佳模型。评估是找到最佳的超参数。然后，使用预测天数，优化器，功能数量以及与长期短期记忆（LSTM）（LSTM）和复发性神经网络（RNN）的先前模型进行比较的最佳超参数设置进行了进一步的评估。。所有评估均使用平均绝对百分比误差（MAPE）的度量。基于评估的结果，深层变压器在使用前层归一化时会产生最佳的结果，并预测有一天的MAPE值为18.83。此外，接受Adamax优化器训练的模型在其他测试优化器中获得了最佳性能。 Deep Transformer的性能还超过了其他测试模型，即LSTM和RNN。

translated by 谷歌翻译

Traffic Flow Prediction via Variational Bayesian Inference-based Encoder-Decoder Framework

Jianlei Kong , Xiaomeng Fan , Xue-Bo Jin , Min Zuo

分类：机器学习

2022-12-14

Accurate traffic flow prediction, a hotspot for intelligent transportation research, is the prerequisite for mastering traffic and making travel plans. The speed of traffic flow can be affected by roads condition, weather, holidays, etc. Furthermore, the sensors to catch the information about traffic flow will be interfered with by environmental factors such as illumination, collection time, occlusion, etc. Therefore, the traffic flow in the practical transportation system is complicated, uncertain, and challenging to predict accurately. This paper proposes a deep encoder-decoder prediction framework based on variational Bayesian inference. A Bayesian neural network is constructed by combining variational inference with gated recurrent units (GRU) and used as the deep neural network unit of the encoder-decoder framework to mine the intrinsic dynamics of traffic flow. Then, the variational inference is introduced into the multi-head attention mechanism to avoid noise-induced deterioration of prediction accuracy. The proposed model achieves superior prediction performance on the Guangzhou urban traffic flow dataset over the benchmarks, particularly when the long-term prediction.

translated by 谷歌翻译

Bi-LSTM Price Prediction based on Attention Mechanism

Jiashu Lou , Leyi Cui , Ye Li

分类：机器学习

2022-12-07

With the increasing enrichment and development of the financial derivatives market, the frequency of transactions is also faster and faster. Due to human limitations, algorithms and automatic trading have recently become the focus of discussion. In this paper, we propose a bidirectional LSTM neural network based on an attention mechanism, which is based on two popular assets, gold and bitcoin. In terms of Feature Engineering, on the one hand, we add traditional technical factors, and at the same time, we combine time series models to develop factors. In the selection of model parameters, we finally chose a two-layer deep learning network. According to AUC measurement, the accuracy of bitcoin and gold is 71.94% and 73.03% respectively. Using the forecast results, we achieved a return of 1089.34% in two years. At the same time, we also compare the attention Bi-LSTM model proposed in this paper with the traditional model, and the results show that our model has the best performance in this data set. Finally, we discuss the significance of the model and the experimental results, as well as the possible improvement direction in the future.

translated by 谷歌翻译

Deep learning for laboratory earthquake prediction and autoregressive forecasting of fault zone stress

Laura Laurenti , Elisa Tinti , Fabio Galasso , Luca Franco , Chris Marone

分类：计算机视觉

2022-03-24

地震的预测和预测有很长的时间，在某些情况下有肮脏的历史，但是最近的工作重新点燃了基于预警的进步，诱发地震性的危害评估以及对实验室地震的成功预测。在实验室中，摩擦滑移事件为地震和地震周期提供了类似物。 Labquakes是机器学习（ML）的理想目标，因为它们可以在受控条件下以长序列生产。最近的作品表明，ML可以使用断层区的声学排放来预测实验室的几个方面。在这里，我们概括了这些结果，并探索了Labquake预测和自动回归（AR）预测的深度学习（DL）方法。 DL改善了现有的Labquake预测方法。 AR方法允许通过迭代预测在未来的视野中进行预测。我们证明，基于长期任期内存（LSTM）和卷积神经网络的DL模型可以预测在几种条件下实验室，并且可以以忠诚度预测断层区应力，证实声能是断层区应力的指纹。我们还预测了实验室的失败开始（TTSF）和失败结束（TTEF）的时间。有趣的是，在所有地震循环中都可以成功预测TTEF，而TTSF的预测随preseismisic断层蠕变的数量而变化。我们报告了使用三个序列建模框架：LSTM，时间卷积网络和变压器网络预测故障应力演变的AR方法。 AR预测与现有的预测模型不同，该模型仅在特定时间预测目标变量。超出单个地震周期的预测结果有限，但令人鼓舞。我们的ML/DL模型优于最先进的模型，我们的自回归模型代表了一个新颖的框架，可以增强当前的地震预测方法。

translated by 谷歌翻译

A Hybrid Framework for Sequential Data Prediction with End-to-End Optimization

Mustafa E. Aydın , Suleyman S. Kozat

分类： (统计)机器学习 | 机器学习

2022-03-25

我们在在线环境中研究了非线性预测，并引入了混合模型，该模型通过端到端体系结构有效地减轻了对手工设计的功能的需求和传统非线性预测/回归方法的手动模型选择问题。特别是，我们使用递归结构从顺序信号中提取特征，同时保留状态信息，即历史记录和增强决策树以产生最终输出。该连接是以端到端方式的，我们使用随机梯度下降共同优化整个体系结构，我们还为此提供了向后的通过更新方程。特别是，我们采用了一个经常性的神经网络（LSTM）来从顺序数据中提取自适应特征，并提取梯度增强机械（Soft GBDT），以进行有效的监督回归。我们的框架是通用的，因此可以使用其他深度学习体系结构进行特征提取（例如RNN和GRU）和机器学习算法进行决策，只要它们是可区分的。我们证明了算法对合成数据的学习行为以及各种现实生活数据集对常规方法的显着性能改进。此外，我们公开分享提出的方法的源代码，以促进进一步的研究。

translated by 谷歌翻译

CRU: A Novel Neural Architecture for Improving the Predictive Performance of Time-Series Data

Sunghyun Sim , Dohee Kim , Hyerim Bae

分类：机器学习 | 人工智能

2022-11-30

The time-series forecasting (TSF) problem is a traditional problem in the field of artificial intelligence. Models such as Recurrent Neural Network (RNN), Long Short Term Memory (LSTM), and GRU (Gate Recurrent Units) have contributed to improving the predictive accuracy of TSF. Furthermore, model structures have been proposed to combine time-series decomposition methods, such as seasonal-trend decomposition using Loess (STL) to ensure improved predictive accuracy. However, because this approach is learned in an independent model for each component, it cannot learn the relationships between time-series components. In this study, we propose a new neural architecture called a correlation recurrent unit (CRU) that can perform time series decomposition within a neural cell and learn correlations (autocorrelation and correlation) between each decomposition component. The proposed neural architecture was evaluated through comparative experiments with previous studies using five univariate time-series datasets and four multivariate time-series data. The results showed that long- and short-term predictive performance was improved by more than 10%. The experimental results show that the proposed CRU is an excellent method for TSF problems compared to other neural architectures.

translated by 谷歌翻译

EgPDE-Net: Building Continuous Neural Networks for Time Series Prediction with Exogenous Variables

Penglei Gao , Xi Yang , Kaizhu Huang , Rui Zhang , Ping Guo , John Y. Goulermas

分类：机器学习

2022-08-03

虽然外源变量对时间序列分析的性能改善有重大影响，但在当前的连续方法中很少考虑这些序列间相关性和时间依赖性。多元时间序列的动力系统可以用复杂的未知偏微分方程（PDE）进行建模，这些方程（PDE）在科学和工程的许多学科中都起着重要作用。在本文中，我们提出了一个任意步骤预测的连续时间模型，以学习多元时间序列中的未知PDE系统，其管理方程是通过自我注意和封闭的复发神经网络参数化的。所提出的模型\下划线{变量及其对目标系列的影响。重要的是，使用特殊设计的正则化指南可以将模型简化为正则化的普通微分方程（ODE）问题，这使得可以触犯的PDE问题以获得数值解决方案，并且可行，以预测目标序列的多个未来值。广泛的实验表明，我们提出的模型可以在强大的基准中实现竞争精度：平均而言，它通过降低RMSE的$ 9.85 \％$和MAE的MAE $ 13.98 \％$的基线表现优于最佳基准，以获得任意步骤预测的MAE $。

translated by 谷歌翻译

ARMA Cell: A Modular and Effective Approach for Neural Autoregressive Modeling

Philipp Schiele , Christoph Berninger , David Rügamer

分类：机器学习 | 神经与进化计算 | (统计)机器学习

2022-08-31

自回旋运动平均值（ARMA）模型是经典的，可以说是模型时间序列数据的最多研究的方法之一。它具有引人入胜的理论特性，并在从业者中广泛使用。最近的深度学习方法普及了经常性神经网络（RNN），尤其是长期记忆（LSTM）细胞，这些细胞已成为神经时间序列建模中最佳性能和最常见的构件之一。虽然对具有长期效果的时间序列数据或序列有利，但复杂的RNN细胞并不总是必须的，有时甚至可能不如更简单的复发方法。在这项工作中，我们介绍了ARMA细胞，这是一种在神经网络中的时间序列建模的更简单，模块化和有效的方法。该单元可以用于存在复发结构的任何神经网络体系结构中，并自然地使用矢量自动进程处理多元时间序列。我们还引入了Convarma细胞作为空间相关时间序列的自然继任者。我们的实验表明，所提出的方法在性能方面与流行替代方案具有竞争力，同时由于其简单性而变得更加强大和引人注目。

translated by 谷歌翻译

HTML版本

Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting

Haoyi Zhou , Shanghang Zhang , Jieqi Peng , Shuai Zhang , Jianxin Li , Hui Xiong , Wancai Zhang

分类：

2020-12-14

Many real-world applications require the prediction of long sequence time-series, such as electricity consumption planning. Long sequence time-series forecasting (LSTF) demands a high prediction capacity of the model, which is the ability to capture precise long-range dependency coupling between output and input efficiently. Recent studies have shown the potential of Transformer to increase the prediction capacity. However, there are several severe issues with Transformer that prevent it from being directly applicable to LSTF, including quadratic time complexity, high memory usage, and inherent limitation of the encoder-decoder architecture. To address these issues, we design an efficient transformer-based model for LSTF, named Informer, with three distinctive characteristics: (i) a ProbSparse self-attention mechanism, which achieves O(L log L) in time complexity and memory usage, and has comparable performance on sequences' dependency alignment. (ii) the self-attention distilling highlights dominating attention by halving cascading layer input, and efficiently handles extreme long input sequences. (iii) the generative style decoder, while conceptually simple, predicts the long time-series sequences at one forward operation rather than a step-by-step way, which drastically improves the inference speed of long-sequence predictions. Extensive experiments on four large-scale datasets demonstrate that Informer significantly outperforms existing methods and provides a new solution to the LSTF problem.

translated by 谷歌翻译

Contextually Enhanced ES-dRNN with Dynamic Attention for Short-Term Load Forecasting

Slawek Smyl , Grzegorz Dudek , Paweł Pełka

分类：机器学习 | 人工智能 | 神经与进化计算

2022-12-18

In this paper, we propose a new short-term load forecasting (STLF) model based on contextually enhanced hybrid and hierarchical architecture combining exponential smoothing (ES) and a recurrent neural network (RNN). The model is composed of two simultaneously trained tracks: the context track and the main track. The context track introduces additional information to the main track. It is extracted from representative series and dynamically modulated to adjust to the individual series forecasted by the main track. The RNN architecture consists of multiple recurrent layers stacked with hierarchical dilations and equipped with recently proposed attentive dilated recurrent cells. These cells enable the model to capture short-term, long-term and seasonal dependencies across time series as well as to weight dynamically the input information. The model produces both point forecasts and predictive intervals. The experimental part of the work performed on 35 forecasting problems shows that the proposed model outperforms in terms of accuracy its predecessor as well as standard statistical models and state-of-the-art machine learning models.

translated by 谷歌翻译

Two ways towards combining Sequential Neural Network and Statistical Methods to Improve the Prediction of Time Series

Jingwei Li

分类：机器学习

2021-09-30

统计建模和数据驱动学习是吸引许多关注的两个重要领域。统计模型打算捕获和解释变量之间的关系，而基于数据的学习尝试直接从数据中提取信息而无需通过复杂模型预先处理。鉴于两个字段中的广泛研究，一个微妙的问题是如何正确地整合基于数据的方法现有知识或模型。在本文中，基于时间序列数据，我们提出了两种不同的方向来集成两者，基于分解的方法和利用数据特征的统计提取方法。第一个将数据分解成线性稳定，非线性稳定和不稳定部件，其中合适的统计模型用于线性稳定和非线性稳定部件，而适当的机器学习工具用于不稳定部件。第二个应用统计模型来提取数据的统计特征，并将其作为额外的输入送入机器学习平台进行培训。最关键和具有挑战性的是如何从数学或统计模型中确定和提取有价值的信息，以提高机器学习算法的性能。我们使用具有不同程度的稳定性的时间序列数据评估该提案。性能结果表明，两种方法都可以优于使用模型和单独学习的现有方案，而改进可能超过60％。我们所提出的方法都具有促进拓展模型和数据驱动的方案之间的差距，并集成了两个，以提供全面的高等学校性能。

translated by 谷歌翻译

On the universality of the volatility formation process: when machine learning and rough volatility agree

Mathieu Rosenbaum , Jianfei Zhang

分类：机器学习 | (统计)机器学习

2022-06-28

我们根据数百个液体库存制成的合并数据集培训LSTM网络，旨在预测所有股票的下一个每日实现的波动性。显示了这种通用LSTM相对于其他资产特异性参数模型的一致性，我们发现了与过去的市场实现相关的普遍波动性形成机制的非参数证据，包括每日回报和波动率与当前的波动。结合粗糙的分数随机波动率和二次粗糙的Heston模型的简约参数预测设备与固定参数相结合的二次粗糙heston模型会导致与通用LSTM相同的性能水平，从参数角度来证实了波动性形成过程的通用性。

translated by 谷歌翻译

Paying Attention to Astronomical Transients: Introducing the Time-series Transformer for Photometric Classification

Tarek Allam Jr. , Jason D. McEwen

分类：机器学习

2021-05-13

Future surveys such as the Legacy Survey of Space and Time (LSST) of the Vera C. Rubin Observatory will observe an order of magnitude more astrophysical transient events than any previous survey before. With this deluge of photometric data, it will be impossible for all such events to be classified by humans alone. Recent efforts have sought to leverage machine learning methods to tackle the challenge of astronomical transient classification, with ever improving success. Transformers are a recently developed deep learning architecture, first proposed for natural language processing, that have shown a great deal of recent success. In this work we develop a new transformer architecture, which uses multi-head self attention at its core, for general multi-variate time-series data. Furthermore, the proposed time-series transformer architecture supports the inclusion of an arbitrary number of additional features, while also offering interpretability. We apply the time-series transformer to the task of photometric classification, minimising the reliance of expert domain knowledge for feature selection, while achieving results comparable to state-of-the-art photometric classification methods. We achieve a logarithmic-loss of 0.507 on imbalanced data in a representative setting using data from the Photometric LSST Astronomical Time-Series Classification Challenge (PLAsTiCC). Moreover, we achieve a micro-averaged receiver operating characteristic area under curve of 0.98 and micro-averaged precision-recall area under curve of 0.87.

translated by 谷歌翻译

Seasonal Encoder-Decoder Architecture for Forecasting

Avinash Achar , Soumen Pachal

分类：机器学习 | 人工智能

2022-07-08

一般而言，深度学习（DL）尤其是复发性神经网络（RNN）在基于序列的应用中获得了很高的成功水平。本文与RNN有关时间序列建模和预测。我们提出了一种新颖的RNN体系结构捕获（随机）季节性相关性，同时能够准确的多步骤预测。它是由著名的编码器架构（ED）体系结构和乘法季节性自动回归模型的动机。即使在外源输入的存在（或不存在）的情况下，它也结合了多步（多目标）学习。它可以用于单个或多个序列数据。对于多个序列情况，我们还提出了一种新型的贪婪递归程序，以构建（一个或多个）在序列数据较少时跨序列的预测模型。我们通过广泛的实验证明了我们提出的架构在单序和多个序列方案中的实用性。

translated by 谷歌翻译

A CNN-BiLSTM Model with Attention Mechanism for Earthquake Prediction

Parisa Kavianpour , Mohammadreza Kavianpour , Ehsan Jahani , Amin Ramezani

分类：机器学习

2021-12-26

作为自然现象的地震，历史上不断造成伤害和人类生活的损失。地震预测是任何社会计划的重要方面，可以增加公共准备，并在很大程度上减少损坏。然而，由于地震的随机特征以及实现了地震预测的有效和可靠模型的挑战，迄今为止努力一直不足，需要新的方法来解决这个问题。本文意识到这些问题，提出了一种基于注意机制（AM），卷积神经网络（CNN）和双向长短期存储器（BILSTM）模型的新型预测方法，其可以预测数量和最大幅度中国大陆各地区的地震为基于该地区的地震目录。该模型利用LSTM和CNN具有注意机制，以更好地关注有效的地震特性并产生更准确的预测。首先，将零阶保持技术应用于地震数据上的预处理，使得模型的输入数据更适当。其次，为了有效地使用空间信息并减少输入数据的维度，CNN用于捕获地震数据之间的空间依赖性。第三，使用Bi-LSTM层来捕获时间依赖性。第四，引入了AM层以突出其重要的特征来实现更好的预测性能。结果表明，该方法具有比其他预测方法更好的性能和概括能力。

translated by 谷歌翻译

Learning Non-Stationary Time-Series with Dynamic Pattern Extractions

Xipei Wang , Haoyu Zhang , Yuanbo Zhang , Meng Wang , Jiarui Song , Tin Lai , Matloob Khushi

分类：机器学习 | 人工智能

2021-11-20

信息爆炸的时代促使累积巨大的时间序列数据，包括静止和非静止时间序列数据。最先进的算法在处理静止时间数据方面取得了体面的性能。然而，解决静止时间系列的传统算法不适用于外汇交易的非静止系列。本文调查了适用的模型，可以提高预测未来非静止时间序列序列趋势的准确性。特别是，我们专注于识别潜在模型，并调查识别模式从历史数据的影响。我们提出了基于RNN的\ Rebuttal {The} SEQ2Seq模型的组合，以及通过动态时间翘曲和Zigzag峰谷指示器提取的注重机制和富集的集合特征。定制损失函数和评估指标旨在更加关注预测序列的峰值和谷点。我们的研究结果表明，我们的模型可以在外汇数据集中预测高精度的4小时未来趋势，这在逼真的情况下至关重要，以协助外汇交易决策。我们进一步提供了对各种损失函数，评估指标，模型变体和组件对模型性能的影响的评估。

translated by 谷歌翻译