冠状病毒疾病或Covid-19是由SARS-COV-2病毒引起的一种传染病。该病毒引起的第一个确认病例是在2019年12月底在中国武汉市发现的。然后,此案遍布全球,包括印度尼西亚。因此,联合19案被WHO指定为全球大流行。可以使用多种方法(例如深神经网络(DNN))预测COVID-19病例的增长,尤其是在印度尼西亚。可以使用的DNN模型之一是可以预测时间序列的深变压器。该模型经过多种测试方案的培训,以获取最佳模型。评估是找到最佳的超参数。然后,使用预测天数,优化器,功能数量以及与长期短期记忆(LSTM)(LSTM)和复发性神经网络(RNN)的先前模型进行比较的最佳超参数设置进行了进一步的评估。 。所有评估均使用平均绝对百分比误差(MAPE)的度量。基于评估的结果,深层变压器在使用前层归一化时会产生最佳的结果,并预测有一天的MAPE值为18.83。此外,接受Adamax优化器训练的模型在其他测试优化器中获得了最佳性能。 Deep Transformer的性能还超过了其他测试模型,即LSTM和RNN。
translated by 谷歌翻译
The outburst of COVID-19 in late 2019 was the start of a health crisis that shook the world and took millions of lives in the ensuing years. Many governments and health officials failed to arrest the rapid circulation of infection in their communities. The long incubation period and the large proportion of asymptomatic cases made COVID-19 particularly elusive to track. However, wastewater monitoring soon became a promising data source in addition to conventional indicators such as confirmed daily cases, hospitalizations, and deaths. Despite the consensus on the effectiveness of wastewater viral load data, there is a lack of methodological approaches that leverage viral load to improve COVID-19 forecasting. This paper proposes using deep learning to automatically discover the relationship between daily confirmed cases and viral load data. We trained one Deep Temporal Convolutional Networks (DeepTCN) and one Temporal Fusion Transformer (TFT) model to build a global forecasting model. We supplement the daily confirmed cases with viral loads and other socio-economic factors as covariates to the models. Our results suggest that TFT outperforms DeepTCN and learns a better association between viral load and daily cases. We demonstrated that equipping the models with the viral load improves their forecasting performance significantly. Moreover, viral load is shown to be the second most predictive input, following the containment and health index. Our results reveal the feasibility of training a location-agnostic deep-learning model to capture the dynamics of infection diffusion when wastewater viral load data is provided.
translated by 谷歌翻译
Time series, sets of sequences in chronological order, are essential data in statistical research with many forecasting applications. Although recent performance in many Transformer-based models has been noticeable, long multi-horizon time series forecasting remains a very challenging task. Going beyond transformers in sequence translation and transduction research, we observe the effects of down-and-up samplings that can nudge temporal saliency patterns to emerge in time sequences. Motivated by the mentioned observation, in this paper, we propose a novel architecture, Temporal Saliency Detection (TSD), on top of the attention mechanism and apply it to multi-horizon time series prediction. We renovate the traditional encoder-decoder architecture by making as a series of deep convolutional blocks to work in tandem with the multi-head self-attention. The proposed TSD approach facilitates the multiresolution of saliency patterns upon condensed multi-heads, thus progressively enhancing complex time series forecasting. Experimental results illustrate that our proposed approach has significantly outperformed existing state-of-the-art methods across multiple standard benchmark datasets in many far-horizon forecasting settings. Overall, TSD achieves 31% and 46% relative improvement over the current state-of-the-art models in multivariate and univariate time series forecasting scenarios on standard benchmarks. The Git repository is available at https://github.com/duongtrung/time-series-temporal-saliency-patterns.
translated by 谷歌翻译
为了提高风能生产的安全性和可靠性,短期预测已成为最重要的。这项研究的重点是挪威大陆架的多步时时空风速预测。图形神经网络(GNN)体系结构用于提取空间依赖性,具有不同的更新功能以学习时间相关性。这些更新功能是使用不同的神经网络体系结构实现的。近年来,一种这样的架构,即变压器,在序列建模中变得越来越流行。已经提出了对原始体系结构的各种改动,以更好地促进时间序列预测,本研究的重点是告密者Logsparse Transformer和AutoFormer。这是第一次将logsparse变压器和自动形态应用于风预测,并且第一次以任何一种或告密者的形式在时空设置以进行风向预测。通过比较时空长的短期记忆(LSTM)和多层感知器(MLP)模型,该研究表明,使用改变的变压器体系结构作为GNN中更新功能的模型能够超越这些功能。此外,我们提出了快速的傅立叶变压器(FFTRANSFORMER),该变压器是基于信号分解的新型变压器体系结构,由两个单独的流组成,分别分析趋势和周期性成分。发现FFTRANSFORMER和自动成型器可在10分钟和1小时的预测中取得优异的结果,而FFTRANSFORMER显着优于所有其他模型的4小时预测。最后,通过改变图表表示的连通性程度,该研究明确说明了所有模型如何利用空间依赖性来改善局部短期风速预测。
translated by 谷歌翻译
Surgical activity recognition and prediction can help provide important context in many Robot-Assisted Surgery (RAS) applications, for example, surgical progress monitoring and estimation, surgical skill evaluation, and shared control strategies during teleoperation. Transformer models were first developed for Natural Language Processing (NLP) to model word sequences and soon the method gained popularity for general sequence modeling tasks. In this paper, we propose the novel use of a Transformer model for three tasks: gesture recognition, gesture prediction, and trajectory prediction during RAS. We modify the original Transformer architecture to be able to generate the current gesture sequence, future gesture sequence, and future trajectory sequence estimations using only the current kinematic data of the surgical robot end-effectors. We evaluate our proposed models on the JHU-ISI Gesture and Skill Assessment Working Set (JIGSAWS) and use Leave-One-User-Out (LOUO) cross-validation to ensure the generalizability of our results. Our models achieve up to 89.3\% gesture recognition accuracy, 84.6\% gesture prediction accuracy (1 second ahead) and 2.71mm trajectory prediction error (1 second ahead). Our models are comparable to and able to outperform state-of-the-art methods while using only the kinematic data channel. This approach can enable near-real time surgical activity recognition and prediction.
translated by 谷歌翻译
制定准确的旅游预测模型对于为旅游管理做出理想的政策决策至关重要。早期研究旅游管理专注于发现与旅游需求相关的外部因素。最近的研究利用深度学习随需需求预测以及这些外部因素。它们主要使用递归神经网络模型,例如LSTM和RNN的框架。然而,这些模型不适合用于预测旅游需求。这是因为旅游需求受到各种外部因素变化的强烈影响,递归神经网络模型在处理这些多变量输入方面具有限制。我们提出了一种多主题CNN模型(MHAC),用于解决这些限制。 MHAC使用1D卷积神经网络来分析时间模式和注意机制,以反映输入变量之间的相关性。该模型可以从各种变量的时间序列数据中提取空间特征。我们通过考虑韩国文化的政治,疾病,季节和吸引力等外部因素,应用我们的预测框架来预测韩国的入境旅游变化。广泛实验的性能结果表明,我们的方法优于韩国旅游预测的其他基于深受学习的预测框架。
translated by 谷歌翻译
在从训练的数据集中学习后,AI Chatbot提供了令人印象深刻的响应。在这十年中,大多数研究工作都表现出深层神经模型优于任何其他模型。 RNN模型定期用于确定序列相关的问题,如问题和IT答案。这种方法熟悉每个人都是SEQ2SEQ学习。在SEQ2SEQ模型机制中,它具有编码器和解码器。编码器嵌入任何输入序列,以及解码器嵌入输出序列。为了加强SEQ2SEQ模型性能,请将注意力添加到编码器和解码器中。之后,变压器模型已经将其自身作为高性能模型引入,具有多种关注机制,用于解决与序列相关的困境。该模型与基于RNN的模型相比减少了训练时间,并且还实现了序列转换的最先进的性能。在这项研究中,我们基于孟加拉普通知识问题答案(QA)数据集,应用了孟加拉一般知识聊天聊天的变压器模型。它在应用的QA数据上得分为85.0 BLEU。要检查变压器模型性能的比较,我们将注意到SEQ2SEQ模型,请注意我们的数据集得分23.5 BLEU。
translated by 谷歌翻译
Many real-world applications require the prediction of long sequence time-series, such as electricity consumption planning. Long sequence time-series forecasting (LSTF) demands a high prediction capacity of the model, which is the ability to capture precise long-range dependency coupling between output and input efficiently. Recent studies have shown the potential of Transformer to increase the prediction capacity. However, there are several severe issues with Transformer that prevent it from being directly applicable to LSTF, including quadratic time complexity, high memory usage, and inherent limitation of the encoder-decoder architecture. To address these issues, we design an efficient transformer-based model for LSTF, named Informer, with three distinctive characteristics: (i) a ProbSparse self-attention mechanism, which achieves O(L log L) in time complexity and memory usage, and has comparable performance on sequences' dependency alignment. (ii) the self-attention distilling highlights dominating attention by halving cascading layer input, and efficiently handles extreme long input sequences. (iii) the generative style decoder, while conceptually simple, predicts the long time-series sequences at one forward operation rather than a step-by-step way, which drastically improves the inference speed of long-sequence predictions. Extensive experiments on four large-scale datasets demonstrate that Informer significantly outperforms existing methods and provides a new solution to the LSTF problem.
translated by 谷歌翻译
时间序列预测是一个重要的问题,具有许多现实世界的应用。深度神经网络的合奏最近实现了令人印象深刻的预测准确性,但是在许多现实世界中,如此大的合奏是不切实际的。变压器模型已成功应用于各种具有挑战性的问题。我们建议对原始变压器体系结构进行新颖的改编,重点是时间序列预测的任务,称为持久性初始化。该模型通过使用与残留跳过连接的乘法门控机制初始化为幼稚的持久性模型。我们使用具有REZERO标准化和旋转位置编码的解码器变压器,但适应适用于任何自动回归神经网络模型。我们评估了有关挑战性M4数据集的拟议体系结构,与基于合奏的方法相比,取得了竞争性能。我们还将最近提议的变压器模型进行比较,以预测时间序列,显示了M4数据集中的卓越性能。广泛的消融研究表明,持久性初始化会导致更好的性能和更快的收敛性。随着模型的大小的增加,只有我们提出的适应性增长的模型。我们还进行了一项额外的消融研究,以确定正常化和位置编码的选择的重要性,并发现旋转编码的使用和REZERO归一化对于良好的预测性能至关重要。
translated by 谷歌翻译
人类活动识别是计算机视觉中的新出现和重要领域,旨在确定个体或个体正在执行的活动。该领域的应用包括从体育中生成重点视频到智能监视和手势识别。大多数活动识别系统依赖于卷积神经网络(CNN)的组合来从数据和复发性神经网络(RNN)中进行特征提取来确定数据的时间依赖性。本文提出并设计了两个用于人类活动识别的变压器神经网络:一个经常性变压器(RET),这是一个专门的神经网络,用于对数据序列进行预测,以及视觉变压器(VIT),一种用于提取显着的变压器的变压器(VIT)图像的特征,以提高活动识别的速度和可扩展性。我们在速度和准确性方面提供了对拟议的变压器神经网络与现代CNN和基于RNN的人类活动识别模型的广泛比较。
translated by 谷歌翻译
预测基金绩效对投资者和基金经理都是有益的,但这是一项艰巨的任务。在本文中,我们测试了深度学习模型是否比传统统计技术更准确地预测基金绩效。基金绩效通常通过Sharpe比率进行评估,该比例代表了风险调整的绩效,以确保基金之间有意义的可比性。我们根据每月收益率数据序列数据计算了年度夏普比率,该数据的时间序列数据为600多个投资于美国上市大型股票的开放式共同基金投资。我们发现,经过现代贝叶斯优化训练的长期短期记忆(LSTM)和封闭式复发单元(GRUS)深度学习方法比传统统计量相比,预测基金的Sharpe比率更高。结合了LSTM和GRU的预测的合奏方法,可以实现所有模型的最佳性能。有证据表明,深度学习和结合能提供有希望的解决方案,以应对基金绩效预测的挑战。
translated by 谷歌翻译
The time-series forecasting (TSF) problem is a traditional problem in the field of artificial intelligence. Models such as Recurrent Neural Network (RNN), Long Short Term Memory (LSTM), and GRU (Gate Recurrent Units) have contributed to improving the predictive accuracy of TSF. Furthermore, model structures have been proposed to combine time-series decomposition methods, such as seasonal-trend decomposition using Loess (STL) to ensure improved predictive accuracy. However, because this approach is learned in an independent model for each component, it cannot learn the relationships between time-series components. In this study, we propose a new neural architecture called a correlation recurrent unit (CRU) that can perform time series decomposition within a neural cell and learn correlations (autocorrelation and correlation) between each decomposition component. The proposed neural architecture was evaluated through comparative experiments with previous studies using five univariate time-series datasets and four multivariate time-series data. The results showed that long- and short-term predictive performance was improved by more than 10%. The experimental results show that the proposed CRU is an excellent method for TSF problems compared to other neural architectures.
translated by 谷歌翻译
Future surveys such as the Legacy Survey of Space and Time (LSST) of the Vera C. Rubin Observatory will observe an order of magnitude more astrophysical transient events than any previous survey before. With this deluge of photometric data, it will be impossible for all such events to be classified by humans alone. Recent efforts have sought to leverage machine learning methods to tackle the challenge of astronomical transient classification, with ever improving success. Transformers are a recently developed deep learning architecture, first proposed for natural language processing, that have shown a great deal of recent success. In this work we develop a new transformer architecture, which uses multi-head self attention at its core, for general multi-variate time-series data. Furthermore, the proposed time-series transformer architecture supports the inclusion of an arbitrary number of additional features, while also offering interpretability. We apply the time-series transformer to the task of photometric classification, minimising the reliance of expert domain knowledge for feature selection, while achieving results comparable to state-of-the-art photometric classification methods. We achieve a logarithmic-loss of 0.507 on imbalanced data in a representative setting using data from the Photometric LSST Astronomical Time-Series Classification Challenge (PLAsTiCC). Moreover, we achieve a micro-averaged receiver operating characteristic area under curve of 0.98 and micro-averaged precision-recall area under curve of 0.87.
translated by 谷歌翻译
已经发现,已经发现深度学习架构,特别是深度动量网络(DMNS)[1904.04912]是一种有效的势头和平均逆转交易的方法。然而,近年来一些关键挑战涉及学习长期依赖,在考虑返回交易成本净净额并适应新的市场制度时,绩效的退化,特别是在SARS-COV-2危机期间。注意机制或基于变换器的架构是对这些挑战的解决方案,因为它们允许网络专注于过去和长期模式的重要时间步骤。我们介绍了势头变压器,一种基于关注的架构,胜过基准,并且本质上是可解释的,为我们提供更大的深入学习交易策略。我们的模型是基于LSTM的DMN的扩展,它通过在风险调整的性能度量上优化网络,直接输出位置尺寸,例如锐利比率。我们发现注意力LSTM混合解码器仅时间融合变压器(TFT)样式架构是最佳的执行模型。在可解释性方面,我们观察注意力模式的显着结构,在动量转点时具有重要的重要性。因此,时间序列被分段为制度,并且该模型倾向于关注以前的制度中的先前时间步骤。我们发现ChangePoint检测(CPD)[2105.13727],另一个用于响应政权变化的技术可以补充多抬头的注意力,特别是当我们在多个时间尺度运行CPD时。通过添加可解释的变量选择网络,我们观察CPD如何帮助我们的模型在日常返回数据上主要远离交易。我们注意到该模型可以智能地切换和混合古典策略 - 基于数据的决定。
translated by 谷歌翻译
Time series forecasting is an important problem across many domains, including predictions of solar plant energy output, electricity consumption, and traffic jam situation. In this paper, we propose to tackle such forecasting problem with Transformer [1]. Although impressed by its performance in our preliminary study, we found its two major weaknesses: (1) locality-agnostics: the point-wise dotproduct self-attention in canonical Transformer architecture is insensitive to local context, which can make the model prone to anomalies in time series; (2) memory bottleneck: space complexity of canonical Transformer grows quadratically with sequence length L, making directly modeling long time series infeasible. In order to solve these two issues, we first propose convolutional self-attention by producing queries and keys with causal convolution so that local context can be better incorporated into attention mechanism. Then, we propose LogSparse Transformer with only O(L(log L) 2 ) memory cost, improving forecasting accuracy for time series with fine granularity and strong long-term dependencies under constrained memory budget. Our experiments on both synthetic data and realworld datasets show that it compares favorably to the state-of-the-art.
translated by 谷歌翻译
在各种下游机器学习任务中,多元时间序列的可靠和有效表示至关重要。在多元时间序列预测中,每个变量都取决于其历史值,并且变量之间也存在相互依存关系。必须设计模型以捕获时间序列之间的内部和相互关系。为了朝着这一目标迈进,我们提出了时间序列注意变压器(TSAT),以进行多元时间序列表示学习。使用TSAT,我们以边缘增强动态图来表示多元时间序列的时间信息和相互依赖性。在动态图中的节点表示,串行中的相关性表示。修改了一种自我注意力的机制,以使用超经验模式分解(SMD)模块捕获序列间的相关性。我们将嵌入式动态图应用于时代序列预测问题,包括两个现实世界数据集和两个基准数据集。广泛的实验表明,TSAT显然在各种预测范围内使用六种最先进的基线方法。我们进一步可视化嵌入式动态图,以说明TSAT的图形表示功能。我们在https://github.com/radiantresearch/tsat上共享代码。
translated by 谷歌翻译
当今世界受到新颖的冠状病毒(Covid-19)的严重影响。使用医疗套件来识别受影响的人非常慢。接下来会发生什么,没人知道。世界正面临不稳定的问题,不知道在不久的将来会发生什么。本文试图使用LSTM(长期记忆)对冠状病毒恢复病例进行预后。这项工作利用了258个地区的数据,其纬度和经度以及403天的死亡人数范围为22-01-2020至27-02-2021。具体而言,被称为LSTM的先进基于深度学习的算法对为时间序列数据(TSD)分析提取高度必不可少的特征产生了极大的影响。有很多方法已经用于分析传播预测。本文的主要任务最终在分析使用基于LSTM深度学习的体系结构分析冠状病毒在全球恢复案例中的传播。
translated by 谷歌翻译
第一个已知的冠状病毒疾病2019(Covid-19)于2019年12月确定。它在全球范围内传播,导致许多国家的持续流行,强加的限制和成本。在此期间预测新案例和死亡人数可能是预测未来所需成本和设施的有用步骤。本研究的目的是预测未来100天内的新案例和死亡率,三天和七天。预测每一个天(而不是每天的动机)是调查计算成本降低和仍然实现合理性能的可能性。可以在时间序列的实时预测中遇到这样的场景。六种不同的深入学习方法是对来自世卫组织网站采用的数据进行检查。三种方法是LSTM,卷积LSTM和GRU。然后考虑对每种方法考虑双向延伸,以预测澳大利亚和伊朗国家的新案例和新死亡率。这项研究是新颖的,因为它对上述三个深度学习方法及其双向延伸进行了全面评估,以对Covid-19新案例和新的死亡率时间序列进行预测。据我们所知,这是Bi-Gru和Bi-conv-LSTM模型首次用于Covid-19新案例和新的死亡时间序列的预测。该方法的评估以图形和弗里德曼统计测试的形式提出。结果表明双向模型的误差比其他模型较低。提出了几个错误评估度量来比较所有模型,最后,确定双向方法的优越性。该研究对于针对Covid-19的组织有用,并确定其长期计划。
translated by 谷歌翻译
The Transformer is widely used in natural language processing tasks. To train a Transformer however, one usually needs a carefully designed learning rate warm-up stage, which is shown to be crucial to the final performance but will slow down the optimization and bring more hyperparameter tunings. In this paper, we first study theoretically why the learning rate warm-up stage is essential and show that the location of layer normalization matters. Specifically, we prove with mean field theory that at initialization, for the original-designed Post-LN Transformer, which places the layer normalization between the residual blocks, the expected gradients of the parameters near the output layer are large. Therefore, using a large learning rate on those gradients makes the training unstable. The warm-up stage is practically helpful for avoiding this problem. On the other hand, our theory also shows that if the layer normalization is put inside the residual blocks (recently proposed as Pre-LN Transformer), the gradients are well-behaved at initialization. This motivates us to remove the warm-up stage for the training of Pre-LN Transformers. We show in our experiments that Pre-LN Transformers without the warm-up stage can reach comparable results with baselines while requiring significantly less training time and hyper-parameter tuning on a wide range of applications.
translated by 谷歌翻译
使用变压器的深度学习最近在许多重要领域取得了很大的成功,例如自然语言处理,计算机视觉,异常检测和推荐系统等。在变压器的几种优点中,对于时间序列预测,捕获远程时间依赖性和相互作用的能力是可取的,从而导致其在各种时间序列应用中的进步。在本文中,我们为非平稳时间序列构建了变压器模型。这个问题具有挑战性,但至关重要。我们为基于小波的变压器编码器体系结构提供了一个新颖的单变量时间序列表示学习框架,并将其称为W-Transformer。所提出的W-Transformer使用最大重叠离散小波转换(MODWT)到时间序列数据,并在分解数据集上构建本地变压器,以生动地捕获时间序列中的非机构性和远程非线性依赖性。在来自各个领域的几个公共基准时间序列数据集和具有不同特征的几个公开基准时间序列数据集上评估我们的框架,我们证明它的平均表现明显优于短期和长期预测的基线预报器,即使是由包含的数据集组成的数据集只有几百个培训样本。
translated by 谷歌翻译