智能论文笔记

RePAD: Real-time Proactive Anomaly Detection for Time Series

Ming-Chang Lee , Jia-Chun Lin , Ernst Gunnar Gran

分类：机器学习 | (统计)机器学习

2020-01-24

During the past decade, many anomaly detection approaches have been introduced in different fields such as network monitoring, fraud detection, and intrusion detection. However, they require understanding of data pattern and often need a long off-line period to build a model or network for the target data. Providing real-time and proactive anomaly detection for streaming time series without human intervention and domain knowledge is highly valuable since it greatly reduces human effort and enables appropriate countermeasures to be undertaken before a disastrous damage, failure, or other harmful event occurs. However, this issue has not been well studied yet. To address it, this paper proposes RePAD, which is a Real-time Proactive Anomaly Detection algorithm for streaming time series based on Long Short-Term Memory (LSTM). RePAD utilizes short-term historic data points to predict and determine whether or not the upcoming data point is a sign that an anomaly is likely to happen in the near future. By dynamically adjusting the detection threshold over time, RePAD is able to tolerate minor pattern change in time series and detect anomalies either proactively or on time. Experiments based on two time series datasets collected from the Numenta Anomaly Benchmark demonstrate that RePAD is able to proactively detect anomalies and provide early warnings in real time without human intervention and domain knowledge.

translated by 谷歌翻译

ReRe: A Lightweight Real-time Ready-to-Go Anomaly Detection Approach for Time Series

Ming-Chang Lee , Jia-Chun Lin , Ernst Gunnar Gran

分类：机器学习 | (统计)机器学习

2020-04-05

Anomaly detection is an active research topic in many different fields such as intrusion detection, network monitoring, system health monitoring, IoT healthcare, etc. However, many existing anomaly detection approaches require either human intervention or domain knowledge, and may suffer from high computation complexity, consequently hindering their applicability in real-world scenarios. Therefore, a lightweight and ready-to-go approach that is able to detect anomalies in real-time is highly sought-after. Such an approach could be easily and immediately applied to perform time series anomaly detection on any commodity machine. The approach could provide timely anomaly alerts and by that enable appropriate countermeasures to be undertaken as early as possible. With these goals in mind, this paper introduces ReRe, which is a Real-time Ready-to-go proactive Anomaly Detection algorithm for streaming time series. ReRe employs two lightweight Long Short-Term Memory (LSTM) models to predict and jointly determine whether or not an upcoming data point is anomalous based on short-term historical data points and two long-term self-adaptive thresholds. Experiments based on real-world time-series datasets demonstrate the good performance of ReRe in real-time anomaly detection without requiring human intervention or domain knowledge.

translated by 谷歌翻译

How Far Should We Look Back to Achieve Effective Real-Time Time-Series Anomaly Detection?

Ming-Chang Lee , Jia-Chun Lin , Ernst Gunnar Gran

分类：机器学习

2021-02-12

异常检测是识别数据中意外事件或AB差异的过程，并且已在许多不同领域（例如系统监控，欺诈检测，医疗保健，入侵检测等）应用。提供实时，轻量级和主动的异常情况对于人类干预和领域知识的时间序列的检测，由于它减少了人类的努力，并在发生灾难性事件发生之前可以进行适当的对策，因此既不具有人为干预和领域知识。据我们所知，Repad（实时主动的异常检测算法）是所有上述特征的通用方法。为了实现实时和轻质检测，重新使用长期记忆（LSTM）来检测每个即将到来的数据点是否基于短期历史数据点是异常的。但是，目前尚不清楚不同数量的历史数据点如何影响续期的性能。因此，在本文中，我们通过引入一组涵盖新颖的检测准确性措施，时间效率，准备和资源消耗等的绩效指标来研究不同历史数据对重新播放的影响。进行时间序列数据集以评估不同情况下的重新播放，并提出和讨论实验结果。

translated by 谷歌翻译

Denoising Architecture for Unsupervised Anomaly Detection in Time-Series

Wadie Skaf , Tomáš Horváth

分类：机器学习 | 人工智能

2022-08-30

时间序列的异常提供了各个行业的关键方案的见解，从银行和航空航天到信息技术，安全和医学。但是，由于异常的定义，经常缺乏标签以及此类数据中存在的极为复杂的时间相关性，因此识别时间序列数据中的异常尤其具有挑战性。LSTM自动编码器是基于长期短期内存网络的异常检测的编码器传统方案，该方案学会重建时间序列行为，然后使用重建错误来识别异常。我们将Denoising Architecture作为对该LSTM编码模型模型的补充，并研究其对现实世界以及人为生成的数据集的影响。我们证明了所提出的体系结构既提高了准确性和训练速度，从而使LSTM自动编码器更有效地用于无监督的异常检测任务。

translated by 谷歌翻译

A Comparative Study of Detecting Anomalies in Time Series Data Using LSTM and TCN Models

Saroj Gopali , Faranak Abri , Sima Siami-Namini , Akbar Siami Namin

分类：机器学习

2021-12-17

存在几种数据驱动方法，使我们的模型时间序列数据能够包括传统的基于回归的建模方法（即，Arima）。最近，在时间序列分析和预测的背景下介绍和探索了深度学习技术。询问的主要研究问题是在预测时间序列数据中的深度学习技术中的这些变化的性能。本文比较了两个突出的深度学习建模技术。比较了经常性的神经网络（RNN）长的短期记忆（LSTM）和卷积神经网络（CNN）基于基于TCN的时间卷积网络（TCN），并报告了它们的性能和训练时间。根据我们的实验结果，两个建模技术都表现了相当具有基于TCN的模型优于LSTM略微。此外，基于CNN的TCN模型比基于RNN的LSTM模型更快地构建了稳定的模型。

translated by 谷歌翻译

Deep Learning for Time Series Anomaly Detection: A Survey

Zahra Zamanzadeh Darban , Geoffrey I. Webb , Shirui Pan , Charu C. Aggarwal , Mahsa Salehi

分类：机器学习 | 人工智能

2022-11-09

Time series anomaly detection has applications in a wide range of research fields and applications, including manufacturing and healthcare. The presence of anomalies can indicate novel or unexpected events, such as production faults, system defects, or heart fluttering, and is therefore of particular interest. The large size and complex patterns of time series have led researchers to develop specialised deep learning models for detecting anomalous patterns. This survey focuses on providing structured and comprehensive state-of-the-art time series anomaly detection models through the use of deep learning. It providing a taxonomy based on the factors that divide anomaly detection models into different categories. Aside from describing the basic anomaly detection technique for each category, the advantages and limitations are also discussed. Furthermore, this study includes examples of deep anomaly detection in time series across various application domains in recent years. It finally summarises open issues in research and challenges faced while adopting deep anomaly detection models.

translated by 谷歌翻译

Anomaly Detection for Fraud in Cryptocurrency Time Series

Eran Kaufman , Andrey Iaremenko

分类：机器学习

2022-07-23

自2009年比特币成立以来，随着日常交易超过100亿美元，加密货币的市场已经超出了初始预期。随着行业的自动化，自动欺诈探测器的需求变得非常明显。实时检测异常会阻止潜在的事故和经济损失。多元时间序列数据中的异常检测提出了一个特定的挑战，因为它需要同时考虑时间依赖性和变量之间的关系。实时识别异常并不是一项容易的任务，特别是因为他们观察到的确切的异常行为。有些要点可能会呈现全球或局部异常行为，而其他点由于其频率或季节性行为或趋势的变化，可能是异常的。在本文中，我们建议从特定帐户进行以太坊的实时交易，并调查了各种各样的传统和新算法。我们根据他们搜索的策略和异常行为对它们进行分类，并表明当它们将它们捆绑在一起时，它们可以证明是一个很好的实时探测器，警报时间不超过几秒钟，并且非常有高信心。

translated by 谷歌翻译

Forecast-based Multi-aspect Framework for Multivariate Time-series Anomaly Detection

Lan Wang , Yusan Lin , Yuhang Wu , Huiyuan Chen , Fei Wang , Hao Yang

分类：机器学习

2022-01-13

今天的网络世界难以多变量。在极端品种中收集的指标需要多变量算法以正确检测异常。然而，基于预测的算法，如被广泛证明的方法，通常在数据集中进行次优或不一致。一个关键的常见问题是他们努力成为一个尺寸适合的，但异常在自然中是独特的。我们提出了一种裁定到这种区别的方法。提出FMUAD - 一种基于预测，多方面，无监督的异常检测框架。FMUAD明确，分别捕获异常类型的签名性状 - 空间变化，时间变化和相关变化 - 与独立模块。然后，模块共同学习最佳特征表示，这是非常灵活和直观的，与类别中的大多数其他模型不同。广泛的实验表明我们的FMUAD框架始终如一地优于其他最先进的预测的异常探测器。

translated by 谷歌翻译

Ymir: A Supervised Ensemble Framework for Multivariate Time Series Anomaly Detection

Zhanxiang Zhao

分类：机器学习

2021-12-09

我们提出了一个多变量时间序列异常检测框架 - 工作YMIR，它利用了集合学习和监督学习技术，以有效地学习和适应异常的现实世界系统应用。 YMIR通过Anensemble学习方法集成了几个目前使用的无监督的异常检测模型，因此可以在无监督场景中提供强大的额度体内差异检测结果。在超级访问的环境中，域专家和系统用户讨论和提供（异常与否），用于培训数据，这反映了特定系统的自身统计学检测标准。 Ymir Leveragesthe上述了未经监督的方法从原始多变量时间序列数据中提取丰富和有用的奇数表示，然后将特征和标签与监督分类器与OFALY检测结合起来。我们在大型监测系统中评估了内部多功能仪系列数据集的YMIR，并实现了异常检测性能。

translated by 谷歌翻译

RUAD: unsupervised anomaly detection in HPC systems

Martin Molan , Andrea Borghesi , Daniele Cesarini , Luca Benini , Andrea Bartolini

分类：机器学习 | 人工智能

2022-08-28

现代高性能计算（HPC）系统的复杂性日益增加，需要引入自动化和数据驱动的方法，以支持系统管理员为增加系统可用性的努力。异常检测是改善可用性不可或缺的一部分，因为它减轻了系统管理员的负担，并减少了异常和解决方案之间的时间。但是，对当前的最新检测方法进行了监督和半监督，因此它们需要具有异常的人体标签数据集 - 在生产HPC系统中收集通常是不切实际的。基于聚类的无监督异常检测方法，旨在减轻准确的异常数据的需求，到目前为止的性能差。在这项工作中，我们通过提出RUAD来克服这些局限性，RUAD是一种新型的无监督异常检测模型。 Ruad比当前的半监督和无监督的SOA方法取得了更好的结果。这是通过考虑数据中的时间依赖性以及在模型体系结构中包括长短期限内存单元的实现。提出的方法是根据tier-0系统（带有980个节点的Cineca的Marconi100的完整历史）评估的。 RUAD在半监督训练中达到曲线（AUC）下的区域（AUC）为0.763，在无监督的训练中达到了0.767的AUC，这改进了SOA方法，在半监督训练中达到0.747的AUC，无需训练的AUC和0.734的AUC在无处不在的AUC中提高了AUC。训练。它还大大优于基于聚类的当前SOA无监督的异常检测方法，其AUC为0.548。

translated by 谷歌翻译

Smart Meter Data Anomaly Detection using Variational Recurrent Autoencoders with Attention

Wenjing Dai , Xiufeng Liu , Alfred Heller , Per Sieverts Nielsen

分类：机器学习

2022-06-08

在能源系统的数字化中，传感器和智能电表越来越多地用于监视生产，运行和需求。基于智能电表数据的异常检测对于在早期阶段识别潜在的风险和异常事件至关重要，这可以作为及时启动适当动作和改善管理的参考。但是，来自能源系统的智能电表数据通常缺乏标签，并且包含噪声和各种模式，而没有明显的周期性。同时，在不同的能量场景中对异常的模糊定义和高度复杂的时间相关性对异常检测构成了巨大的挑战。许多传统的无监督异常检测算法（例如基于群集或基于距离的模型）对噪声不强大，也不完全利用时间序列中的时间依赖性以及在多个变量（传感器）中的其他依赖关系。本文提出了一种基于带有注意机制的变异复发自动编码器的无监督异常检测方法。凭借来自智能电表的“肮脏”数据，我们的方法预示了缺失的值和全球异常，以在训练中缩小其贡献。本文与基于VAE的基线方法和其他四种无监督的学习方法进行了定量比较，证明了其有效性和优势。本文通过一项实际案例研究进一步验证了所提出的方法，该研究方法是检测工业加热厂的供水温度异常。

translated by 谷歌翻译

A Multi-View Framework for BGP Anomaly Detection via Graph Attention Network

Songtao Peng , Jiaqi Nie , Xincheng Shu , Zhongyuan Ruan , Lei Wang , Yunxuan Sheng , Qi Xuan

分类：机器学习 | 人工智能

2021-12-23

作为在Internet交换路由到达性信息的默认协议，边界网关协议（BGP）的流量异常行为与互联网异常事件密切相关。 BGP异常检测模型通过其实时监控和警报功能确保互联网上的稳定路由服务。以前的研究要么专注于特征选择问题或数据中的内存特征，同时忽略特征之间的关系和特征中的精确时间相关（无论是长期还是短期依赖性）。在本文中，我们提出了一种用于捕获来自BGP更新流量的异常行为的多视图模型，其中使用黄土（STL）方法的季节性和趋势分解来减少原始时间序列数据中的噪声和图表网络中的噪声（GAT）用于分别发现功能中的特征关系和时间相关性。我们的结果优于异常检测任务的最先进的方法，平均F1分别在平衡和不平衡数据集上得分高达96.3％和93.2％。同时，我们的模型可以扩展以对多个异常进行分类并检测未知事件。

translated by 谷歌翻译

Generative Anomaly Detection for Time Series Datasets

Zhuangwei Kang , Ayan Mukhopadhyay , Aniruddha Gokhale , Shijie Wen , Abhishek Dubey

分类：机器学习 | 人工智能

2022-06-28

在智能交通系统中，交通拥堵异常检测至关重要。运输机构的目标有两个方面：监视感兴趣领域的一般交通状况，并在异常拥堵状态下定位道路细分市场。建模拥塞模式可以实现这些目标，以实现全市道路的目标，相当于学习多元时间序列（MTS）的分布。但是，现有作品要么不可伸缩，要么无法同时捕获MTS中的空间信息。为此，我们提出了一个由数据驱动的生成方法组成的原则性和全面的框架，该方法可以执行可拖动的密度估计来检测流量异常。我们的方法在特征空间中的第一群段段，然后使用条件归一化流以在无监督的设置下在群集级别识别异常的时间快照。然后，我们通过在异常群集上使用内核密度估计器来识别段级别的异常。关于合成数据集的广泛实验表明，我们的方法在召回和F1得分方面显着优于几种最新的拥塞异常检测和诊断方法。我们还使用生成模型来采样标记的数据，该数据可以在有监督的环境中训练分类器，从而减轻缺乏在稀疏设置中进行异常检测的标记数据。

translated by 谷歌翻译

Task-aware Similarity Learning for Event-triggered Time Series

Shaoyu Dou , Kai Yang , Yang Jiao , Chengbo Qiu , Kui Ren

分类：机器学习 | 人工智能

2022-07-17

时间序列分析已在网络安全，环境监测和医学信息学等不同应用中取得了巨大成功。在不同时间序列之间学习相似性是一个关键问题，因为它是下游分析（例如聚类和异常检测）的基础。由于事件触发的传感产生的时间序列的复杂时间动态，通常不清楚哪种距离度量适合相似性学习，这在各种应用中很常见，包括自动驾驶，交互式医疗保健和智能家庭自动化。本文的总体目标是开发一个无监督的学习框架，该框架能够在未标记的事件触发时间序列中学习任务感知的相似性。从机器学习有利位置，提出的框架可以利用层次多尺度序列自动编码器和高斯混合模型（GMM）的功能，以有效地学习时间序列的低维表示。最后，可以轻松地将获得的相似性度量可视化以进行解释。拟议的框架渴望提供一块垫脚石，从而产生一种系统的模型方法，以在许多事件触发的时间序列中学习相似之处。通过广泛的定性和定量实验，揭示了所提出的方法的表现大大优于最先进的方法。

translated by 谷歌翻译

Smart Metering System Capable of Anomaly Detection by Bi-directional LSTM Autoencoder

Sangkeum Lee , Hojun Jin , Sarvar Hussain Nengroo , Yoonmee Doh , Chungho Lee , Taewook Heo , Dongsoo Har

分类：机器学习

2021-12-06

异常检测涉及广泛的应用，如故障检测，系统监控和事件检测。识别从智能计量系统获得的计量数据的异常是提高电力系统的可靠性，稳定性和效率的关键任务。本文介绍了异常检测过程，以发现在智能计量系统中观察到的异常值。在所提出的方法中，使用双向长短期存储器（BILSTM）的AutoEncoder并找到异常数据点。它通过具有非异常数据的AutoEncoder计算重建错误，并且将分类为异常的异常值通过预定义的阈值与非异常数据分离。基于Bilstm AutoEncoder的异常检测方法用来自985户家庭收集的4种能源电力/水/加热/热水的计量数据进行测试。

translated by 谷歌翻译

Deep Baseline Network for Time Series Modeling and Anomaly Detection

Cheng Ge , Xi Chen , Ming Wang , Jin Wang

分类：机器学习

2022-09-10

近年来，深度学习的时间序列增加了。对于时间序列的异常检测方案，例如金融，物联网，数据中心操作等，时间序列通常会根据各种外部因素显示非常灵活的基线。异常通过躺在远离基线的情况下揭示自己。但是，由于一些挑战，包括基线转换，缺乏标签，噪声干扰，流数据中的实时检测，可解释性等。从时间序列，即深基线网络（DBLN）。通过使用此深层网络，我们可以轻松地定位基线位置，然后提供可靠且可解释的异常检测结果。对合成和公共现实世界数据集的经验评估表明，我们纯粹的无监督算法与最新方法相比，实现了卓越的性能，并且具有良好的实际应用。

translated by 谷歌翻译

AER: Auto-Encoder with Regression for Time Series Anomaly Detection

Lawrence Wong , Dongyu Liu , Laure Berti-Equille , Sarah Alnegheimish , Kalyan Veeramachaneni

分类：机器学习 | (统计)机器学习

2022-12-27

Anomaly detection on time series data is increasingly common across various industrial domains that monitor metrics in order to prevent potential accidents and economic losses. However, a scarcity of labeled data and ambiguous definitions of anomalies can complicate these efforts. Recent unsupervised machine learning methods have made remarkable progress in tackling this problem using either single-timestamp predictions or time series reconstructions. While traditionally considered separately, these methods are not mutually exclusive and can offer complementary perspectives on anomaly detection. This paper first highlights the successes and limitations of prediction-based and reconstruction-based methods with visualized time series signals and anomaly scores. We then propose AER (Auto-encoder with Regression), a joint model that combines a vanilla auto-encoder and an LSTM regressor to incorporate the successes and address the limitations of each method. Our model can produce bi-directional predictions while simultaneously reconstructing the original time series by optimizing a joint objective function. Furthermore, we propose several ways of combining the prediction and reconstruction errors through a series of ablation studies. Finally, we compare the performance of the AER architecture against two prediction-based methods and three reconstruction-based methods on 12 well-known univariate time series datasets from NASA, Yahoo, Numenta, and UCR. The results show that AER has the highest averaged F1 score across all datasets (a 23.5% improvement compared to ARIMA) while retaining a runtime similar to its vanilla auto-encoder and regressor components. Our model is available in Orion, an open-source benchmarking tool for time series anomaly detection.

translated by 谷歌翻译

Time Series Anomaly Detection for Cyber-Physical Systems via Neural System Identification and Bayesian Filtering

Cheng Feng , Pengwei Tian

分类：机器学习 | (统计)机器学习

2021-06-15

Aiot技术的最新进展导致利用机器学习算法来检测网络物理系统（CPS）的操作失败的越来越受欢迎。在其基本形式中，异常检测模块从物理工厂监控传感器测量和致动器状态，并检测这些测量中的异常以识别异常操作状态。然而，由于该模型必须在存在高度复杂的系统动态和未知量的传感器噪声的情况下准确地检测异常，构建有效的异常检测模型是挑战性的。在这项工作中，我们提出了一种新的时序序列异常检测方法，称为神经系统识别和贝叶斯滤波（NSIBF），其中特制的神经网络架构被构成系统识别，即捕获动态状态空间中CP的动态模型;然后，通过跟踪系统的隐藏状态的不确定性随着时间的推移，自然地施加贝叶斯滤波算法的顶部。我们提供定性的和定量实验，并在合成和三个现实世界CPS数据集上具有所提出的方法，表明NSIBF对最先进的方法比较了对CPS中异常检测的最新方法。

translated by 谷歌翻译

A Comprehensive Survey of Graph-based Deep Learning Approaches for Anomaly Detection in Complex Distributed Systems

Armin Danesh Pazho , Ghazal Alinezhad Noghre , Arnab A Purkayastha , Jagannadh Vempati , Otto Martin , Hamed Tabkhi

分类：机器学习

2022-06-08

对于由硬件和软件组件组成的复杂分布式系统而言，异常检测是一个重要的问题。对此类系统的异常检测的要求和挑战的透彻理解对于系统的安全性至关重要，尤其是对于现实世界的部署。尽管有许多解决问题的研究领域和应用领域，但很少有人试图对这种系统进行深入研究。大多数异常检测技术是针对某些应用域的专门开发的，而其他检测技术则更为通用。在这项调查中，我们探讨了基于图的算法在复杂分布式异质系统中识别和减轻不同类型异常的重要潜力。我们的主要重点是在分布在复杂分布式系统上的异质计算设备上应用时，可深入了解图。这项研究分析，比较和对比该领域的最新研究文章。首先，我们描述了现实世界分布式系统的特征及其在复杂网络中的异常检测的特定挑战，例如数据和评估，异常的性质以及现实世界的要求。稍后，我们讨论了为什么可以在此类系统中利用图形以及使用图的好处。然后，我们将恰当地深入研究最先进的方法，并突出它们的优势和劣势。最后，我们评估和比较这些方法，并指出可能改进的领域。

translated by 谷歌翻译

Cloud Failure Prediction with Hierarchical Temporal Memory: An Empirical Assessment

Oliviero Riganelli , Paolo Saltarel , Alessandro Tundo , Marco Mobilio , Leonardo Mariani

分类：神经与进化计算 | 人工智能 | 机器学习

2021-10-06

分层时间记忆（HTM）是一种无监督的学习算法，其灵感来自Neocortex的功能，可用于连续处理流数据并检测异常，而无需大量数据进行培训，也不需要标记数据。 HTM还能够从样本中不断学习，提供一个始终是关于观察的模型。这些特性使HTM特别适用于支持云系统中的在线故障预测，这是具有动态变化行为的系统必须监视以预测问题。本文介绍了在故障预测的背景下评估HTM的第一个系统研究。考虑到72个HTM配置所获得的HTM配置到Clearwater云系统中引入的12种不同类型的故障表明，HTM可以帮助预测具有足够有效性（F-Measure = 0.76）的失败，代表有趣的实际替代方案（半 - ）监督算法。

translated by 谷歌翻译