智能论文笔记

Anomaly Detection for Fraud in Cryptocurrency Time Series

Eran Kaufman , Andrey Iaremenko

分类：机器学习

2022-07-23

自2009年比特币成立以来，随着日常交易超过100亿美元，加密货币的市场已经超出了初始预期。随着行业的自动化，自动欺诈探测器的需求变得非常明显。实时检测异常会阻止潜在的事故和经济损失。多元时间序列数据中的异常检测提出了一个特定的挑战，因为它需要同时考虑时间依赖性和变量之间的关系。实时识别异常并不是一项容易的任务，特别是因为他们观察到的确切的异常行为。有些要点可能会呈现全球或局部异常行为，而其他点由于其频率或季节性行为或趋势的变化，可能是异常的。在本文中，我们建议从特定帐户进行以太坊的实时交易，并调查了各种各样的传统和新算法。我们根据他们搜索的策略和异常行为对它们进行分类，并表明当它们将它们捆绑在一起时，它们可以证明是一个很好的实时探测器，警报时间不超过几秒钟，并且非常有高信心。

translated by 谷歌翻译

Deep Learning for Time Series Anomaly Detection: A Survey

Zahra Zamanzadeh Darban , Geoffrey I. Webb , Shirui Pan , Charu C. Aggarwal , Mahsa Salehi

分类：机器学习 | 人工智能

2022-11-09

Time series anomaly detection has applications in a wide range of research fields and applications, including manufacturing and healthcare. The presence of anomalies can indicate novel or unexpected events, such as production faults, system defects, or heart fluttering, and is therefore of particular interest. The large size and complex patterns of time series have led researchers to develop specialised deep learning models for detecting anomalous patterns. This survey focuses on providing structured and comprehensive state-of-the-art time series anomaly detection models through the use of deep learning. It providing a taxonomy based on the factors that divide anomaly detection models into different categories. Aside from describing the basic anomaly detection technique for each category, the advantages and limitations are also discussed. Furthermore, this study includes examples of deep anomaly detection in time series across various application domains in recent years. It finally summarises open issues in research and challenges faced while adopting deep anomaly detection models.

translated by 谷歌翻译

Generative Anomaly Detection for Time Series Datasets

Zhuangwei Kang , Ayan Mukhopadhyay , Aniruddha Gokhale , Shijie Wen , Abhishek Dubey

分类：机器学习 | 人工智能

2022-06-28

在智能交通系统中，交通拥堵异常检测至关重要。运输机构的目标有两个方面：监视感兴趣领域的一般交通状况，并在异常拥堵状态下定位道路细分市场。建模拥塞模式可以实现这些目标，以实现全市道路的目标，相当于学习多元时间序列（MTS）的分布。但是，现有作品要么不可伸缩，要么无法同时捕获MTS中的空间信息。为此，我们提出了一个由数据驱动的生成方法组成的原则性和全面的框架，该方法可以执行可拖动的密度估计来检测流量异常。我们的方法在特征空间中的第一群段段，然后使用条件归一化流以在无监督的设置下在群集级别识别异常的时间快照。然后，我们通过在异常群集上使用内核密度估计器来识别段级别的异常。关于合成数据集的广泛实验表明，我们的方法在召回和F1得分方面显着优于几种最新的拥塞异常检测和诊断方法。我们还使用生成模型来采样标记的数据，该数据可以在有监督的环境中训练分类器，从而减轻缺乏在稀疏设置中进行异常检测的标记数据。

translated by 谷歌翻译

Denoising Architecture for Unsupervised Anomaly Detection in Time-Series

Wadie Skaf , Tomáš Horváth

分类：机器学习 | 人工智能

2022-08-30

时间序列的异常提供了各个行业的关键方案的见解，从银行和航空航天到信息技术，安全和医学。但是，由于异常的定义，经常缺乏标签以及此类数据中存在的极为复杂的时间相关性，因此识别时间序列数据中的异常尤其具有挑战性。LSTM自动编码器是基于长期短期内存网络的异常检测的编码器传统方案，该方案学会重建时间序列行为，然后使用重建错误来识别异常。我们将Denoising Architecture作为对该LSTM编码模型模型的补充，并研究其对现实世界以及人为生成的数据集的影响。我们证明了所提出的体系结构既提高了准确性和训练速度，从而使LSTM自动编码器更有效地用于无监督的异常检测任务。

translated by 谷歌翻译

Unsupervised Anomaly Detection in Time-series: An Extensive Evaluation and Analysis of State-of-the-art Methods

Nesryne Mejri , Laura Lopez-Fuentes , Kankana Roy , Pavel Chernakov , Enjie Ghorbel , Djamila Aouada

分类：机器学习

2022-12-06

Unsupervised anomaly detection in time-series has been extensively investigated in the literature. Notwithstanding the relevance of this topic in numerous application fields, a complete and extensive evaluation of recent state-of-the-art techniques is still missing. Few efforts have been made to compare existing unsupervised time-series anomaly detection methods rigorously. However, only standard performance metrics, namely precision, recall, and F1-score are usually considered. Essential aspects for assessing their practical relevance are therefore neglected. This paper proposes an original and in-depth evaluation study of recent unsupervised anomaly detection techniques in time-series. Instead of relying solely on standard performance metrics, additional yet informative metrics and protocols are taken into account. In particular, (1) more elaborate performance metrics specifically tailored for time-series are used; (2) the model size and the model stability are studied; (3) an analysis of the tested approaches with respect to the anomaly type is provided; and (4) a clear and unique protocol is followed for all experiments. Overall, this extensive analysis aims to assess the maturity of state-of-the-art time-series anomaly detection, give insights regarding their applicability under real-world setups and provide to the community a more complete evaluation protocol.

translated by 谷歌翻译

Exploring the Use of Data-Driven Approaches for Anomaly Detection in the Internet of Things (IoT) Environment

Eleonora Achiluzzi , Menglu Li , Md Fahd Al Georgy , Rasha Kashef

分类：机器学习

2022-12-31

The Internet of Things (IoT) is a system that connects physical computing devices, sensors, software, and other technologies. Data can be collected, transferred, and exchanged with other devices over the network without requiring human interactions. One challenge the development of IoT faces is the existence of anomaly data in the network. Therefore, research on anomaly detection in the IoT environment has become popular and necessary in recent years. This survey provides an overview to understand the current progress of the different anomaly detection algorithms and how they can be applied in the context of the Internet of Things. In this survey, we categorize the widely used anomaly detection machine learning and deep learning techniques in IoT into three types: clustering-based, classification-based, and deep learning based. For each category, we introduce some state-of-the-art anomaly detection methods and evaluate the advantages and limitations of each technique.

translated by 谷歌翻译

AER: Auto-Encoder with Regression for Time Series Anomaly Detection

Lawrence Wong , Dongyu Liu , Laure Berti-Equille , Sarah Alnegheimish , Kalyan Veeramachaneni

分类：机器学习 | (统计)机器学习

2022-12-27

Anomaly detection on time series data is increasingly common across various industrial domains that monitor metrics in order to prevent potential accidents and economic losses. However, a scarcity of labeled data and ambiguous definitions of anomalies can complicate these efforts. Recent unsupervised machine learning methods have made remarkable progress in tackling this problem using either single-timestamp predictions or time series reconstructions. While traditionally considered separately, these methods are not mutually exclusive and can offer complementary perspectives on anomaly detection. This paper first highlights the successes and limitations of prediction-based and reconstruction-based methods with visualized time series signals and anomaly scores. We then propose AER (Auto-encoder with Regression), a joint model that combines a vanilla auto-encoder and an LSTM regressor to incorporate the successes and address the limitations of each method. Our model can produce bi-directional predictions while simultaneously reconstructing the original time series by optimizing a joint objective function. Furthermore, we propose several ways of combining the prediction and reconstruction errors through a series of ablation studies. Finally, we compare the performance of the AER architecture against two prediction-based methods and three reconstruction-based methods on 12 well-known univariate time series datasets from NASA, Yahoo, Numenta, and UCR. The results show that AER has the highest averaged F1 score across all datasets (a 23.5% improvement compared to ARIMA) while retaining a runtime similar to its vanilla auto-encoder and regressor components. Our model is available in Orion, an open-source benchmarking tool for time series anomaly detection.

translated by 谷歌翻译

RePAD: Real-time Proactive Anomaly Detection for Time Series

Ming-Chang Lee , Jia-Chun Lin , Ernst Gunnar Gran

分类：机器学习 | (统计)机器学习

2020-01-24

During the past decade, many anomaly detection approaches have been introduced in different fields such as network monitoring, fraud detection, and intrusion detection. However, they require understanding of data pattern and often need a long off-line period to build a model or network for the target data. Providing real-time and proactive anomaly detection for streaming time series without human intervention and domain knowledge is highly valuable since it greatly reduces human effort and enables appropriate countermeasures to be undertaken before a disastrous damage, failure, or other harmful event occurs. However, this issue has not been well studied yet. To address it, this paper proposes RePAD, which is a Real-time Proactive Anomaly Detection algorithm for streaming time series based on Long Short-Term Memory (LSTM). RePAD utilizes short-term historic data points to predict and determine whether or not the upcoming data point is a sign that an anomaly is likely to happen in the near future. By dynamically adjusting the detection threshold over time, RePAD is able to tolerate minor pattern change in time series and detect anomalies either proactively or on time. Experiments based on two time series datasets collected from the Numenta Anomaly Benchmark demonstrate that RePAD is able to proactively detect anomalies and provide early warnings in real time without human intervention and domain knowledge.

translated by 谷歌翻译

Memory-free Online Change-point Detection: A Novel Neural Network Approach

Zahra Atashgahi , Decebal Constantin Mocanu , Raymond Veldhuis , Mykola Pechenizkiy

分类：机器学习 | 人工智能

2022-07-08

检测数据分布突然变化的变更点检测（CPD）被认为是时间序列分析中最重要的任务之一。尽管关于离线CPD的文献广泛，但无监督的在线CPD仍面临主要挑战，包括可扩展性，超参数调整和学习限制。为了减轻其中一些挑战，在本文中，我们提出了一种新颖的深度学习方法，用于从多维时间序列中无监督的在线CPD，名为Adaptive LSTM-AUTOENOCODER变更点检测（ALACPD）。 ALACPD利用了基于LSTM-AutoEncoder的神经网络来执行无监督的在线CPD。它连续地适应了传入的样本，而无需保留先前接收的输入，因此没有内存。我们对几个实际时间序列的CPD基准进行了广泛的评估。我们表明，在时间序列细分的质量方面，ALACPD平均在最先进的CPD算法中排名第一，并且就估计更改点的准确性而言，它与表现最好。 ALACPD的实现可在Github \ footNote {\ url {https://github.com/zahraatashgahi/alacpd}}上在线获得。

translated by 谷歌翻译

Deep Learning for Anomaly Detection in Log Data: A Survey

Max Landauer , Sebastian Onder , Florian Skopik , Markus Wurzenberger

分类：机器学习

2022-07-08

自动日志文件分析可以尽早发现相关事件，例如系统故障。特别是，自我学习的异常检测技术在日志数据中捕获模式，随后向系统操作员报告意外的日志事件事件，而无需提前提供或手动对异常情况进行建模。最近，已经提出了越来越多的方法来利用深度学习神经网络为此目的。与传统的机器学习技术相比，这些方法证明了出色的检测性能，并同时解决了不稳定数据格式的问题。但是，有许多不同的深度学习体系结构，并且编码由神经网络分析的原始和非结构化日志数据是不平凡的。因此，我们进行了系统的文献综述，概述了部署的模型，数据预处理机制，异常检测技术和评估。该调查没有定量比较现有方法，而是旨在帮助读者了解不同模型体系结构的相关方面，并强调未来工作的开放问题。

translated by 谷歌翻译

Applications of Signature Methods to Market Anomaly Detection

Erdinc Akyildirim , Matteo Gambara , Josef Teichmann , Syang Zhou

分类：机器学习 | (统计)机器学习

2022-01-07

异常检测是识别数据集中异常实例或事件的过程，这些情况偏离了规范。在本研究中，我们提出了一种基于机器学习算法的签名，以检测给定数据集的稀有或意外项目。我们将签名或随机签名的应用作为异常检测算法的特征提取器;此外，我们为随机签名构建提供了简单的，表示的理论理由。我们的第一个申请基于合成数据，旨在区分股票价格的实际和假轨迹，这是通过目视检查无法区分的。我们还通过使用加密货币市场的交易数据来显示实际应用程序。在这种情况下，我们能够通过无监督的学习算法识别在社交网络上组织的泵和转储尝试，该算法高达88％，从而实现了靠近现场最先进的结果基于监督学习。

translated by 谷歌翻译

Smart Meter Data Anomaly Detection using Variational Recurrent Autoencoders with Attention

Wenjing Dai , Xiufeng Liu , Alfred Heller , Per Sieverts Nielsen

分类：机器学习

2022-06-08

在能源系统的数字化中，传感器和智能电表越来越多地用于监视生产，运行和需求。基于智能电表数据的异常检测对于在早期阶段识别潜在的风险和异常事件至关重要，这可以作为及时启动适当动作和改善管理的参考。但是，来自能源系统的智能电表数据通常缺乏标签，并且包含噪声和各种模式，而没有明显的周期性。同时，在不同的能量场景中对异常的模糊定义和高度复杂的时间相关性对异常检测构成了巨大的挑战。许多传统的无监督异常检测算法（例如基于群集或基于距离的模型）对噪声不强大，也不完全利用时间序列中的时间依赖性以及在多个变量（传感器）中的其他依赖关系。本文提出了一种基于带有注意机制的变异复发自动编码器的无监督异常检测方法。凭借来自智能电表的“肮脏”数据，我们的方法预示了缺失的值和全球异常，以在训练中缩小其贡献。本文与基于VAE的基线方法和其他四种无监督的学习方法进行了定量比较，证明了其有效性和优势。本文通过一项实际案例研究进一步验证了所提出的方法，该研究方法是检测工业加热厂的供水温度异常。

translated by 谷歌翻译

Outlier Detection using AI: A Survey

Md Nazmul Kabir Sikder , Feras A. Batarseh

分类：机器学习 | 人工智能 | (统计)机器学习

2021-12-01

异常值是一个事件或观察，其被定义为不同于距群体的不规则距离的异常活动，入侵或可疑数据点。然而，异常事件的定义是主观的，取决于应用程序和域（能量，健康，无线网络等）。重要的是要尽可能仔细地检测异常事件，以避免基础设施故障，因为异常事件可能导致对基础设施的严重损坏。例如，诸如微电网的网络物理系统的攻击可以发起电压或频率不稳定性，从而损坏涉及非常昂贵的修复的智能逆变器。微电网中的不寻常活动可以是机械故障，行为在系统中发生变化，人体或仪器错误或恶意攻击。因此，由于其可变性，异常值检测（OD）是一个不断增长的研究领域。在本章中，我们讨论了使用AI技术的OD方法的进展。为此，通过多个类别引入每个OD模型的基本概念。广泛的OD方法分为六大类：基于统计，基于距离，基于密度的，基于群集的，基于学习的和合奏方法。对于每个类别，我们讨论最近最先进的方法，他们的应用领域和表演。之后，关于对未来研究方向的建议提供了关于各种技术的优缺点和挑战的简要讨论。该调查旨在指导读者更好地了解OD方法的最新进展，以便保证AI。

translated by 谷歌翻译

Anomaly Detection in Power Markets and Systems

Ugur Halden , Umit Cali , Ferhat Ozgur Catak , Salvatore D'Arco , Francisco Bilendo

分类：机器学习

2022-12-05

The widespread use of information and communication technology (ICT) over the course of the last decades has been a primary catalyst behind the digitalization of power systems. Meanwhile, as the utilization rate of the Internet of Things (IoT) continues to rise along with recent advancements in ICT, the need for secure and computationally efficient monitoring of critical infrastructures like the electrical grid and the agents that participate in it is growing. A cyber-physical system, such as the electrical grid, may experience anomalies for a number of different reasons. These may include physical defects, mistakes in measurement and communication, cyberattacks, and other similar occurrences. The goal of this study is to emphasize what the most common incidents are with power systems and to give an overview and classification of the most common ways to find problems, starting with the consumer/prosumer end working up to the primary power producers. In addition, this article aimed to discuss the methods and techniques, such as artificial intelligence (AI) that are used to identify anomalies in the power systems and markets.

translated by 谷歌翻译

Artificial Intelligence and Design of Experiments for Assessing Security of Electricity Supply: A Review and Strategic Outlook

Jan Priesmann , Justin Münch , Elias Ridha , Thomas Spiegel , Marius Reich , Mario Adam , Lars Nolting , Aaron Praktiknjo

分类：人工智能

2021-12-07

评估能源转型和能源市场自由化对资源充足性的影响是一种越来越重要和苛刻的任务。能量系统的上升复杂性需要足够的能量系统建模方法，从而提高计算要求。此外，随着复杂性，同样调用概率评估和场景分析同样增加不确定性。为了充分和高效地解决这些各种要求，需要来自数据科学领域的新方法来加速当前方法。通过我们的系统文献综述，我们希望缩小三个学科之间的差距（1）电力供应安全性评估，（2）人工智能和（3）实验设计。为此，我们对所选应用领域进行大规模的定量审查，并制作彼此不同学科的合成。在其他发现之外，我们使用基于AI的方法和应用程序的AI方法和应用来确定电力供应模型的复杂安全性的元素，并作为未充分涵盖的应用领域的储存调度和（非）可用性。我们结束了推出了一种新的方法管道，以便在评估电力供应安全评估时充分有效地解决当前和即将到来的挑战。

translated by 谷歌翻译

A Comparative Study of Detecting Anomalies in Time Series Data Using LSTM and TCN Models

Saroj Gopali , Faranak Abri , Sima Siami-Namini , Akbar Siami Namin

分类：机器学习

2021-12-17

存在几种数据驱动方法，使我们的模型时间序列数据能够包括传统的基于回归的建模方法（即，Arima）。最近，在时间序列分析和预测的背景下介绍和探索了深度学习技术。询问的主要研究问题是在预测时间序列数据中的深度学习技术中的这些变化的性能。本文比较了两个突出的深度学习建模技术。比较了经常性的神经网络（RNN）长的短期记忆（LSTM）和卷积神经网络（CNN）基于基于TCN的时间卷积网络（TCN），并报告了它们的性能和训练时间。根据我们的实验结果，两个建模技术都表现了相当具有基于TCN的模型优于LSTM略微。此外，基于CNN的TCN模型比基于RNN的LSTM模型更快地构建了稳定的模型。

translated by 谷歌翻译

Detecting Anomalous Cryptocurrency Transactions: an AML/CFT Application of Machine Learning-based Forensics

Nadia Pocher , Mirko Zichichi , Fabio Merizzi , Muhammad Zohaib Shafiq , Stefano Ferretti

分类：机器学习

2022-06-07

金融部门中区块链和分布式分类帐技术（DLT）的兴起产生了社会经济转变，引发了法律关注和监管计划。尽管DLT的匿名性可以保护隐私权，数据保护和其他公民自由的权利，但缺乏身份证明阻碍了问责制，调查和执法。最终的挑战范围扩展到打击洗钱以及恐怖主义和扩散的融资（AML/CFT）的规则。由于执法机构和分析公司已经开始成功地应用取证来跟踪区块链生态系统的货币，因此在本文中，我们着重于这些技术的不断增长的相关性。特别是，我们提供了有关机器学习，网络和交易图分析的货币互联网（IOM）应用程序的见解。在提供了IOM中匿名的概念以及AML/CFT和区块链取证之间的相互作用的一些背景之后，我们着重于导致实验的异常检测方法。也就是说，我们通过各种机器学习技术分析了一个现实世界中的比特币交易数据集。我们的说法是，AML/CFT域可以从机器学习中的新图形分析方法中受益。确实，我们的发现表明，图形卷积网络（GCN）和图形注意网络（GAT）神经网络类型代表了AML/CFT合规性的有希望的解决方案。

translated by 谷歌翻译

IoT Data Analytics in Dynamic Environments: From An Automated Machine Learning Perspective

Li Yang , Abdallah Shami

分类：机器学习

2022-09-16

近年来，随着传感器和智能设备的广泛传播，物联网（IoT）系统的数据生成速度已大大增加。在物联网系统中，必须经常处理，转换和分析大量数据，以实现各种物联网服务和功能。机器学习（ML）方法已显示出其物联网数据分析的能力。但是，将ML模型应用于物联网数据分析任务仍然面临许多困难和挑战，特别是有效的模型选择，设计/调整和更新，这给经验丰富的数据科学家带来了巨大的需求。此外，物联网数据的动态性质可能引入概念漂移问题，从而导致模型性能降解。为了减少人类的努力，自动化机器学习（AUTOML）已成为一个流行的领域，旨在自动选择，构建，调整和更新机器学习模型，以在指定任务上实现最佳性能。在本文中，我们对Automl区域中模型选择，调整和更新过程中的现有方法进行了审查，以识别和总结将ML算法应用于IoT数据分析的每个步骤的最佳解决方案。为了证明我们的发现并帮助工业用户和研究人员更好地实施汽车方法，在这项工作中提出了将汽车应用于IoT异常检测问题的案例研究。最后，我们讨论并分类了该领域的挑战和研究方向。

translated by 谷歌翻译

Review of Time Series Forecasting Methods and Their Applications to Particle Accelerators

Sichen Li , Andreas Adelmann

分类：机器学习

2022-09-21

粒子加速器是复杂的设施，可产生大量的结构化数据，并具有明确的优化目标以及精确定义的控制要求。因此，它们自然适合数据驱动的研究方法。来自传感器和监视加速器形式的多元时间序列的数据。在加速器控制和诊断方面，快速的先发制人方法是高度首选的，数据驱动的时间序列预测方法的应用尤其有希望。这篇综述提出了时间序列预测问题，并总结了现有模型，并在各个科学领域的应用中进行了应用。引入了粒子加速器领域中的几次和将来的尝试。预测到粒子加速器的时间序列的应用显示出令人鼓舞的结果和更广泛使用的希望，现有的问题（例如数据一致性和兼容性）已开始解决。

translated by 谷歌翻译

A Comprehensive Survey of Graph-based Deep Learning Approaches for Anomaly Detection in Complex Distributed Systems

Armin Danesh Pazho , Ghazal Alinezhad Noghre , Arnab A Purkayastha , Jagannadh Vempati , Otto Martin , Hamed Tabkhi

分类：机器学习

2022-06-08

对于由硬件和软件组件组成的复杂分布式系统而言，异常检测是一个重要的问题。对此类系统的异常检测的要求和挑战的透彻理解对于系统的安全性至关重要，尤其是对于现实世界的部署。尽管有许多解决问题的研究领域和应用领域，但很少有人试图对这种系统进行深入研究。大多数异常检测技术是针对某些应用域的专门开发的，而其他检测技术则更为通用。在这项调查中，我们探讨了基于图的算法在复杂分布式异质系统中识别和减轻不同类型异常的重要潜力。我们的主要重点是在分布在复杂分布式系统上的异质计算设备上应用时，可深入了解图。这项研究分析，比较和对比该领域的最新研究文章。首先，我们描述了现实世界分布式系统的特征及其在复杂网络中的异常检测的特定挑战，例如数据和评估，异常的性质以及现实世界的要求。稍后，我们讨论了为什么可以在此类系统中利用图形以及使用图的好处。然后，我们将恰当地深入研究最先进的方法，并突出它们的优势和劣势。最后，我们评估和比较这些方法，并指出可能改进的领域。

translated by 谷歌翻译