Mechanical devices such as engines, vehicles, aircrafts, etc., are typically instrumented with numerous sensors to capture the behavior and health of the machine. However, there are often external factors or variables which are not captured by sensors leading to time-series which are inherently unpredictable. For instance, manual controls and/or unmonitored environmental conditions or load may lead to inherently unpredictable time-series. Detecting anomalies in such scenarios becomes challenging using standard approaches based on mathematical models that rely on stationarity, or prediction models that utilize prediction errors to detect anomalies. We propose a Long Short Term Memory Networks based Encoder-Decoder scheme for Anomaly Detection (EncDec-AD) that learns to reconstruct 'normal' time-series behavior, and thereafter uses reconstruction error to detect anomalies. We experiment with three publicly available quasi predictable time-series datasets: power demand , space shuttle, and ECG, and two real-world engine datasets with both predictive and unpredictable behavior. We show that EncDec-AD is robust and can detect anomalies from predictable , unpredictable, periodic, aperiodic, and quasi-periodic time-series. Further, we show that EncDec-AD is able to detect anomalies from short time-series (length as small as 30) as well as long time-series (length as large as 500).
translated by 谷歌翻译
Videos represent the primary source of information for surveillance applications. Video material is often available in large quantities but in most cases it contains little or no annotation for supervised learning. This article reviews the state-of-the-art deep learning based methods for video anomaly detection and categorizes them based on the type of model and criteria of detection. We also perform simple studies to understand the different approaches and provide the criteria of evaluation for spatio-temporal anomaly detection.
translated by 谷歌翻译
如今,多变量时间序列数据越来越多地收集在各​​种现实世界系统中,例如发电厂,可穿戴设备等。多变量时间序列中的异常检测和诊断是指在某些时间步骤中识别异常状态并查明根本原因。然而,这样的系统具有挑战性,因为它不仅需要捕获每个时间序列中的时间依赖性,而且还需要编码不同时间序列对之间的相关性。此外,系统应该对噪声具有鲁棒性,并根据不同事件的严重程度为操作员提供不同级别的异常分数。尽管已经开发了许多无监督的异常检测算法,但是它们中很少能够共同解决这些挑战。在本文中,我们提出了一种多尺度卷积递归编码器 - 解码器(MSCRED),用于在多变量时间序列数据中进行性能检测和诊断。具体来说,MSCRED首先构建多尺度(分辨率)签名矩阵,以在不同的时间步长中表征系统状态的多个级别。随后,给定签名矩阵,使用卷积编码器来编码传感器间(时间序列)相关性和注意力。基于卷积长短期记忆(ConvLSTM)网络被开发用于捕获时间模式。最后,基于编码传感器间相关性和时间信息的特征图,使用卷积解码器重建输入签名矩阵,并且进一步利用残余签名矩阵来检测和诊断异常。基于合成数据集和真实发电厂数据集的广泛实证研究表明,MSCRED可以胜过最先进的基线方法。
translated by 谷歌翻译
由于数据量大,因此增加了对自主和通用异常检测系统的需求。然而,开发一种准确且快速的独立的通用异常检测系统仍然是一个挑战。在本文中,我们提出了传统的时间序列分析方法,季节自回归整合移动平均(SARIMA)模型和使用黄土(STL)的SeasonalTrend分解,以检测复杂和各种异常。通常,SARIMA和STL仅用于静止和周期时间 - 系列,但通过组合,我们表明他们可以检测高精度的异常,甚至嘈杂和非周期性的数据。我们将该算法与Long ShortTerm Memory(LSTM)进行了比较,LSTM是一种用于异常检测系统的基于深度学习的算法。我们总共使用了七个真实数据集和四个具有不同时间序列属性的人工数据集来验证所提算法的性能。
translated by 谷歌翻译
Surveillance videos are able to capture a variety of realistic anomalies. In this paper, we propose to learn anomalies by exploiting both normal and anomalous videos. To avoid annotating the anomalous segments or clips in training videos, which is very time consuming, we propose to learn anomaly through the deep multiple instance ranking framework by leveraging weakly labeled training videos, i.e. the training labels (anomalous or normal) are at video-level instead of clip-level. In our approach, we consider normal and anomalous videos as bags and video segments as instances in multiple instance learning (MIL), and automatically learn a deep anomaly ranking model that predicts high anomaly scores for anomalous video segments. Furthermore , we introduce sparsity and temporal smoothness constraints in the ranking loss function to better localize anomaly during training. We also introduce a new large-scale first of its kind dataset of 128 hours of videos. It consists of 1900 long and untrimmed real-world surveillance videos, with 13 realistic anomalies such as fighting, road accident, burglary, robbery , etc. as well as normal activities. This dataset can be used for two tasks. First, general anomaly detection considering all anomalies in one group and all normal activities in another group. Second, for recognizing each of 13 anomalous activities. Our experimental results show that our MIL method for anomaly detection achieves significant improvement on anomaly detection performance as compared to the state-of-the-art approaches. We provide the results of several recent deep learning baselines on anomalous activity recognition. The low recognition performance of these baselines reveals that our dataset is very challenging and opens more opportunities for future work. The dataset is available at: http://crcv.ucf.edu/projects/real-world/
translated by 谷歌翻译
Reliable uncertainty estimation for time series prediction is critical in many fields, including physics, biology, and manufacturing. At Uber, probabilistic time series forecasting is used for robust prediction of number of trips during special events, driver incentive allocation, as well as real-time anomaly detection across millions of metrics. Classical time series models are often used in conjunction with a probabilistic formulation for uncertainty estimation. However, such models are hard to tune, scale, and add exogenous variables to. Motivated by the recent resurgence of Long Short Term Memory networks, we propose a novel end-to-end Bayesian deep model that provides time series prediction along with uncertainty estimation. We provide detailed experiments of the proposed solution on completed trips data, and successfully apply it to large-scale time series anomaly detection at Uber.
translated by 谷歌翻译
We present a novel unsupervised deep learning framework for anomalous event detection in complex video scenes. While most existing works merely use hand-crafted appearance and motion features, we propose Appearance and Motion DeepNet (AMDN) which utilizes deep neural networks to automatically learn feature representations. To exploit the complementary information of both appearance and motion patterns, we introduce a novel double fusion framework, combining both the benefits of traditional early fusion and late fusion strategies. Specifically, stacked denoising autoencoders are proposed to separately learn both appearance and motion features as well as a joint representation (early fusion). Based on the learned representations, multiple one-class SVM models are used to predict the anomaly scores of each input, which are then integrated with a late fusion strategy for final anomaly detection. We evaluate the proposed method on two publicly available video surveillance datasets, showing competitive performance with respect to state of the art approaches.
translated by 谷歌翻译
无线传感器网络(WSN)通过弥合物理和网络世界之间的差距,是物联网(IoT)的基础。在这种情况下,异常检测是一项关键任务,因为它负责识别各种利益事件,例如设备故障和未发现的现象。然而,由于异常的特性和周围环境的波动性,这项任务具有挑战性。在像WSN这样资源稀缺的环境中,这一挑战进一步提升,并削弱了许多现有解决方案的适用性。本文首次将自动编码器神经网络引入到WSN中,解决了异常检测问题。我们设计了一个分为传感器和物联网云的两部分算法,这样(i)可以以完全分布的方式在传感器中检测到异常,而无需与任何其他传感器或云通信,以及(ii)相对更多的计算密集学习任务可由云以更低(和可配置)的频率处理。除了最小的通信开销之外,传感器上的计算负荷也非常低(多项式复杂性),并且大多数COTS传感器都能轻松承受。使用连续4个月收集的真实WSNindoor测试平台和传感器数据,我们通过实验证明我们提出的基于自动编码器的异常检测机制可以实现高检测精度和低误报率。它还能够适应非预测的和新的非变化由于我们选择的自编码神经网络的无监督学习功能,静止环境。
translated by 谷歌翻译
We are seeing an enormous increase in the availability of streaming, time-series data. Largely driven by the rise of connected real-time data sources, this data presents technical challenges and opportunities. One fundamental capability for streaming analytics is to model each stream in an unsupervised fashion and detect unusual, anomalous behaviors in real-time. Early anomaly detection is valuable, yet it can be difficult to execute reliably in practice. Application constraints require systems to process data in real-time, not batches. Streaming data inherently exhibits concept drift, favoring algorithms that learn continuously. Furthermore, the massive number of independent streams in practice requires that anomaly detectors be fully automated. In this paper we propose a novel anomaly detection algorithm that meets these constraints. The technique is based on an online sequence memory algorithm called Hierarchical Temporal Memory (HTM). We also present results using the Numenta Anomaly Benchmark (NAB), a benchmark containing real-world data streams with labeled anomalies. The benchmark, the first of its kind, provides a controlled open-source environment for testing anomaly detection algorithms on streaming data. We present results and analysis for a wide range of algorithms on this benchmark, and discuss future challenges for the emerging field of streaming analytics.
translated by 谷歌翻译
在这项工作中,我们研究了医疗时间系列的无监督表示学习,它承诺利用大量现有的标记数据,以便最终协助临床决策。通过评估临床相关结果的预测,我们表明,在实用设置中,无监督表示学习可以提供比端到端监督体系结构更好的性能优势。我们尝试以两种不同的方式使用序列到序列(Seq2Seq)模型,作为自动编码器和预测器,并且表明通过具有集成注意机制的预测Seq2Seq模型实现了最佳性能,在设置中首次提出无监督学习的医疗时间系列。
translated by 谷歌翻译
Anomaly detection in database management systems (DBMSs) is difficult because of increasing number of statistics (stat) and event metrics in big data system. In this paper, I propose an automatic DBMS diagnosis system that detects anomaly periods with abnormal DB stat metrics and finds causal events in the periods. Reconstruction error from deep autoencoder and statistical process control approach are applied to detect time period with anomalies. Related events are found using time series similarity measures between events and abnormal stat metrics. After training deep autoencoder with DBMS metric data, efficacy of anomaly detection is investigated from other DBMSs containing anomalies. Experiment results show effectiveness of proposed model, especially, batch temporal normalization layer. Proposed model is used for publishing automatic DBMS diagnosis reports in order to determine DBMS configuration and SQL tuning.
translated by 谷歌翻译
当数据长度变化时,使用神经网络对时间序列数据进行分类是一个具有挑战性的问题。视频对象轨迹是视觉监控应用的关键,通常被发现具有不同的强度。如果使用这样的轨迹来理解移动物体的行为(正常的超常),则需要正确表示它们。在本文中,我们提出了使用混合卷积神经网络(CNN)和变分自动编码器(VAE)架构的视频对象轨迹分类和异常检测。首先,我们使用颜色渐变形式引入对象轨迹的高级表示。在下一阶段,使用时间未知增量聚类(TUIC)提取的用于注释运动物体轨迹的半监督路径已被应用于轨迹类别标记。使用t-分布式随机邻域嵌入(t-SNE)分离异常轨迹。最后,混合CNN-VAE架构已被用于堡垒分类和异常检测。使用公共监控视频数据集获得的结果表明,所提出的方法可以成功地识别一些重要的交通异常,例如没有跟随车道行驶的车辆,突然的速度变化,车辆运动的突然停止以及车辆在错误的方向上移动。与现有的异常检测方法相比,所提出的方法能够以更高的精度检测上述异常。
translated by 谷歌翻译
This paper introduces a generic and scalable framework for automated anomaly detection on large scale time-series data. Early detection of anomalies plays a key role in maintaining consistency of person's data and protects corporations against malicious attackers. Current state of the art anomaly detection approaches suffer from scalability, use-case restrictions , difficulty of use and a large number of false positives. Our system at Yahoo, EGADS, uses a collection of anomaly detection and forecasting models with an anomaly filtering layer for accurate and scalable anomaly detection on time-series. We compare our approach against other anomaly detection systems on real and synthetic data with varying time-series characteristics. We found that our framework allows for 50-60% improvement in precision and recall for a variety of use-cases. Both the data and the framework are being open-sourced. The open-sourcing of the data, in particular , represents the first of its kind effort to establish the standard benchmark for anomaly detection.
translated by 谷歌翻译
及时预测重症监护室(ICU)中的临床关键事件对于提高护理和存活率非常重要。大多数现有方法基于各种分类方法的应用,从生命信号中明确地提取统计特征。在这项工作中,我们建议通过使用序列到序列自动编码器来学习它们的代表性,从多变量的生理信号时间序列中消除工程手工制作特征的高成本。然后,我们建议对已学习的表示进行分析,以便对关键事件的预测进行信号相似性评估。我们将这种方法论框架应用于急性低血压事件(AHE),对大量不同的生命信号记录数据集进行了应用。实验证明了所提出的框架能够准确预测即将到来的AHE的能力。
translated by 谷歌翻译
我们将异常事件检测问题表示为异常检测任务,并提出了一种基于k均值聚类和一类支持向量机(SVM)的两阶段算法来消除异常值。在从仅包含正常事件的训练视频中提取运动特征之后,我们应用k均值聚类来找到表示不同类型运动的聚类。在第一阶段,我们认为具有较少样本的集群(相对于给定的阈值)仅包含异常值,并且我们完全消除这些集群。在第二阶段,我们通过在每个集群上训练一类SVM模型来缩小剩余集群的边界。为了检测测试视频中的异常事件,我们分析每个测试样本并考虑由训练的一类SVM模型提供的最大正态性分数,基于测试样本只能属于一个正常运动集群的直觉。如果测试样品不适合任何变窄的簇,则标记为异常。我们还将基于运动特征的方法与基于使用预先训练的卷积神经网络(CNN)提取的深度外观特征的近期方法相结合。我们使用后期融合策略将我们的两阶段算法与深度框架相结合,使两个方法的管道保持独立。我们将我们的方法与四个基准数据集上的几种最先进的监督和无监督方法进行比较。实证结果表明,在大多数情况下,我们的异常事件检测框架可以获得更好的结果,同时在CPU上以每秒32帧的速度实时处理测试视频。
translated by 谷歌翻译
许多现实世界系统中的网络传感器和执行器的普及,例如智能建筑,工厂,发电厂和数据中心,为这些系统产生了大量的多变量时间序列数据。可以连续监测Therich传感器数据以检测入侵事件。然而,由于这些系统的动态复杂性,传统的基于阈值的异常检测方法是不充分的,而由于缺少标记数据,监督机器学习方法不能利用大量数据。另一方面,当前的无监督机器学习方法尚未充分利用系统中用于检测异常的多个变量(传感器/致动器)之间的空间 - 时间相关性和其他依赖性。在这项工作中,我们提出了一种基于遗传对抗网络(GAN)的无监督多变量异常检测方法。我们提出的MAD-GAN框架不是独立地处理每个数据流,而是考虑整个变量同时捕获变量之间的潜在相互作用。我们同时利用GAN生成的生成器和鉴别器,使用称为DR-score的异常异常分数来通过区分和重构来检测异常。我们使用从现实世界CPS中收集的两个最近的数据集测试了我们提出的MAD-GAN:安全水处理(SWaT)和水分布(WADI)数据集。我们的实验结果表明,提出的MAD-GAN可以有效地报告在这些复杂的现实世界系统中由各种网络入侵引起的异常。
translated by 谷歌翻译
Due to the continued digitization of industrial and societal processes, including the deployment of networked sensors , we are witnessing a rapid proliferation of time-ordered observations, known as time series. For example, the behavior of drivers can be captured by GPS or accelerometer as a time series of speeds, directions, and accelerations. We propose a framework for outlier detection in time series that, for example, can be used for identifying dangerous driving behavior and hazardous road locations. Specifically, we first propose a method that generates statistical features to enrich the feature space of raw time series. Next, we utilize an autoencoder to reconstruct the enriched time series. The autoencoder performs dimensionality reduction to capture, using a small feature space, the most representative features of the enriched time series. As a result, the reconstructed time series only capture representative features, whereas outliers often have non-representative features. Therefore, deviations of the enriched time series from the reconstructed time series can be taken as indicators of outliers. We propose and study autoencoders based on convolutional neural networks and long-short term memory neural networks. In addition, we show that embedding of contextual information into the framework has the potential to further improve the accuracy of identifying outliers. We report on empirical studies with multiple time series data sets, which offers insight into the design properties of the proposed framework, indicating that it is effective at detecting outliers.
translated by 谷歌翻译
尽管存在固有的不明确定义,但异常检测是机器学习和视觉场景理解中相当兴趣的研究工作。通常,异常检测被认为是基于某种正态度量在给定数据分布中检测异常值。现实世界异常检测问题中最重要的挑战是可用数据对于正常性(即非异常)高度不平衡并且包含所有可能的异常样本的大部分子集 - 因此限制了良好建立的监督学习方法的使用。相比之下,我们介绍了无监督异常检测模型,仅对正常(非异常,丰富)样本进行训练,以便了解域的正态分布,从而根据与该模型的偏差检测异常。我们提出的方法采用编码器 - 解码器卷积神经网络,跳过连接,彻底捕捉高维图像空间中正常数据分布的多尺度分布。此外,利用针对该选择的体系结构的对抗性训练方案,在高维图像空间和更低维的潜在向量空间编码中提供了优越的重建。在训练期间最小化图像和隐藏向量空间内的重建误差度量,使模型根据需要学习正态性的分布。因此,在随后的测试和部署期间的较高重构度量指示了与该正态分布的偏差,因此指示了异常。在X射线安全屏幕的背景下,对已建立的异常检测基准和具有挑战性的真实数据集的实验表明了这种提议方法的独特前景。
translated by 谷歌翻译
本文提出了一种新的优化原理及其在声音(ADS)中使用自动编码器(AE)进行无监督异常检测的实现。无监督ADS的目的是在没有异常声音训练数据的情况下检测未知的异常声音。使用AE作为正常模型是无监督ADS的最先进技术。为了降低误报率(FPR),训练AE以最小化正常声音的重建误差,并且将异常分数计算为观察到的声音的重建误差。不幸的是,由于该训练过程没有考虑异常的异常分数。听起来,真正的阳性率(TPR)并没有必然增加。在本研究中,我们通过将ADS作为统计假设检验来定义基于Neyman-Pearson引理的目标函数。所提出的目标函数训练AE以在任意低FPR条件下最大化TPR。为了计算目标函数中的TPR,我们考虑一组异常声音是正常声音的互补集合,并使用拒绝抽样算法模拟异常声音。通过使用合成数据的实验,我们发现提出的方法改进了ADS的性能指标。在低FPR条件下。此外,我们确认所提出的方法可以检测重新环境中的异常声音。
translated by 谷歌翻译
本文介绍了使用无监督深度神经网络(特别是卷积神经网络)检测工业控制系统(ICS)上的网络攻击的研究。该研究是在SecureWaterTreatment试验台(SWaT)数据集上进行的,该数据集代表了面积世界工业水处理厂的缩小版本。提出了一种基于测量预测值与观测值的统计偏差的异常检测方法。我们采用各种深度神经网络架构来应用所提出的方法,包括卷积网络和循环网络的不同变体。来自SWaT的测试数据包括36个不同的网络攻击。所提出的方法成功地以低误报率检测到绝大多数攻击,从而改进了基于该数据集的先前工作。研究结果表明,一维卷积网络可以成功地应用于工业控制系统中的异常检测,并且表现出更复杂的复杂网络,同时训练更小,更快。
translated by 谷歌翻译