智能论文笔记

A Hybrid Deep Learning Anomaly Detection Framework for Intrusion Detection

Rahul Kale , Zhi Lu , Kar Wai Fok , Vrizlynn L. L. Thing

分类：人工智能 | 机器学习

2022-12-02

Cyber intrusion attacks that compromise the users' critical and sensitive data are escalating in volume and intensity, especially with the growing connections between our daily life and the Internet. The large volume and high complexity of such intrusion attacks have impeded the effectiveness of most traditional defence techniques. While at the same time, the remarkable performance of the machine learning methods, especially deep learning, in computer vision, had garnered research interests from the cyber security community to further enhance and automate intrusion detections. However, the expensive data labeling and limitation of anomalous data make it challenging to train an intrusion detector in a fully supervised manner. Therefore, intrusion detection based on unsupervised anomaly detection is an important feature too. In this paper, we propose a three-stage deep learning anomaly detection based network intrusion attack detection framework. The framework comprises an integration of unsupervised (K-means clustering), semi-supervised (GANomaly) and supervised learning (CNN) algorithms. We then evaluated and showed the performance of our implemented framework on three benchmark datasets: NSL-KDD, CIC-IDS2018, and TON_IoT.

translated by 谷歌翻译

Exploring the Use of Data-Driven Approaches for Anomaly Detection in the Internet of Things (IoT) Environment

Eleonora Achiluzzi , Menglu Li , Md Fahd Al Georgy , Rasha Kashef

分类：机器学习

2022-12-31

The Internet of Things (IoT) is a system that connects physical computing devices, sensors, software, and other technologies. Data can be collected, transferred, and exchanged with other devices over the network without requiring human interactions. One challenge the development of IoT faces is the existence of anomaly data in the network. Therefore, research on anomaly detection in the IoT environment has become popular and necessary in recent years. This survey provides an overview to understand the current progress of the different anomaly detection algorithms and how they can be applied in the context of the Internet of Things. In this survey, we categorize the widely used anomaly detection machine learning and deep learning techniques in IoT into three types: clustering-based, classification-based, and deep learning based. For each category, we introduce some state-of-the-art anomaly detection methods and evaluate the advantages and limitations of each technique.

translated by 谷歌翻译

Improving Multilayer-Perceptron(MLP)-based Network Anomaly Detection with Birch Clustering on CICIDS-2017 Dataset

Yuhua Yin , Julian Jang-Jaccard , Fariza Sabrina , Jin Kwak

分类：机器学习

2022-08-20

机器学习算法已被广泛用于入侵检测系统，包括多层感知器（MLP）。在这项研究中，我们提出了一个两阶段模型，该模型结合了桦木聚类算法和MLP分类器，以提高网络异常多分类的性能。在我们提出的方法中，我们首先将桦木或kmeans作为无监督的聚类算法应用于CICIDS-2017数据集，以预先分组数据。然后，将生成的伪标签作为基于MLP分类器的训练的附加功能添加。实验结果表明，使用桦木和K-均值聚类进行数据预组化可以改善入侵检测系统的性能。我们的方法可以使用桦木聚类实现多分类的99.73％的精度，这比使用独立的MLP模型的类似研究要好。

translated by 谷歌翻译

GANomaly: Semi-Supervised Anomaly Detection via Adversarial Training

Samet Akcay , Amir Atapour-Abarghouei , Toby P. Breckon

分类：

2018-05-17

Anomaly detection is a classical problem in computer vision, namely the determination of the normal from the abnormal when datasets are highly biased towards one class (normal) due to the insufficient sample size of the other class (abnormal). While this can be addressed as a supervised learning problem, a significantly more challenging problem is that of detecting the unknown/unseen anomaly case that takes us instead into the space of a one-class, semi-supervised learning paradigm. We introduce such a novel anomaly detection model, by using a conditional generative adversarial network that jointly learns the generation of high-dimensional image space and the inference of latent space. Employing encoder-decoder-encoder sub-networks in the generator network enables the model to map the input image to a lower dimension vector, which is then used to reconstruct the generated output image. The use of the additional encoder network maps this generated image to its latent representation. Minimizing the distance between these images and the latent vectors during training aids in learning the data distribution for the normal samples. As a result, a larger distance metric from this learned data distribution at inference time is indicative of an outlier from that distribution -an anomaly. Experimentation over several benchmark datasets, from varying domains, shows the model efficacy and superiority over previous state-of-the-art approaches.

translated by 谷歌翻译

Comparative Study on Supervised versus Semi-supervised Machine Learning for Anomaly Detection of In-vehicle CAN Network

Yongqi Dong , Kejia Chen , Yinxuan Peng , Zhiyuan Ma

分类：机器学习 | 人工智能

2022-07-21

作为智能车辆控制系统的中心神经，车载网络总线对于车辆驾驶的安全至关重要。车载网络的最佳标准之一是控制器区域网络（CAN BUS）协议。但是，由于缺乏安全机制，CAN总线被设计为容易受到各种攻击的影响。为了增强车载网络的安全性并根据大量的CAN网络流量数据和提取的有价值的功能来促进该领域的研究，本研究全面比较了完全监督的机器学习与半监督的机器学习方法可以发信息异常检测。评估了传统的机器学习模型（包括单个分类器和集合模型）和基于神经网络的深度学习模型。此外，这项研究提出了一种基于自动编码器的深度自动编码器的半监督学习方法，该方法适用于CAN传达异常检测，并验证了其优于其他半监督方法的优势。广泛的实验表明，全面监督的方法通常优于半监督者，因为它们使用更多信息作为输入。通常，开发的基于XGBoost的模型以最佳准确性（98.65％），精度（0.9853）和Roc AUC（0.9585）击败了文献中报道的其他方法。

translated by 谷歌翻译

Semi-WTC: A Practical Semi-supervised Framework for Attack Categorization through Weight-Task Consistency

Zihan Li , Wentao Chen , Zhiqing Wei , Xingqi Luo , Bing Su

分类：机器学习

2022-05-19

监督学习已被广泛用于攻击分类，需要高质量的数据和标签。但是，数据通常是不平衡的，很难获得足够的注释。此外，有监督的模型应遵守现实世界的部署问题，例如防御看不见的人造攻击。为了应对挑战，我们提出了一个半监督的细粒攻击分类框架，该框架由编码器和两个分支机构结构组成，并且该框架可以推广到不同的监督模型。具有残留连接的多层感知器用作提取特征并降低复杂性的编码器。提出了复发原型模块（RPM）以半监督的方式有效地训练编码器。为了减轻数据不平衡问题，我们将重量任务一致性（WTC）引入RPM的迭代过程中，通过将较大的权重分配给损失函数中较少样本的类别。此外，为了应对现实世界部署中的新攻击，我们提出了一种主动调整重新采样（AAR）方法，该方法可以更好地发现看不见的样本数据的分布并调整编码器的参数。实验结果表明，我们的模型优于最先进的半监督攻击检测方法，分类精度提高了3％，训练时间降低了90％。

translated by 谷歌翻译

RUAD: unsupervised anomaly detection in HPC systems

Martin Molan , Andrea Borghesi , Daniele Cesarini , Luca Benini , Andrea Bartolini

分类：机器学习 | 人工智能

2022-08-28

现代高性能计算（HPC）系统的复杂性日益增加，需要引入自动化和数据驱动的方法，以支持系统管理员为增加系统可用性的努力。异常检测是改善可用性不可或缺的一部分，因为它减轻了系统管理员的负担，并减少了异常和解决方案之间的时间。但是，对当前的最新检测方法进行了监督和半监督，因此它们需要具有异常的人体标签数据集 - 在生产HPC系统中收集通常是不切实际的。基于聚类的无监督异常检测方法，旨在减轻准确的异常数据的需求，到目前为止的性能差。在这项工作中，我们通过提出RUAD来克服这些局限性，RUAD是一种新型的无监督异常检测模型。 Ruad比当前的半监督和无监督的SOA方法取得了更好的结果。这是通过考虑数据中的时间依赖性以及在模型体系结构中包括长短期限内存单元的实现。提出的方法是根据tier-0系统（带有980个节点的Cineca的Marconi100的完整历史）评估的。 RUAD在半监督训练中达到曲线（AUC）下的区域（AUC）为0.763，在无监督的训练中达到了0.767的AUC，这改进了SOA方法，在半监督训练中达到0.747的AUC，无需训练的AUC和0.734的AUC在无处不在的AUC中提高了AUC。训练。它还大大优于基于聚类的当前SOA无监督的异常检测方法，其AUC为0.548。

translated by 谷歌翻译

A Comparative Study of Detecting Anomalies in Time Series Data Using LSTM and TCN Models

Saroj Gopali , Faranak Abri , Sima Siami-Namini , Akbar Siami Namin

分类：机器学习

2021-12-17

存在几种数据驱动方法，使我们的模型时间序列数据能够包括传统的基于回归的建模方法（即，Arima）。最近，在时间序列分析和预测的背景下介绍和探索了深度学习技术。询问的主要研究问题是在预测时间序列数据中的深度学习技术中的这些变化的性能。本文比较了两个突出的深度学习建模技术。比较了经常性的神经网络（RNN）长的短期记忆（LSTM）和卷积神经网络（CNN）基于基于TCN的时间卷积网络（TCN），并报告了它们的性能和训练时间。根据我们的实验结果，两个建模技术都表现了相当具有基于TCN的模型优于LSTM略微。此外，基于CNN的TCN模型比基于RNN的LSTM模型更快地构建了稳定的模型。

translated by 谷歌翻译

DOC-NAD: A Hybrid Deep One-class Classifier for Network Anomaly Detection

Mohanad Sarhan , Gayan Kulatilleke , Wai Weng Lo , Siamak Layeghy , Marius Portmann

分类：机器学习

2022-12-15

Machine Learning (ML) approaches have been used to enhance the detection capabilities of Network Intrusion Detection Systems (NIDSs). Recent work has achieved near-perfect performance by following binary- and multi-class network anomaly detection tasks. Such systems depend on the availability of both (benign and malicious) network data classes during the training phase. However, attack data samples are often challenging to collect in most organisations due to security controls preventing the penetration of known malicious traffic to their networks. Therefore, this paper proposes a Deep One-Class (DOC) classifier for network intrusion detection by only training on benign network data samples. The novel one-class classification architecture consists of a histogram-based deep feed-forward classifier to extract useful network data features and use efficient outlier detection. The DOC classifier has been extensively evaluated using two benchmark NIDS datasets. The results demonstrate its superiority over current state-of-the-art one-class classifiers in terms of detection and false positive rates.

translated by 谷歌翻译

ARCADE: Adversarially Regularized Convolutional Autoencoder for Network Anomaly Detection

Willian T. Lunardi , Martin Andreoni Lopez , Jean-Pierre Giacalone

分类：机器学习

2022-05-03

As the number of heterogenous IP-connected devices and traffic volume increase, so does the potential for security breaches. The undetected exploitation of these breaches can bring severe cybersecurity and privacy risks. Anomaly-based \acp{IDS} play an essential role in network security. In this paper, we present a practical unsupervised anomaly-based deep learning detection system called ARCADE (Adversarially Regularized Convolutional Autoencoder for unsupervised network anomaly DEtection). With a convolutional \ac{AE}, ARCADE automatically builds a profile of the normal traffic using a subset of raw bytes of a few initial packets of network flows so that potential network anomalies and intrusions can be efficiently detected before they cause more damage to the network. ARCADE is trained exclusively on normal traffic. An adversarial training strategy is proposed to regularize and decrease the \ac{AE}'s capabilities to reconstruct network flows that are out-of-the-normal distribution, thereby improving its anomaly detection capabilities. The proposed approach is more effective than state-of-the-art deep learning approaches for network anomaly detection. Even when examining only two initial packets of a network flow, ARCADE can effectively detect malware infection and network attacks. ARCADE presents 20 times fewer parameters than baselines, achieving significantly faster detection speed and reaction time.

translated by 谷歌翻译

Fake detection in imbalance dataset by Semi-supervised learning with GAN

Jinus Bordbar , Saman Ardalan , Mohammadreza Mohammadrezaie , Mohammad Ebrahim Shiri

分类：机器学习 | 人工智能

2022-12-02

As social media grows faster, harassment becomes more prevalent which leads to considered fake detection a fascinating field among researchers. The graph nature of data with the large number of nodes caused different obstacles including a considerable amount of unrelated features in matrices as high dispersion and imbalance classes in the dataset. To deal with these issues Auto-encoders and a combination of semi-supervised learning and the GAN algorithm which is called SGAN were used. This paper is deploying a smaller number of labels and applying SGAN as a classifier. The result of this test showed that the accuracy had reached 91\% in detecting fake accounts using only 100 labeled samples.

translated by 谷歌翻译

Deep Learning for Time Series Anomaly Detection: A Survey

Zahra Zamanzadeh Darban , Geoffrey I. Webb , Shirui Pan , Charu C. Aggarwal , Mahsa Salehi

分类：机器学习 | 人工智能

2022-11-09

Time series anomaly detection has applications in a wide range of research fields and applications, including manufacturing and healthcare. The presence of anomalies can indicate novel or unexpected events, such as production faults, system defects, or heart fluttering, and is therefore of particular interest. The large size and complex patterns of time series have led researchers to develop specialised deep learning models for detecting anomalous patterns. This survey focuses on providing structured and comprehensive state-of-the-art time series anomaly detection models through the use of deep learning. It providing a taxonomy based on the factors that divide anomaly detection models into different categories. Aside from describing the basic anomaly detection technique for each category, the advantages and limitations are also discussed. Furthermore, this study includes examples of deep anomaly detection in time series across various application domains in recent years. It finally summarises open issues in research and challenges faced while adopting deep anomaly detection models.

translated by 谷歌翻译

Deep Learning for Anomaly Detection in Log Data: A Survey

Max Landauer , Sebastian Onder , Florian Skopik , Markus Wurzenberger

分类：机器学习

2022-07-08

自动日志文件分析可以尽早发现相关事件，例如系统故障。特别是，自我学习的异常检测技术在日志数据中捕获模式，随后向系统操作员报告意外的日志事件事件，而无需提前提供或手动对异常情况进行建模。最近，已经提出了越来越多的方法来利用深度学习神经网络为此目的。与传统的机器学习技术相比，这些方法证明了出色的检测性能，并同时解决了不稳定数据格式的问题。但是，有许多不同的深度学习体系结构，并且编码由神经网络分析的原始和非结构化日志数据是不平凡的。因此，我们进行了系统的文献综述，概述了部署的模型，数据预处理机制，异常检测技术和评估。该调查没有定量比较现有方法，而是旨在帮助读者了解不同模型体系结构的相关方面，并强调未来工作的开放问题。

translated by 谷歌翻译

Federated PCA on Grassmann Manifold for Anomaly Detection in IoT Networks

Tung-Anh Nguyen , Jiayu He , Long Tan Le , Wei Bao , Nguyen H. Tran

分类：机器学习

2022-12-23

In the era of Internet of Things (IoT), network-wide anomaly detection is a crucial part of monitoring IoT networks due to the inherent security vulnerabilities of most IoT devices. Principal Components Analysis (PCA) has been proposed to separate network traffics into two disjoint subspaces corresponding to normal and malicious behaviors for anomaly detection. However, the privacy concerns and limitations of devices' computing resources compromise the practical effectiveness of PCA. We propose a federated PCA-based Grassmannian optimization framework that coordinates IoT devices to aggregate a joint profile of normal network behaviors for anomaly detection. First, we introduce a privacy-preserving federated PCA framework to simultaneously capture the profile of various IoT devices' traffic. Then, we investigate the alternating direction method of multipliers gradient-based learning on the Grassmann manifold to guarantee fast training and the absence of detecting latency using limited computational resources. Empirical results on the NSL-KDD dataset demonstrate that our method outperforms baseline approaches. Finally, we show that the Grassmann manifold algorithm is highly adapted for IoT anomaly detection, which permits drastically reducing the analysis time of the system. To the best of our knowledge, this is the first federated PCA algorithm for anomaly detection meeting the requirements of IoT networks.

translated by 谷歌翻译

Computer Vision on X-ray Data in Industrial Production and Security Applications: A survey

Mehdi Rafiei , Jenni Raitoharju , Alexandros Iosifidis

分类：计算机视觉

2022-11-10

X-ray imaging technology has been used for decades in clinical tasks to reveal the internal condition of different organs, and in recent years, it has become more common in other areas such as industry, security, and geography. The recent development of computer vision and machine learning techniques has also made it easier to automatically process X-ray images and several machine learning-based object (anomaly) detection, classification, and segmentation methods have been recently employed in X-ray image analysis. Due to the high potential of deep learning in related image processing applications, it has been used in most of the studies. This survey reviews the recent research on using computer vision and machine learning for X-ray analysis in industrial production and security applications and covers the applications, techniques, evaluation metrics, datasets, and performance comparison of those techniques on publicly available datasets. We also highlight some drawbacks in the published research and give recommendations for future research in computer vision-based X-ray analysis.

translated by 谷歌翻译

A Survey on Unsupervised Visual Industrial Anomaly Detection Algorithms

Yajie Cui , Zhaoxiang Liu , Shiguo Lian

分类：计算机视觉

2022-04-24

与行业4.0的发展相一致，越来越多的关注被表面缺陷检测领域所吸引。提高效率并节省劳动力成本已稳步成为行业领域引起人们关注的问题，近年来，基于深度学习的算法比传统的视力检查方法更好。尽管现有的基于深度学习的算法偏向于监督学习，但这不仅需要大量标记的数据和大量的劳动力，而且还效率低下，并且有一定的局限性。相比之下，最近的研究表明，无监督的学习在解决视觉工业异常检测的高于缺点方面具有巨大的潜力。在这项调查中，我们总结了当前的挑战，并详细概述了最近提出的针对视觉工业异常检测的无监督算法，涵盖了五个类别，其创新点和框架详细描述了。同时，提供了包含表面图像样本的公开可用数据集的信息。通过比较不同类别的方法，总结了异常检测算法的优点和缺点。预计将协助研究社区和行业发展更广泛，更跨域的观点。

translated by 谷歌翻译

Intrusion Detection using Spatial-Temporal features based on Riemannian Manifold

Amardeep Singh , Julian Jang-Jaccard

分类：机器学习

2021-10-31

网络流量数据是不同网络协议下不同数据字节数据包的组合。这些流量数据包具有复杂的时变非线性关系。现有的最先进的方法通过基于相关性和使用提取空间和时间特征的混合分类技术将特征融合到多个子集中，通过将特征融合到多个子集中来提高这一挑战。这通常需要高计算成本和手动支持，这限制了它们的网络流量的实时处理。为了解决这个问题，我们提出了一种基于协方差矩阵的新型新颖特征提取方法，提取网络流量数据的空间时间特征来检测恶意网络流量行为。我们所提出的方法中的协方差矩阵不仅自然地对不同网络流量值之间的相互关系进行了编码，而且还具有落在riemannian歧管中的明确的几何形状。利莫曼歧管嵌入距离度量，便于提取用于检测恶意网络流量的判别特征。我们在NSL-KDD和UNSW-NB15数据集上进行了评估模型，并显示了我们提出的方法显着优于与数据集上的传统方法和其他现有研究。

translated by 谷歌翻译

Multivariate Time Series Anomaly Detection with Few Positive Samples

Feng Xue , Weizhong Yan

分类：机器学习 | 人工智能 | 神经与进化计算

2022-07-02

鉴于在现实世界应用中缺乏异常情况，大多数文献一直集中在建模正态上。学到的表示形式可以将异常检测作为正态性模型进行训练，以捕获正常情况下的某些密钥数据规律性。在实际环境中，尤其是工业时间序列异常检测中，我们经常遇到有大量正常操作数据以及随时间收集的少量异常事件的情况。这种实际情况要求方法学来利用这些少量的异常事件来创建更好的异常检测器。在本文中，我们介绍了两种方法来满足这种实际情况的需求，并将其与最近开发的最新技术进行了比较。我们提出的方法锚定在具有自回归（AR）模型的正常运行的代表性学习以及损失组件上，以鼓励表示正常与几个积极示例的表示形式。我们将提出的方法应用于两个工业异常检测数据集，并与文献相比表现出有效的性能。我们的研究还指出了在实际应用中采用此类方法的其他挑战。

translated by 谷歌翻译

Deep Transfer Learning: A Novel Collaborative Learning Model for Cyberattack Detection Systems in IoT Networks

Tran Viet Khoa , Dinh Thai Hoang , Nguyen Linh Trung , Cong T. Nguyen , Tran Thi Thuy Quynh , Diep N. Nguyen , Nguyen Viet Ha , Eryk Dutkiewicz

分类：机器学习

2021-12-02

联邦学习（FL）最近成为网络攻击检测系统的有效方法，尤其是在互联网上（物联网）网络。通过在IOT网关中分配学习过程，FL可以提高学习效率，降低通信开销并增强网络内人检测系统的隐私。在这种系统中实施FL的挑战包括不同物联网中的数据特征的标记数据和不可用的不可用。在本文中，我们提出了一种新的协作学习框架，利用转移学习（TL）来克服这些挑战。特别是，我们开发一种新颖的协作学习方法，使目标网络能够有效地和快速学习来自拥有丰富标记数据的源网络的知识。重要的是，最先进的研究要求网络的参与数据集具有相同的特征，从而限制了入侵检测系统的效率，灵活性以及可扩展性。但是，我们所提出的框架可以通过在各种深度学习模型中交换学习知识来解决这些问题，即使他们的数据集具有不同的功能。关于最近的真实网络安全数据集的广泛实验表明，与基于最先进的深度学习方法相比，拟议的框架可以提高超过40％。

translated by 谷歌翻译

Utilizing XAI technique to improve autoencoder based model for computer network anomaly detection with shapley additive explanation(SHAP)

Khushnaseeb Roshan , Aasim Zafar

分类：机器学习 | 人工智能

2021-12-14

机器学习（ML）和深度学习（DL）方法正在迅速采用，尤其是计算机网络安全，如欺诈检测，网络异常检测，入侵检测等等。然而，ML和DL基础模型缺乏透明度是其实施和由于其黑匣子性质而受到批评的主要障碍，即使具有如此巨大的结果。可解释的人工智能（XAI）是一个有希望的区域，可以通过给出解释和解释其产出来改善这些模型的可信度。如果ML和基于DL的模型的内部工作是可以理解的，那么它可以进一步帮助改善其性能。本文的目的是表明，Xai如何用于解释DL模型的结果，在这种情况下是AutoEncoder。并且，根据解释，我们改善了计算机网络异常检测的性能。基于福谢值的内核形状方法用作新颖的特征选择技术。此方法用于仅识别实际上导致该组攻击/异常实例的异常行为的那些功能。稍后，这些功能集用于培训和验证AutoEncoder，而是仅在良性数据上验证。最后，基于特征选择方法提出的其他两个模型的内置Shap_Model始终。整个实验是在最新的Cicids2017网络数据集的子集上进行的。 Shap_Model的总体精度和AUC分别为94％和0.969。

translated by 谷歌翻译