智能论文笔记

Known by the company we keep: `Triadic influence' as a proxy for compatibility in social relationships

Miguel Ruíz-García , Juan Ozaita , María Pereda , Antonio Alfonso , Pablo Brañas-Garza. Jose A. Cuesta , Ángel Sánchez

分类： (统计)机器学习

2022-09-08

社会互动网络是建立文明的基材。通常，我们与我们喜欢的人建立新的纽带，或者认为通过第三方的干预，我们的关系损害了。尽管它们的重要性和这些过程对我们的生活产生的巨大影响，但对它们的定量科学理解仍处于起步阶段，这主要是由于很难收集大量的社交网络数据集，包括个人属性。在这项工作中，我们对13所学校的真实社交网络进行了彻底的研究，其中3,000多名学生和60,000名宣布正面关系和负面关系，包括对所有学生的个人特征的测试。我们引入了一个度量标准 - “三合会影响”，该指标衡量了最近的邻居在其接触关系中的影响。我们使用神经网络来预测关系，并根据他们的个人属性或三合会的影响来提取两个学生是朋友或敌人的可能性。或者，我们可以使用网络结构的高维嵌入来预测关系。值得注意的是，三合会影响（一个简单的一维度量）在预测两个学生之间的关系方面达到了最高的准确性。我们假设从神经网络中提取的概率 - 三合会影响的功能和学生的个性 - 控制真实社交网络的演变，为这些系统的定量研究开辟了新的途径。

translated by 谷歌翻译

Adversarial attacks and defenses on ML- and hardware-based IoT device fingerprinting and identification

Pedro Miguel Sánchez Sánchez , Alberto Huertas Celdrán , Gérôme Bovet , Gregorio Martínez Pérez

分类：人工智能

2022-12-30

In the last years, the number of IoT devices deployed has suffered an undoubted explosion, reaching the scale of billions. However, some new cybersecurity issues have appeared together with this development. Some of these issues are the deployment of unauthorized devices, malicious code modification, malware deployment, or vulnerability exploitation. This fact has motivated the requirement for new device identification mechanisms based on behavior monitoring. Besides, these solutions have recently leveraged Machine and Deep Learning techniques due to the advances in this field and the increase in processing capabilities. In contrast, attackers do not stay stalled and have developed adversarial attacks focused on context modification and ML/DL evaluation evasion applied to IoT device identification solutions. This work explores the performance of hardware behavior-based individual device identification, how it is affected by possible context- and ML/DL-focused attacks, and how its resilience can be improved using defense techniques. In this sense, it proposes an LSTM-CNN architecture based on hardware performance behavior for individual device identification. Then, previous techniques have been compared with the proposed architecture using a hardware performance dataset collected from 45 Raspberry Pi devices running identical software. The LSTM-CNN improves previous solutions achieving a +0.96 average F1-Score and 0.8 minimum TPR for all devices. Afterward, context- and ML/DL-focused adversarial attacks were applied against the previous model to test its robustness. A temperature-based context attack was not able to disrupt the identification. However, some ML/DL state-of-the-art evasion attacks were successful. Finally, adversarial training and model distillation defense techniques are selected to improve the model resilience to evasion attacks, without degrading its performance.

translated by 谷歌翻译

RL and Fingerprinting to Select Moving Target Defense Mechanisms for Zero-day Attacks in IoT

Alberto Huertas Celdrán , Pedro Miguel Sánchez Sánchez , Jan von der Assen , Timo Schenk , Gérôme Bovet , Gregorio Martínez Pérez , Burkhard Stiller

分类：人工智能

2022-12-30

Cybercriminals are moving towards zero-day attacks affecting resource-constrained devices such as single-board computers (SBC). Assuming that perfect security is unrealistic, Moving Target Defense (MTD) is a promising approach to mitigate attacks by dynamically altering target attack surfaces. Still, selecting suitable MTD techniques for zero-day attacks is an open challenge. Reinforcement Learning (RL) could be an effective approach to optimize the MTD selection through trial and error, but the literature fails when i) evaluating the performance of RL and MTD solutions in real-world scenarios, ii) studying whether behavioral fingerprinting is suitable for representing SBC's states, and iii) calculating the consumption of resources in SBC. To improve these limitations, the work at hand proposes an online RL-based framework to learn the correct MTD mechanisms mitigating heterogeneous zero-day attacks in SBC. The framework considers behavioral fingerprinting to represent SBCs' states and RL to learn MTD techniques that mitigate each malicious state. It has been deployed on a real IoT crowdsensing scenario with a Raspberry Pi acting as a spectrum sensor. More in detail, the Raspberry Pi has been infected with different samples of command and control malware, rootkits, and ransomware to later select between four existing MTD techniques. A set of experiments demonstrated the suitability of the framework to learn proper MTD techniques mitigating all attacks (except a harmfulness rootkit) while consuming <1 MB of storage and utilizing <55% CPU and <80% RAM.

translated by 谷歌翻译

A Snapshot of the Frontiers of Client Selection in Federated Learning

Gergely Dániel Németh , Miguel Ángel Lozano , Novi Quadrianto , Nuria Oliver

分类：人工智能 | 机器学习

2022-09-27

Federated learning (FL) has been proposed as a privacy-preserving approach in distributed machine learning. A federated learning architecture consists of a central server and a number of clients that have access to private, potentially sensitive data. Clients are able to keep their data in their local machines and only share their locally trained model's parameters with a central server that manages the collaborative learning process. FL has delivered promising results in real-life scenarios, such as healthcare, energy, and finance. However, when the number of participating clients is large, the overhead of managing the clients slows down the learning. Thus, client selection has been introduced as a strategy to limit the number of communicating parties at every step of the process. Since the early na\"{i}ve random selection of clients, several client selection methods have been proposed in the literature. Unfortunately, given that this is an emergent field, there is a lack of a taxonomy of client selection methods, making it hard to compare approaches. In this paper, we propose a taxonomy of client selection in Federated Learning that enables us to shed light on current progress in the field and identify potential areas of future research in this promising area of machine learning.

translated by 谷歌翻译

On the Optimal Combination of Cross-Entropy and Soft Dice Losses for Lesion Segmentation with Out-of-Distribution Robustness

Adrian Galdran , Gustavo Carneiro , Miguel Ángel González Ballester

分类：计算机视觉

2022-09-13

我们研究不同损失功能对医学图像病变细分的影响。尽管在处理自然图像时，跨凝结（CE）损失是最受欢迎的选择，但对于生物医学图像分割，由于其处理不平衡的情况，软骰子损失通常是首选的。另一方面，这两个功能的组合也已成功地应用于此类任务中。一个较少研究的问题是在存在分布（OOD）数据的情况下所有这些损失的概括能力。这是指在测试时间出现的样本，这些样本是从与训练图像不同的分布中得出的。在我们的情况下，我们将模型训练在始终包含病变的图像上，但是在测试时间我们也有无病变样品。我们通过全面的实验对内窥镜图像和糖尿病脚图像的溃疡分割进行了全面的实验，分析了不同损失函数对分布性能的最小化对分布性能的影响。我们的发现令人惊讶：在处理OOD数据时，CE-DICE损失组合在分割分配图像中表现出色，这使我们建议通过这种问题采用CE损失，因为它的稳健性和能够概括为OOD样品。可以在\ url {https://github.com/agaldran/lesion_losses_ood}找到与我们实验相关的代码。

translated by 谷歌翻译

Cross-Lingual and Cross-Domain Crisis Classification for Low-Resource Scenarios

Cinthia Sánchez , Hernan Sarmiento , Jorge Pérez , Andres Abeliuk , Barbara Poblete

分类：自然语言处理

2022-09-05

社交媒体数据已成为有关现实世界危机事件的及时信息的有用来源。与将社交媒体用于灾难管理有关的主要任务之一是自动识别与危机相关的消息。关于该主题的大多数研究都集中在特定语言中特定类型事件的数据分析上。这限制了概括现有方法的可能性，因为模型不能直接应用于新类型的事件或其他语言。在这项工作中，我们研究了通过利用跨语言和跨域标记数据来自动对与危机事件相关的消息进行分类的任务。我们的目标是利用来自高资源语言的标记数据来对其他（低资源）语言和/或新（以前看不见的）类型的危机情况进行分类。在我们的研究中，我们从文献中合并了一个大型统一数据集，其中包含多个危机事件和语言。我们的经验发现表明，确实有可能利用英语危机事件的数据来对其他语言（例如西班牙语和意大利语）（80.0％的F1得分）对相同类型的事件进行分类。此外，我们在跨语言环境中为跨域任务（80.0％F1得分）取得了良好的性能。总体而言，我们的工作有助于改善数据稀缺问题，这对于多语言危机分类非常重要。特别是，当时间是本质的时候，可以减轻紧急事件中的冷启动情况。

translated by 谷歌翻译

Dynamics of Fourier Modes in Torus Generative Adversarial Networks

Ángel González-Prieto , Alberto Mozo , Edgar Talavera , Sandra Gómez-Canaval

分类：机器学习 | (统计)机器学习

2022-09-05

生成对抗网络（GAN）是强大的机器学习模型，能够生成具有高分辨率的所需现象的完全合成样本。尽管他们成功了，但GAN的训练过程非常不稳定，通常有必要对网络实施几种附属启发式方法，以达到模型的可接受收敛。在本文中，我们介绍了一种新颖的方法来分析生成对抗网络培训的收敛性和稳定性。为此，我们建议分解对手Min-Max游戏的目标功能，将定期gan定义为傅立叶系列。通过研究连续交替梯度下降算法的截短傅里叶序列的动力学，我们能够近似实际流量并确定GAN收敛的主要特征。通过研究$ 2 $ - 参数gan的旨在产生未知指数分布的训练流，从经验上证实了这种方法。作为副产品，我们表明gan中的融合轨道是周期性轨道的小扰动，因此纳什均值是螺旋吸引子。从理论上讲，这证明了在甘斯中观察到的缓慢和不稳定的训练。

translated by 谷歌翻译

Adaptive QoS of WebRTC for Vehicular Media Communications

Ángel Martín , Daniel Mejías , Zaloa Fernández , Roberto Viola , Josu Pérez , Mikel García , Gorka Velez , Jon Montalbán , Pablo Angueira

分类：计算机视觉

2022-08-24

车载传感器的车载系统正在增强连接。这使信息共享能够实现对环境的更全面的理解。但是，通过公共蜂窝网络的同行通信带来了多个网络障碍以解决，需要网络系统来中继通信并连接无法直接连接的各方。 Web实时通信（WEBRTC）是跨车辆流媒体流媒体的良好候选者，因为它可以使延迟通信较低，同时将标准协议带到安全握手中，发现公共IP和横向网络地址转换（NAT）系统。但是，在基础架构中的端到端服务质量（QOS）适应，在该基础架构中，传输和接收是通过继电器解耦的，需要一种机制来有效地使视频流适应网络容量。为此，本文通过利用实时运输控制协议（RTCP）指标（例如带宽和往返时间）来调查解决分辨率，帧和比特率更改的机制。该解决方案旨在确保接收机上系统及时获得相关信息。在实际的5G测试台中分析了应用不同方法适应方法时对端到端吞吐量效率和反应时间的影响。

translated by 谷歌翻译

HTML版本

Machine learning algorithms for three-dimensional mean-curvature computation in the level-set method

Luis Ángel Larios-Cárdenas , Frédéric Gibou

分类：机器学习

2022-08-18

我们为级别集方法提出了一个数据驱动的均值曲线求解器。这项工作是我们在[arxiv：2201.12342] [1]和[doi：10.1016/j.jcp.2022.1111291] [arxiv：2201.12342] [1]中的二维策略的$ \ mathbb {r}^3 $的自然扩展。 ]。但是，与[1,2]建立了依赖分辨率的神经网络词典相比，在这里，我们在$ \ mathbb {r}^3 $中开发了两对模型，而不管网格大小如何。我们的前馈网络摄入的水平集，梯度和曲率数据转换为固定接口节点的数值均值曲率近似值。为了降低问题的复杂性，我们使用高斯曲率对模板进行了分类，并将模型分别适合于非堆肥和鞍模式。非插图模板更容易处理，因为它们表现出以单调性和对称性为特征的曲率误差分布。尽管后者允许我们仅在平均曲面频谱的一半上进行训练，但前者帮助我们将数据驱动的融合并在平坦区域附近无缝地融合了基线估计。另一方面，鞍形图案误差结构不太清楚。因此，我们没有利用超出已知信息的潜在信息。在这方面，我们不仅在球形和正弦和双曲线抛物面斑块上训练了我们的模型。我们构建他们的数据集的方法是系统的，但是随机收集样品，同时确保均衡度。我们还诉诸于标准化和降低尺寸，作为预处理步骤和集成正则化以最大程度地减少异常值。此外，我们利用曲率旋转/反射不变性在推理时提高精度。几项实验证实，与现代粒子的界面重建和水平设定方案相比，我们提出的系统可以产生更准确的均值曲线估计。

translated by 谷歌翻译

Robust Self-Tuning Data Association for Geo-Referencing Using Lane Markings

Miguel Ángel Muñoz-Bañón , Jan-Hendrik Pauls , Haohao Hu , Christoph Stiller , Francisco A. Candelas , Fernando Torres

分类：机器人 | 计算机视觉

2022-07-28

基于航空图像的地图中的本地化提供了许多优势，例如全球一致性，地理参考地图以及可公开访问数据的可用性。但是，从空中图像和板载传感器中可以观察到的地标是有限的。这导致数据关联期间的歧义或混叠。本文以高度信息的代表制（允许有效的数据关联）为基础，为解决这些歧义提供了完整的管道。它的核心是强大的自我调整数据关联，它根据测量的熵调整搜索区域。此外，为了平滑最终结果，我们将相关数据的信息矩阵调整为数据关联过程产生的相对变换的函数。我们评估了来自德国卡尔斯鲁厄市周围城市和农村场景的真实数据的方法。我们将最新的异常缓解方法与我们的自我调整方法进行了比较，这表明了相当大的改进，尤其是对于外部城市场景。

translated by 谷歌翻译