智能论文笔记

NN2Poly: A polynomial representation for deep feed-forward artificial neural networks

Pablo Morala , Jenny Alexandra Cifuentes , Rosa E. Lillo , Iñaki Ucar

分类： (统计)机器学习 | 机器学习

2021-12-21

神经网络的可解释性及其潜在的理论行为仍然是一个开放的学习领域，即使在实际应用的巨大成功之后，特别是在深度学习的出现。在这项工作中，提出了NN2Poly：一种理论方法，允许获得提供已经训练的深神经网络的替代表示的多项式。这扩展了ARXIV中提出的先前想法：2102.03865，其仅限于单个隐藏层神经网络，以便在回归和分类任务中使用任意深度前馈神经网络。本文的目的是通过在每层的激活函数上使用泰勒膨胀来实现，然后使用若干组合性质，允许识别所需多项式的系数。讨论了实现本理论方法时的主要计算限制，并介绍了NN2POLY工作所必需的神经网络权重的约束的示例。最后，呈现了一些模拟，得出结论，使用NN2Poly可以获得给定神经网络的表示，并且在所获得的预测之间具有低误差。

translated by 谷歌翻译

Learning crop type mapping from regional label proportions in large-scale SAR and optical imagery

Laura E. C. La Rosa , Dario A. B. Oliveira , Pedram Ghamisi

分类：计算机视觉

2022-08-24

近年来，深度学习算法在地球观察（EO）中的应用使依赖远程感知数据的领域取得了重大进展。但是，鉴于EO中的数据量表，创建具有专家使用像素级注释的大型数据集是昂贵且耗时的。在这种情况下，先验被视为一种有吸引力的方法，可以减轻在训练EO的深度学习方法时手动标签的负担。对于某些应用，这些先验很容易获得。本研究以许多计算机视觉任务中的自我监督特征表示学习的对比学习方法取得了巨大成功的动机，本研究提出了一种使用作物标签比例的在线深度聚类方法，作为研究基于政府作物的样本级别的先验者 - 整个农业地区的比例数据。我们使用来自巴西两个不同农业地区的两个大数据集评估了该方法。广泛的实验表明，该方法对不同的数据类型（合成句子雷达和光学图像）具有鲁棒性，考虑到目标区域中主要的作物类型，报告了更高的精度值。因此，它可以减轻EO应用中大规模图像注释的负担。

translated by 谷歌翻译

HTML版本

Multi-modal volumetric concept activation to explain detection and classification of metastatic prostate cancer on PSMA-PET/CT

Rosa C. J. Kraaijveld , Marielle E. P. Philippens , Wietse S. C. Eppinga , Ina M. Jürgenliemk-Schulz , Kenneth G. A. Gilhuijs , Petra S. Kroon , Bas H. M. van der Velden

分类：计算机视觉

2022-08-04

可解释的人工智能（XAI）越来越多地用于分析神经网络的行为。概念激活使用人解剖概念来解释神经网络行为。这项研究旨在评估回归概念激活的可行性，以解释多模式体积数据的检测和分类。概念验证证明是在前列腺发射断层扫描/计算机断层扫描（PET/CT）成像的转移性前列腺癌患者中证明的。多模式的体积概念激活用于提供全球和局部解释。敏感性为80％，为每位患者的假阳性为1.78。全球解释表明，检测集中在CT上的解剖位置和PET上的检测信心。当地的解释显示出有望有助于区分真实积极因素和误报。因此，这项研究证明了使用回归概念激活来解释多模式体积数据的检测和分类的可行性。

translated by 谷歌翻译

Panoptic Segmentation Meets Remote Sensing

Osmar Luiz Ferreira de Carvalho , Osmar Abílio de Carvalho Júnior , Cristiano Rosa e Silva , Anesmar Olino de Albuquerque , Nickolas Castro Santana , Dibio Leandro Borges , Roberto Arnaldo Trancoso Gomes , Renato Fontes Guimarães

分类：计算机视觉 | 人工智能

2021-11-23

Panoptic semonation组合实例和语义预测，允许同时检测“事物”和“东西”。在许多具有挑战性的问题中有效地接近远程感测的数据中的Panoptic分段可能是吉祥的，因为它允许连续映射和特定的目标计数。有几个困难阻止了遥感中这项任务的增长：（a）大多数算法都设计用于传统图像，（b）图像标签必须包含“事物”和“填写”类，并且（c）注释格式复杂。因此，旨在解决和提高遥感中Panoptic分割的可操作性，这项研究有五个目标：（1）创建一个新的Panoptic分段数据准备管道，（2）提出注释转换软件以产生Panoptic注释; （3）在城市地区提出一个小说数据集，（4）修改任务的Detectron2，（5）评估城市环境中这项任务的困难。我们使用的空中图像，考虑14级，使用0,24米的空间分辨率。我们的管道考虑了三个图像输入，所提出的软件使用点Shapefile来创建Coco格式的样本。我们的研究生成了3,400个样本，具有512x512像素尺寸。我们使用了带有两个骨干板（Reset-50和Reset-101）的Panoptic-FPN，以及模型评估被视为语义实例和Panoptic指标。我们获得了93.9,47.7和64.9的平均iou，box ap和pq。我们的研究提出了一个用于Panoptic Seation的第一个有效管道，以及用于其他研究人员的广泛数据库使用和处理需要彻底了解的其他数据或相关问题。

translated by 谷歌翻译

Logic Mill -- A Knowledge Navigation System

Sebastian Erhardt , Mainak Ghosh , Erik Buunk , Michael E. Rose , Dietmar Harhoff

分类：自然语言处理

2022-12-31

Logic Mill is a scalable and openly accessible software system that identifies semantically similar documents within either one domain-specific corpus or multi-domain corpora. It uses advanced Natural Language Processing (NLP) techniques to generate numerical representations of documents. Currently it leverages a large pre-trained language model to generate these document representations. The system focuses on scientific publications and patent documents and contains more than 200 million documents. It is easily accessible via a simple Application Programming Interface (API) or via a web interface. Moreover, it is continuously being updated and can be extended to text corpora from other domains. We see this system as a general-purpose tool for future research applications in the social sciences and other domains.

translated by 谷歌翻译

Observer-based Controller for VTOL-UAVs Tracking using Direct Vision-Aided Inertial Navigation Measurements

Hashim A. Hashim , Abdelrahman E. E. Eltoukhy , Akos Odry

分类：机器人

2022-12-30

This paper proposes a novel observer-based controller for Vertical Take-Off and Landing (VTOL) Unmanned Aerial Vehicle (UAV) designed to directly receive measurements from a Vision-Aided Inertial Navigation System (VA-INS) and produce the required thrust and rotational torque inputs. The VA-INS is composed of a vision unit (monocular or stereo camera) and a typical low-cost 6-axis Inertial Measurement Unit (IMU) equipped with an accelerometer and a gyroscope. A major benefit of this approach is its applicability for environments where the Global Positioning System (GPS) is inaccessible. The proposed VTOL-UAV observer utilizes IMU and feature measurements to accurately estimate attitude (orientation), gyroscope bias, position, and linear velocity. Ability to use VA-INS measurements directly makes the proposed observer design more computationally efficient as it obviates the need for attitude and position reconstruction. Once the motion components are estimated, the observer-based controller is used to control the VTOL-UAV attitude, angular velocity, position, and linear velocity guiding the vehicle along the desired trajectory in six degrees of freedom (6 DoF). The closed-loop estimation and the control errors of the observer-based controller are proven to be exponentially stable starting from almost any initial condition. To achieve global and unique VTOL-UAV representation in 6 DoF, the proposed approach is posed on the Lie Group and the design in unit-quaternion is presented. Although the proposed approach is described in a continuous form, the discrete version is provided and tested. Keywords: Vision-aided inertial navigation system, unmanned aerial vehicle, vertical take-off and landing, stochastic, noise, Robotics, control systems, air mobility, observer-based controller algorithm, landmark measurement, exponential stability.

translated by 谷歌翻译

Joint Action is a Framework for Understanding Partnerships Between Humans and Upper Limb Prostheses

Michael R. Dawson , Adam S. R. Parker , Heather E. Williams , Ahmed W. Shehata , Jacqueline S. Hebert , Craig S. Chapman , Patrick M. Pilarski

分类：人工智能 | 机器人

2022-12-28

Recent advances in upper limb prostheses have led to significant improvements in the number of movements provided by the robotic limb. However, the method for controlling multiple degrees of freedom via user-generated signals remains challenging. To address this issue, various machine learning controllers have been developed to better predict movement intent. As these controllers become more intelligent and take on more autonomy in the system, the traditional approach of representing the human-machine interface as a human controlling a tool becomes limiting. One possible approach to improve the understanding of these interfaces is to model them as collaborative, multi-agent systems through the lens of joint action. The field of joint action has been commonly applied to two human partners who are trying to work jointly together to achieve a task, such as singing or moving a table together, by effecting coordinated change in their shared environment. In this work, we compare different prosthesis controllers (proportional electromyography with sequential switching, pattern recognition, and adaptive switching) in terms of how they present the hallmarks of joint action. The results of the comparison lead to a new perspective for understanding how existing myoelectric systems relate to each other, along with recommendations for how to improve these systems by increasing the collaborative communication between each partner.

translated by 谷歌翻译

MyI-Net: Fully Automatic Detection and Quantification of Myocardial Infarction from Cardiovascular MRI Images

Shuihua Wang , Ahmed M. S. E. K Abdelaty , Kelly Parke , J Ranjit Arnold , Gerry P McCann , Ivan Y Tyukin

分类：计算机视觉 | 机器学习

2022-12-28

A "heart attack" or myocardial infarction (MI), occurs when an artery supplying blood to the heart is abruptly occluded. The "gold standard" method for imaging MI is Cardiovascular Magnetic Resonance Imaging (MRI), with intravenously administered gadolinium-based contrast (late gadolinium enhancement). However, no "gold standard" fully automated method for the quantification of MI exists. In this work, we propose an end-to-end fully automatic system (MyI-Net) for the detection and quantification of MI in MRI images. This has the potential to reduce the uncertainty due to the technical variability across labs and inherent problems of the data and labels. Our system consists of four processing stages designed to maintain the flow of information across scales. First, features from raw MRI images are generated using feature extractors built on ResNet and MoblieNet architectures. This is followed by the Atrous Spatial Pyramid Pooling (ASPP) to produce spatial information at different scales to preserve more image context. High-level features from ASPP and initial low-level features are concatenated at the third stage and then passed to the fourth stage where spatial information is recovered via up-sampling to produce final image segmentation output into: i) background, ii) heart muscle, iii) blood and iv) scar areas. New models were compared with state-of-art models and manual quantification. Our models showed favorable performance in global segmentation and scar tissue detection relative to state-of-the-art work, including a four-fold better performance in matching scar pixels to contours produced by clinicians.

translated by 谷歌翻译

Thermal Heating in ReRAM Crossbar Arrays: Challenges and Solutions

Kamilya Smagulova , Mohammed E. Fouda , Ahmed Eltawil

分类：机器学习

2022-12-28

Increasing popularity of deep-learning-powered applications raises the issue of vulnerability of neural networks to adversarial attacks. In other words, hardly perceptible changes in input data lead to the output error in neural network hindering their utilization in applications that involve decisions with security risks. A number of previous works have already thoroughly evaluated the most commonly used configuration - Convolutional Neural Networks (CNNs) against different types of adversarial attacks. Moreover, recent works demonstrated transferability of the some adversarial examples across different neural network models. This paper studied robustness of the new emerging models such as SpinalNet-based neural networks and Compact Convolutional Transformers (CCT) on image classification problem of CIFAR-10 dataset. Each architecture was tested against four White-box attacks and three Black-box attacks. Unlike VGG and SpinalNet models, attention-based CCT configuration demonstrated large span between strong robustness and vulnerability to adversarial examples. Eventually, the study of transferability between VGG, VGG-inspired SpinalNet and pretrained CCT 7/3x1 models was conducted. It was shown that despite high effectiveness of the attack on the certain individual model, this does not guarantee the transferability to other models.

translated by 谷歌翻译

Sensing-Throughput Tradeoffs with Generative Adversarial Networks for NextG Spectrum Sharing

Yi Shi , Yalin E. Sagduyu

分类：机器学习

2022-12-27

Spectrum coexistence is essential for next generation (NextG) systems to share the spectrum with incumbent (primary) users and meet the growing demand for bandwidth. One example is the 3.5 GHz Citizens Broadband Radio Service (CBRS) band, where the 5G and beyond communication systems need to sense the spectrum and then access the channel in an opportunistic manner when the incumbent user (e.g., radar) is not transmitting. To that end, a high-fidelity classifier based on a deep neural network is needed for low misdetection (to protect incumbent users) and low false alarm (to achieve high throughput for NextG). In a dynamic wireless environment, the classifier can only be used for a limited period of time, i.e., coherence time. A portion of this period is used for learning to collect sensing results and train a classifier, and the rest is used for transmissions. In spectrum sharing systems, there is a well-known tradeoff between the sensing time and the transmission time. While increasing the sensing time can increase the spectrum sensing accuracy, there is less time left for data transmissions. In this paper, we present a generative adversarial network (GAN) approach to generate synthetic sensing results to augment the training data for the deep learning classifier so that the sensing time can be reduced (and thus the transmission time can be increased) while keeping high accuracy of the classifier. We consider both additive white Gaussian noise (AWGN) and Rayleigh channels, and show that this GAN-based approach can significantly improve both the protection of the high-priority user and the throughput of the NextG user (more in Rayleigh channels than AWGN channels).

translated by 谷歌翻译