智能论文笔记

MEDS-Net: Self-Distilled Multi-Encoders Network with Bi-Direction Maximum Intensity projections for Lung Nodule Detection

Muhammad Usman , Azka Rehman , Abdullah Shahid , Siddique Latif , Shi Sub Byon , Byoung Dai Lee , Sung Hyun Kim , Byung il Lee , Yeong Gil Shin

分类：计算机视觉

2022-10-30

In this study, we propose a lung nodule detection scheme which fully incorporates the clinic workflow of radiologists. Particularly, we exploit Bi-Directional Maximum intensity projection (MIP) images of various thicknesses (i.e., 3, 5 and 10mm) along with a 3D patch of CT scan, consisting of 10 adjacent slices to feed into self-distillation-based Multi-Encoders Network (MEDS-Net). The proposed architecture first condenses 3D patch input to three channels by using a dense block which consists of dense units which effectively examine the nodule presence from 2D axial slices. This condensed information, along with the forward and backward MIP images, is fed to three different encoders to learn the most meaningful representation, which is forwarded into the decoded block at various levels. At the decoder block, we employ a self-distillation mechanism by connecting the distillation block, which contains five lung nodule detectors. It helps to expedite the convergence and improves the learning ability of the proposed architecture. Finally, the proposed scheme reduces the false positives by complementing the main detector with auxiliary detectors. The proposed scheme has been rigorously evaluated on 888 scans of LUNA16 dataset and obtained a CPM score of 93.6\%. The results demonstrate that incorporating of bi-direction MIP images enables MEDS-Net to effectively distinguish nodules from surroundings which help to achieve the sensitivity of 91.5% and 92.8% with false positives rate of 0.25 and 0.5 per scan, respectively.

translated by 谷歌翻译

Tracking Passengers and Baggage Items using Multiple Overhead Cameras at Security Checkpoints

Abubakar Siddique , Henry Medeiros

分类：计算机视觉

2022-12-31

We introduce a novel framework to track multiple objects in overhead camera videos for airport checkpoint security scenarios where targets correspond to passengers and their baggage items. We propose a Self-Supervised Learning (SSL) technique to provide the model information about instance segmentation uncertainty from overhead images. Our SSL approach improves object detection by employing a test-time data augmentation and a regression-based, rotation-invariant pseudo-label refinement technique. Our pseudo-label generation method provides multiple geometrically-transformed images as inputs to a Convolutional Neural Network (CNN), regresses the augmented detections generated by the network to reduce localization errors, and then clusters them using the mean-shift algorithm. The self-supervised detector model is used in a single-camera tracking algorithm to generate temporal identifiers for the targets. Our method also incorporates a multi-view trajectory association mechanism to maintain consistent temporal identifiers as passengers travel across camera views. An evaluation of detection, tracking, and association performances on videos obtained from multiple overhead cameras in a realistic airport checkpoint environment demonstrates the effectiveness of the proposed approach. Our results show that self-supervision improves object detection accuracy by up to $42\%$ without increasing the inference time of the model. Our multi-camera association method achieves up to $89\%$ multi-object tracking accuracy with an average computation time of less than $15$ ms.

translated by 谷歌翻译

Robust Recurrent Neural Network to Identify Ship Motion in Open Water with Performance Guarantees -- Technical Report

Daniel Frank , Decky Aspandi Latif , Michael Muehlebach , Steffen Staab

分类：机器学习

2022-12-12

Recurrent neural networks are capable of learning the dynamics of an unknown nonlinear system purely from input-output measurements. However, the resulting models do not provide any stability guarantees on the input-output mapping. In this work, we represent a recurrent neural network as a linear time-invariant system with nonlinear disturbances. By introducing constraints on the parameters, we can guarantee finite gain stability and incremental finite gain stability. We apply this identification method to learn the motion of a four-degrees-of-freedom ship that is moving in open water and compare it against other purely learning-based approaches with unconstrained parameters. Our analysis shows that the constrained recurrent neural network has a lower prediction accuracy on the test set, but it achieves comparable results on an out-of-distribution set and respects stability conditions.

translated by 谷歌翻译

PTSD in the Wild: A Video Database for Studying Post-Traumatic Stress Disorder Recognition in Unconstrained Environments

Moctar Abdoul Latif Sawadogo , Furkan Pala , Gurkirat Singh , Imen Selmi , Pauline Puteaux , Alice Othmani

分类：计算机视觉 | 机器学习

2022-09-28

创伤后应激障碍（PTSD）是一种长期衰弱的精神状况，是针对灾难性生活事件（例如军事战斗，性侵犯和自然灾害）而发展的。 PTSD的特征是过去的创伤事件，侵入性思想，噩梦，过度维护和睡眠障碍的闪回，所有这些都会影响一个人的生活，并导致相当大的社会，职业和人际关系障碍。 PTSD的诊断是由医学专业人员使用精神障碍诊断和统计手册（DSM）中定义的PTSD症状的自我评估问卷进行的。在本文中，这是我们第一次收集，注释并为公共发行准备了一个新的视频数据库，用于自动PTSD诊断，在野生数据集中称为PTSD。该数据库在采集条件下表现出“自然”和巨大的差异，面部表达，照明，聚焦，分辨率，年龄，性别，种族，遮挡和背景。除了描述数据集集合的详细信息外，我们还提供了评估野生数据集中PTSD的基于计算机视觉和机器学习方法的基准。此外，我们建议并评估基于深度学习的PTSD检测方法。提出的方法显示出非常有希望的结果。有兴趣的研究人员可以从：http：//www.lissi.fr/ptsd-dataset/下载PTSD-in-wild数据集的副本

translated by 谷歌翻译

Globally Optimal Event-Based Divergence Estimation for Ventral Landing

Sofia McLeod , Gabriele Meoni , Dario Izzo , Anne Mergy , Daqi Liu , Yasir Latif , Ian Reid , Tat-Jun Chin

分类：计算机视觉

2022-09-27

事件传感是生物启发的飞行指导和控制系统中的主要组成部分。我们探讨了事件摄像机在腹侧着陆期间与表面进行时间接触（TTC）的用法。这是通过估计差异（逆TTC）的差异来实现的，即径向光流的速率，是从着陆期间产生的事件流。我们的核心贡献是针对基于事件的差异估计的一种新颖的对比度最大化公式，以及一种分支和结合算法，可准确地最大化对比度并找到最佳的差异值。进行GPU加速度以加快全球算法。另一个贡献是一个新的数据集，其中包含来自腹面着陆的真实事件流，该数据集用于测试和基准我们的方法。由于全局优化，与其他启发式差异估计器或基于事件的光流方法相比，我们的算法更有能力恢复真正的分歧。随着GPU加速，我们的方法还可以实现竞争性的运行时间。

translated by 谷歌翻译

Device-friendly Guava fruit and leaf disease detection using deep learning

Rabindra Nath Nandi , Aminul Haque Palash , Nazmul Siddique , Mohammed Golam Zilani

分类：计算机视觉

2022-09-26

这项工作使用水果和叶子的图像提出了一个基于学习的植物性诊断系统。已经使用了五个最先进的卷积神经网络（CNN）来实施该系统。迄今为止，模型的精度一直是此类应用程序的重点，并且尚未考虑模型的模型适用于最终用户设备。两种模型量化技术，例如float16和动态范围量化已应用于五个最新的CNN体系结构。研究表明，量化的GoogleNet模型达到了0.143 MB的尺寸，准确度为97％，这是考虑到大小标准的最佳候选模型。高效网络模型以99％的精度达到了4.2MB的大小，这是考虑性能标准的最佳模型。源代码可在https://github.com/compostieai/guava-disease-detection上获得。

translated by 谷歌翻译

Towards Bridging the Space Domain Gap for Satellite Pose Estimation using Event Sensing

Mohsi Jawaid , Ethan Elms , Yasir Latif , Tat-Jun Chin

分类：计算机视觉 | 机器人

2022-09-24

使用合成数据训练的深层模型需要适应域的适应性，以弥合模拟环境和目标环境之间的差距。最新的域适应方法通常需要来自目标域的足够数量（未标记的）数据。但是，当目标域是极端环境（例如空间）时，这种需求很难满足。在本文中，我们的目标问题是接近卫星姿势估计，从实际的会合任务中获取卫星的图像是昂贵的。我们证明，事件传感提供了一种有希望的解决方案，可以在Stark照明差异下从模拟到目标域。我们的主要贡献是一种基于事件的卫星姿势估计技术，纯粹是对合成事件数据进行培训的，该数据具有基本数据增强，以提高针对实际（嘈杂）事件传感器的鲁棒性。基础我们的方法是一个具有仔细校准的地面真相的新型数据集，其中包括通过在剧烈的照明条件下在实验室中模拟卫星集合场景获得的真实事件数据。数据集上的结果表明，我们基于事件的卫星姿势估计方法仅在没有适应的情况下接受合成数据训练，可以有效地概括为目标域。

translated by 谷歌翻译

Automated ischemic stroke lesion segmentation from 3D MRI

Md Mahfuzur Rahman Siddique , Dong Yang , Yufan He , Daguang Xu , Andriy Myronenko

分类：计算机视觉

2022-09-20

缺血性中风病变细分挑战（Isles 2022）为研究人员提供了一个平台，可以将其解决方案与3D MRI的缺血性中风区域进行比较。在这项工作中，我们描述了我们对2022分段任务的解决方案。我们将所有图像重新样本为一个共同的分辨率，使用两种输入MRI模式（DWI和ADC），并使用MONAI的Train Segresnet语义分割网络。最终提交是15个模型的合奏（来自3倍交叉验证的3次运行）。我们的解决方案（NVAUTO团队名称）在骰子度量标准（0.824）和总排名第2（基于合并的度量排名）方面获得了最高位置。

translated by 谷歌翻译

Self-supervised Learning for Panoptic Segmentation of Multiple Fruit Flower Species

Abubakar Siddique , Amy Tabb , Henry Medeiros

分类：计算机视觉

2022-09-10

使用手动生成标签训练的卷积神经网络通常用于语义或实例分割。在精确的农业中，自动花探测方法使用监督模型和后处理技术，这些技术可能不会始终如一地表现为花朵的出现，并且数据采集条件有所不同。我们提出了一种自我监督的学习策略，以使用自动生成的伪标签来增强分割模型对不同花种物种的敏感性。我们采用数据增强和完善方法来提高模型预测的准确性。然后将增强的语义预测转换为全景伪标签，以迭代训练多任务模型。可以通过现有的后处理方法来完善自我监督的模型预测，以进一步提高其准确性。对多物种果树花数据集的评估表明，我们的方法的表现优于最先进的模型，而无需计算昂贵的后处理步骤，为花朵检测应用提供了新的基线。

translated by 谷歌翻译

Multi-Robot Synergistic Localization in Dynamic Environments

Ehsan Latif , Ramviyas Parasuraman

分类：机器人

2022-06-07

移动机器人的精确位置信息对于导航和任务处理至关重要，尤其是对于多机器人系统（MRS），可以从该领域进行协作和收集有价值的数据。但是，在无法访问GPS信号（例如在环境控制，室内或地下环境中）的机器人发现很难单独使用其传感器找到。结果，机器人共享其本地信息以改善其本地化估计，使整个MRS团队受益。已经尝试使用无线电信号强度指标（RSSI）作为计算轴承信息的来源进行了几次尝试模拟基于多机器人的定位。我们还利用了通过系统中多个机器人的通信生成的无线网络，并旨在在动态环境中具有很高准确性和效率的定位代理，以共享信息融合以完善本地化估计。该估计器结构减少了一个测量相关性的来源，同时适当地纳入了其他相关性。本文提出了一个分散的多机器人协同定位系统（MRSL），以实现密集和动态的环境。每当从邻居那里收到新信息时，机器人都会更新其位置估计。当系统感觉到该地区其他机器人的存在时，它会交换位置估计并将接收到的数据合并以提高其本地化精度。我们的方法使用基于贝叶斯规则的集成，该集成已证明在计算上是有效的，适用于异步机器人通信。我们已经使用数量不同的机器人进行了广泛的仿真实验，以分析算法。 MRSL与RSSI的本地化准确性优于文献中的其他算法，对未来发展有很大的希望。

translated by 谷歌翻译