智能论文笔记

covEcho Resource constrained lung ultrasound image analysis tool for faster triaging and active learning

Jinu Joseph , Mahesh Raveendranatha Panicker , Yale Tung Chen , Kesavadas Chandrasekharan , Vimal Chacko Mondy , Anoop Ayyappan , Jineesh Valakkada , Kiran Vishnu Narayan

分类：计算机视觉

2022-06-21

肺超声（LUS）可能是唯一可用于连续和周期性监测肺的医学成像方式。这对于在肺部感染开始期间跟踪肺表现或跟踪疫苗接种对肺部的影响非常有用，如Covid-19中的肺部作用。有许多尝试将肺严重程度分为各个类别或自动分割各种LUS地标和表现形式的尝试。但是，所有这些方法均基于训练静态机器学习模型，该模型需要大量临床注释的大数据集，并且在计算上是沉重的，并且大部分时间非现实时间。在这项工作中，提出了一种实时重量的基于活跃的学习方法，以在资源约束设置中在COVID-19的受试者中更快地进行分类。该工具基于您看起来仅一次（YOLO）网络，具有基于各种LUS地标，人工制品和表现形式的标识，肺部感染严重程度的预测，基于主动学习的可能性，提供图像质量的能力。临床医生的反馈或图像质量以及对感染严重程度高的重要框架的汇总，以进一步分析。结果表明，对于LUS地标的预测，该提议的工具在联合（IOU）阈值的交叉点上的平均平均精度（MAP）为66％。在Quadro P4000 GPU运行时，14MB轻量级Yolov5S网络可实现123 fps。该工具可根据作者的要求进行使用和分析。

translated by 谷歌翻译

A Comparative Study on Unsupervised Anomaly Detection for Time Series: Experiments and Analysis

Yan Zhao , Liwei Deng , Xuanhao Chen , Chenjuan Guo , Bin Yang , Tung Kieu , Feiteng Huang , Torben Bach Pedersen , Kai Zheng , Christian S. Jensen

分类：机器学习 | 人工智能

2022-09-10

社会过程的持续数字化转化为时间序列数据的扩散，这些数据涵盖了诸如欺诈检测，入侵检测和能量管理等应用，在这种应用程序中，异常检测通常对于启用可靠性和安全性至关重要。许多最近的研究针对时间序列数据的异常检测。实际上，时间序列异常检测的特征是不同的数据，方法和评估策略，现有研究中的比较仅考虑了这种多样性的一部分，这使得很难为特定问题设置选择最佳方法。为了解决这一缺点，我们介绍了有关数据，方法和评估策略的分类法，并使用分类法提供了无监督时间序列检测的全面概述，并系统地评估和比较了最先进的传统以及深度学习技术。在使用九个公开可用数据集的实证研究中，我们将最常用的性能评估指标应用于公平实施标准下的典型方法。根据分类法提供的结构化，我们报告了经验研究，并以比较表的形式提供指南，以选择最适合特定应用程序设置的方法。最后，我们为这个动态领域提出了研究方向。

translated by 谷歌翻译

Graph Neural Networks for Low-Energy Event Classification & Reconstruction in IceCube

R. Abbasi , M. Ackermann , J. Adams , N. Aggarwal , J. A. Aguilar , M. Ahlers , M. Ahrens , J. M. Alameddine , A. A. Alves Jr. , N. M. Amin

分类：机器学习

2022-09-07

ICECUBE是一种用于检测1 GEV和1 PEV之间大气和天体中微子的光学传感器的立方公斤阵列，该阵列已部署1.45 km至2.45 km的南极的冰盖表面以下1.45 km至2.45 km。来自ICE探测器的事件的分类和重建在ICeCube数据分析中起着核心作用。重建和分类事件是一个挑战，这是由于探测器的几何形状，不均匀的散射和冰中光的吸收，并且低于100 GEV的光，每个事件产生的信号光子数量相对较少。为了应对这一挑战，可以将ICECUBE事件表示为点云图形，并将图形神经网络（GNN）作为分类和重建方法。 GNN能够将中微子事件与宇宙射线背景区分开，对不同的中微子事件类型进行分类，并重建沉积的能量，方向和相互作用顶点。基于仿真，我们提供了1-100 GEV能量范围的比较与当前ICECUBE分析中使用的当前最新最大似然技术，包括已知系统不确定性的影响。对于中微子事件分类，与当前的IceCube方法相比，GNN以固定的假阳性速率（FPR）提高了信号效率的18％。另外，GNN在固定信号效率下将FPR的降低超过8（低于半百分比）。对于能源，方向和相互作用顶点的重建，与当前最大似然技术相比，分辨率平均提高了13％-20％。当在GPU上运行时，GNN能够以几乎是2.7 kHz的中位数ICECUBE触发速率的速率处理ICECUBE事件，这打开了在在线搜索瞬态事件中使用低能量中微子的可能性。

translated by 谷歌翻译

The Metaverse Data Deluge: What Can We Do About It?

Beng Chin Ooi , Gang Chen , Mike Zheng Shou , Kian-Lee Tan , Anthony Tung , Xiaokui Xiao , James Wei Luen Yip , Meihui Zhang

分类：人工智能 | 计算机视觉

2022-06-14

In the Metaverse, the physical space and the virtual space co-exist, and interact simultaneously. While the physical space is virtually enhanced with information, the virtual space is continuously refreshed with real-time, real-world information. To allow users to process and manipulate information seamlessly between the real and digital spaces, novel technologies must be developed. These include smart interfaces, new augmented realities, efficient storage and data management and dissemination techniques. In this paper, we first discuss some promising co-space applications. These applications offer opportunities that neither of the spaces can realize on its own. We then discuss challenges. Finally, we discuss and envision what are likely to be required from the database and system perspectives.

translated by 谷歌翻译

CI-AVSR: A Cantonese Audio-Visual Speech Dataset for In-car Command Recognition

Wenliang Dai , Samuel Cahyawijaya , Tiezheng Yu , Elham J. Barezi , Peng Xu , Cheuk Tung Shadow Yiu , Rita Frieske , Holy Lovenia , Genta Indra Winata , Qifeng Chen

分类：自然语言处理 | 人工智能

2022-01-11

随着深度学习和智能车辆的兴起，智能助手已成为促进驾驶和提供额外功能的基本内部组件。汽车智能助理应该能够处理一般的和与汽车有关的命令，并执行相应的操作，减轻驾驶和提高安全性。但是，对于低资源语言存在数据稀缺问题，妨碍了研究和应用的发展。在本文中，我们介绍了一个新的DataSet，粤式视听语音识别（CI-AVSR），用于粤语中的车载命令识别，具有视频和音频数据。它由令人宣传的30个粤语发言者记录的200个车载命令的4,984个样本（8.3小时）组成。此外，我们使用常见的内部内部背景噪声增强我们的数据集来模拟真实环境，产生比收集的数据集大10倍。我们提供我们数据集的清洁和增强版本的详细统计信息。此外，我们实施了两个多模式基线以证明CI-AVSR的有效性。实验结果表明，利用视觉信号提高了模型的整体性能。虽然我们的最佳模型可以在清洁测试集上实现相当大的质量，但嘈杂数据的语音识别质量仍然是较差的，并且仍然是真正的车载语音识别系统的极其具有挑战性的任务。数据集和代码将在https://github.com/hltchkust/ci-avsr发布。

translated by 谷歌翻译

Automatic Speech Recognition Datasets in Cantonese Language: A Survey and a New Dataset

Tiezheng Yu , Rita Frieske , Peng Xu , Samuel Cahyawijaya , Cheuk Tung Shadow Yiu , Holy Lovenia , Wenliang Dai , Elham J. Barezi , Qifeng Chen , Xiaojuan Ma

分类：自然语言处理

2022-01-07

低资源语言的自动语音识别（ASR）改善了语言少数群体的访问，以便人工智能（AI）提供的技术优势。在本文中，我们通过创建一个新的粤语数据集来解决香港广东语言的数据稀缺问题。我们的数据集多域粤语语料库（MDCC）由73.6小时的清洁阅读语音与成绩单配对，从香港的粤语有声读物收集。它结合了哲学，政治，教育，文化，生活方式和家庭领域，涵盖了广泛的主题。我们还查看所有现有的粤语数据集，并在两个最大的数据集（MDCC和公共语音ZH-HK）上执行实验。我们根据其语音类型，数据源，总大小和可用性分析现有数据集。使用Fairseq S2T变压器，最先进的ASR模型进行实验结果，显示了我们数据集的有效性。此外，我们通过在MDCC和常见的声音ZH-HK上应用多数据集学习来创建一个强大而强大的粤语ASR模型。

translated by 谷歌翻译

Machine Learning Approach to Polymerization Reaction Engineering: Determining Monomers Reactivity Ratios

Tung Nguyen , Mona Bavarian

分类：机器学习

2023-01-03

Here, we demonstrate how machine learning enables the prediction of comonomers reactivity ratios based on the molecular structure of monomers. We combined multi-task learning, multi-inputs, and Graph Attention Network to build a model capable of predicting reactivity ratios based on the monomers chemical structures.

translated by 谷歌翻译

Egocentric Video Task Translation

Zihui Xue , Yale Song , Kristen Grauman , Lorenzo Torresani

分类：计算机视觉

2022-12-13

Different video understanding tasks are typically treated in isolation, and even with distinct types of curated data (e.g., classifying sports in one dataset, tracking animals in another). However, in wearable cameras, the immersive egocentric perspective of a person engaging with the world around them presents an interconnected web of video understanding tasks -- hand-object manipulations, navigation in the space, or human-human interactions -- that unfold continuously, driven by the person's goals. We argue that this calls for a much more unified approach. We propose EgoTask Translation (EgoT2), which takes a collection of models optimized on separate tasks and learns to translate their outputs for improved performance on any or all of them at once. Unlike traditional transfer or multi-task learning, EgoT2's flipped design entails separate task-specific backbones and a task translator shared across all tasks, which captures synergies between even heterogeneous tasks and mitigates task competition. Demonstrating our model on a wide array of video tasks from Ego4D, we show its advantages over existing transfer paradigms and achieve top-ranked results on four of the Ego4D 2022 benchmark challenges.

translated by 谷歌翻译

Deep Learning Generates Synthetic Cancer Histology for Explainability and Education

James M. Dolezal , Rachelle Wolk , Hanna M. Hieromnimon , Frederick M. Howard , Andrew Srisuwananukorn , Dmitry Karpeyev , Siddhi Ramesh , Sara Kochanny , Jung Woo Kwon , Meghana Agni

分类：计算机视觉

2022-11-12

Artificial intelligence methods including deep neural networks (DNN) can provide rapid molecular classification of tumors from routine histology with accuracy that matches or exceeds human pathologists. Discerning how neural networks make their predictions remains a significant challenge, but explainability tools help provide insights into what models have learned when corresponding histologic features are poorly defined. Here, we present a method for improving explainability of DNN models using synthetic histology generated by a conditional generative adversarial network (cGAN). We show that cGANs generate high-quality synthetic histology images that can be leveraged for explaining DNN models trained to classify molecularly-subtyped tumors, exposing histologic features associated with molecular state. Fine-tuning synthetic histology through class and layer blending illustrates nuanced morphologic differences between tumor subtypes. Finally, we demonstrate the use of synthetic histology for augmenting pathologist-in-training education, showing that these intuitive visualizations can reinforce and improve understanding of histologic manifestations of tumor biology.

translated by 谷歌翻译

Online pseudo labeling for polyp segmentation with momentum networks

Toan Pham Van , Linh Bao Doan , Thanh Tung Nguyen , Duc Trung Tran , Quan Van Nguyen , Dinh Viet Sang

分类：计算机视觉

2022-09-29

语义分割是开发医学图像诊断系统的重要任务。但是，构建注释的医疗数据集很昂贵。因此，在这种情况下，半监督方法很重要。在半监督学习中，标签的质量在模型性能中起着至关重要的作用。在这项工作中，我们提出了一种新的伪标签策略，可提高用于培训学生网络的伪标签的质量。我们遵循多阶段的半监督训练方法，该方法在标记的数据集上训练教师模型，然后使用训练有素的老师将伪标签渲染用于学生培训。通过这样做，伪标签将被更新，并且随着培训的进度更加精确。上一个和我们的方法之间的关键区别在于，我们在学生培训过程中更新教师模型。因此，在学生培训过程中，提高了伪标签的质量。我们还提出了一种简单但有效的策略，以使用动量模型来提高伪标签的质量 - 训练过程中原始模型的慢复制版本。通过应用动量模型与学生培训期间的重新渲染伪标签相结合，我们在五个数据集中平均达到了84.1％的骰子分数（即Kvarsir，CVC-ClinicdB，Etis-laribpolypdb，cvc-colondb，cvc-colondb，cvc-colondb和cvc-300）和CVC-300）只有20％的数据集用作标记数据。我们的结果超过了3％的共同实践，甚至在某些数据集中取得了完全监督的结果。我们的源代码和预培训模型可在https://github.com/sun-asterisk-research/online学习SSL上找到

translated by 谷歌翻译