Automated detecting lung infections from computed tomography (CT) data plays an important role for combating COVID-19. However, there are still some challenges for developing AI system. 1) Most current COVID-19 infection segmentation methods mainly relied on 2D CT images, which lack 3D sequential constraint. 2) Existing 3D CT segmentation methods focus on single-scale representations, which do not achieve the multiple level receptive field sizes on 3D volume. 3) The emergent breaking out of COVID-19 makes it hard to annotate sufficient CT volumes for training deep model. To address these issues, we first build a multiple dimensional-attention convolutional neural network (MDA-CNN) to aggregate multi-scale information along different dimension of input feature maps and impose supervision on multiple predictions from different CNN layers. Second, we assign this MDA-CNN as a basic network into a novel dual multi-scale mean teacher network (DM${^2}$T-Net) for semi-supervised COVID-19 lung infection segmentation on CT volumes by leveraging unlabeled data and exploring the multi-scale information. Our DM${^2}$T-Net encourages multiple predictions at different CNN layers from the student and teacher networks to be consistent for computing a multi-scale consistency loss on unlabeled data, which is then added to the supervised loss on the labeled data from multiple predictions of MDA-CNN. Third, we collect two COVID-19 segmentation datasets to evaluate our method. The experimental results show that our network consistently outperforms the compared state-of-the-art methods.
translated by 谷歌翻译
宫颈异常细胞检测是一项具有挑战性的任务,因为异常细胞和正常细胞之间的形态差异通常是微妙的。为了确定宫颈细胞是正常还是异常,细胞病理学家总是将周围细胞作为参考,并进行仔细比较以鉴定其异常。为了模仿这些临床行为,我们建议探索上下文关系,以提高宫颈异常细胞检测的性能。具体而言,利用细胞和细胞到全球图像之间的上下文关系,以增强每个感兴趣区域(ROI)建议的特征。因此,开发了两个模块,称为ROI关系注意模块(RRAM)和全球ROI注意模块(GRAM),还研究了它们的组合策略。我们通过使用特征金字塔网络(FPN)使用单头或双头更快的R-CNN来设置强基础,并将我们的RRAM和革兰氏集整合到它们中以验证提出的模块的有效性。由40,000个细胞学图像组成的大宫颈细胞检测数据集进行的实验表明,RRAM和GRAM的引入都比基线方法获得了更好的平均精度(AP)。此外,当级联RRAM和GRAM时,我们的方法优于最先进的方法(SOTA)方法。此外,我们还显示了提出的功能增强方案可以促进图像级别和涂片级别的分类。代码和训练有素的模型可在https://github.com/cviu-csu/cr4cacd上公开获得。
translated by 谷歌翻译
实现自动驾驶汽车(AV)的人级安全性能仍然是一个挑战。一个关键的瓶颈是所谓的“长尾挑战”,通常是指AVS应该能够处理看似无限的低概率安全性安全驾驶场景的问题,即使在公共场所积累了数百万个测试里程道路。但是,既没有严格的定义,也没有对此类问题的属性进行分析,这阻碍了解决问题的进展。在本文中,我们系统地分析了“长尾挑战”,并提出了“稀有诅咒”(COR)的概念。我们得出的结论是,由于在驾驶环境的高维度中,安全至关重要事件的稀有性,COR对“维度诅咒”(COD)(COD)的基础是“长尾挑战”的根本原因。我们在AV开发的各个方面讨论COR,包括感知,预测和计划以及验证和验证。基于这些分析和讨论,我们提出了潜在的解决方案来解决COR,以加速AV开发和部署。
translated by 谷歌翻译
第六版的AI城市挑战赛特别关注了两个领域的问题,在计算机视觉和人工智能的交集中具有巨大的解锁潜力:智能交通系统(ITS),以及实体和砂浆零售业务。 2022年AI City Challenge的四个挑战赛收到了来自27个国家 /地区254个团队的参与请求。轨道1地址的城市规模多目标多摄像机(MTMC)车辆跟踪。轨道2地址为基于天然语言的车辆轨道检索。 Track 3是一条全新的自然主义驾驶分析的轨道,该轨道是由安装在车辆内部的几台相机捕获的,该摄像头专注于驾驶员安全,而任务是对驾驶员的操作进行分类。 Track 4是另一个旨在仅使用单个视图摄像头实现零售商店自动结帐的新轨道。我们发布了两个基于不同方法的领导董事会成员提交,包括比赛的公共负责人委员会,不允许使用外部数据,以及用于所有提交结果的总管委员会。参与团队的最高表现建立了强大的基线,甚至超过了拟议的挑战赛中的最先进。
translated by 谷歌翻译
深度学习(DL)模型为各种医学成像基准挑战提供了最先进的性能,包括脑肿瘤细分(BRATS)挑战。然而,局灶性病理多隔室分割(例如,肿瘤和病变子区)的任务特别具有挑战性,并且潜在的错误阻碍DL模型转化为临床工作流程。量化不确定形式的DL模型预测的可靠性,可以实现最不确定的地区的临床审查,从而建立信任并铺平临床翻译。最近,已经引入了许多不确定性估计方法,用于DL医学图像分割任务。开发指标评估和比较不确定性措施的表现将有助于最终用户制定更明智的决策。在本研究中,我们探索并评估在Brats 2019-2020任务期间开发的公制,以对不确定量化量化(Qu-Brats),并旨在评估和排列脑肿瘤多隔室分割的不确定性估计。该公制(1)奖励不确定性估计,对正确断言产生高置信度,以及在不正确的断言处分配低置信水平的估计数,(2)惩罚导致更高百分比的无关正确断言百分比的不确定性措施。我们进一步基准测试由14个独立参与的Qu-Brats 2020的分割不确定性,所有这些都参与了主要的Brats细分任务。总体而言,我们的研究结果证实了不确定性估计提供了分割算法的重要性和互补价值,因此突出了医学图像分析中不确定性量化的需求。我们的评估代码在HTTPS://github.com/ragmeh11/qu-brats公开提供。
translated by 谷歌翻译
显微镜交通模拟为自动驾驶汽车(AVS)提供了可控,可重复且有效的测试环境。为了公正地评估AVS的安全性能,在模拟自然主义驾驶环境(NDE)中,环境统计数据的概率分布必须与现实世界中驾驶环境的统计数据一致。但是,尽管人类驾驶行为已经在运输工程领域进行了广泛的研究,但大多数现有模型都是用于交通流量分析的,而无需考虑驾驶行为的分布一致性,这可能会导致AV测试的重大评估偏见。为了填补这一研究差距,本文提出了分布一致的NDE建模框架。使用大规模的自然驾驶数据,获得了经验分布,以在不同条件下构建随机的人类驾驶行为模型。为了解决仿真过程中的误差积累问题,进一步设计了一种基于优化的方法来完善经验行为模型。具体而言,车辆状态的演变被建模为马尔可夫链,其固定分布被扭曲以匹配现实世界驾驶环境的分布。在多车道高速公路驾驶模拟的案例研究中评估了该框架,其中验证了生成的NDE的分布精度,并有效地评估了AV模型的安全性能。
translated by 谷歌翻译
Increasing research interests focus on sequential recommender systems, aiming to model dynamic sequence representation precisely. However, the most commonly used loss function in state-of-the-art sequential recommendation models has essential limitations. To name a few, Bayesian Personalized Ranking (BPR) loss suffers the vanishing gradient problem from numerous negative sampling and predictionbiases; Binary Cross-Entropy (BCE) loss subjects to negative sampling numbers, thereby it is likely to ignore valuable negative examples and reduce the training efficiency; Cross-Entropy (CE) loss only focuses on the last timestamp of the training sequence, which causes low utilization of sequence information and results in inferior user sequence representation. To avoid these limitations, in this paper, we propose to calculate Cumulative Cross-Entropy (CCE) loss over the sequence. CCE is simple and direct, which enjoys the virtues of painless deployment, no negative sampling, and effective and efficient training. We conduct extensive experiments on five benchmark datasets to demonstrate the effectiveness and efficiency of CCE. The results show that employing CCE loss on three state-of-the-art models GRU4Rec, SASRec, and S3-Rec can reach 125.63%, 69.90%, and 33.24% average improvement of full ranking NDCG@5, respectively. Using CCE, the performance curve of the models on the test data increases rapidly with the wall clock time, and is superior to that of other loss functions in almost the whole process of model training.
translated by 谷歌翻译
In the scenario of black-box adversarial attack, the target model's parameters are unknown, and the attacker aims to find a successful adversarial perturbation based on query feedback under a query budget. Due to the limited feedback information, existing query-based black-box attack methods often require many queries for attacking each benign example. To reduce query cost, we propose to utilize the feedback information across historical attacks, dubbed example-level adversarial transferability. Specifically, by treating the attack on each benign example as one task, we develop a meta-learning framework by training a meta-generator to produce perturbations conditioned on benign examples. When attacking a new benign example, the meta generator can be quickly fine-tuned based on the feedback information of the new task as well as a few historical attacks to produce effective perturbations. Moreover, since the meta-train procedure consumes many queries to learn a generalizable generator, we utilize model-level adversarial transferability to train the meta-generator on a white-box surrogate model, then transfer it to help the attack against the target model. The proposed framework with the two types of adversarial transferability can be naturally combined with any off-the-shelf query-based attack methods to boost their performance, which is verified by extensive experiments.
translated by 谷歌翻译
Supervised Deep-Learning (DL)-based reconstruction algorithms have shown state-of-the-art results for highly-undersampled dynamic Magnetic Resonance Imaging (MRI) reconstruction. However, the requirement of excessive high-quality ground-truth data hinders their applications due to the generalization problem. Recently, Implicit Neural Representation (INR) has appeared as a powerful DL-based tool for solving the inverse problem by characterizing the attributes of a signal as a continuous function of corresponding coordinates in an unsupervised manner. In this work, we proposed an INR-based method to improve dynamic MRI reconstruction from highly undersampled k-space data, which only takes spatiotemporal coordinates as inputs. Specifically, the proposed INR represents the dynamic MRI images as an implicit function and encodes them into neural networks. The weights of the network are learned from sparsely-acquired (k, t)-space data itself only, without external training datasets or prior images. Benefiting from the strong implicit continuity regularization of INR together with explicit regularization for low-rankness and sparsity, our proposed method outperforms the compared scan-specific methods at various acceleration factors. E.g., experiments on retrospective cardiac cine datasets show an improvement of 5.5 ~ 7.1 dB in PSNR for extremely high accelerations (up to 41.6-fold). The high-quality and inner continuity of the images provided by INR has great potential to further improve the spatiotemporal resolution of dynamic MRI, without the need of any training data.
translated by 谷歌翻译
Recent studies have shown that using an external Language Model (LM) benefits the end-to-end Automatic Speech Recognition (ASR). However, predicting tokens that appear less frequently in the training set is still quite challenging. The long-tail prediction problems have been widely studied in many applications, but only been addressed by a few studies for ASR and LMs. In this paper, we propose a new memory augmented lookup dictionary based Transformer architecture for LM. The newly introduced lookup dictionary incorporates rich contextual information in training set, which is vital to correctly predict long-tail tokens. With intensive experiments on Chinese and English data sets, our proposed method is proved to outperform the baseline Transformer LM by a great margin on both word/character error rate and tail tokens error rate. This is achieved without impact on the decoding efficiency. Overall, we demonstrate the effectiveness of our proposed method in boosting the ASR decoding performance, especially for long-tail tokens.
translated by 谷歌翻译