人的大脑位于复杂的神经生物学系统的核心,神经元,电路和子系统以神秘的方式相互作用。长期以来,了解大脑的结构和功能机制一直是神经科学研究和临床障碍疗法的引人入胜的追求。将人脑作为网络的连接映射是神经科学中最普遍的范例之一。图神经网络(GNN)最近已成为建模复杂网络数据的潜在方法。另一方面,深层模型的可解释性低,从而阻止了他们在医疗保健等决策环境中的使用。为了弥合这一差距,我们提出了一个可解释的框架,以分析特定的利益区域(ROI)和突出的联系。提出的框架由两个模块组成:疾病预测的面向脑网络的主链模型和全球共享的解释发生器,该模型突出了包括疾病特异性的生物标志物,包括显着的ROI和重要连接。我们在三个现实世界中的脑疾病数据集上进行实验。结果证明了我们的框架可以获得出色的性能并确定有意义的生物标志物。这项工作的所有代码均可在https://github.com/hennyjie/ibgnn.git上获得。
translated by 谷歌翻译
组合来自多视图图像的信息对于提高自动化方法的疾病诊断方法的性能和鲁棒性至关重要。但是,由于多视图图像的非对齐特性,跨视图的构建相关性和数据融合在很大程度上仍然是一个开放的问题。在这项研究中,我们提出了输血,这是一种基于变压器的体系结构,可使用卷积层和强大的注意机制合并不同的多视图成像信息。特别是,针对丰富的跨视图上下文建模和语义依赖性挖掘,提出了发散的融合注意(DIFA)模块,以解决从不同图像视图中捕获未对齐数据之间的长期相关性的关键问题。我们进一步提出了多尺度注意(MSA),以收集多尺度特征表示的全局对应关系。我们评估了心脏MRI(M \&MS-2)挑战队列中多疾病,多视图\&多中心右心室分段的输血。输血表明了针对最先进方法的领先绩效,并为多视图成像集成的新观点打开了稳健的医学图像分割。
translated by 谷歌翻译
自上而下的实例分割框架与自下而上的框架相比,它在对象检测方面表现出了优越性。虽然它有效地解决了过度细分,但自上而下的实例分割却遭受了过度处理问题。然而,完整的分割掩模对于生物图像分析至关重要,因为它具有重要的形态特性,例如形状和体积。在本文中,我们提出了一个区域建议纠正(RPR)模块,以解决这个具有挑战性的分割问题。特别是,我们提供了一个渐进式皇家模块,以逐渐将邻居信息引入一系列ROI。 ROI功能被馈入专门的进料网络(FFN)以进行提案框回归。有了其他邻居信息,提出的RPR模块显示了区域建议位置的校正显着改善,因此与最先进的基线方法相比,在三个生物图像数据集上表现出有利的实例分割性能。实验结果表明,所提出的RPR模块在基于锚固的和无锚的自上而下实例分割方法中有效,这表明该方法可以应用于生物学图像的一般自上而下实例分割。代码可用。
translated by 谷歌翻译
通常对端到端自动语音识别(ASR)模型进行训练,以优化整个令牌序列的损失,同时忽略了明确的音素粒度监督。这可能导致由于相似的混淆或音素减少而导致的识别错误。为了减轻这个问题,我们提出了一个基于监督对比学习(Scala)的新框架,以增强端到端ASR系统的音素表示学习。具体而言,我们将自我监督的掩盖对比预测编码(MCPC)扩展到完全监督的设置,在此设置以下方式应用监督。首先,Scala掩盖了可变长度编码器特征,根据音素边界,从预先训练的声学模型中提取的音素强制对齐;然后,它通过对比度学习预测了蒙版的特征。强制对齐可以提供音素标签,以减轻自我监督的MCPC中正阴对引入的噪声。关于阅读和自发语音数据集的实验表明,与基线相比,我们提出的方法分别达到了2.8和1.4点字符错误率(CER)绝对降低。
translated by 谷歌翻译
Generating new fonts is a time-consuming and labor-intensive, especially in a language with a huge amount of characters like Chinese. Various deep learning models have demonstrated the ability to efficiently generate new fonts with a few reference characters of that style. This project aims to develop a few-shot cross-lingual font generator based on AGIS-Net and improve the performance metrics mentioned. Our approaches include redesigning the encoder and the loss function. We will validate our method on multiple languages and datasets mentioned.
translated by 谷歌翻译
In recent years, object detection has achieved a very large performance improvement, but the detection result of small objects is still not very satisfactory. This work proposes a strategy based on feature fusion and dilated convolution that employs dilated convolution to broaden the receptive field of feature maps at various scales in order to address this issue. On the one hand, it can improve the detection accuracy of larger objects. On the other hand, it provides more contextual information for small objects, which is beneficial to improving the detection accuracy of small objects. The shallow semantic information of small objects is obtained by filtering out the noise in the feature map, and the feature information of more small objects is preserved by using multi-scale fusion feature module and attention mechanism. The fusion of these shallow feature information and deep semantic information can generate richer feature maps for small object detection. Experiments show that this method can have higher accuracy than the traditional YOLOv3 network in the detection of small objects and occluded objects. In addition, we achieve 32.8\% Mean Average Precision on the detection of small objects on MS COCO2017 test set. For 640*640 input, this method has 88.76\% mAP on the PASCAL VOC2012 dataset.
translated by 谷歌翻译
Existing 3D scene stylization methods employ an arbitrary style reference to transfer textures and colors as styles without establishing meaningful semantic correspondences. We present Reference-Based Non-Photorealistic Radiance Fields, i.e., Ref-NPR. It is a controllable scene stylization method utilizing radiance fields to stylize a 3D scene, with a single stylized 2D view taken as reference. To achieve decent results, we propose a ray registration process based on the stylized reference view to obtain pseudo-ray supervision in novel views, and exploit the semantic correspondence in content images to fill occluded regions with perceptually similar styles. Combining these operations, Ref-NPR generates non-photorealistic and continuous novel view sequences with a single reference while obtaining reasonable stylization in occluded regions. Experiments show that Ref-NPR significantly outperforms other scene and video stylization methods in terms of both visual quality and semantic correspondence. Code and data will be made publicly available.
translated by 谷歌翻译
Computer vision and machine learning are playing an increasingly important role in computer-assisted diagnosis; however, the application of deep learning to medical imaging has challenges in data availability and data imbalance, and it is especially important that models for medical imaging are built to be trustworthy. Therefore, we propose TRUDLMIA, a trustworthy deep learning framework for medical image analysis, which adopts a modular design, leverages self-supervised pre-training, and utilizes a novel surrogate loss function. Experimental evaluations indicate that models generated from the framework are both trustworthy and high-performing. It is anticipated that the framework will support researchers and clinicians in advancing the use of deep learning for dealing with public health crises including COVID-19.
translated by 谷歌翻译
Various types of Multi-Agent Reinforcement Learning (MARL) methods have been developed, assuming that agents' policies are based on true states. Recent works have improved the robustness of MARL under uncertainties from the reward, transition probability, or other partners' policies. However, in real-world multi-agent systems, state estimations may be perturbed by sensor measurement noise or even adversaries. Agents' policies trained with only true state information will deviate from optimal solutions when facing adversarial state perturbations during execution. MARL under adversarial state perturbations has limited study. Hence, in this work, we propose a State-Adversarial Markov Game (SAMG) and make the first attempt to study the fundamental properties of MARL under state uncertainties. We prove that the optimal agent policy and the robust Nash equilibrium do not always exist for an SAMG. Instead, we define the solution concept, robust agent policy, of the proposed SAMG under adversarial state perturbations, where agents want to maximize the worst-case expected state value. We then design a gradient descent ascent-based robust MARL algorithm to learn the robust policies for the MARL agents. Our experiments show that adversarial state perturbations decrease agents' rewards for several baselines from the existing literature, while our algorithm outperforms baselines with state perturbations and significantly improves the robustness of the MARL policies under state uncertainties.
translated by 谷歌翻译
Recent work in sim2real has successfully enabled robots to act in physical environments by training in simulation with a diverse ''population'' of environments (i.e. domain randomization). In this work, we focus on enabling generalization in assistive tasks: tasks in which the robot is acting to assist a user (e.g. helping someone with motor impairments with bathing or with scratching an itch). Such tasks are particularly interesting relative to prior sim2real successes because the environment now contains a human who is also acting. This complicates the problem because the diversity of human users (instead of merely physical environment parameters) is more difficult to capture in a population, thus increasing the likelihood of encountering out-of-distribution (OOD) human policies at test time. We advocate that generalization to such OOD policies benefits from (1) learning a good latent representation for human policies that test-time humans can accurately be mapped to, and (2) making that representation adaptable with test-time interaction data, instead of relying on it to perfectly capture the space of human policies based on the simulated population only. We study how to best learn such a representation by evaluating on purposefully constructed OOD test policies. We find that sim2real methods that encode environment (or population) parameters and work well in tasks that robots do in isolation, do not work well in assistance. In assistance, it seems crucial to train the representation based on the history of interaction directly, because that is what the robot will have access to at test time. Further, training these representations to then predict human actions not only gives them better structure, but also enables them to be fine-tuned at test-time, when the robot observes the partner act. https://adaptive-caregiver.github.io.
translated by 谷歌翻译