Modeling noise transition matrix is a kind of promising method for learning with label noise. Based on the estimated noise transition matrix and the noisy posterior probabilities, the clean posterior probabilities, which are jointly called Label Distribution (LD) in this paper, can be calculated as the supervision. To reliably estimate the noise transition matrix, some methods assume that anchor points are available during training. Nonetheless, if anchor points are invalid, the noise transition matrix might be poorly learned, resulting in poor performance. Consequently, other methods treat reliable data points, extracted from training data, as pseudo anchor points. However, from a statistical point of view, the noise transition matrix can be inferred from data with noisy labels under the clean-label-domination assumption. Therefore, we aim to estimate the noise transition matrix without (pseudo) anchor points. There is evidence showing that samples are more likely to be mislabeled as other similar class labels, which means the mislabeling probability is highly correlated with the inter-class correlation. Inspired by this observation, we propose an instance-specific Label Distribution Regularization (LDR), in which the instance-specific LD is estimated as the supervision, to prevent DCNNs from memorizing noisy labels. Specifically, we estimate the noisy posterior under the supervision of noisy labels, and approximate the batch-level noise transition matrix by estimating the inter-class correlation matrix with neither anchor points nor pseudo anchor points. Experimental results on two synthetic noisy datasets and two real-world noisy datasets demonstrate that our LDR outperforms existing methods.
translated by 谷歌翻译
自动化的腹部多器官分割是计算机辅助诊断腹部器官相关疾病的至关重要但具有挑战性的任务。尽管许多深度学习模型在许多医学图像分割任务中取得了显着的成功,但由于腹部器官的不同大小以及它们之间的含糊界限,腹部器官的准确分割仍然具有挑战性。在本文中,我们提出了一个边界感知网络(BA-NET),以分段CT扫描和MRI扫描进行腹部器官。该模型包含共享编码器,边界解码器和分割解码器。两个解码器都采用了多尺度的深度监督策略,这可以减轻可变器官尺寸引起的问题。边界解码器在每个量表上产生的边界概率图被用作提高分割特征图的注意。我们评估了腹部多器官细分(AMOS)挑战数据集的BA-NET,并获得了CT扫描的多器官分割的平均骰子分数为89.29 $ \%$,平均骰子得分为71.92 $ \%$ \%$ \% MRI扫描。结果表明,在两个分割任务上,BA-NET优于NNUNET。
translated by 谷歌翻译
肾脏结构细分是计算机辅助诊断基于手术的肾癌的至关重要但具有挑战性的任务。尽管许多深度学习模型在许多医学图像分割任务中取得了显着的成功,但由于肾脏肿瘤的尺寸可变,肾脏肿瘤及其周围环境之间的歧义范围可变,因此对计算机层析造影血管造影(CTA)图像的肾脏结构的准确分割仍然具有挑战性。 。在本文中,我们在CTA扫描中提出了一个边界感知网络(BA-NET),以分段肾脏,肾脏肿瘤,动脉和静脉。该模型包含共享编码器,边界解码器和分割解码器。两个解码器都采用了多尺度的深度监督策略,这可以减轻肿瘤大小可变的问题。边界解码器在每个量表上产生的边界概率图被用作提高分割特征图的注意。我们在肾脏解析(KIPA)挑战数据集上评估了BA-NET,并通过使用4倍的交叉验证来实现CTA扫描的肾脏结构细分的平均骰子得分为89.65 $ \%$。结果证明了BA-NET的有效性。
translated by 谷歌翻译
颈动脉血管壁分割是在计算机辅助诊断动脉粥样硬化中的至关重要但具有挑战性的任务。尽管许多深度学习模型在许多医学图像分割任务中取得了显着的成功,但由于注释有限和异构动脉,对磁共振(MR)图像上颈动脉壁(MR)图像的准确分割仍然具有挑战性。在本文中,我们在3D MR图像上提出了一个半监督标签的传播框架,以分段管腔,正常容器壁和动脉粥样硬化血管壁。通过插值提供的注释,我们获得了3D连续标签,用于训练3D分割模型。借助训练有素的模型,我们生成了未标记切片的伪标签,以将其纳入模型训练。然后,我们使用整个MR扫描和传播标签来重新培养分割模型并改善其稳健性。我们评估了颈动脉血管墙分割和动脉粥样硬化诊断(COSMOS)挑战数据集上的标签传播框架,并在测试数据集中获得了83.41 \%的Quanm分数,这使在线评估排行榜上获得了1-ST的位置。结果证明了拟议框架的有效性。
translated by 谷歌翻译
手动注释医学图像是高度主观的,导致不可避免和巨大的注释偏见。深度学习模型可能超过各种任务的人类性能,但它们也可能模仿或放大这些偏差。虽然我们可以有多个注释器并融化它们的注释来减少随机错误,但我们无法使用这种策略来处理因注释器偏好引起的偏差。在本文中,我们突出了对医学图像分割任务的注释相关偏差问题,并提出了涉及涉及的注释分配学习(PADL)框架来解决它从解开注入者的偏好使用分配学习的随机误差的偏好来解决它由于不仅产生元分割,而且产生每个注释器的分割。在此框架下,随机误差建模(SEM)模块估计元分割图和平均随机错误映射,以及一系列人类偏好建模(HPM)模块估计每个注释器的分段和相应的随机误差。我们在具有不同的成像方式的两个医学图像基准上进行了评估了我们的PADL框架,这些模型由多个医疗专业人员注释,并在所有五种医学图像分割任务上取得了有希望的表现。
translated by 谷歌翻译
域间隙主要由可变的医学图像质量引起的构成,这是训练实验室中的分割模型与应用训练的模型在未见临床数据之间的路径上的主要障碍。为了解决这个问题,已经提出了域泛化方法,但是通常使用静态卷积,并且灵活性较低。在本文中,我们提出了一个基于域和内容自适应卷积(DCAC)的多源域概括模型,以分割不同模式的医学图像。具体而言,我们设计了域自适应卷积(DAC)模块和内容自适应卷积(CAC)模块,并将两者都合并到编码器解码器中。在DAC模块中,动态卷积头是根据输入的预测域代码进行的,以使我们的模型适应看不见的目标域。在CAC模块中,动态卷积头在全局图像特征上进行条件,以使我们的模型适应测试图像。我们针对基线的DCAC模型和针对前列腺分割,COVID-19病变分段和视频杯/视盘分段任务的四种最先进的域概括方法评估了DCAC模型。我们的结果不仅表明所提出的DCAC模型在每个分割任务上都优于所有竞争方法,而且还证明了DAC和CAC模块的有效性。代码可在\ url {https://git.io/dcac}上获得。
translated by 谷歌翻译
Interview has been regarded as one of the most crucial step for recruitment. To fully prepare for the interview with the recruiters, job seekers usually practice with mock interviews between each other. However, such a mock interview with peers is generally far away from the real interview experience: the mock interviewers are not guaranteed to be professional and are not likely to behave like a real interviewer. Due to the rapid growth of online recruitment in recent years, recruiters tend to have online interviews, which makes it possible to collect real interview data from real interviewers. In this paper, we propose a novel application named EZInterviewer, which aims to learn from the online interview data and provides mock interview services to the job seekers. The task is challenging in two ways: (1) the interview data are now available but still of low-resource; (2) to generate meaningful and relevant interview dialogs requires thorough understanding of both resumes and job descriptions. To address the low-resource challenge, EZInterviewer is trained on a very small set of interview dialogs. The key idea is to reduce the number of parameters that rely on interview dialogs by disentangling the knowledge selector and dialog generator so that most parameters can be trained with ungrounded dialogs as well as the resume data that are not low-resource. Evaluation results on a real-world job interview dialog dataset indicate that we achieve promising results to generate mock interviews. With the help of EZInterviewer, we hope to make mock interview practice become easier for job seekers.
translated by 谷歌翻译
General nonlinear sieve learnings are classes of nonlinear sieves that can approximate nonlinear functions of high dimensional variables much more flexibly than various linear sieves (or series). This paper considers general nonlinear sieve quasi-likelihood ratio (GN-QLR) based inference on expectation functionals of time series data, where the functionals of interest are based on some nonparametric function that satisfy conditional moment restrictions and are learned using multilayer neural networks. While the asymptotic normality of the estimated functionals depends on some unknown Riesz representer of the functional space, we show that the optimally weighted GN-QLR statistic is asymptotically Chi-square distributed, regardless whether the expectation functional is regular (root-$n$ estimable) or not. This holds when the data are weakly dependent beta-mixing condition. We apply our method to the off-policy evaluation in reinforcement learning, by formulating the Bellman equation into the conditional moment restriction framework, so that we can make inference about the state-specific value functional using the proposed GN-QLR method with time series data. In addition, estimating the averaged partial means and averaged partial derivatives of nonparametric instrumental variables and quantile IV models are also presented as leading examples. Finally, a Monte Carlo study shows the finite sample performance of the procedure
translated by 谷歌翻译
This paper presents a safety-critical locomotion control framework for quadrupedal robots. Our goal is to enable quadrupedal robots to safely navigate in cluttered environments. To tackle this, we introduce exponential Discrete Control Barrier Functions (exponential DCBFs) with duality-based obstacle avoidance constraints into a Nonlinear Model Predictive Control (NMPC) with Whole-Body Control (WBC) framework for quadrupedal locomotion control. This enables us to use polytopes to describe the shapes of the robot and obstacles for collision avoidance while doing locomotion control of quadrupedal robots. Compared to most prior work, especially using CBFs, that utilize spherical and conservative approximation for obstacle avoidance, this work demonstrates a quadrupedal robot autonomously and safely navigating through very tight spaces in the real world. (Our open-source code is available at github.com/HybridRobotics/quadruped_nmpc_dcbf_duality, and the video is available at youtu.be/p1gSQjwXm1Q.)
translated by 谷歌翻译
Video semantic segmentation (VSS) is beneficial for dealing with dynamic scenes due to the continuous property of the real-world environment. On the one hand, some methods alleviate the predicted inconsistent problem between continuous frames. On the other hand, other methods employ the previous frame as the prior information to assist in segmenting the current frame. Although the previous methods achieve superior performances on the independent and identically distributed (i.i.d) data, they can not generalize well on other unseen domains. Thus, we explore a new task, the video generalizable semantic segmentation (VGSS) task that considers both continuous frames and domain generalization. In this paper, we propose a class-wise non-salient region generalized (CNSG) framework for the VGSS task. Concretely, we first define the class-wise non-salient feature, which describes features of the class-wise non-salient region that carry more generalizable information. Then, we propose a class-wise non-salient feature reasoning strategy to select and enhance the most generalized channels adaptively. Finally, we propose an inter-frame non-salient centroid alignment loss to alleviate the predicted inconsistent problem in the VGSS task. We also extend our video-based framework to the image-based generalizable semantic segmentation (IGSS) task. Experiments demonstrate that our CNSG framework yields significant improvement in the VGSS and IGSS tasks.
translated by 谷歌翻译