近年来,来自神经影像数据的脑疾病的单一受试者预测引起了人们的关注。然而,对于某些异质性疾病,例如严重抑郁症(MDD)和自闭症谱系障碍(ASD),大规模多站点数据集对预测模型的性能仍然很差。我们提出了一个两阶段的框架,以改善静止状态功能磁共振成像(RS-FMRI)的异质精神疾病的诊断。首先,我们建议对健康个体的数据进行自我监督的掩盖预测任务,以利用临床数据集中健康对照与患者之间的差异。接下来,我们在学习的判别性表示方面培训了一个有监督的分类器。为了建模RS-FMRI数据,我们开发Graph-S4;最近提出的状态空间模型S4扩展到图形设置,其中底层图结构未提前知道。我们表明,将框架和Graph-S4结合起来可以显着提高基于神经成像的MDD和ASD的基于神经影像学的单个主题预测模型和三个开源多中心RS-FMRI临床数据集的诊断性能。
translated by 谷歌翻译
Nowadays, the current neural network models of dialogue generation(chatbots) show great promise for generating answers for chatty agents. But they are short-sighted in that they predict utterances one at a time while disregarding their impact on future outcomes. Modelling a dialogue's future direction is critical for generating coherent, interesting dialogues, a need that has led traditional NLP dialogue models that rely on reinforcement learning. In this article, we explain how to combine these objectives by using deep reinforcement learning to predict future rewards in chatbot dialogue. The model simulates conversations between two virtual agents, with policy gradient methods used to reward sequences that exhibit three useful conversational characteristics: the flow of informality, coherence, and simplicity of response (related to forward-looking function). We assess our model based on its diversity, length, and complexity with regard to humans. In dialogue simulation, evaluations demonstrated that the proposed model generates more interactive responses and encourages a more sustained successful conversation. This work commemorates a preliminary step toward developing a neural conversational model based on the long-term success of dialogues.
translated by 谷歌翻译
Recent advances in deep learning (dl) have led to the release of several dl software libraries such as pytorch, Caffe, and TensorFlow, in order to assist machine learning (ml) practitioners in developing and deploying state-of-the-art deep neural networks (DNN), but they are not able to properly cope with limitations in the dl libraries such as testing or data processing. In this paper, we present a qualitative and quantitative analysis of the most frequent dl libraries combination, the distribution of dl library dependencies across the ml workflow, and formulate a set of recommendations to (i) hardware builders for more optimized accelerators and (ii) library builder for more refined future releases. Our study is based on 1,484 open-source dl projects with 46,110 contributors selected based on their reputation. First, we found an increasing trend in the usage of deep learning libraries. Second, we highlight several usage patterns of deep learning libraries. In addition, we identify dependencies between dl libraries and the most frequent combination where we discover that pytorch and Scikit-learn and, Keras and TensorFlow are the most frequent combination in 18% and 14% of the projects. The developer uses two or three dl libraries in the same projects and tends to use different multiple dl libraries in both the same function and the same files. The developer shows patterns in using various deep-learning libraries and prefers simple functions with fewer arguments and straightforward goals. Finally, we present the implications of our findings for researchers, library maintainers, and hardware vendors.
translated by 谷歌翻译
社交媒体平台上有毒内容的普遍性,例如仇恨言论,冒犯性语言和厌女症,给我们的相互联系的社会带来了严重的挑战。这些具有挑战性的问题引起了自然语言处理(NLP)社区的广泛关注。在本文中,我们将提交的系统介绍给第一个阿拉伯语厌女症识别共享任务。我们研究了三个多任务学习模型及其单任务。为了编码输入文本,我们的模型依赖于预先训练的Marbert语言模型。总体获得的结果表明,我们所有提交的模型均在厌女症识别和分类任务中取得了最佳性能(排名前三的提交)。
translated by 谷歌翻译
准确估计深度信息的能力对于许多自主应用来识别包围环境并预测重要对象的深度至关重要。最近使用的技术之一是单眼深度估计,其中深度图从单个图像推断出深度图。本文提高了自我监督的深度学习技术,以进行准确的广义单眼深度估计。主要思想是训练深层模型要考虑不同帧的序列,每个帧都是地理标记的位置信息。这使得模型能够增强给定区域语义的深度估计。我们展示了我们模型改善深度估计结果的有效性。该模型在现实环境中受过培训,结果显示在将位置数据添加到模型训练阶段之后的深度图中的改进。
translated by 谷歌翻译
用于计算病理(CPATH)的深度分割模型的发展可以帮助培养可解释的形态生物标志物的调查。然而,这些方法的成功存在主要瓶颈,因为监督的深度学习模型需要丰富的准确标记数据。该问题在CPATH领域加剧,因为详细注释的产生通常需要对病理学家的输入能够区分不同的组织构建体和核。手动标记核可能不是收集大规模注释数据集的可行方法,特别是当单个图像区域可以包含数千个不同的单元时。但是,仅依靠自动生成注释将限制地面真理的准确性和可靠性。因此,为了帮助克服上述挑战,我们提出了一种多级注释管道,以使大规模数据集进行用于组织学图像分析,具有病理学家in-循环的细化步骤。使用本市管道,我们生成最大的已知核实例分段和分类数据集,其中包含近百万分之一的H&E染色的结肠组织中标记的细胞核。我们发布了DataSet并鼓励研究社区利用它来推动CPATH中下游小区模型的发展。
translated by 谷歌翻译
Joint image-text embedding is the bedrock for most Visionand-Language (V+L) tasks, where multimodality inputs are simultaneously processed for joint visual and textual understanding. In this paper, we introduce UNITER, a UNiversal Image-TExt Representation, learned through large-scale pre-training over four image-text datasets (COCO, Visual Genome, Conceptual Captions, and SBU Captions), which can power heterogeneous downstream V+L tasks with joint multimodal embeddings. We design four pre-training tasks: Masked Language Modeling (MLM), Masked Region Modeling (MRM, with three variants), Image-Text Matching (ITM), and Word-Region Alignment (WRA). Different from previous work that applies joint random masking to both modalities, we use conditional masking on pre-training tasks (i.e., masked language/region modeling is conditioned on full observation of image/text). In addition to ITM for global image-text alignment, we also propose WRA via the use of Optimal Transport (OT) to explicitly encourage finegrained alignment between words and image regions during pre-training. Comprehensive analysis shows that both conditional masking and OTbased WRA contribute to better pre-training. We also conduct a thorough ablation study to find an optimal combination of pre-training tasks. Extensive experiments show that UNITER achieves new state of the art across six V+L tasks (over nine datasets), including Visual Question
translated by 谷歌翻译
Non-linear state-space models, also known as general hidden Markov models, are ubiquitous in statistical machine learning, being the most classical generative models for serial data and sequences in general. The particle-based, rapid incremental smoother PaRIS is a sequential Monte Carlo (SMC) technique allowing for efficient online approximation of expectations of additive functionals under the smoothing distribution in these models. Such expectations appear naturally in several learning contexts, such as likelihood estimation (MLE) and Markov score climbing (MSC). PARIS has linear computational complexity, limited memory requirements and comes with non-asymptotic bounds, convergence results and stability guarantees. Still, being based on self-normalised importance sampling, the PaRIS estimator is biased. Our first contribution is to design a novel additive smoothing algorithm, the Parisian particle Gibbs PPG sampler, which can be viewed as a PaRIS algorithm driven by conditional SMC moves, resulting in bias-reduced estimates of the targeted quantities. We substantiate the PPG algorithm with theoretical results, including new bounds on bias and variance as well as deviation inequalities. Our second contribution is to apply PPG in a learning framework, covering MLE and MSC as special examples. In this context, we establish, under standard assumptions, non-asymptotic bounds highlighting the value of bias reduction and the implicit Rao--Blackwellization of PPG. These are the first non-asymptotic results of this kind in this setting. We illustrate our theoretical results with numerical experiments supporting our claims.
translated by 谷歌翻译
The performance of the Deep Learning (DL) models depends on the quality of labels. In some areas, the involvement of human annotators may lead to noise in the data. When these corrupted labels are blindly regarded as the ground truth (GT), DL models suffer from performance deficiency. This paper presents a method that aims to learn a confident model in the presence of noisy labels. This is done in conjunction with estimating the uncertainty of multiple annotators. We robustly estimate the predictions given only the noisy labels by adding entropy or information-based regularizer to the classifier network. We conduct our experiments on a noisy version of MNIST, CIFAR-10, and FMNIST datasets. Our empirical results demonstrate the robustness of our method as it outperforms or performs comparably to other state-of-the-art (SOTA) methods. In addition, we evaluated the proposed method on the curated dataset, where the noise type and level of various annotators depend on the input image style. We show that our approach performs well and is adept at learning annotators' confusion. Moreover, we demonstrate how our model is more confident in predicting GT than other baselines. Finally, we assess our approach for segmentation problem and showcase its effectiveness with experiments.
translated by 谷歌翻译
Recent advances in upper limb prostheses have led to significant improvements in the number of movements provided by the robotic limb. However, the method for controlling multiple degrees of freedom via user-generated signals remains challenging. To address this issue, various machine learning controllers have been developed to better predict movement intent. As these controllers become more intelligent and take on more autonomy in the system, the traditional approach of representing the human-machine interface as a human controlling a tool becomes limiting. One possible approach to improve the understanding of these interfaces is to model them as collaborative, multi-agent systems through the lens of joint action. The field of joint action has been commonly applied to two human partners who are trying to work jointly together to achieve a task, such as singing or moving a table together, by effecting coordinated change in their shared environment. In this work, we compare different prosthesis controllers (proportional electromyography with sequential switching, pattern recognition, and adaptive switching) in terms of how they present the hallmarks of joint action. The results of the comparison lead to a new perspective for understanding how existing myoelectric systems relate to each other, along with recommendations for how to improve these systems by increasing the collaborative communication between each partner.
translated by 谷歌翻译