多模式融合方法旨在整合来自不同数据源的信息。与天然数据集不同,例如在视听应用中,样本由“配对”模式组成,医疗保健中的数据通常异步收集。因此,对于给定样品需要所有方式,对于临床任务而言并不现实,并且在训练过程中显着限制了数据集的大小。在本文中,我们提出了Medfuse,这是一种概念上简单但有前途的基于LSTM的融合模块,可以容纳Uni-Mododal和多模式输入。我们使用MIMIC-IV数据集中的临床时间序列数据以及Mimic-CXR中的相应的胸部X射线图像,评估了融合方法,并引入了院内死亡率预测和表型分类的新基准结果。与更复杂的多模式融合策略相比,MEDFUSE在完全配对的测试集上的差距很大。它在部分配对的测试集中还保持了强大的稳定性,其中包含带有缺少胸部X射线图像的样品。我们发布了我们的可重复性代码,并在将来对竞争模型进行评估。
translated by 谷歌翻译
我们假设,由于多模式深神经网络中学习的贪婪本质,这些模型倾向于仅依靠一种模式,同时又构成了其他模式。这种行为是违反直觉的,并损害了模型的概括,正如我们从经验上观察到的。为了估计模型对每种模式的依赖性,我们计算模型还可以访问其他模式时的准确性增益。我们将此增益称为条件利用率。在实验中,我们始终观察到模式之间,多个任务和体系结构之间的条件利用率不平衡。由于在训练过程中无法有效地计算条件利用率,因此我们根据模型从每种模式中学习的速度引入代理,我们将其称为条件学习速度。我们提出了一种算法,以平衡训练过程中模式之间的有条件学习速度,并证明它确实解决了贪婪学习的问题。提出的算法改善了模型在三个数据集上的概括:彩色MNIST,ModelNet40和Nvidia Dynamic Hand手势。
translated by 谷歌翻译
医疗领域的特征在于,不同种类的数据的模式,诸如成像和生理数据。在实践中,各种决策的医疗数据有助于临床医生。然而,目前大多数国家的最先进的深学习模式完全依赖一个单一的模式精心策划的数据。在本文中,我们提出了一个动态的培训方法,学习特定的模态数据表示和集成的辅助功能,而不是仅仅依靠一个单一的模式。我们的初步实验结果使用在MIMIC- CXR数据集表明,该方法实现了ROC曲线(AUROC)(0.764 AUROC)下的最高区域在MIMIC-IV胸片生理数据相比,患者的表型任务在先前的工作基准的方法,其中仅使用生理数据(AUROC 0.740)的性能。对于重复一组五个或周期性急性发作,包括心律失常,传导障碍,和充血性心脏衰竭的慢性疾病中,AUROC改善了从0.747至0.798。这说明利用在表型任务胸部成像方式的好处,并强调在医疗应用多模态学习的潜能。
translated by 谷歌翻译
在本文中,我们探讨了使用多个任务的单任务方法来解决多任务分类问题的尖峰神经网络的功能。我们设计并实施了一个多任务尖峰神经网络(MT-SNN),该网络可以在一次执行一项任务时学习两个或多个分类任务。通过调节此工作中使用的泄漏的集成和火神经元的发射阈值来选择执行的任务。该网络是使用Intel的Laihi2神经形态芯片的Intel熔岩平台实现的。对NMNIST数据的动态多任务分类进行测试。结果表明,MT-SNN通过修改其动力学有效地学习了多个任务,即尖峰神经元的触发阈值。
translated by 谷歌翻译
在不确定性面前的乐观原则在整个连续决策中普遍存在,如多武装匪和加强学习(RL)等问题。为了成功,乐观的RL算法必须过度估计真正的值函数(乐观),但不是通过它不准确的(估计错误)。在表格设置中,许多最先进的方法通过在缩放到深rl时难以应变的方法产生所需的乐观。我们重新解释基于可扩展的乐观模型的算法,以解决易解噪声增强MDP。这种配方实现了竞争遗憾:$ \ tilde {\ mathcal {o}}(| \ mathcal {s} | h \ sqrt {| \ mathcal {a} | t} $在使用高斯噪音时,$ t $是环境步骤的总数。我们还探讨了这种权衡在深度RL设置中的权衡变化,我们在验证上显示估计误差明显更麻烦。但是,我们还表明,如果此错误减少,基于乐观的模型的RL算法可以在连续控制问题中匹配最先进的性能。
translated by 谷歌翻译
We demonstrate a proof-of-concept of a large language model conducting corporate lobbying related activities. We use an autoregressive large language model (OpenAI's text-davinci-003) to determine if proposed U.S. Congressional bills are relevant to specific public companies and provide explanations and confidence levels. For the bills the model deems as relevant, the model drafts a letter to the sponsor of the bill in an attempt to persuade the congressperson to make changes to the proposed legislation. We use hundreds of ground-truth labels of the relevance of a bill to a company to benchmark the performance of the model, which outperforms the baseline of predicting the most common outcome of irrelevance. However, we test the ability to determine the relevance of a bill with the previous OpenAI GPT-3 model (text-davinci-002), which was state-of-the-art on many language tasks until text-davinci-003 was released on November 28, 2022. The performance of text-davinci-002 is worse than simply always predicting that a bill is irrelevant to a company. These results suggest that, as large language models continue to improve core natural language understanding capabilities, performance on corporate lobbying related tasks will continue to improve. We then discuss why this could be problematic for societal-AI alignment.
translated by 谷歌翻译
In the past years, deep learning has seen an increase of usage in the domain of histopathological applications. However, while these approaches have shown great potential, in high-risk environments deep learning models need to be able to judge their own uncertainty and be able to reject inputs when there is a significant chance of misclassification. In this work, we conduct a rigorous evaluation of the most commonly used uncertainty and robustness methods for the classification of Whole-Slide-Images under domain shift using the H\&E stained Camelyon17 breast cancer dataset. Although it is known that histopathological data can be subject to strong domain shift and label noise, to our knowledge this is the first work that compares the most common methods for uncertainty estimation under these aspects. In our experiments, we compare Stochastic Variational Inference, Monte-Carlo Dropout, Deep Ensembles, Test-Time Data Augmentation as well as combinations thereof. We observe that ensembles of methods generally lead to higher accuracies and better calibration and that Test-Time Data Augmentation can be a promising alternative when choosing an appropriate set of augmentations. Across methods, a rejection of the most uncertain tiles leads to a significant increase in classification accuracy on both in-distribution as well as out-of-distribution data. Furthermore, we conduct experiments comparing these methods under varying conditions of label noise. We observe that the border regions of the Camelyon17 dataset are subject to label noise and evaluate the robustness of the included methods against different noise levels. Lastly, we publish our code framework to facilitate further research on uncertainty estimation on histopathological data.
translated by 谷歌翻译
In large-scale machine learning, recent works have studied the effects of compressing gradients in stochastic optimization in order to alleviate the communication bottleneck. These works have collectively revealed that stochastic gradient descent (SGD) is robust to structured perturbations such as quantization, sparsification, and delays. Perhaps surprisingly, despite the surge of interest in large-scale, multi-agent reinforcement learning, almost nothing is known about the analogous question: Are common reinforcement learning (RL) algorithms also robust to similar perturbations? In this paper, we investigate this question by studying a variant of the classical temporal difference (TD) learning algorithm with a perturbed update direction, where a general compression operator is used to model the perturbation. Our main technical contribution is to show that compressed TD algorithms, coupled with an error-feedback mechanism used widely in optimization, exhibit the same non-asymptotic theoretical guarantees as their SGD counterparts. We then extend our results significantly to nonlinear stochastic approximation algorithms and multi-agent settings. In particular, we prove that for multi-agent TD learning, one can achieve linear convergence speedups in the number of agents while communicating just $\tilde{O}(1)$ bits per agent at each time step. Our work is the first to provide finite-time results in RL that account for general compression operators and error-feedback in tandem with linear function approximation and Markovian sampling. Our analysis hinges on studying the drift of a novel Lyapunov function that captures the dynamics of a memory variable introduced by error feedback.
translated by 谷歌翻译
While the capabilities of autonomous systems have been steadily improving in recent years, these systems still struggle to rapidly explore previously unknown environments without the aid of GPS-assisted navigation. The DARPA Subterranean (SubT) Challenge aimed to fast track the development of autonomous exploration systems by evaluating their performance in real-world underground search-and-rescue scenarios. Subterranean environments present a plethora of challenges for robotic systems, such as limited communications, complex topology, visually-degraded sensing, and harsh terrain. The presented solution enables long-term autonomy with minimal human supervision by combining a powerful and independent single-agent autonomy stack, with higher level mission management operating over a flexible mesh network. The autonomy suite deployed on quadruped and wheeled robots was fully independent, freeing the human supervision to loosely supervise the mission and make high-impact strategic decisions. We also discuss lessons learned from fielding our system at the SubT Final Event, relating to vehicle versatility, system adaptability, and re-configurable communications.
translated by 谷歌翻译
Research on automated essay scoring has become increasing important because it serves as a method for evaluating students' written-responses at scale. Scalable methods for scoring written responses are needed as students migrate to online learning environments resulting in the need to evaluate large numbers of written-response assessments. The purpose of this study is to describe and evaluate three active learning methods than can be used to minimize the number of essays that must be scored by human raters while still providing the data needed to train a modern automated essay scoring system. The three active learning methods are the uncertainty-based, the topological-based, and the hybrid method. These three methods were used to select essays included as part of the Automated Student Assessment Prize competition that were then classified using a scoring model that was training with the bidirectional encoder representations from transformer language model. All three active learning methods produced strong results, with the topological-based method producing the most efficient classification. Growth rate accuracy was also evaluated. The active learning methods produced different levels of efficiency under different sample size allocations but, overall, all three methods were highly efficient and produced classifications that were similar to one another.
translated by 谷歌翻译