单细胞RNA-seq数据集的大小和复杂性正在增长,从而可以研究各种生物/临床环境中的细胞组成变化。可扩展的降低性降低技术需要消除它们的生物学变异,同时考虑技术和生物混杂因素。在这项工作中,我们扩展了一种流行的概率非线性维度降低的方法,即高斯过程潜在变量模型,以扩展到大量的单细胞数据集,同时明确考虑技术和生物混杂因素。关键思想是使用增强的内核,该内核可以保留下限的可分式性,从而允许快速随机变化推断。我们证明了其在Kumasaka等人中重建先天免疫的潜在潜在签名的能力。 (2021)训练时间较低9倍。我们进一步分析了一个共同数据集并在130个人群中证明了该框架,该框架可以在捕获可解释的感染签名的同时进行数据集成。具体而言,我们探讨了互联的严重程度,作为优化患者分层并捕获疾病特异性基因表达的潜在维度。
translated by 谷歌翻译
在对机器学习研究的可靠性和可信度的越来越关注的越来越关注的情况下,我们提出了一个有原则的框架,用于提出可靠和可推广的主张:多元宇宙分析。我们的框架建立在多元宇宙分析(Steegen等,2016)的基础上,该框架是为了应对心理学自身的可重复性危机而引入的。为了有效地探索高维且经常连续的ML搜索空间,我们用高斯工艺替代品对多元宇宙进行建模,并应用贝叶斯实验设计。我们的框架旨在促进有关模型性能的强大科学结论,因此我们的方法着重于探索而不是常规优化。在两个案例研究中的第一个中,我们研究了关于自适应优化者相对优点的有争议的主张。其次,我们综合了关于学习率对大批次培训概括差距的影响的矛盾研究。对于机器学习社区而言,多元宇宙分析是一种简单有效的技术,用于识别稳定的主张,提高透明度以及迈向改善可重复性的一步。
translated by 谷歌翻译
In this paper we explore whether the fundamental tool of experimental psychology, the behavioral experiment, has the power to generate insight not only into humans and animals, but artificial systems too. We apply the techniques of experimental psychology to investigating catastrophic forgetting in neural networks. We present a series of controlled experiments with two-layer ReLU networks, and exploratory results revealing a new understanding of the behavior of catastrophic forgetting. Alongside our empirical findings, we demonstrate an alternative, behavior-first approach to investigating neural network phenomena.
translated by 谷歌翻译
Transferring knowledge from a teacher neural network pretrained on the same or a similar task to a student neural network can significantly improve the performance of the student neural network. Existing knowledge transfer approaches match the activations or the corresponding handcrafted features of the teacher and the student networks. We propose an information-theoretic framework for knowledge transfer which formulates knowledge transfer as maximizing the mutual information between the teacher and the student networks. We compare our method with existing knowledge transfer methods on both knowledge distillation and transfer learning tasks and show that our method consistently outperforms existing methods. We further demonstrate the strength of our method on knowledge transfer across heterogeneous network architectures by transferring knowledge from a convolutional neural network (CNN) to a multi-layer perceptron (MLP) on CIFAR-10. The resulting MLP significantly outperforms the-state-of-the-art methods and it achieves similar performance to the CNN with a single convolutional layer. * Contributed during an internship at Amazon.
translated by 谷歌翻译
We introduce stochastic variational inference for Gaussian process models. This enables the application of Gaussian process (GP) models to data sets containing millions of data points. We show how GPs can be variationally decomposed to depend on a set of globally relevant inducing variables which factorize the model in the necessary manner to perform variational inference. Our approach is readily extended to models with non-Gaussian likelihoods and latent variable models based around Gaussian processes. We demonstrate the approach on a simple toy problem and two real world data sets.
translated by 谷歌翻译
In this paper we introduce deep Gaussian process (GP) models. Deep GPs are a deep belief network based on Gaussian process mappings. The data is modeled as the output of a multivariate GP. The inputs to that Gaussian process are then governed by another GP. A single layer model is equivalent to a standard GP or the GP latent variable model (GP-LVM). We perform inference in the model by approximate variational marginalization. This results in a strict lower bound on the marginal likelihood of the model which we use for model selection (number of layers and nodes per layer). Deep belief networks are typically applied to relatively large data sets using stochastic gradient descent for optimization. Our fully Bayesian treatment allows for the application of deep models even when data is scarce. Model selection by our variational bound shows that a five layer hierarchy is justified even when modelling a digit data set containing only 150 examples.
translated by 谷歌翻译
元学习是机器学习的一个分支,旨在将相关任务分布的数据合成以有效地解决新的数据。在过程控制中,许多系统具有相似且充分理解的动力学,这表明可以通过元学习创建可推广的控制器是可行的。在这项工作中,我们制定了一种元加强学习(META-RL)控制策略,该策略利用已知的离线信息进行培训,例如模型结构。对模型参数的分布而不是单个模型,对元RL代理进行了训练,从而使代理能够自动适应过程动力学的变化,同时保持性能。一个关键的设计元素是能够在培训期间离线利用基于模型的信息,同时保持与新环境交互的无模型策略结构。我们以前的工作已经证明了如何将这种方法应用于调整比例综合控制器以控制一阶过程的与工业相关的问题。在这项工作中,我们简要地重新引入了我们的方法,并证明了如何将其扩展到比例综合衍生的控制器和二阶系统。
translated by 谷歌翻译
通用数据模型解决了标准化电子健康记录(EHR)数据的许多挑战,但无法将其集成深度表型所需的资源。开放的生物学和生物医学本体论(OBO)铸造本体论提供了可用于生物学知识的语义计算表示,并能够整合多种生物医学数据。但是,将EHR数据映射到OBO Foundry本体论需要大量的手动策展和域专业知识。我们介绍了一个框架,用于将观察性医学成果合作伙伴关系(OMOP)标准词汇介绍给OBO铸造本体。使用此框架,我们制作了92,367条条件,8,615种药物成分和10,673个测量结果的映射。域专家验证了映射准确性,并且在24家医院进行检查时,映射覆盖了99%的条件和药物成分和68%的测量结果。最后,我们证明OMOP2OBO映射可以帮助系统地识别可能受益于基因检测的未诊断罕见病患者。
translated by 谷歌翻译
人工智能的最新趋势是将验证的模型用于语言和视觉任务,这些模型已经实现了非凡的表现,但也令人困惑。因此,以各种方式探索这些模型的能力对该领域至关重要。在本文中,我们探讨了模型的可靠性,在其中我们将可靠的模型定义为一个不仅可以实现强大的预测性能,而且在许多涉及不确定性(例如选择性预测,开放式设置识别)的决策任务上,在许多决策任务上表现出色,而且表现良好。强大的概括(例如,准确性和适当的评分规则,例如在分布数据集中和分发数据集上的对数可能性)和适应性(例如,主动学习,几乎没有射击不确定性)。我们设计了40个数据集的10种任务类型,以评估视觉和语言域上可靠性的不同方面。为了提高可靠性,我们分别开发了VIT-PLEX和T5-PLEX,分别针对视觉和语言方式扩展了大型模型。 PLEX极大地改善了跨可靠性任务的最先进,并简化了传统协议,因为它可以改善开箱即用的性能,并且不需要设计分数或为每个任务调整模型。我们演示了高达1B参数的模型尺寸的缩放效果,并预处理数据集大小最多4B示例。我们还展示了PLEX在具有挑战性的任务上的功能,包括零射门的开放式识别,主动学习和对话语言理解中的不确定性。
translated by 谷歌翻译
The need for Question Answering datasets in low resource languages is the motivation of this research, leading to the development of Kencorpus Swahili Question Answering Dataset, KenSwQuAD. This dataset is annotated from raw story texts of Swahili low resource language, which is a predominantly spoken in Eastern African and in other parts of the world. Question Answering (QA) datasets are important for machine comprehension of natural language for tasks such as internet search and dialog systems. Machine learning systems need training data such as the gold standard Question Answering set developed in this research. The research engaged annotators to formulate QA pairs from Swahili texts collected by the Kencorpus project, a Kenyan languages corpus. The project annotated 1,445 texts from the total 2,585 texts with at least 5 QA pairs each, resulting into a final dataset of 7,526 QA pairs. A quality assurance set of 12.5% of the annotated texts confirmed that the QA pairs were all correctly annotated. A proof of concept on applying the set to the QA task confirmed that the dataset can be usable for such tasks. KenSwQuAD has also contributed to resourcing of the Swahili language.
translated by 谷歌翻译