Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.
translated by 谷歌翻译
目前,Covid-19的发展使研究人员可以收集2年内积累的数据集并将其用于预测分析。反过来,这可以评估更复杂的预测模型的效率潜力,包括具有不同预测范围的神经网络。在本文中,我们介绍了基于两个国家的区域数据:美国和俄罗斯的区域数据,对不同类型的方法进行了一致的比较研究结果。我们使用了众所周知的统计方法(例如,指数平滑),一种“明天”方法,以及一套经过来自各个地区数据的经典机器学习模型。与他们一起,考虑了基于长期记忆(LSTM)层的神经网络模型,这些培训样本的培训样本汇总了来自两个国家 /地区的所有地区:美国和俄罗斯。根据MAPE度量,使用交叉验证进行效率评估。结果表明,对于以确认的每日案例数量大幅增加的复杂时期,最佳结果是由在两国所有地区训练的LSTM模型显示的,显示平均平均绝对百分比误差(MAPE)为18%在俄罗斯为30%,37%,31%,41%,50%的预测范围为14、28和42天。
translated by 谷歌翻译
磁共振成像(MRI)是中风成像的中心方式。它被用来接受患者的治疗决定,例如选择患者进行静脉溶栓或血管内治疗。随后在住院期间使用MRI来通过可视化梗塞核心大小和位置来预测结果。此外,它可以用来表征中风病因,例如(心脏) - 栓塞和非胚胎中风之间的区分。基于计算机的自动医疗图像处理越来越多地进入临床常规。缺血性中风病变分割(ISLE)挑战的先前迭代有助于生成鉴定急性和急性缺血性中风病变分割的基准方法。在这里,我们介绍了一个专家注册的多中心MRI数据集,以分割急性到亚急性中风病变。该数据集包括400个多供应商MRI案例,中风病变大小,数量和位置的可变性很高。它分为n = 250的训练数据集和n = 150的测试数据集。所有培训数据将公开可用。测试数据集将仅用于模型验证,并且不会向公众发布。该数据集是Isles 2022挑战的基础,目的是找到算法方法,以实现缺血性中风的稳健和准确分割算法的开发和基准测试。
translated by 谷歌翻译
语言模型既展示了定量的改进,又展示了新的定性功能,随着规模的增加。尽管它们具有潜在的变革性影响,但这些新能力的特征却很差。为了为未来的研究提供信息,为破坏性的新模型能力做准备,并改善社会有害的效果,至关重要的是,我们必须了解目前和近乎未来的能力和语言模型的局限性。为了应对这一挑战,我们介绍了超越模仿游戏基准(Big Bench)。 Big Bench目前由204个任务组成,由132家机构的442位作者贡献。任务主题是多样的,从语言学,儿童发展,数学,常识性推理,生物学,物理学,社会偏见,软件开发等等。 Big-Bench专注于被认为超出当前语言模型的功能的任务。我们评估了OpenAI的GPT型号,Google内部密集变压器体系结构和大型基础上的开关稀疏变压器的行为,跨越了数百万到数十亿个参数。此外,一个人类专家评估者团队执行了所有任务,以提供强大的基准。研究结果包括:模型性能和校准都随规模改善,但绝对的术语(以及与评估者的性能相比);在模型类中的性能非常相似,尽管带有稀疏性。逐渐和预测的任务通常涉及大量知识或记忆成分,而在临界规模上表现出“突破性”行为的任务通常涉及多个步骤或组成部分或脆性指标;社交偏见通常会随着含糊不清的环境而随着规模而增加,但这可以通过提示来改善。
translated by 谷歌翻译
过渡光谱是一种有力的工具,可以解码额外行星气氛的化学成分。在本文中,我们专注于分析来自过渡外部的光谱数据的无监督技术。我们展示了i)的方法,清洁和验证数据,ii)基于概述统计(位置和变异性估计),iii)的初始探索数据分析,iii)探索和量化数据中的现有相关性,IV)预处理和线性变换数据到其主要成分,v)维数减少和歧管学习,vi)聚类和异常检测,vii)可视化和数据的解释。为了说明所提出的无监督方法,我们使用众所周知的公共基准数据集的合成传输谱。我们表明光谱数据中存在高度的相关性,该数据呼叫适当的低维表示。我们探索了许多不同的技术,用于减少这种维数,在概要统计,主成分等方面确定几种合适的选择。我们在主成分基础上发现有趣的结构,即与不同化学制度相对应的明确定义的分支。底层大气。我们证明,这些分支可以以完全无监督的方式用K-Means聚类算法成功恢复。我们倡导第三个主成分的光谱数据的三维表示,以揭示数据中的现有结构并快速表征行星的化学类。
translated by 谷歌翻译
新发现的外部肌肉的物理特性和大气化学成分通常从其过渡光谱推断出从辐射转移的复杂数模型获得的。或者,简单的分析表达式为相关的大气过程提供了富有洞察力的物理直觉。深入学习的革命已经开辟了直接推导出这样的分析结果的门,直接与拟合数据的计算机算法。作为概念证明,我们成功地证明了在通用热木星外部基因族的过渡半径的合成数据上使用符号回归,以得出相应的分析公式。作为预处理步骤,我们使用尺寸分析来识别变量的相关无量纲组合,并减少独立输入的数量,从而提高了符号回归的性能。尺寸分析还允许我们在数学上得出并适当地参加输入大气参数中最通用的变性家族,这通过过渡光谱影响开发族气氛的表征。
translated by 谷歌翻译
In the past years, deep learning has seen an increase of usage in the domain of histopathological applications. However, while these approaches have shown great potential, in high-risk environments deep learning models need to be able to judge their own uncertainty and be able to reject inputs when there is a significant chance of misclassification. In this work, we conduct a rigorous evaluation of the most commonly used uncertainty and robustness methods for the classification of Whole-Slide-Images under domain shift using the H\&E stained Camelyon17 breast cancer dataset. Although it is known that histopathological data can be subject to strong domain shift and label noise, to our knowledge this is the first work that compares the most common methods for uncertainty estimation under these aspects. In our experiments, we compare Stochastic Variational Inference, Monte-Carlo Dropout, Deep Ensembles, Test-Time Data Augmentation as well as combinations thereof. We observe that ensembles of methods generally lead to higher accuracies and better calibration and that Test-Time Data Augmentation can be a promising alternative when choosing an appropriate set of augmentations. Across methods, a rejection of the most uncertain tiles leads to a significant increase in classification accuracy on both in-distribution as well as out-of-distribution data. Furthermore, we conduct experiments comparing these methods under varying conditions of label noise. We observe that the border regions of the Camelyon17 dataset are subject to label noise and evaluate the robustness of the included methods against different noise levels. Lastly, we publish our code framework to facilitate further research on uncertainty estimation on histopathological data.
translated by 谷歌翻译
Charisma is considered as one's ability to attract and potentially also influence others. Clearly, there can be considerable interest from an artificial intelligence's (AI) perspective to provide it with such skill. Beyond, a plethora of use cases opens up for computational measurement of human charisma, such as for tutoring humans in the acquisition of charisma, mediating human-to-human conversation, or identifying charismatic individuals in big social data. A number of models exist that base charisma on various dimensions, often following the idea that charisma is given if someone could and would help others. Examples include influence (could help) and affability (would help) in scientific studies or power (could help), presence, and warmth (both would help) as a popular concept. Modelling high levels in these dimensions for humanoid robots or virtual agents, seems accomplishable. Beyond, also automatic measurement appears quite feasible with the recent advances in the related fields of Affective Computing and Social Signal Processing. Here, we, thereforem present a blueprint for building machines that can appear charismatic, but also analyse the charisma of others. To this end, we first provide the psychological perspective including different models of charisma and behavioural cues of it. We then switch to conversational charisma in spoken language as an exemplary modality that is essential for human-human and human-computer conversations. The computational perspective then deals with the recognition and generation of charismatic behaviour by AI. This includes an overview of the state of play in the field and the aforementioned blueprint. We then name exemplary use cases of computational charismatic skills before switching to ethical aspects and concluding this overview and perspective on building charisma-enabled AI.
translated by 谷歌翻译
Deep learning-based 3D human pose estimation performs best when trained on large amounts of labeled data, making combined learning from many datasets an important research direction. One obstacle to this endeavor are the different skeleton formats provided by different datasets, i.e., they do not label the same set of anatomical landmarks. There is little prior research on how to best supervise one model with such discrepant labels. We show that simply using separate output heads for different skeletons results in inconsistent depth estimates and insufficient information sharing across skeletons. As a remedy, we propose a novel affine-combining autoencoder (ACAE) method to perform dimensionality reduction on the number of landmarks. The discovered latent 3D points capture the redundancy among skeletons, enabling enhanced information sharing when used for consistency regularization. Our approach scales to an extreme multi-dataset regime, where we use 28 3D human pose datasets to supervise one model, which outperforms prior work on a range of benchmarks, including the challenging 3D Poses in the Wild (3DPW) dataset. Our code and models are available for research purposes.
translated by 谷歌翻译
This article concerns Bayesian inference using deep linear networks with output dimension one. In the interpolating (zero noise) regime we show that with Gaussian weight priors and MSE negative log-likelihood loss both the predictive posterior and the Bayesian model evidence can be written in closed form in terms of a class of meromorphic special functions called Meijer-G functions. These results are non-asymptotic and hold for any training dataset, network depth, and hidden layer widths, giving exact solutions to Bayesian interpolation using a deep Gaussian process with a Euclidean covariance at each layer. Through novel asymptotic expansions of Meijer-G functions, a rich new picture of the role of depth emerges. Specifically, we find that the posteriors in deep linear networks with data-independent priors are the same as in shallow networks with evidence maximizing data-dependent priors. In this sense, deep linear networks make provably optimal predictions. We also prove that, starting from data-agnostic priors, Bayesian model evidence in wide networks is only maximized at infinite depth. This gives a principled reason to prefer deeper networks (at least in the linear case). Finally, our results show that with data-agnostic priors a novel notion of effective depth given by \[\#\text{hidden layers}\times\frac{\#\text{training data}}{\text{network width}}\] determines the Bayesian posterior in wide linear networks, giving rigorous new scaling laws for generalization error.
translated by 谷歌翻译