We aim to bridge the gap between our common-sense few-sample human learning and large-data machine learning. We derive a theory of human-like few-shot learning from von-Neuman-Landauer's principle. modelling human learning is difficult as how people learn varies from one to another. Under commonly accepted definitions, we prove that all human or animal few-shot learning, and major models including Free Energy Principle and Bayesian Program Learning that model such learning, approximate our theory, under Church-Turing thesis. We find that deep generative model like variational autoencoder (VAE) can be used to approximate our theory and perform significantly better than baseline models including deep neural networks, for image recognition, low resource language processing, and character recognition.
translated by 谷歌翻译
预计机器学习算法的大多数实际问题都可以通过1)未知数据分配来解决这种情况; 2)小领域特定知识; 3)注释有限的数据集。我们通过使用潜在变量(NPC-LV)的压缩提出非参数学习,这是任何数据集的学习框架,这些数据集具有丰富的未标记数据,但很少有标签的数据。通过仅以无监督的方式训练生成模型,该框架利用数据分配来构建压缩机。使用源自Kolmogorov复杂性的基于压缩机的距离度量,加上很少的标记数据,NPC-LV无需进一步的训练而进行分类。我们表明,在低数据制度中,NPC-LV在图像分类的所有三个数据集上都优于监督方法,甚至超过了CIFAR-10上的半监督学习方法。我们证明了如何以及何时使用负面证据下降(Nelbo)作为分类的近似压缩长度。通过揭示压缩率和分类精度之间的相关性,我们说明在NPC-LV下,生成模型的改进可以增强下游分类精度。
translated by 谷歌翻译
Recent progress in artificial intelligence (AI) has renewed interest in building systems that learn and think like people. Many advances have come from using deep neural networks trained end-to-end in tasks such as object recognition, video games, and board games, achieving performance that equals or even beats humans in some respects. Despite their biological inspiration and performance achievements, these systems differ from human intelligence in crucial ways. We review progress in cognitive science suggesting that truly human-like learning and thinking machines will have to reach beyond current engineering trends in both what they learn, and how they learn it. Specifically, we argue that these machines should (a) build causal models of the world that support explanation and understanding, rather than merely solving pattern recognition problems; (b) ground learning in intuitive theories of physics and psychology, to support and enrich the knowledge that is learned; and (c) harness compositionality and learning-to-learn to rapidly acquire and generalize knowledge to new tasks and situations. We suggest concrete challenges and promising routes towards these goals that can combine the strengths of recent neural network advances with more structured cognitive models.
translated by 谷歌翻译
Deep neural networks (DNNs) are often used for text classification tasks as they usually achieve high levels of accuracy. However, DNNs can be computationally intensive with billions of parameters and large amounts of labeled data, which can make them expensive to use, to optimize and to transfer to out-of-distribution (OOD) cases in practice. In this paper, we propose a non-parametric alternative to DNNs that's easy, light-weight and universal in text classification: a combination of a simple compressor like gzip with a $k$-nearest-neighbor classifier. Without any training, pre-training or fine-tuning, our method achieves results that are competitive with non-pretrained deep learning methods on six in-distributed datasets. It even outperforms BERT on all five OOD datasets, including four low-resource languages. Our method also performs particularly well in few-shot settings where labeled data are too scarce for DNNs to achieve a satisfying accuracy.
translated by 谷歌翻译
在流行媒体中,人造代理商的意识出现与同时实现人类或超人水平智力的那些相同的代理之间通常存在联系。在这项工作中,我们探讨了意识和智力之间这种看似直观的联系的有效性和潜在应用。我们通过研究与三种当代意识功能理论相关的认知能力:全球工作空间理论(GWT),信息生成理论(IGT)和注意力模式理论(AST)。我们发现,这三种理论都将有意识的功能专门与人类领域将军智力的某些方面联系起来。有了这个见解,我们转向人工智能领域(AI),发现尽管远未证明一般智能,但许多最先进的深度学习方法已经开始纳入三个功能的关键方面理论。确定了这一趋势后,我们以人类心理时间旅行的激励例子来提出方式,其中三种理论中每种理论的见解都可以合并为一个单一的统一和可实施的模型。鉴于三种功能理论中的每一种都可以通过认知能力来实现这一可能,因此,具有精神时间旅行的人造代理不仅具有比当前方法更大的一般智力,而且还与我们当前对意识功能作用的理解更加一致在人类中,这使其成为AI研究的有希望的近期目标。
translated by 谷歌翻译
本次调查绘制了用于分析社交媒体数据的生成方法的研究状态的广泛的全景照片(Sota)。它填补了空白,因为现有的调查文章在其范围内或被约会。我们包括两个重要方面,目前正在挖掘和建模社交媒体的重要性:动态和网络。社会动态对于了解影响影响或疾病的传播,友谊的形成,友谊的形成等,另一方面,可以捕获各种复杂关系,提供额外的洞察力和识别否则将不会被注意的重要模式。
translated by 谷歌翻译
建立一种人类综合人工认知系统,即人工综合情报(AGI),是人工智能(AI)领域的圣杯。此外,实现人工系统实现认知发展的计算模型将是脑和认知科学的优秀参考。本文介绍了一种通过集成元素认知模块来开发认知架构的方法,以实现整个模块的训练。这种方法是基于两个想法:(1)脑激发AI,学习人类脑建筑以构建人类级智能,(2)概率的生成模型(PGM)基础的认知系统,为发展机器人开发认知系统通过整合PGM。发展框架称为全大脑PGM(WB-PGM),其根本地不同于现有的认知架构,因为它可以通过基于感官电机信息的系统不断学习。在这项研究中,我们描述了WB-PGM的基本原理,基于PGM的元素认知模块的当前状态,与人类大脑的关系,对认知模块的整合的方法,以及未来的挑战。我们的研究结果可以作为大脑研究的参考。随着PGMS描述变量之间的明确信息关系,本说明书提供了从计算科学到脑科学的可解释指导。通过提供此类信息,神经科学的研究人员可以向AI和机器人提供的研究人员提供反馈,以及目前模型缺乏对大脑的影响。此外,它可以促进神经认知科学的研究人员以及AI和机器人的合作。
translated by 谷歌翻译
在深层网络和人工智能复兴的十年中,我们提出了一个理论框架,该框架阐明了一般智力的更大范围内的深层网络。我们介绍了两个基本原则,即简短和自持矛盾,我们认为这是智力,人为或自然的兴起的基石。尽管这两个原则具有丰富的古典根源,但我们认为可以以完全可衡量和可计算的方式重新说明它们。更具体地说,这两个原理导致了有效,有效的计算框架,即压缩闭环转录,该框架统一并解释了现代深层网络和许多人工智能实践的演变。尽管我们主要以视觉数据的建模为例,但我们认为这两个原则将统一对自主智能系统的广泛家庭的理解,并为理解大脑提供了一个框架。
translated by 谷歌翻译
意识和智力是通常被民间心理学和社会所理解的特性。人工智能一词及其在近年来设法解决的问题是一种论点,以确立机器经历某种意识。遵循罗素的类比,如果一台机器能够做一个有意识的人所做的事情,那么机器有意识的可能性就会增加。但是,这种类比的社会含义是灾难性的。具体而言,如果对可以解决神经典型人可能会解决的问题的实体赋予了权利,那么机器是否具有更多的残疾人权利?例如,自闭症综合征障碍频谱可以使一个人无法解决机器解决的问题。我们认为明显的答案是否定的,因为解决问题并不意味着意识。因此,我们将在本文中争论出惊人的意识和至少计算智力是独立的,以及为什么机器不具有惊人意识,尽管它们可能会发展出与人类相比更高的计算智力。为此,我们尝试制定计算智能的客观度量,并研究其在人类,动物和机器中的表现。类似地,我们将惊人的意识研究为二分法变量,以及它在人,动物和机器中的分布方式。由于现象意识和计算智力是独立的,因此这一事实对社会具有关键意义,我们在这项工作中也分析了这一事实。
translated by 谷歌翻译
主张神经符号人工智能(NESY)断言,将深度学习与象征性推理相结合将导致AI更强大,而不是本身。像深度学习一样成功,人们普遍认为,即使我们最好的深度学习系统也不是很擅长抽象推理。而且,由于推理与语言密不可分,因此具有直觉的意义,即自然语言处理(NLP)将成为NESY特别适合的候选人。我们对实施NLP实施NESY的研究进行了结构化审查,目的是回答Nesy是否确实符合其承诺的问题:推理,分布概括,解释性,学习和从小数据的可转让性以及新的推理到新的域。我们研究了知识表示的影响,例如规则和语义网络,语言结构和关系结构,以及隐式或明确的推理是否有助于更高的承诺分数。我们发现,将逻辑编译到神经网络中的系统会导致满足最NESY的目标,而其他因素(例如知识表示或神经体系结构的类型)与实现目标没有明显的相关性。我们发现在推理的定义方式上,特别是与人类级别的推理有关的许多差异,这会影响有关模型架构的决策并推动结论,这些结论在整个研究中并不总是一致的。因此,我们倡导采取更加有条不紊的方法来应用人类推理的理论以及适当的基准的发展,我们希望这可以更好地理解该领域的进步。我们在GitHub上提供数据和代码以进行进一步分析。
translated by 谷歌翻译
The process of learning good features for machine learning applications can be very computationally expensive and may prove difficult in cases where little data is available. A prototypical example of this is the one-shot learning setting, in which we must correctly make predictions given only a single example of each new class.In this paper, we explore a method for learning siamese neural networks which employ a unique structure to naturally rank similarity between inputs. Once a network has been tuned, we can then capitalize on powerful discriminative features to generalize the predictive power of the network not just to new data, but to entirely new classes from unknown distributions. Using a convolutional architecture, we are able to achieve strong results which exceed those of other deep learning models with near stateof-the-art performance on one-shot classification tasks.
translated by 谷歌翻译
In recent years, deep learning has infiltrated every field it has touched, reducing the need for specialist knowledge and automating the process of knowledge discovery from data. This review argues that astronomy is no different, and that we are currently in the midst of a deep learning revolution that is transforming the way we do astronomy. We trace the history of astronomical connectionism from the early days of multilayer perceptrons, through the second wave of convolutional and recurrent neural networks, to the current third wave of self-supervised and unsupervised deep learning. We then predict that we will soon enter a fourth wave of astronomical connectionism, in which finetuned versions of an all-encompassing 'foundation' model will replace expertly crafted deep learning models. We argue that such a model can only be brought about through a symbiotic relationship between astronomy and connectionism, whereby astronomy provides high quality multimodal data to train the foundation model, and in turn the foundation model is used to advance astronomical research.
translated by 谷歌翻译
The success of machine learning algorithms generally depends on data representation, and we hypothesize that this is because different representations can entangle and hide more or less the different explanatory factors of variation behind the data. Although specific domain knowledge can be used to help design representations, learning with generic priors can also be used, and the quest for AI is motivating the design of more powerful representation-learning algorithms implementing such priors. This paper reviews recent work in the area of unsupervised feature learning and deep learning, covering advances in probabilistic models, auto-encoders, manifold learning, and deep networks. This motivates longer-term unanswered questions about the appropriate objectives for learning good representations, for computing representations (i.e., inference), and the geometrical connections between representation learning, density estimation and manifold learning.
translated by 谷歌翻译
This white paper lays out a vision of research and development in the field of artificial intelligence for the next decade (and beyond). Its denouement is a cyber-physical ecosystem of natural and synthetic sense-making, in which humans are integral participants$\unicode{x2014}$what we call ''shared intelligence''. This vision is premised on active inference, a formulation of adaptive behavior that can be read as a physics of intelligence, and which inherits from the physics of self-organization. In this context, we understand intelligence as the capacity to accumulate evidence for a generative model of one's sensed world$\unicode{x2014}$also known as self-evidencing. Formally, this corresponds to maximizing (Bayesian) model evidence, via belief updating over several scales: i.e., inference, learning, and model selection. Operationally, this self-evidencing can be realized via (variational) message passing or belief propagation on a factor graph. Crucially, active inference foregrounds an existential imperative of intelligent systems; namely, curiosity or the resolution of uncertainty. This same imperative underwrites belief sharing in ensembles of agents, in which certain aspects (i.e., factors) of each agent's generative world model provide a common ground or frame of reference. Active inference plays a foundational role in this ecology of belief sharing$\unicode{x2014}$leading to a formal account of collective intelligence that rests on shared narratives and goals. We also consider the kinds of communication protocols that must be developed to enable such an ecosystem of intelligences and motivate the development of a shared hyper-spatial modeling language and transaction protocol, as a first$\unicode{x2014}$and key$\unicode{x2014}$step towards such an ecology.
translated by 谷歌翻译
本文回顾了概念,建模方法和最新发现,沿着不同级别的神经网络模型的抽象范围,包括跨(1)样本跨(2)分布,(3)域,(4)任务,(5)模态的概括,(2) ,和(6)范围。 (1)样品概括的结果表明,对于ImageNet而言,几乎所有最近的改进都减少了训练误差,而过度拟合则保持平坦。几乎消除了所有训练错误,未来的进度将需要专注于减少过度拟合。统计数据的观点突出显示了(2)分布概括如何交替地视为样本权重的变化或输入输出关系的变化。总结了(3)域概括的转移学习方法,以及最新的进步和域适应性基准数据集的财富。在(4)任务概括中调查的最新突破包括很少的元学习方法和BERT NLP引擎以及最近(5)个模态概括研究,这些研究整合了图像和文本数据,并应用了跨嗅觉的生物学启发的网络,视觉和听觉方式。回顾了最近(6)个范围泛化结果,将知识图嵌入深度NLP方法中。此外,讨论了关于大脑的模块化结构以及多巴胺驱动的条件导致抽象思维的步骤。
translated by 谷歌翻译
一个令人着迷的假设是,人类和动物的智力可以通过一些原则(而不是启发式方法的百科全书清单)来解释。如果这个假设是正确的,我们可以更容易地理解自己的智能并建造智能机器。就像物理学一样,原理本身不足以预测大脑等复杂系统的行为,并且可能需要大量计算来模拟人类式的智力。这一假设将表明,研究人类和动物所剥削的归纳偏见可以帮助阐明这些原则,并为AI研究和神经科学理论提供灵感。深度学习已经利用了几种关键的归纳偏见,这项工作考虑了更大的清单,重点是关注高级和顺序有意识的处理的工作。阐明这些特定原则的目的是,它们有可能帮助我们建立从人类的能力中受益于灵活分布和系统概括的能力的AI系统,目前,这是一个领域艺术机器学习和人类智力。
translated by 谷歌翻译
对于使用高性能机器学习算法通常不透明的决策,人们越来越担心。用特定于领域的术语对推理过程的解释对于在医疗保健等风险敏感领域中采用至关重要。我们认为,机器学习算法应该可以通过设计来解释,并且表达这些解释的语言应与域和任务有关。因此,我们将模型的预测基于数据的用户定义和特定于任务的二进制函数,每个都对最终用户有明确的解释。然后,我们最大程度地减少了在任何给定输入上准确预测所需的预期查询数。由于解决方案通常是棘手的,因此在事先工作之后,我们根据信息增益顺序选择查询。但是,与以前的工作相反,我们不必假设查询在有条件地独立。取而代之的是,我们利用随机生成模型(VAE)和MCMC算法(未经调整的Langevin)来选择基于先前的查询 - 答案的输入的最有用的查询。这使得在线确定要解决预测歧义所需的任何深度的查询链。最后,关于视觉和NLP任务的实验证明了我们的方法的功效及其优越性比事后解释的优势。
translated by 谷歌翻译
预测性编码提供了对皮质功能的潜在统一说明 - 假设大脑的核心功能是最小化有关世界生成模型的预测错误。该理论与贝叶斯大脑框架密切相关,在过去的二十年中,在理论和认知神经科学领域都产生了重大影响。基于经验测试的预测编码的改进和扩展的理论和数学模型,以及评估其在大脑中实施的潜在生物学合理性以及该理论所做的具体神经生理学和心理学预测。尽管存在这种持久的知名度,但仍未对预测编码理论,尤其是该领域的最新发展进行全面回顾。在这里,我们提供了核心数学结构和预测编码的逻辑的全面综述,从而补充了文献中最新的教程。我们还回顾了该框架中的各种经典和最新工作,从可以实施预测性编码的神经生物学现实的微电路到预测性编码和广泛使用的错误算法的重新传播之间的紧密关系,以及对近距离的调查。预测性编码和现代机器学习技术之间的关系。
translated by 谷歌翻译
对机器学习和创造力领域的兴趣越来越大。这项调查概述了计算创造力理论,关键机器学习技术(包括生成深度学习)和相应的自动评估方法的历史和现状。在对该领域的主要贡献进行了批判性讨论之后,我们概述了当前的研究挑战和该领域的新兴机会。
translated by 谷歌翻译
Continual Learning (CL) is a field dedicated to devise algorithms able to achieve lifelong learning. Overcoming the knowledge disruption of previously acquired concepts, a drawback affecting deep learning models and that goes by the name of catastrophic forgetting, is a hard challenge. Currently, deep learning methods can attain impressive results when the data modeled does not undergo a considerable distributional shift in subsequent learning sessions, but whenever we expose such systems to this incremental setting, performance drop very quickly. Overcoming this limitation is fundamental as it would allow us to build truly intelligent systems showing stability and plasticity. Secondly, it would allow us to overcome the onerous limitation of retraining these architectures from scratch with the new updated data. In this thesis, we tackle the problem from multiple directions. In a first study, we show that in rehearsal-based techniques (systems that use memory buffer), the quantity of data stored in the rehearsal buffer is a more important factor over the quality of the data. Secondly, we propose one of the early works of incremental learning on ViTs architectures, comparing functional, weight and attention regularization approaches and propose effective novel a novel asymmetric loss. At the end we conclude with a study on pretraining and how it affects the performance in Continual Learning, raising some questions about the effective progression of the field. We then conclude with some future directions and closing remarks.
translated by 谷歌翻译