我们探索一种以数据为基础的学习方法来优化神经网络。我们构建神经网络检查点的数据集,并培训有关参数的生成模型。特别是,我们的模型是一个条件扩散变压器,鉴于初始输入参数向量以及提示的丢失,误差或返回,可以预测实现所需度量的参数更新的分布。在测试时,它可以在一个更新中优化具有看不见的参数的神经网络。我们发现我们的方法成功地生成了各种损失提示的参数。此外,它可以采样多模式参数解决方案,并具有有利的缩放属性。我们将方法应用于监督和强化学习中的不同神经网络体系结构和任务。
translated by 谷歌翻译
在这项工作中,我们研究了生成图像模型的性能和评估如何受到其培训数据集的种族组成的影响。通过检查和控制各种培训数据集中的种族分布,我们能够观察不同培训分布对生成的图像质量和生成图像的种族分布的影响。我们的结果表明,生成的图像的种族组成成功地保留了培训数据。但是,我们观察到截断是一种用于在推断过程中生成更高质量图像的技术,加剧了数据中的种族失衡。最后,在检查图像质量与种族之间的关系时,我们发现给定种族的最高可感知的视觉质量图像来自该种族代表性很好的分布,并且注释者始终偏爱白人的生成图像,而不是黑人。
translated by 谷歌翻译
我们提出了一个视频生成模型,该模型可以准确地重现对象运动,摄像头视图的变化以及随着时间的推移而产生的新内容。现有的视频生成方法通常无法生成新内容作为时间的函数,同时保持在真实环境中预期的一致性,例如合理的动态和对象持久性。一个常见的故障情况是,由于过度依赖归纳偏见而提供时间一致性,因此内容永远不会改变,例如单个潜在代码决定整个视频的内容。在另一个极端情况下,没有长期一致性,生成的视频可能会在不同场景之间不切实际。为了解决这些限制,我们通过重新设计暂时的潜在表示并通过较长的视频培训从数据中学习长期一致性来优先考虑时间轴。为此,我们利用了两阶段的培训策略,在该策略中,我们以低分辨率和高分辨率的较短视频分别训练了较长的视频。为了评估模型的功能,我们介绍了两个新的基准数据集,并明确关注长期时间动态。
translated by 谷歌翻译
人类的姿势告诉我们一个场景吗?我们提出了一个任务来回答这个问题:给予人类姿势作为输入,幻觉兼容兼容的场景。人类姿势语义,环境承受,对象交互捕获的微妙提示 - 提供令人惊讶的洞察力,涉及哪些场景。我们为姿势调节场景生成提供了一个大型生成的对抗性网络。我们显着缩放了培训数据的大小和复杂性,策划了在日常环境中遏制了含有超过1900万帧的大型元数据集。我们对STYLEGAN2的模型的能力增加了调整这些复杂数据,并设计一个推动我们模型的姿势调节机制,以了解姿势和场景之间的差别关系。我们利用我们的培训模型进行各种应用:有或没有人类的幻觉兼容的场景,可视化不兼容的场景并姿势,将一个人从一个生成的图像放入另一个场景,动画姿势。我们的模型在准确的人体放置(正确关键点的百分比)和图像质量(FroEchet Inception距离)方面,我们的模型产生了不同的样本和占据了姿势调节的样式2和PIX2PIX基线。
translated by 谷歌翻译
原始的“七个图案”阐述了科学计算领域的基本方法的路线图,其中图案是一种捕获计算和数据移动模式的算法方法。我们介绍了“仿真智力的九个主题”,是一种开发和整合的路线图,以合并科学计算,科学模拟和人工智能所必需的基本算法。我们称之为合并模拟智能(SI),短暂。我们认为模拟智能的主题是相互连接的和相互依存的,很像操作系统层中的组件一样。使用这种隐喻,我们探讨了模拟智能操作系统堆栈(Si-Stack)和其中图案的各层的性质:(1)多种物理和多尺度建模; (2)替代建模和仿真; (3)基于仿真的推理; (4)因果建模和推理; (5)基于代理的建模; (6)概率编程; (7)可微分的编程; (8)开放式优化; (9)机器编程。我们相信图案之间的协调努力提供了加速科学发现的巨大机会,从综合生物和气候科学中解决逆问题,指导核能实验,并预测社会经济环境中的紧急行为。我们详细说明了Si-stack的每层,详细说明了最先进的方法,提出了示例以突出挑战和机遇,并倡导具体的方法来推进主题和与其组合的协同作用。推进和整合这些技术可以实现稳健且有效的假设仿真 - 分析类型的科学方法,我们用几种使用案例为人机组合和自动化学介绍。
translated by 谷歌翻译
Modeling lies at the core of both the financial and the insurance industry for a wide variety of tasks. The rise and development of machine learning and deep learning models have created many opportunities to improve our modeling toolbox. Breakthroughs in these fields often come with the requirement of large amounts of data. Such large datasets are often not publicly available in finance and insurance, mainly due to privacy and ethics concerns. This lack of data is currently one of the main hurdles in developing better models. One possible option to alleviating this issue is generative modeling. Generative models are capable of simulating fake but realistic-looking data, also referred to as synthetic data, that can be shared more freely. Generative Adversarial Networks (GANs) is such a model that increases our capacity to fit very high-dimensional distributions of data. While research on GANs is an active topic in fields like computer vision, they have found limited adoption within the human sciences, like economics and insurance. Reason for this is that in these fields, most questions are inherently about identification of causal effects, while to this day neural networks, which are at the center of the GAN framework, focus mostly on high-dimensional correlations. In this paper we study the causal preservation capabilities of GANs and whether the produced synthetic data can reliably be used to answer causal questions. This is done by performing causal analyses on the synthetic data, produced by a GAN, with increasingly more lenient assumptions. We consider the cross-sectional case, the time series case and the case with a complete structural model. It is shown that in the simple cross-sectional scenario where correlation equals causation the GAN preserves causality, but that challenges arise for more advanced analyses.
translated by 谷歌翻译
KL-regularized reinforcement learning from expert demonstrations has proved successful in improving the sample efficiency of deep reinforcement learning algorithms, allowing them to be applied to challenging physical real-world tasks. However, we show that KL-regularized reinforcement learning with behavioral reference policies derived from expert demonstrations can suffer from pathological training dynamics that can lead to slow, unstable, and suboptimal online learning. We show empirically that the pathology occurs for commonly chosen behavioral policy classes and demonstrate its impact on sample efficiency and online policy performance. Finally, we show that the pathology can be remedied by non-parametric behavioral reference policies and that this allows KL-regularized reinforcement learning to significantly outperform state-of-the-art approaches on a variety of challenging locomotion and dexterous hand manipulation tasks.
translated by 谷歌翻译
Scientists and philosophers have debated whether humans can trust advanced artificial intelligence (AI) agents to respect humanity's best interests. Yet what about the reverse? Will advanced AI agents trust humans? Gauging an AI agent's trust in humans is challenging because--absent costs for dishonesty--such agents might respond falsely about their trust in humans. Here we present a method for incentivizing machine decisions without altering an AI agent's underlying algorithms or goal orientation. In two separate experiments, we then employ this method in hundreds of trust games between an AI agent (a Large Language Model (LLM) from OpenAI) and a human experimenter (author TJ). In our first experiment, we find that the AI agent decides to trust humans at higher rates when facing actual incentives than when making hypothetical decisions. Our second experiment replicates and extends these findings by automating game play and by homogenizing question wording. We again observe higher rates of trust when the AI agent faces real incentives. Across both experiments, the AI agent's trust decisions appear unrelated to the magnitude of stakes. Furthermore, to address the possibility that the AI agent's trust decisions reflect a preference for uncertainty, the experiments include two conditions that present the AI agent with a non-social decision task that provides the opportunity to choose a certain or uncertain option; in those conditions, the AI agent consistently chooses the certain option. Our experiments suggest that one of the most advanced AI language models to date alters its social behavior in response to incentives and displays behavior consistent with trust toward a human interlocutor when incentivized.
translated by 谷歌翻译
The cooperation of a human pilot with an autonomous agent during flight control realizes parallel autonomy. A parallel-autonomous system acts as a guardian that significantly enhances the robustness and safety of flight operations in challenging circumstances. Here, we propose an air-guardian concept that facilitates cooperation between an artificial pilot agent and a parallel end-to-end neural control system. Our vision-based air-guardian system combines a causal continuous-depth neural network model with a cooperation layer to enable parallel autonomy between a pilot agent and a control system based on perceived differences in their attention profile. The attention profiles are obtained by computing the networks' saliency maps (feature importance) through the VisualBackProp algorithm. The guardian agent is trained via reinforcement learning in a fixed-wing aircraft simulated environment. When the attention profile of the pilot and guardian agents align, the pilot makes control decisions. If the attention map of the pilot and the guardian do not align, the air-guardian makes interventions and takes over the control of the aircraft. We show that our attention-based air-guardian system can balance the trade-off between its level of involvement in the flight and the pilot's expertise and attention. We demonstrate the effectivness of our methods in simulated flight scenarios with a fixed-wing aircraft and on a real drone platform.
translated by 谷歌翻译
As demand for large corpora increases with the size of current state-of-the-art language models, using web data as the main part of the pre-training corpus for these models has become a ubiquitous practice. This, in turn, has introduced an important challenge for NLP practitioners, as they are now confronted with the task of developing highly optimized models and pipelines for pre-processing large quantities of textual data, which implies, effectively classifying and filtering multilingual, heterogeneous and noisy data, at web scale. One of the main components of this pre-processing step for the pre-training corpora of large language models, is the removal of adult and harmful content. In this paper we explore different methods for detecting adult and harmful of content in multilingual heterogeneous web data. We first show how traditional methods in harmful content detection, that seemingly perform quite well in small and specialized datasets quickly break down when confronted with heterogeneous noisy web data. We then resort to using a perplexity based approach but with a twist: Instead of using a so-called "clean" corpus to train a small language model and then use perplexity so select the documents with low perplexity, i.e., the documents that resemble this so-called "clean" corpus the most. We train solely with adult and harmful textual data, and then select the documents having a perplexity value above a given threshold. This approach will virtually cluster our documents into two distinct groups, which will greatly facilitate the choice of the threshold for the perplexity and will also allow us to obtain higher precision than with the traditional classification methods for detecting adult and harmful content.
translated by 谷歌翻译