Our aim is to build autonomous agents that can solve tasks in environments like Minecraft. To do so, we used an imitation learning-based approach. We formulate our control problem as a search problem over a dataset of experts' demonstrations, where the agent copies actions from a similar demonstration trajectory of image-action pairs. We perform a proximity search over the BASALT MineRL-dataset in the latent representation of a Video PreTraining model. The agent copies the actions from the expert trajectory as long as the distance between the state representations of the agent and the selected expert trajectory from the dataset do not diverge. Then the proximity search is repeated. Our approach can effectively recover meaningful demonstration trajectories and show human-like behavior of an agent in the Minecraft environment.
translated by 谷歌翻译
Our goal with this survey is to provide an overview of the state of the art deep learning technologies for face generation and editing. We will cover popular latest architectures and discuss key ideas that make them work, such as inversion, latent representation, loss functions, training procedures, editing methods, and cross domain style transfer. We particularly focus on GAN-based architectures that have culminated in the StyleGAN approaches, which allow generation of high-quality face images and offer rich interfaces for controllable semantics editing and preserving photo quality. We aim to provide an entry point into the field for readers that have basic knowledge about the field of deep learning and are looking for an accessible introduction and overview.
translated by 谷歌翻译
学习剂的实际应用需要样本有效且可解释的算法。向行为先验学习是一种有前途的方法,可以使工具探索政策更好或对早期学习的陷阱进行安全保护。现有的模仿学习解决方案需要大量的专家演示,并依靠难以解释的学习方法,例如深Q学习。在这项工作中,我们提出了一种基于计划的方法,该方法可以在强化学习环境中使用这些行为先验进行有效的探索和学习,我们证明以行为先验的形式进行了精心挑战的探索政策可以帮助代理商更快地学习。
translated by 谷歌翻译
在www.aicrowd.com平台上托管的学习竞赛自主赛车虚拟挑战由两个曲目组成:单摄像头和多相机。我们的Uniteam团队是单个相机轨道中的最终获胜者之一。该代理必须在最短时间内通过以前未知的F1风格轨道,而越野驾驶量最少。在我们的方法中,我们将U-NET体系结构用于道路细分,各种自动编码器编码道路二进制面具以及最近的邻居搜索策略,该策略选择给定状态的最佳动作。我们的经纪人在第1阶段(已知赛道)的平均速度为105 km/h,在第2阶段(未知轨道)上达到了73 km/h,而没有任何越野驾驶。在这里,我们提出解决方案和结果。代码实施可在此处提供:https://gitlab.aicrowd.com/shivansh beohar/l2r
translated by 谷歌翻译
准确的交通预测是使流量管理等流量管理的关键要素,例如重新路由汽车减少道路拥堵或通过动态速度限制来调节流量以保持稳定的流量。表示流量数据的一种方法是以时间更改的热图可视化流量的属性(例如速度和音量)的形式。在最近的作品中,U-NET模型在热图预测的交通预测上显示了SOTA性能。我们建议将U-NET体系结构与图层相结合,该层面可以改善与香草U-NET相比,将空间概括到看不见的道路网络。特别是,我们专门将现有的图形操作对地理拓扑敏感,并概括合并和升级操作以适用于图形。
translated by 谷歌翻译
在这项工作中,我们展示了如何为来自单个混合音频通道的音频源分离问题调整公共可用的预先训练的Jukebox模型。我们的转移学习的神经网络架构快速训练,结果向其他最先进的方法表现出相当的性能。我们提供了我们体系结构的开源代码(https://rebrand.ly/transfer-jukebox-github)。
translated by 谷歌翻译
The recent increase in public and academic interest in preserving biodiversity has led to the growth of the field of conservation technology. This field involves designing and constructing tools that utilize technology to aid in the conservation of wildlife. In this article, we will use case studies to demonstrate the importance of designing conservation tools with human-wildlife interaction in mind and provide a framework for creating successful tools. These case studies include a range of complexities, from simple cat collars to machine learning and game theory methodologies. Our goal is to introduce and inform current and future researchers in the field of conservation technology and provide references for educating the next generation of conservation technologists. Conservation technology not only has the potential to benefit biodiversity but also has broader impacts on fields such as sustainability and environmental protection. By using innovative technologies to address conservation challenges, we can find more effective and efficient solutions to protect and preserve our planet's resources.
translated by 谷歌翻译
A Digital Twin (DT) is a simulation of a physical system that provides information to make decisions that add economic, social or commercial value. The behaviour of a physical system changes over time, a DT must therefore be continually updated with data from the physical systems to reflect its changing behaviour. For resource-constrained systems, updating a DT is non-trivial because of challenges such as on-board learning and the off-board data transfer. This paper presents a framework for updating data-driven DTs of resource-constrained systems geared towards system health monitoring. The proposed solution consists of: (1) an on-board system running a light-weight DT allowing the prioritisation and parsimonious transfer of data generated by the physical system; and (2) off-board robust updating of the DT and detection of anomalous behaviours. Two case studies are considered using a production gas turbine engine system to demonstrate the digital representation accuracy for real-world, time-varying physical systems.
translated by 谷歌翻译
We introduce Argoverse 2 (AV2) - a collection of three datasets for perception and forecasting research in the self-driving domain. The annotated Sensor Dataset contains 1,000 sequences of multimodal data, encompassing high-resolution imagery from seven ring cameras, and two stereo cameras in addition to lidar point clouds, and 6-DOF map-aligned pose. Sequences contain 3D cuboid annotations for 26 object categories, all of which are sufficiently-sampled to support training and evaluation of 3D perception models. The Lidar Dataset contains 20,000 sequences of unlabeled lidar point clouds and map-aligned pose. This dataset is the largest ever collection of lidar sensor data and supports self-supervised learning and the emerging task of point cloud forecasting. Finally, the Motion Forecasting Dataset contains 250,000 scenarios mined for interesting and challenging interactions between the autonomous vehicle and other actors in each local scene. Models are tasked with the prediction of future motion for "scored actors" in each scenario and are provided with track histories that capture object location, heading, velocity, and category. In all three datasets, each scenario contains its own HD Map with 3D lane and crosswalk geometry - sourced from data captured in six distinct cities. We believe these datasets will support new and existing machine learning research problems in ways that existing datasets do not. All datasets are released under the CC BY-NC-SA 4.0 license.
translated by 谷歌翻译
We present a Machine Learning (ML) study case to illustrate the challenges of clinical translation for a real-time AI-empowered echocardiography system with data of ICU patients in LMICs. Such ML case study includes data preparation, curation and labelling from 2D Ultrasound videos of 31 ICU patients in LMICs and model selection, validation and deployment of three thinner neural networks to classify apical four-chamber view. Results of the ML heuristics showed the promising implementation, validation and application of thinner networks to classify 4CV with limited datasets. We conclude this work mentioning the need for (a) datasets to improve diversity of demographics, diseases, and (b) the need of further investigations of thinner models to be run and implemented in low-cost hardware to be clinically translated in the ICU in LMICs. The code and other resources to reproduce this work are available at https://github.com/vital-ultrasound/ai-assisted-echocardiography-for-low-resource-countries.
translated by 谷歌翻译