Reinforcement Learning (RL) can enable agents to learn complex tasks. However, it is difficult to interpret the knowledge and reuse it across tasks. Inductive biases can address such issues by explicitly providing generic yet useful decomposition that is otherwise difficult or expensive to learn implicitly. For example, object-centered approaches decompose a high dimensional observation into individual objects. Expanding on this, we utilize an inductive bias for explicit object-centered knowledge separation that provides further decomposition into semantic representations and dynamics knowledge. For this, we introduce a semantic module that predicts an objects' semantic state based on its context. The resulting affordance-like object state can then be used to enrich perceptual object representations. With a minimal setup and an environment that enables puzzle-like tasks, we demonstrate the feasibility and benefits of this approach. Specifically, we compare three different methods of integrating semantic representations into a model-based RL architecture. Our experiments show that the degree of explicitness in knowledge separation correlates with faster learning, better accuracy, better generalization, and better interpretability.
translated by 谷歌翻译
在大多数情况下,有条件的图像生成可以被认为是对图像理解过程的反转。由于通用图像理解涉及解决多个任务,因此自然要通过多条件来生成图像。但是,由于异质性和(实际上)可用条件标签的稀疏性,多条件图像生成是一个非常具有挑战性的问题。在这项工作中,我们提出了一种新型的神经结构,以解决空间多条件标签的异质性和稀疏性问题。我们选择的空间条件(例如语义和深度)是由它具有更好地控制图像生成过程的承诺所驱动的。所提出的方法使用类似变压器的体系结构操作像素,该架构将可用的标签作为输入令牌接收,以将其合并在学习的标签均匀空间中。然后,合并的标签用于通过有条件的生成对抗训练进行图像生成。在此过程中,通过简单地将与所需位置的缺失标签相对应的输入令牌掉下来处理标签的稀疏性,这要归功于提议的像素操作架构。我们在三个基准数据集上进行的实验证明了我们的方法比最新的基准和比较基线的明显优势。源代码将公开可用。
translated by 谷歌翻译
我们在视觉变压器中引入完全随机层,而不会导致任何严重的性能下降。额外的随机性提高了视觉特征的鲁棒性,并加强了隐私。在该过程中,在训练和推理期间使用具有完全随机参数的线性层,以改变每个多层Perceptron的特征激活。这种随机线性操作保留了由通过共用多层Perceptron的令牌形成的拓扑结构。此操作鼓励学习识别任务依赖令牌的拓扑结构,而不是它们的值,而不是它们的值,这反过来提供了可视化功能的所需的鲁棒性和隐私。在本文中,我们使用我们的特性进行三种不同的应用程序,即对抗鲁棒性,网络校准和特征隐私。我们的功能为这些任务提供令人兴奋的结果。此外,我们展示了联合和转移学习的实验设置,其中具有随机层的视觉变压器再次显示出良好的表现。我们的源代码将公开可用。
translated by 谷歌翻译
组合演变 - 通过现有物品的组合创建新事物 - 可以是发展而不是设计电子电路等技术对象的强大方法。有趣的是,这似乎是一个持续的,因此开放的过程,创造了越来越复杂性的新奇。在这里,我们采用了软件开发的组合演变。虽然遗传编程等当前方法在解决特殊问题方面是有效的,但它们都会趋向于解决方案,并且之后不再创建任何新的东西。多种式语言和技术等复杂系统的组合演变被认为是开放式的。因此,通过组合演化可能可以进行开放式自动编程。我们实现了一种计算机程序,模拟存储在数据库中的代码块的组合演进,以使它们可用于组合。通过评估正则表达式来实现基于算法的码代生成感的自动编程。我们发现编程语言的保留关键字适用于在仿真开始时定义基本代码块。我们还发现占位符可用于组合代码块,并且可以根据对编程语言的重要性来描述代码复杂性。与电子电路的先前组合演进仿真一样,复杂性从简单的关键字和特殊字符增加到更复杂的变量声明,类定义,方法和包含方法和变量声明的类。因此,组合演化似乎是开放式自动编程的有希望的方法。
translated by 谷歌翻译
View-dependent effects such as reflections pose a substantial challenge for image-based and neural rendering algorithms. Above all, curved reflectors are particularly hard, as they lead to highly non-linear reflection flows as the camera moves. We introduce a new point-based representation to compute Neural Point Catacaustics allowing novel-view synthesis of scenes with curved reflectors, from a set of casually-captured input photos. At the core of our method is a neural warp field that models catacaustic trajectories of reflections, so complex specular effects can be rendered using efficient point splatting in conjunction with a neural renderer. One of our key contributions is the explicit representation of reflections with a reflection point cloud which is displaced by the neural warp field, and a primary point cloud which is optimized to represent the rest of the scene. After a short manual annotation step, our approach allows interactive high-quality renderings of novel views with accurate reflection flow. Additionally, the explicit representation of reflection flow supports several forms of scene manipulation in captured scenes, such as reflection editing, cloning of specular objects, reflection tracking across views, and comfortable stereo viewing. We provide the source code and other supplemental material on https://repo-sam.inria.fr/ fungraph/neural_catacaustics/
translated by 谷歌翻译
Edge computing is changing the face of many industries and services. Common edge computing models offload computing which is prone to security risks and privacy violation. However, advances in deep learning enabled Internet of Things (IoTs) to take decisions and run cognitive tasks locally. This research introduces a decentralized-control edge model where most computation and decisions are moved to the IoT level. The model aims at decreasing communication to the edge which in return enhances efficiency and decreases latency. The model also avoids data transfer which raises security and privacy risks. To examine the model, we developed SAFEMYRIDES, a scene-aware ridesharing monitoring system where smart phones are detecting violations at the runtime. Current real-time monitoring systems are costly and require continuous network connectivity. The system uses optimized deep learning that run locally on IoTs to detect violations in ridesharing and record violation incidences. The system would enhance safety and security in ridesharing without violating privacy.
translated by 谷歌翻译
Cognitive Computing (COC) aims to build highly cognitive machines with low computational resources that respond in real-time. However, scholarly literature shows varying research areas and various interpretations of COC. This calls for a cohesive architecture that delineates the nature of COC. We argue that if Herbert Simon considered the design science is the science of artificial, cognitive systems are the products of cognitive science or 'the newest science of the artificial'. Therefore, building a conceptual basis for COC is an essential step into prospective cognitive computing-based systems. This paper proposes an architecture of COC through analyzing the literature on COC using a myriad of statistical analysis methods. Then, we compare the statistical analysis results with previous qualitative analysis results to confirm our findings. The study also comprehensively surveys the recent research on COC to identify the state of the art and connect the advances in varied research disciplines in COC. The study found that there are three underlaying computing paradigms, Von-Neuman, Neuromorphic Engineering and Quantum Computing, that comprehensively complement the structure of cognitive computation. The research discuss possible applications and open research directions under the COC umbrella.
translated by 谷歌翻译
Reading comprehension of legal text can be a particularly challenging task due to the length and complexity of legal clauses and a shortage of expert-annotated datasets. To address this challenge, we introduce the Merger Agreement Understanding Dataset (MAUD), an expert-annotated reading comprehension dataset based on the American Bar Association's 2021 Public Target Deal Points Study, with over 39,000 examples and over 47,000 total annotations. Our fine-tuned Transformer baselines show promising results, with models performing well above random on most questions. However, on a large subset of questions, there is still room for significant improvement. As the only expert-annotated merger agreement dataset, MAUD is valuable as a benchmark for both the legal profession and the NLP community.
translated by 谷歌翻译
The application of deep learning algorithms to financial data is difficult due to heavy non-stationarities which can lead to over-fitted models that underperform under regime changes. Using the Numerai tournament data set as a motivating example, we propose a machine learning pipeline for trading market-neutral stock portfolios based on tabular data which is robust under changes in market conditions. We evaluate various machine-learning models, including Gradient Boosting Decision Trees (GBDTs) and Neural Networks with and without simple feature engineering, as the building blocks for the pipeline. We find that GBDT models with dropout display high performance, robustness and generalisability with relatively low complexity and reduced computational cost. We then show that online learning techniques can be used in post-prediction processing to enhance the results. In particular, dynamic feature neutralisation, an efficient procedure that requires no retraining of models and can be applied post-prediction to any machine learning model, improves robustness by reducing drawdown in volatile market conditions. Furthermore, we demonstrate that the creation of model ensembles through dynamic model selection based on recent model performance leads to improved performance over baseline by improving the Sharpe and Calmar ratios. We also evaluate the robustness of our pipeline across different data splits and random seeds with good reproducibility of results.
translated by 谷歌翻译
In this work, we address the problem of unsupervised moving object segmentation (MOS) in 4D LiDAR data recorded from a stationary sensor, where no ground truth annotations are involved. Deep learning-based state-of-the-art methods for LiDAR MOS strongly depend on annotated ground truth data, which is expensive to obtain and scarce in existence. To close this gap in the stationary setting, we propose a novel 4D LiDAR representation based on multivariate time series that relaxes the problem of unsupervised MOS to a time series clustering problem. More specifically, we propose modeling the change in occupancy of a voxel by a multivariate occupancy time series (MOTS), which captures spatio-temporal occupancy changes on the voxel level and its surrounding neighborhood. To perform unsupervised MOS, we train a neural network in a self-supervised manner to encode MOTS into voxel-level feature representations, which can be partitioned by a clustering algorithm into moving or stationary. Experiments on stationary scenes from the Raw KITTI dataset show that our fully unsupervised approach achieves performance that is comparable to that of supervised state-of-the-art approaches.
translated by 谷歌翻译