The automated machine learning (AutoML) field has become increasingly relevant in recent years. These algorithms can develop models without the need for expert knowledge, facilitating the application of machine learning techniques in the industry. Neural Architecture Search (NAS) exploits deep learning techniques to autonomously produce neural network architectures whose results rival the state-of-the-art models hand-crafted by AI experts. However, this approach requires significant computational resources and hardware investments, making it less appealing for real-usage applications. This article presents the third version of Pareto-Optimal Progressive Neural Architecture Search (POPNASv3), a new sequential model-based optimization NAS algorithm targeting different hardware environments and multiple classification tasks. Our method is able to find competitive architectures within large search spaces, while keeping a flexible structure and data processing pipeline to adapt to different tasks. The algorithm employs Pareto optimality to reduce the number of architectures sampled during the search, drastically improving the time efficiency without loss in accuracy. The experiments performed on images and time series classification datasets provide evidence that POPNASv3 can explore a large set of assorted operators and converge to optimal architectures suited for the type of data provided under different scenarios.
translated by 谷歌翻译
如今,深度学习模型的所有者和开发人员必须考虑其培训数据的严格隐私保护规则,通常是人群来源且保留敏感信息。如今,深入学习模型执行隐私保证的最广泛采用的方法依赖于实施差异隐私的优化技术。根据文献,这种方法已被证明是针对多种模型的隐私攻击的成功防御,但其缺点是对模型的性能的实质性降级。在这项工作中,我们比较了差异私有的随机梯度下降(DP-SGD)算法与使用正则化技术的标准优化实践的有效性。我们分析了生成模型的实用程序,培训性能以及成员推理和模型反转攻击对学习模型的有效性。最后,我们讨论了差异隐私的缺陷和限制,并从经验上证明了辍学和L2型规范的卓越保护特性。
translated by 谷歌翻译
单眼语义同时定位和映射(SLAM)的有效对象级别表示仍然缺乏广泛接受的解决方案。在本文中,我们提出了基于结构点的有效表示的使用,以基于姿势格式的配方在单眼语义大满贯系统中用作地标的几何形状。特别是,为姿势图中的地标节点提出了一个反深度参数化,以存储对象位置,方向和大小/比例。所提出的配方是一般的,可以应用于不同的几何形状。在本文中,我们关注的是室内环境,其中人工制品通常具有平面矩形形状,例如窗户,门,橱柜等。模拟中的实验表现出良好的性能,尤其是在对象几何重建中。
translated by 谷歌翻译
事件摄像机是新型生物启发传感器,其异步捕获“事件”形式的像素级强度变化。由于它们的传感机制,事件相机几乎没有运动模糊,这是一个非常高的时间分辨率,并且需要比传统的基于帧的相机更小的电力和存储器。这些特性使它们成为一个完美的拟合若干现实世界应用,如在可穿戴设备上的专门动作识别,其中快速相机运动和有限的电力挑战传统视觉传感器。然而,迄今为止,基于事件的愿景的不断增长的愿景领域已经忽略了在此类应用中的活动摄像机的潜力。在本文中,我们表明事件数据是自我监测行动识别的非常有价值的模态。为此,我们介绍了N-EPIC-Kitchens,这是大型史诗厨房数据集的第一个基于事件的相机扩展。在此背景下,我们提出了两种策略:(i)使用传统的视频处理架构(E $ ^ 2 $(GO))和(ii)使用事件数据直接处理事件相机数据(E $ ^ 2 $(GO))和蒸馏光流信息(E $ ^ 2 $(go)mo)。在我们提出的基准测试中,我们表明事件数据为RGB和光流提供了可比性的性能,但在部署时没有任何额外的流量计算,以及相对于RGB的信息高达4%的性能。
translated by 谷歌翻译
隐私法规法(例如GDPR)将透明度和安全性作为数据处理算法的设计支柱。在这种情况下,联邦学习是保护隐私的分布式机器学习的最具影响力的框架之一,从而实现了许多自然语言处理和计算机视觉任务的惊人结果。一些联合学习框架采用差异隐私,以防止私人数据泄漏到未经授权的政党和恶意攻击者。但是,许多研究突出了标准联邦学习对中毒和推理的脆弱性,因此引起了人们对敏感数据潜在风险的担忧。为了解决此问题,我们提出了SGDE,这是一种生成数据交换协议,可改善跨索洛联合会中的用户安全性和机器学习性能。 SGDE的核心是共享具有强大差异隐私的数据生成器,保证了对私人数据培训的培训,而不是通信显式梯度信息。这些发电机合成了任意大量数据,这些数据保留了私人样品的独特特征,但有很大差异。我们展示了将SGDE纳入跨核心联合网络如何提高对联邦学习最有影响力的攻击的弹性。我们在图像和表格数据集上测试我们的方法,利用β变量自动编码器作为数据生成器,并突出了对非生成数据的本地和联合学习的公平性和绩效改进。
translated by 谷歌翻译
Computational units in artificial neural networks follow a simplified model of biological neurons. In the biological model, the output signal of a neuron runs down the axon, splits following the many branches at its end, and passes identically to all the downward neurons of the network. Each of the downward neurons will use their copy of this signal as one of many inputs dendrites, integrate them all and fire an output, if above some threshold. In the artificial neural network, this translates to the fact that the nonlinear filtering of the signal is performed in the upward neuron, meaning that in practice the same activation is shared between all the downward neurons that use that signal as their input. Dendrites thus play a passive role. We propose a slightly more complex model for the biological neuron, where dendrites play an active role: the activation in the output of the upward neuron becomes optional, and instead the signals going through each dendrite undergo independent nonlinear filterings, before the linear combination. We implement this new model into a ReLU computational unit and discuss its biological plausibility. We compare this new computational unit with the standard one and describe it from a geometrical point of view. We provide a Keras implementation of this unit into fully connected and convolutional layers and estimate their FLOPs and weights change. We then use these layers in ResNet architectures on CIFAR-10, CIFAR-100, Imagenette, and Imagewoof, obtaining performance improvements over standard ResNets up to 1.73%. Finally, we prove a universal representation theorem for continuous functions on compact sets and show that this new unit has more representational power than its standard counterpart.
translated by 谷歌翻译
Real-world robotic grasping can be done robustly if a complete 3D Point Cloud Data (PCD) of an object is available. However, in practice, PCDs are often incomplete when objects are viewed from few and sparse viewpoints before the grasping action, leading to the generation of wrong or inaccurate grasp poses. We propose a novel grasping strategy, named 3DSGrasp, that predicts the missing geometry from the partial PCD to produce reliable grasp poses. Our proposed PCD completion network is a Transformer-based encoder-decoder network with an Offset-Attention layer. Our network is inherently invariant to the object pose and point's permutation, which generates PCDs that are geometrically consistent and completed properly. Experiments on a wide range of partial PCD show that 3DSGrasp outperforms the best state-of-the-art method on PCD completion tasks and largely improves the grasping success rate in real-world scenarios. The code and dataset will be made available upon acceptance.
translated by 谷歌翻译
Uncertainty quantification is crucial to inverse problems, as it could provide decision-makers with valuable information about the inversion results. For example, seismic inversion is a notoriously ill-posed inverse problem due to the band-limited and noisy nature of seismic data. It is therefore of paramount importance to quantify the uncertainties associated to the inversion process to ease the subsequent interpretation and decision making processes. Within this framework of reference, sampling from a target posterior provides a fundamental approach to quantifying the uncertainty in seismic inversion. However, selecting appropriate prior information in a probabilistic inversion is crucial, yet non-trivial, as it influences the ability of a sampling-based inference in providing geological realism in the posterior samples. To overcome such limitations, we present a regularized variational inference framework that performs posterior inference by implicitly regularizing the Kullback-Leibler divergence loss with a CNN-based denoiser by means of the Plug-and-Play methods. We call this new algorithm Plug-and-Play Stein Variational Gradient Descent (PnP-SVGD) and demonstrate its ability in producing high-resolution, trustworthy samples representative of the subsurface structures, which we argue could be used for post-inference tasks such as reservoir modelling and history matching. To validate the proposed method, numerical tests are performed on both synthetic and field post-stack seismic data.
translated by 谷歌翻译
Explainability is a vibrant research topic in the artificial intelligence community, with growing interest across methods and domains. Much has been written about the topic, yet explainability still lacks shared terminology and a framework capable of providing structural soundness to explanations. In our work, we address these issues by proposing a novel definition of explanation that is a synthesis of what can be found in the literature. We recognize that explanations are not atomic but the product of evidence stemming from the model and its input-output and the human interpretation of this evidence. Furthermore, we fit explanations into the properties of faithfulness (i.e., the explanation being a true description of the model's decision-making) and plausibility (i.e., how much the explanation looks convincing to the user). Using our proposed theoretical framework simplifies how these properties are ope rationalized and provide new insight into common explanation methods that we analyze as case studies.
translated by 谷歌翻译
Fruit is a key crop in worldwide agriculture feeding millions of people. The standard supply chain of fruit products involves quality checks to guarantee freshness, taste, and, most of all, safety. An important factor that determines fruit quality is its stage of ripening. This is usually manually classified by experts in the field, which makes it a labor-intensive and error-prone process. Thus, there is an arising need for automation in the process of fruit ripeness classification. Many automatic methods have been proposed that employ a variety of feature descriptors for the food item to be graded. Machine learning and deep learning techniques dominate the top-performing methods. Furthermore, deep learning can operate on raw data and thus relieve the users from having to compute complex engineered features, which are often crop-specific. In this survey, we review the latest methods proposed in the literature to automatize fruit ripeness classification, highlighting the most common feature descriptors they operate on.
translated by 谷歌翻译