Spiking Neural Networks (SNNs) are bio-plausible models that hold great potential for realizing energy-efficient implementations of sequential tasks on resource-constrained edge devices. However, commercial edge platforms based on standard GPUs are not optimized to deploy SNNs, resulting in high energy and latency. While analog In-Memory Computing (IMC) platforms can serve as energy-efficient inference engines, they are accursed by the immense energy, latency, and area requirements of high-precision ADCs (HP-ADC), overshadowing the benefits of in-memory computations. We propose a hardware/software co-design methodology to deploy SNNs into an ADC-Less IMC architecture using sense-amplifiers as 1-bit ADCs replacing conventional HP-ADCs and alleviating the above issues. Our proposed framework incurs minimal accuracy degradation by performing hardware-aware training and is able to scale beyond simple image classification tasks to more complex sequential regression tasks. Experiments on complex tasks of optical flow estimation and gesture recognition show that progressively increasing the hardware awareness during SNN training allows the model to adapt and learn the errors due to the non-idealities associated with ADC-Less IMC. Also, the proposed ADC-Less IMC offers significant energy and latency improvements, $2-7\times$ and $8.9-24.6\times$, respectively, depending on the SNN model and the workload, compared to HP-ADC IMC.
translated by 谷歌翻译
基于事件的摄像机最近由于其不同步捕获时间丰富的信息的能力而显示出高速运动估计的巨大潜力。具有神经启发的事件驱动的处理的尖峰神经网络(SNN)可以有效地处理异步数据,而神经元模型(例如泄漏的综合和火灾(LIF))可以跟踪输入中包含的典型时序信息。 SNN通过在神经元内存中保持动态状态,保留重要信息,同时忘记冗余数据随着时间的推移而实现这一目标。因此,我们认为,与类似大小的模拟神经网络(ANN)相比,SNN将允许在顺序回归任务上更好地性能。但是,由于以后的层消失了,很难训练深SNN。为此,我们提出了一个具有可学习的神经元动力学的自适应完全刺激框架,以减轻尖峰消失的问题。我们在时间(BPTT)中利用基于替代梯度的反向传播来从头开始训练我们的深SNN。我们验证了在多车立体化事件相机(MVSEC)数据集和DSEC-FLOW数据集中的光流估计任务的方法。我们在这些数据集上的实验显示,与最新的ANN相比,平均终点误差(AEE)平均降低了13%。我们还探索了几个缩小的模型,并观察到我们的SNN模型始终超过大小的ANN,提供10%-16%的AEE。这些结果证明了SNN对较小模型的重要性及其在边缘的适用性。在效率方面,与最先进的ANN实施相比,我们的SNN可节省大量的网络参数(48倍)和计算能(51倍),同时获得了〜10%的EPE。
translated by 谷歌翻译
Neural network-based approaches for solving partial differential equations (PDEs) have recently received special attention. However, the large majority of neural PDE solvers only apply to rectilinear domains, and do not systematically address the imposition of Dirichlet/Neumann boundary conditions over irregular domain boundaries. In this paper, we present a framework to neurally solve partial differential equations over domains with irregularly shaped (non-rectilinear) geometric boundaries. Our network takes in the shape of the domain as an input (represented using an unstructured point cloud, or any other parametric representation such as Non-Uniform Rational B-Splines) and is able to generalize to novel (unseen) irregular domains; the key technical ingredient to realizing this model is a novel approach for identifying the interior and exterior of the computational grid in a differentiable manner. We also perform a careful error analysis which reveals theoretical insights into several sources of error incurred in the model-building process. Finally, we showcase a wide variety of applications, along with favorable comparisons with ground truth solutions.
translated by 谷歌翻译
许多组织使用配备有加速器的Compute集群,例如GPU和TPU,用于以分布式方式培训深入学习模型。培训是资源密集型的,消耗显着的计算,内存和网络资源。许多先前的作品探索如何减少培训资源占资源的占资源占用空间,而不会影响质量,但它们对瓶颈的子集(通常只有网络)限制了它们改善整体集群利用的能力。在这项工作中,我们利用深度学习工作负载的独特特征来提出结构化部分反向化(SPB),这是一种系统地控制分布式培训中个别工人的背包量的技术。这同时可以减少网络带宽,计算利用率和内存占用空间,同时保持模型质量。为了有效地利用SPB在集群层面的好处,我们介绍了一个SPB了解调度程序的jigsaw,它在深度学习培训(DLT)作业中进行迭代级别。我们发现拼图可以通过高达28 \%将大规模集群效率提高。
translated by 谷歌翻译
气候变化已成为最大的全球性问题之一,越来越多地损害地球的居住地。最近的发展如加利福尼亚州和加拿大的非凡热浪,以及德国的毁灭性洪水指向气候变化在极端天气不断增长的频率下的作用。在过去的五十年中,天气和气候的数值模型已经看到了巨大的改善,但仍有严格的限制仍有待克服。空间和时间本地化预测是需要一个小时,以便有效适应措施,以尽量减少生命和财产丧失。基于人工智能的方法正在展示有希望的导致改进预测,但仍然受到必要硬件和软件所需的可用性来处理地球地球的规模所需的软硬件和软件的限制。量子计算是一种新兴范式,在几个领域中发现了潜在的适用性。在这种意见作品中,我们认为为量子计算机设计的人工智能算法的新发展 - 也称为量子人工智能(QAI) - 可以提供进一步进一步的气候变化科学所需的关键突破。预计天气和气候预测的改善将级联到众多社会福利。
translated by 谷歌翻译
Iterative text revision improves text quality by fixing grammatical errors, rephrasing for better readability or contextual appropriateness, or reorganizing sentence structures throughout a document. Most recent research has focused on understanding and classifying different types of edits in the iterative revision process from human-written text instead of building accurate and robust systems for iterative text revision. In this work, we aim to build an end-to-end text revision system that can iteratively generate helpful edits by explicitly detecting editable spans (where-to-edit) with their corresponding edit intents and then instructing a revision model to revise the detected edit spans. Leveraging datasets from other related text editing NLP tasks, combined with the specification of editable spans, leads our system to more accurately model the process of iterative text refinement, as evidenced by empirical results and human evaluations. Our system significantly outperforms previous baselines on our text revision tasks and other standard text revision tasks, including grammatical error correction, text simplification, sentence fusion, and style transfer. Through extensive qualitative and quantitative analysis, we make vital connections between edit intentions and writing quality, and better computational modeling of iterative text revisions.
translated by 谷歌翻译
To track the 3D locations and trajectories of the other traffic participants at any given time, modern autonomous vehicles are equipped with multiple cameras that cover the vehicle's full surroundings. Yet, camera-based 3D object tracking methods prioritize optimizing the single-camera setup and resort to post-hoc fusion in a multi-camera setup. In this paper, we propose a method for panoramic 3D object tracking, called CC-3DT, that associates and models object trajectories both temporally and across views, and improves the overall tracking consistency. In particular, our method fuses 3D detections from multiple cameras before association, reducing identity switches significantly and improving motion modeling. Our experiments on large-scale driving datasets show that fusion before association leads to a large margin of improvement over post-hoc fusion. We set a new state-of-the-art with 12.6% improvement in average multi-object tracking accuracy (AMOTA) among all camera-based methods on the competitive NuScenes 3D tracking benchmark, outperforming previously published methods by 6.5% in AMOTA with the same 3D detector.
translated by 谷歌翻译
This is a continuation of our recent paper in which we developed the theory of sequential parametrized motion planning. A sequential parametrized motion planning algorithm produced a motion of the system which is required to visit a prescribed sequence of states, in a certain order, at specified moments of time. In the previous publication we analysed the sequential parametrized topological complexity of the Fadell - Neuwirth fibration which in relevant to the problem of moving multiple robots avoiding collisions with other robots and with obstacles in the Euclidean space. Besides, in the preceeding paper we found the sequential parametrised topological complexity of the Fadell - Neuwirth bundle for the case of the Euclidean space $\Bbb R^d$ of odd dimension as well as the case $d=2$. In the present paper we give the complete answer for an arbitrary $d\ge 2$ even. Moreover, we present an explicit motion planning algorithm for controlling multiple robots in $\Bbb R^d$ having the minimal possible topological complexity; this algorithm is applicable to any number $n$ of robots and any number $m\ge 2$ of obstacles.
translated by 谷歌翻译
Deep Reinforcement Learning (DRL) has the potential to be used for synthesizing feedback controllers (agents) for various complex systems with unknown dynamics. These systems are expected to satisfy diverse safety and liveness properties best captured using temporal logic. In RL, the reward function plays a crucial role in specifying the desired behaviour of these agents. However, the problem of designing the reward function for an RL agent to satisfy complex temporal logic specifications has received limited attention in the literature. To address this, we provide a systematic way of generating rewards in real-time by using the quantitative semantics of Signal Temporal Logic (STL), a widely used temporal logic to specify the behaviour of cyber-physical systems. We propose a new quantitative semantics for STL having several desirable properties, making it suitable for reward generation. We evaluate our STL-based reinforcement learning mechanism on several complex continuous control benchmarks and compare our STL semantics with those available in the literature in terms of their efficacy in synthesizing the controller agent. Experimental results establish our new semantics to be the most suitable for synthesizing feedback controllers for complex continuous dynamical systems through reinforcement learning.
translated by 谷歌翻译
State-of-the-art object detectors are fast and accurate, but they require a large amount of well annotated training data to obtain good performance. However, obtaining a large amount of training annotations specific to a particular task, i.e., fine-grained annotations, is costly in practice. In contrast, obtaining common-sense relationships from text, e.g., "a table-lamp is a lamp that sits on top of a table", is much easier. Additionally, common-sense relationships like "on-top-of" are easy to annotate in a task-agnostic fashion. In this paper, we propose a probabilistic model that uses such relational knowledge to transform an off-the-shelf detector of coarse object categories (e.g., "table", "lamp") into a detector of fine-grained categories (e.g., "table-lamp"). We demonstrate that our method, RelDetect, achieves performance competitive to finetuning based state-of-the-art object detector baselines when an extremely low amount of fine-grained annotations is available ($0.2\%$ of entire dataset). We also demonstrate that RelDetect is able to utilize the inherent transferability of relationship information to obtain a better performance ($+5$ mAP points) than the above baselines on an unseen dataset (zero-shot transfer). In summary, we demonstrate the power of using relationships for object detection on datasets where fine-grained object categories can be linked to coarse-grained categories via suitable relationships.
translated by 谷歌翻译