批次归一化被广泛用于深度学习以使中间激活归一化。深层网络臭名昭著地增加了训练的复杂性,要​​求仔细的体重初始化,需要较低的学习率等。这些问题已通过批归一化解决(\ textbf {bn})来解决,通过将激活的输入归功于零平均值和单位标准偏差。使培训过程的批归归量化部分显着加速了非常深网络的训练过程。一个新的研究领域正在进行研究\ textbf {bn}成功背后的确切理论解释。这些理论见解中的大多数试图通过将其对优化,体重量表不变性和正则化的影响来解释\ textbf {bn}的好处。尽管\ textbf {bn}在加速概括方面取得了不可否认的成功,但分析的差距将\ textbf {bn}与正则化参数的效果相关联。本文旨在通过\ textbf {bn}对正则化参数的数据依赖性自动调整,并具有分析证明。我们已将\ textbf {bn}提出为对非 - \ textbf {bn}权重的约束优化,通过该优化,我们通过它演示其数据统计信息依赖于正则化参数的自动调整。我们还为其在嘈杂的输入方案下的行为提供了分析证明,该方案揭示了正则化参数的信号与噪声调整。我们还通过MNIST数据集实验的经验结果证实了我们的主张。
translated by 谷歌翻译
与数字计算相比,模拟计算具有吸引力,因为它可以达到更高的计算密度和更高的能源效率。但是,与数字电路不同,由于晶体管偏置偏差,温度变化和有限的动态范围的差异,传统的模拟计算电路不能轻易地在不同的过程节点上映射。在这项工作中,我们概括了先前报道的基于边缘传播的模拟计算框架,用于设计新颖的\ textit {基于形状的模拟计算}(S-AC)电路,这些电路可以轻松地在不同的过程节点上交叉映射。与数字设计类似的S-AC设计也可以缩放以获得精确,速度和功率。作为概念验证,我们展示了实现机器学习(ML)体系结构中通常使用的数学功能的S-AC电路的几个示例。使用电路模拟,我们证明了电路输入/输出特性从平面CMOS 180NM工艺映射到FinFET 7NM工艺时保持健壮。同样,使用基准数据集,我们证明了基于S-AC的神经网络的分类精度在两个过程中映射到温度变化时仍然坚固。
translated by 谷歌翻译
偏差可估算的模拟计算对于实施机器学习(ML)处理器具有不同的功能性能规格具有吸引力。例如,用于服务器工作负载的ML实现专注于计算吞吐量和更快的训练,而Edge设备的ML实现则集中在节能推理上。在本文中,我们证明了使用边缘传播(MP)原理的概括(MP)原理称为基于形状的模拟计算(S-AC)的偏置模拟计算电路的实现。所得的S-AC核心集成了几个接近内存的计算元素,其中包括:(a)非线性激活函数; (b)内部产品计算电路; (c)混合信号压缩内存。使用在180nm CMOS工艺中制造的原型的测量结果,我们证明了计算模块的性能仍然可与晶体管偏置和温度变化保持稳健。在本文中,我们还证明了简单的ML回归任务的偏差量表性。
translated by 谷歌翻译
我们提出了一个新颖的框架,用于设计无乘数内核机器,该机器可以在智能边缘设备等资源约束平台上使用。该框架使用基于边缘传播(MP)技术的分段线性(PWL)近似值,仅使用加法/减法,移位,比较和寄存器底流/溢出操作。我们建议使用针对现场可编程门阵列(FPGA)平台进行优化的基于硬件的MP推理和在线培训算法。我们的FPGA实施消除了对DSP单元的需求,并减少了LUT的数量。通过重复使用相同的硬件进行推理和培训,我们表明该平台可以克服由MP近似产生的分类错误和本地最小值。该提议的无乘数MP-Kernel机器在FPGA上的实施导致估计的能源消耗为13.4 PJ,功率消耗为107 MW,每台均具有〜9K LUTS和FFS,每张均具有256 x 32个大小的核与其他可比实现相比,区域和区域。
translated by 谷歌翻译
植物疾病是全球作物损失的主要原因,对世界经济产生了影响。为了解决这些问题,智能农业解决方案正在发展,将物联网和机器学习结合起来,以进行早期疾病检测和控制。许多这样的系统使用基于视觉的机器学习方法进行实时疾病检测和诊断。随着深度学习技术的发展,已经出现了新方法,这些方法采用卷积神经网络进行植物性疾病检测和鉴定。基于视觉的深度学习的另一个趋势是使用视觉变压器,事实证明,这些变压器是分类和其他问题的强大模型。但是,很少研究视力变压器以进行植物病理应用。在这项研究中,为植物性疾病鉴定提出了一个启用视觉变压器的卷积神经网络模型。提出的模型将传统卷积神经网络的能力与视觉变压器有效地识别出多种农作物的大量植物疾病。拟议的模型具有轻巧的结构,只有80万个可训练的参数,这使其适合基于物联网的智能农业服务。 PlantXvit的性能在五个公开可用的数据集上进行了评估。拟议的PlantXvit网络在所有五个数据集上的性能要比五种最先进的方法更好。即使在挑战性的背景条件下,识别植物性疾病的平均准确性分别超过了苹果,玉米和稻米数据集的93.55%,92.59%和98.33%。使用梯度加权的类激活图和局部可解释的模型不可思议的解释来评估所提出模型的解释性效率。
translated by 谷歌翻译
We introduce Argoverse 2 (AV2) - a collection of three datasets for perception and forecasting research in the self-driving domain. The annotated Sensor Dataset contains 1,000 sequences of multimodal data, encompassing high-resolution imagery from seven ring cameras, and two stereo cameras in addition to lidar point clouds, and 6-DOF map-aligned pose. Sequences contain 3D cuboid annotations for 26 object categories, all of which are sufficiently-sampled to support training and evaluation of 3D perception models. The Lidar Dataset contains 20,000 sequences of unlabeled lidar point clouds and map-aligned pose. This dataset is the largest ever collection of lidar sensor data and supports self-supervised learning and the emerging task of point cloud forecasting. Finally, the Motion Forecasting Dataset contains 250,000 scenarios mined for interesting and challenging interactions between the autonomous vehicle and other actors in each local scene. Models are tasked with the prediction of future motion for "scored actors" in each scenario and are provided with track histories that capture object location, heading, velocity, and category. In all three datasets, each scenario contains its own HD Map with 3D lane and crosswalk geometry - sourced from data captured in six distinct cities. We believe these datasets will support new and existing machine learning research problems in ways that existing datasets do not. All datasets are released under the CC BY-NC-SA 4.0 license.
translated by 谷歌翻译
The rise in data has led to the need for dimension reduction techniques, especially in the area of non-scalar variables, including time series, natural language processing, and computer vision. In this paper, we specifically investigate dimension reduction for time series through functional data analysis. Current methods for dimension reduction in functional data are functional principal component analysis and functional autoencoders, which are limited to linear mappings or scalar representations for the time series, which is inefficient. In real data applications, the nature of the data is much more complex. We propose a non-linear function-on-function approach, which consists of a functional encoder and a functional decoder, that uses continuous hidden layers consisting of continuous neurons to learn the structure inherent in functional data, which addresses the aforementioned concerns in the existing approaches. Our approach gives a low dimension latent representation by reducing the number of functional features as well as the timepoints at which the functions are observed. The effectiveness of the proposed model is demonstrated through multiple simulations and real data examples.
translated by 谷歌翻译
Object movement identification is one of the most researched problems in the field of computer vision. In this task, we try to classify a pixel as foreground or background. Even though numerous traditional machine learning and deep learning methods already exist for this problem, the two major issues with most of them are the need for large amounts of ground truth data and their inferior performance on unseen videos. Since every pixel of every frame has to be labeled, acquiring large amounts of data for these techniques gets rather expensive. Recently, Zhao et al. [1] proposed one of a kind Arithmetic Distribution Neural Network (ADNN) for universal background subtraction which utilizes probability information from the histogram of temporal pixels and achieves promising results. Building onto this work, we developed an intelligent video surveillance system that uses ADNN architecture for motion detection, trims the video with parts only containing motion, and performs anomaly detection on the trimmed video.
translated by 谷歌翻译
The machine translation mechanism translates texts automatically between different natural languages, and Neural Machine Translation (NMT) has gained attention for its rational context analysis and fluent translation accuracy. However, processing low-resource languages that lack relevant training attributes like supervised data is a current challenge for Natural Language Processing (NLP). We incorporated a technique known Active Learning with the NMT toolkit Joey NMT to reach sufficient accuracy and robust predictions of low-resource language translation. With active learning, a semi-supervised machine learning strategy, the training algorithm determines which unlabeled data would be the most beneficial for obtaining labels using selected query techniques. We implemented two model-driven acquisition functions for selecting the samples to be validated. This work uses transformer-based NMT systems; baseline model (BM), fully trained model (FTM) , active learning least confidence based model (ALLCM), and active learning margin sampling based model (ALMSM) when translating English to Hindi. The Bilingual Evaluation Understudy (BLEU) metric has been used to evaluate system results. The BLEU scores of BM, FTM, ALLCM and ALMSM systems are 16.26, 22.56 , 24.54, and 24.20, respectively. The findings in this paper demonstrate that active learning techniques helps the model to converge early and improve the overall quality of the translation system.
translated by 谷歌翻译
We study the problem of planning under model uncertainty in an online meta-reinforcement learning (RL) setting where an agent is presented with a sequence of related tasks with limited interactions per task. The agent can use its experience in each task and across tasks to estimate both the transition model and the distribution over tasks. We propose an algorithm to meta-learn the underlying structure across tasks, utilize it to plan in each task, and upper-bound the regret of the planning loss. Our bound suggests that the average regret over tasks decreases as the number of tasks increases and as the tasks are more similar. In the classical single-task setting, it is known that the planning horizon should depend on the estimated model's accuracy, that is, on the number of samples within task. We generalize this finding to meta-RL and study this dependence of planning horizons on the number of tasks. Based on our theoretical findings, we derive heuristics for selecting slowly increasing discount factors, and we validate its significance empirically.
translated by 谷歌翻译