智能论文笔记

ApproxTrain: Fast Simulation of Approximate Multipliers for DNN Training and Inference

Jing Gong , Hassaan Saadat , Hasindu Gamaarachchi , Haris Javaid , Xiaobo Sharon Hu , Sri Parameswaran

分类：人工智能 | 机器学习

2022-09-09

深度神经网络（DNNS）的边缘训练是持续学习的理想目标。但是，这受到训练所需的巨大计算能力的阻碍。硬件近似乘数表明，它们在获得DNN推理加速器中获得资源效率的有效性；但是，使用近似乘数的培训在很大程度上尚未开发。为了通过支持DNN培训的近似乘数来构建有效的资源加速器，需要对不同DNN体系结构和不同近似乘数进行彻底评估。本文介绍了近似值，这是一个开源框架，允许使用模拟近似乘数快速评估DNN训练和推理。近似值与TensorFlow（TF）一样用户友好，仅需要对DNN体系结构的高级描述以及近似乘数的C/C ++功能模型。我们通过使用GPU（AMSIM）上的基于基于LUT的近似浮点（FP）乘数模拟器来提高乘数在乘数级别的模拟速度。近似值利用CUDA并有效地将AMSIM集成到张量库中，以克服商业GPU中的本机硬件近似乘数的缺乏。我们使用近似值来评估使用LENET和RESNETS体系结构的小型和大型数据集（包括Imagenet）的近似乘数的DNN训练的收敛性和准确性。与FP32和BFLOAT16乘数相比，评估表明测试准确性相似的收敛行为和可忽略不计的变化。与训练和推理中基于CPU的近似乘数模拟相比，GPU加速近似值快2500倍以上。基于具有本地硬件乘数的高度优化的闭合源Cudnn/Cublas库，原始张量量仅比近似值快8倍。

translated by 谷歌翻译

Deep learning-based object detection is a powerful approach for detecting faulty insulators in power lines. This involves training an object detection model from scratch, or fine tuning a model that is pre-trained on benchmark computer vision datasets. This approach works well with a large number of insulator images, but can result in unreliable models in the low data regime. The current literature mainly focuses on detecting the presence or absence of insulator caps, which is a relatively easy detection task, and does not consider detection of finer faults such as flashed and broken disks. In this article, we formulate three object detection tasks for insulator and asset inspection from aerial images, focusing on incipient faults in disks. We curate a large reference dataset of insulator images that can be used to learn robust features for detecting healthy and faulty insulators. We study the advantage of using this dataset in the low target data regime by pre-training on the reference dataset followed by fine-tuning on the target dataset. The results suggest that object detection models can be used to detect faults in insulators at a much incipient stage, and that transfer learning adds value depending on the type of object detection model. We identify key factors that dictate performance in the low data-regime and outline potential approaches to improve the state-of-the-art.

translated by 谷歌翻译

In recent years, Monte Carlo tree search (MCTS) has achieved widespread adoption within the game community. Its use in conjunction with deep reinforcement learning has produced success stories in many applications. While these approaches have been implemented in various games, from simple board games to more complicated video games such as StarCraft, the use of deep neural networks requires a substantial training period. In this work, we explore on-line adaptivity in MCTS without requiring pre-training. We present MCTS-TD, an adaptive MCTS algorithm improved with temporal difference learning. We demonstrate our new approach on the game miniXCOM, a simplified version of XCOM, a popular commercial franchise consisting of several turn-based tactical games, and show how adaptivity in MCTS-TD allows for improved performances against opponents.

translated by 谷歌翻译