Many state-of-the-art deep learning models for computer vision tasks are based on the transformer architecture. Such models can be computationally expensive and are typically statically set to meet the deployment scenario. However, in real-time applications, the resources available for every inference can vary considerably and be smaller than what state-of-the-art models use. We can use dynamic models to adapt the model execution to meet real-time application resource constraints. While prior dynamic work has primarily minimized resource utilization for less complex input images while maintaining accuracy and focused on CNNs and early transformer models such as BERT, we adapt vision transformers to meet system dynamic resource constraints, independent of the input image. We find that unlike early transformer models, recent state-of-the-art vision transformers heavily rely on convolution layers. We show that pretrained models are fairly resilient to skipping computation in the convolution and self-attention layers, enabling us to create a low-overhead system for dynamic real-time inference without additional training. Finally, we create a optimized accelerator for these dynamic vision transformers in a 5nm technology. The PE array occupies 2.26mm$^2$ and is 17 times faster than a NVIDIA TITAN V GPU for state-of-the-art transformer-based models for semantic segmentation.
translated by 谷歌翻译
State-of-the-art deep neural networks (DNNs) have hundreds of millions of connections and are both computationally and memory intensive, making them difficult to deploy on embedded systems with limited hardware resources and power budgets. While custom hardware helps the computation, fetching weights from DRAM is two orders of magnitude more expensive than ALU operations, and dominates the required power.Previously proposed 'Deep Compression' makes it possible to fit large DNNs (AlexNet and VGGNet) fully in on-chip SRAM. This compression is achieved by pruning the redundant connections and having multiple connections share the same weight. We propose an energy efficient inference engine (EIE) that performs inference on this compressed network model and accelerates the resulting sparse matrix-vector multiplication with weight sharing. Going from DRAM to SRAM gives EIE 120× energy saving; Exploiting sparsity saves 10×; Weight sharing gives 8×; Skipping zero activations from ReLU saves another 3×. Evaluated on nine DNN benchmarks, EIE is 189× and 13× faster when compared to CPU and GPU implementations of the same DNN without compression. EIE has a processing power of 102 GOPS/s working directly on a compressed network, corresponding to 3 TOPS/s on an uncompressed network, and processes FC layers of AlexNet at 1.88×10 4 frames/sec with a power dissipation of only 600mW. It is 24,000× and 3,400× more energy efficient than a CPU and GPU respectively. Compared with DaDianNao, EIE has 2.9×, 19× and 3× better throughput, energy efficiency and area efficiency.
translated by 谷歌翻译
北极河口环境中温室气体排放的估计是由于在这些条件下进行现场观测的后勤易于进行现场溶解气体测量所主导的。然而,沿海北极环境中的最新证据表明,溶解的甲烷(CH4)和二氧化碳(CO2)在季节性上有很大的可变,并且在春季新鲜物体期间至少发生了一个重要的气通气事件。北极是否充当温室排放的来源或水槽对建模气候变化及其反馈机制具有重要意义。为了使溶解气体在典型的不足采样条件下进行更高分辨率的时空测量,可以使用远程操作的车辆(ROV)来在春季新鲜材料之前和期间提取冰以下的连续水样品。在这里,我们提出了一个适合于服用冰质地球化学观测值的尺寸,重量和功率约束(交换)水下车辆(UV)和新型的地球化学采样系统,并在北极河口研究的现场 - 肛门设置中演示了所提出的系统。
translated by 谷歌翻译
我们提出了一种惩罚的非参数方法,以使用整流器二次单元(REEND)激活了深层神经网络,以估计不可分割的模型中的分位数回归过程(QRP),并引入了新的惩罚函数,以实施对瓦解回归曲线的非交叉。我们为估计的QRP建立了非反应过量的风险界限,并在轻度平滑度和规律性条件下得出估计的QRP的平均综合平方误差。为了建立这些非反应风险和估计误差范围,我们还使用$ s> 0 $及其衍生物及其衍生物使用所需的激活的神经网络开发了一个新的错误,用于近似$ c^s $平滑功能。这是必需网络的新近似结果,并且具有独立的兴趣,并且可能在其他问题中有用。我们的数值实验表明,所提出的方法具有竞争性或胜过两种现有方法,包括使用再现核和随机森林的方法,用于非参数分位数回归。
translated by 谷歌翻译
部署的AI系统通常不起作用。它们可以随意地构造,不加选择地部署并欺骗性地促进。然而,尽管有这一现实,但学者,新闻界和决策者对功能的关注很少。这导致技术和政策解决方案的重点是“道德”或价值一致的部署,通常会跳过先前的问题,即给定系统功能或完全提供任何好处。描述各种功能失败的危害,我们分析一组案例研究,以创建已知的AI功能问题的分类法。然后,我们指出的是政策和组织响应,这些策略和组织响应经常被忽略,并在功能成为重点后变得更容易获得。我们认为功能是一项有意义的AI政策挑战,是保护受影响社区免受算法伤害的必要第一步。
translated by 谷歌翻译
Research on automated essay scoring has become increasing important because it serves as a method for evaluating students' written-responses at scale. Scalable methods for scoring written responses are needed as students migrate to online learning environments resulting in the need to evaluate large numbers of written-response assessments. The purpose of this study is to describe and evaluate three active learning methods than can be used to minimize the number of essays that must be scored by human raters while still providing the data needed to train a modern automated essay scoring system. The three active learning methods are the uncertainty-based, the topological-based, and the hybrid method. These three methods were used to select essays included as part of the Automated Student Assessment Prize competition that were then classified using a scoring model that was training with the bidirectional encoder representations from transformer language model. All three active learning methods produced strong results, with the topological-based method producing the most efficient classification. Growth rate accuracy was also evaluated. The active learning methods produced different levels of efficiency under different sample size allocations but, overall, all three methods were highly efficient and produced classifications that were similar to one another.
translated by 谷歌翻译
This paper presents a novel framework for planning in unknown and occluded urban spaces. We specifically focus on turns and intersections where occlusions significantly impact navigability. Our approach uses an inpainting model to fill in a sparse, occluded, semantic lidar point cloud and plans dynamically feasible paths for a vehicle to traverse through the open and inpainted spaces. We demonstrate our approach using a car's lidar data with real-time occlusions, and show that by inpainting occluded areas, we can plan longer paths, with more turn options compared to without inpainting; in addition, our approach more closely follows paths derived from a planner with no occlusions (called the ground truth) compared to other state of the art approaches.
translated by 谷歌翻译
Feature acquisition algorithms address the problem of acquiring informative features while balancing the costs of acquisition to improve the learning performances of ML models. Previous approaches have focused on calculating the expected utility values of features to determine the acquisition sequences. Other approaches formulated the problem as a Markov Decision Process (MDP) and applied reinforcement learning based algorithms. In comparison to previous approaches, we focus on 1) formulating the feature acquisition problem as a MDP and applying Monte Carlo Tree Search, 2) calculating the intermediary rewards for each acquisition step based on model improvements and acquisition costs and 3) simultaneously optimizing model improvement and acquisition costs with multi-objective Monte Carlo Tree Search. With Proximal Policy Optimization and Deep Q-Network algorithms as benchmark, we show the effectiveness of our proposed approach with experimental study.
translated by 谷歌翻译
The celebrated proverb that "speech is silver, silence is golden" has a long multinational history and multiple specific meanings. In written texts punctuation can in fact be considered one of its manifestations. Indeed, the virtue of effectively speaking and writing involves - often decisively - the capacity to apply the properly placed breaks. In the present study, based on a large corpus of world-famous and representative literary texts in seven major Western languages, it is shown that the distribution of intervals between consecutive punctuation marks in almost all texts can universally be characterised by only two parameters of the discrete Weibull distribution which can be given an intuitive interpretation in terms of the so-called hazard function. The values of these two parameters tend to be language-specific, however, and even appear to navigate translations. The properties of the computed hazard functions indicate that among the studied languages, English turns out to be the least constrained by the necessity to place a consecutive punctuation mark to partition a sequence of words. This may suggest that when compared to other studied languages, English is more flexible, in the sense of allowing longer uninterrupted sequences of words. Spanish reveals similar tendency to only a bit lesser extent.
translated by 谷歌翻译
This report summarizes the 3rd International Verification of Neural Networks Competition (VNN-COMP 2022), held as a part of the 5th Workshop on Formal Methods for ML-Enabled Autonomous Systems (FoMLAS), which was collocated with the 34th International Conference on Computer-Aided Verification (CAV). VNN-COMP is held annually to facilitate the fair and objective comparison of state-of-the-art neural network verification tools, encourage the standardization of tool interfaces, and bring together the neural network verification community. To this end, standardized formats for networks (ONNX) and specification (VNN-LIB) were defined, tools were evaluated on equal-cost hardware (using an automatic evaluation pipeline based on AWS instances), and tool parameters were chosen by the participants before the final test sets were made public. In the 2022 iteration, 11 teams participated on a diverse set of 12 scored benchmarks. This report summarizes the rules, benchmarks, participating tools, results, and lessons learned from this iteration of this competition.
translated by 谷歌翻译