Machine learning (ML) techniques are enjoying rapidly increasing adoption. However, designing and implementing the systems that support ML models in real-world deployments remains a significant obstacle, in large part due to the radically different development and deployment profile of modern ML methods, and the range of practical concerns that come with broader adoption. We propose to foster a new systems machine learning research community at the intersection of the traditional systems and ML communities, focused on topics such as hardware systems for ML, software systems for ML, and ML optimized for metrics beyond predictive accuracy. To do this, we describe a new conference, SysML, that explicitly targets research at the intersection of systems and machine learning with a program committee split evenly between experts in systems and ML, and an explicit focus on topics at the intersection of the two.
translated by 谷歌翻译
标记培训数据是开发或修改基于机器学习的应用程序中最昂贵的瓶颈之一。我们调查了一个组织的资源如何被用作谷歌三个分类任务的弱监督来源,以便将开发时间和成本降低一个数量级。我们以Snorkel框架为基础,将其扩展为新系统Snorkel DryBell,该系统与Google的分布式生产系统集成,使工程师能够在不到30分钟的时间内开发并执行数百万个示例的弱监控策略。我们发现Snorkel DryBell创建了可比较的分类器使用多达数万个手工标记的例子来提高质量,部分地通过平均在生产中无法使用的组织资源,这使得弱监督分类器的性能平均提高了52%。
translated by 谷歌翻译
随着机器学习模型的复杂性不断增加,收集大量手工标记的训练集已成为实践中最大的障碍之一。相反,经常使用较弱的监督形式,提供噪音更大但更便宜的标签。然而,这些弱监督源具有不同的和未知的准确性,可能输出相关标签,并且可以标记不同的任务或应用于不同的粒度级别。我们提出了一个框架,用于对这些弱监督源进行整合和建模,将其视为标记问题的不同相关子任务,我们将其称为多任务弱监督设置。我们通过求解矩阵完成式问题表明,我们可以根据其依赖结构恢复这些多任务源的准确性,但是没有任何标记数据,可以为训练最终模型提供更高质量的监督。从理论上讲,我们证明了用这种方法训练的模型的泛化误差随着未标记数据点的数量而改善,并且表征了关于任务和依赖结构的缩放。在三个细粒度分类问题上,我们表明,与传统的监督方法相比,我们的方法在准确度方面的平均收益为20.2分,多数投票基线为6.8分,与先前提出的分别对任务建模的弱监督方法相比为4.1分。
translated by 谷歌翻译
Data augmentation, a technique in which a training set is expanded with class-preserving transformations , is ubiquitous in modern machine learning pipelines. In this paper, we seek to establish a theoretical framework for understanding data augmentation. We approach this from two directions: First, we provide a general model of augmentation as a Markov process, and show that kernels appear naturally with respect to this model, even when we do not employ kernel classification. Next, we analyze more directly the effect of augmentation on kernel classifiers, showing that data augmentation can be approximated by first-order feature averaging and second-order variance regularization components. These frameworks both serve to illustrate the ways in which data augmentation affects the downstream learning model, and the resulting analyses provide novel connections between prior work in invariant kernels, tangent propagation, and robust optimization. Finally, we provide several proof-of-concept applications showing that our theory can be useful for accelerating machine learning workflows, such as reducing the amount of computation needed to train using augmented data, and predicting the utility of a transformation prior to training.
translated by 谷歌翻译
Large labeled training sets are the critical building blocks of supervisedlearning methods and are key enablers of deep learning techniques. For someapplications, creating labeled training sets is the most time-consuming andexpensive part of applying machine learning. We therefore propose a paradigmfor the programmatic creation of training sets called data programming in whichusers express weak supervision strategies or domain heuristics as labelingfunctions, which are programs that label subsets of the data, but that arenoisy and may conflict. We show that by explicitly representing this trainingset labeling process as a generative model, we can "denoise" the generatedtraining set, and establish theoretically that we can recover the parameters ofthese generative models in a handful of settings. We then show how to modify adiscriminative loss function to make it noise-aware, and demonstrate our methodover a range of discriminative models including logistic regression and LSTMs.Experimentally, on the 2014 TAC-KBP Slot Filling challenge, we show that dataprogramming would have led to a new winning score, and also show that applyingdata programming to an LSTM model leads to a TAC-KBP score almost 6 F1 pointsover a state-of-the-art LSTM baseline (and into second place in thecompetition). Additionally, in initial user studies we observed that dataprogramming may be an easier way for non-experts to create machine learningmodels when training data is limited or unavailable.
translated by 谷歌翻译
在这项研究中,我们提出了仿射变分自动编码器(AVAE),变体自动编码器(VAE)的变量,旨在通过避免VAE无法推广到仿射扰动形式的分布变化来提高鲁棒性。通过优化仿射变换以最大化ELBO,所提出的AVAE将输入转换为训练分布而不需要增加模型复杂度以模拟仿射变换的完整分布。此外,我们引入了一个培训程序,通过学习训练分布的子集来创建有效的模型,并使用AVAE来改善分布式移位证明时间的泛化和鲁棒性。对仿射扰动的实验表明,所提出的AVAE显着改善了仿射扰动形式的分布均匀性的推广和鲁棒性,而不增加模型复杂性。
translated by 谷歌翻译
本文介绍了自动驾驶汽车的算法和系统架构。引入的车辆由设计用于鲁棒性,可靠性和可扩展性的软件栈提供动力。为了自主地绕过先前未知的轨道,所提出的解决方案结合了来自不同机器人领域的技术的状态。具体而言,感知,估计和控制被合并到一个高性能自主车辆中。这个复杂的机器人系统由AMZ Driverless和ETHZurich开发,在我们参加的每个比赛中获得第一名:Formula StudentGermany 2017,Formula Student Italy 2018和Formula Student Germany 2018. Wediscuss这些比赛的结果和学习,并对每个模块进行实验评估我们的解决方案
translated by 谷歌翻译
我们建议使用随机变分帧预测深度神经网络,其中学习的先验分布训练在二维雨雷达反射率图上,用于降水临近预报,导致时间高达2 1/2小时。我们提出了与标准卷积LSTM网络的比较,并评估了两种方法的结构相似性指数的演变。案例研究表明,新方法可以产生有意义的预测,而不会在感兴趣的时间范围内过度模糊。
translated by 谷歌翻译
涉及多艘船舶的海上碰撞被认为是罕见的,但在2017年,美国海军的几艘船只涉及致命的海上碰撞,导致17名美国军人死亡。本文介绍的实验是对这些事件的直接反应。我们提出了一种基于视频图像处理的舰载碰撞 - 海上避让系统,它将有助于确保海上船只的安全驻留和导航。我们的系统利用在合成海事图像上训练的卷积神经网络,以便在场景中检测附近的船只,对检测到的船只进行航向分析,并在入境船舶存在的情况下提供analert。此外,我们还提供了导航危害 - 合成(NAVHAZ-Synthetic)数据集。该数据集包括从虚拟船载摄像机观察到的十个船级的一百万个注释图像,以及人类“Topside Lookout”视角。 NAVHAZ-Synthetic包括显示不同海况,光照条件和光学降解(如雾,海浪和盐积累)的图像。我们展示了在基于计算机视觉的海上碰撞预警系统中使用合成图像的结果,该系统具有良好的性能。
translated by 谷歌翻译
在本文中,我们利用生成对抗网络和基于条件随机场(GAN-CRF)的框架来解决高光谱图像(HSI)分类任务,该框架集成了半监督深度学习和概率图形模型,并做出了三个贡献。首先,我们设计了四种类型的卷积和转置卷积层,它们考虑了HSI的特征,以帮助从有限数量的标记HSI样本中提取判别特征。其次,我们构建了受监督的GAN,通过添加标签来缓解训练样本的不足,并通过对抗训练隐式重建真实的HSI数据分布。第三,我们在随机变量的顶部建立密集的条件随机场(CRF),这些随机变量被初始化为训练的GAN的softmax预测,并以HSI为条件来改进分类图。这个半监督框架利用判别和生成模型的优点通过游戏理论方法。此外,尽管我们从两个最具挑战性和广泛研究的数据集中使用了非常少量的标记训练HSI样本,但实验结果表明,谱空间GAN-CRF(SS-GAN-CRF)模型达到了半监督HSI分类的排序精度。
translated by 谷歌翻译