机器学习模型何时预测个人的未来,什么时候背诵个人的模式?在这项工作中,我们提出了这两种预测途径的区别,这些预测途径得到了理论,经验和规范性论点的支持。我们提案的中心是一个简单有效的统计测试家族,称为向后基线,它们证明了是否以及在何种程度上叙述了过去。我们的统计理论提供了解释向后基线的指导,建立了不同基准和熟悉的统计概念之间的等价。具体而言,我们从只有背景变量和系统的预测来审核预测系统作为黑匣子进行审核,以审核预测系统。从经验上讲,我们对纵向面板调查得出的不同预测任务的框架进行了评估,这表明将向后基线纳入机器学习实践的便捷性和有效性。
translated by 谷歌翻译
美国刑事法律体系越来越依赖软件输出来定罪和被监禁。在每年大量案件中,政府根据统计软件的证据(例如概率基因分型,环境音频检测和工具标志分析工具)做出这些结果决定,以使辩护律师无法完全盘中或审查。这破坏了对抗性刑事法律制度的承诺,该制度依赖于辩方探查和测试起诉案件保护个人权利的能力。为了应对这种软件的对抗性审查输出的需求,我们提出了强大的对抗测试作为审计框架,以检查证据统计软件的有效性。我们通过在强大的机器学习和算法公平的最新作品中绘制大量工作来定义和操作这种强大的对抗性测试的概念。我们演示了该框架如何使审查此类工具的过程标准化,并使辩护律师能够检查其与当前案件最相关的情况的有效性。我们进一步讨论了美国刑事法律制度内的现有结构和机构挑战,该系统可能会造成实施该和其他此类审计框架的障碍,并通过讨论政策变更的讨论可以帮助解决这些问题。
translated by 谷歌翻译
虽然公平社区已经认识到数据的重要性,但该地区的研究人员主要依赖于UCI成年人的表格数据。来自1994年的美国人口普查调查,该数据集出现在数百名研究论文中,它作为许多算法公平干预措施的开发和比较的基础。我们从可用的美国人口普查源重建了UCI成人数据的超集,并揭示了UCI成人数据集的特质,这些数据集限制了其外部有效性。我们的主要贡献是一套来自美国人口普查调查的新数据集,这扩展了现有数据生态系统进行公平机器学习的研究。我们创建与收入,就业,健康,运输和住房有关的预测任务。数据跨越多年和美国所有国家,允许研究人员研究时间转移和地理变异。我们突出了与公平标准,算法干预措施之间的权衡相关的权衡初始突发的初始突发,以及基于我们的新数据集的分发转移的作用。我们的调查结果通知了正在进行的辩论,挑战一些现有的叙述,并指向未来的研究方向。我们的数据集可在https://github.com/zykls/folktables上获得。
translated by 谷歌翻译
Saliency methods have emerged as a popular tool to highlight features in an input deemed relevant for the prediction of a learned model. Several saliency methods have been proposed, often guided by visual appeal on image data. In this work, we propose an actionable methodology to evaluate what kinds of explanations a given method can and cannot provide. We find that reliance, solely, on visual assessment can be misleading. Through extensive experiments we show that some existing saliency methods are independent both of the model and of the data generating process. Consequently, methods that fail the proposed tests are inadequate for tasks that are sensitive to either data or model, such as, finding outliers in the data, explaining the relationship between inputs and outputs that the model learned, and debugging the model. We interpret our findings through an analogy with edge detection in images, a technique that requires neither training data nor model. Theory in the case of a linear model and a single-layer convolutional neural network supports our experimental findings 2 . * Work done during the Google AI Residency Program. 2 All code to replicate our findings will be available here: https://goo.gl/hBmhDt 3 We refer here to the broad category of visualization and attribution methods aimed at interpreting trained models. These methods are often used for interpreting deep neural networks particularly on image data.
translated by 谷歌翻译
Despite their massive size, successful deep artificial neural networks can exhibit a remarkably small difference between training and test performance. Conventional wisdom attributes small generalization error either to properties of the model family, or to the regularization techniques used during training. Through extensive systematic experiments, we show how these traditional approaches fail to explain why large neural networks generalize well in practice. Specifically, our experiments establish that state-of-the-art convolutional networks for image classification trained with stochastic gradient methods easily fit a random labeling of the training data. This phenomenon is qualitatively unaffected by explicit regularization, and occurs even if we replace the true images by completely unstructured random noise. We corroborate these experimental findings with a theoretical construction showing that simple depth two neural networks already have perfect finite sample expressivity as soon as the number of parameters exceeds the number of data points as it usually does in practice. We interpret our experimental findings by comparison with traditional models. * Work performed while interning at Google Brain.† Work performed at Google Brain.
translated by 谷歌翻译
We propose a criterion for discrimination against a specified sensitive attribute in supervised learning, where the goal is to predict some target based on available features. Assuming data about the predictor, target, and membership in the protected group are available, we show how to optimally adjust any learned predictor so as to remove discrimination according to our definition. Our framework also improves incentives by shifting the cost of poor classification from disadvantaged groups to the decision maker, who can respond by improving the classification accuracy.In line with other studies, our notion is oblivious: it depends only on the joint statistics of the predictor, the target and the protected attribute, but not on interpretation of individual features. We study the inherent limits of defining and identifying biases based on such oblivious measures, outlining what can and cannot be inferred from different oblivious tests.We illustrate our notion using a case study of FICO credit scores.
translated by 谷歌翻译
We show that parametric models trained by a stochastic gradient method (SGM) with few iterations have vanishing generalization error. We prove our results by arguing that SGM is algorithmically stable in the sense of Bousquet and Elisseeff. Our analysis only employs elementary tools from convex and continuous optimization. We derive stability bounds for both convex and non-convex optimization under standard Lipschitz and smoothness assumptions.Applying our results to the convex case, we provide new insights for why multiple epochs of stochastic gradient methods generalize well in practice. In the non-convex case, we give a new interpretation of common practices in neural networks, and formally show that popular techniques for training large deep models are indeed stability-promoting. Our findings conceptually underscore the importance of reducing training time beyond its obvious benefit.
translated by 谷歌翻译
We study fairness in classification, where individuals are classified, e.g., admitted to a university, and the goal is to prevent discrimination against individuals based on their membership in some group, while maintaining utility for the classifier (the university). The main conceptual contribution of this paper is a framework for fair classification comprising (1) a (hypothetical) task-specific metric for determining the degree to which individuals are similar with respect to the classification task at hand; (2) an algorithm for maximizing utility subject to the fairness constraint, that similar individuals are treated similarly. We also present an adaptation of our approach to achieve the complementary goal of "fair affirmative action," which guarantees statistical parity (i.e., the demographics of the set of individuals receiving any classification are the same as the demographics of the underlying population), while treating similar individuals as similarly as possible. Finally, we discuss the relationship of fairness to privacy: when fairness implies privacy, and how tools developed in the context of differential privacy may be applied to fairness.
translated by 谷歌翻译
With the advent of Neural Style Transfer (NST), stylizing an image has become quite popular. A convenient way for extending stylization techniques to videos is by applying them on a per-frame basis. However, such per-frame application usually lacks temporal-consistency expressed by undesirable flickering artifacts. Most of the existing approaches for enforcing temporal-consistency suffers from one or more of the following drawbacks. They (1) are only suitable for a limited range of stylization techniques, (2) can only be applied in an offline fashion requiring the complete video as input, (3) cannot provide consistency for the task of stylization, or (4) do not provide interactive consistency-control. Note that existing consistent video-filtering approaches aim to completely remove flickering artifacts and thus do not respect any specific consistency-control aspect. For stylization tasks, however, consistency-control is an essential requirement where a certain amount of flickering can add to the artistic look and feel. Moreover, making this control interactive is paramount from a usability perspective. To achieve the above requirements, we propose an approach that can stylize video streams while providing interactive consistency-control. Apart from stylization, our approach also supports various other image processing filters. For achieving interactive performance, we develop a lite optical-flow network that operates at 80 Frames per second (FPS) on desktop systems with sufficient accuracy. We show that the final consistent video-output using our flow network is comparable to that being obtained using state-of-the-art optical-flow network. Further, we employ an adaptive combination of local and global consistent features and enable interactive selection between the two. By objective and subjective evaluation, we show that our method is superior to state-of-the-art approaches.
translated by 谷歌翻译
This short paper discusses continually updated causal abstractions as a potential direction of future research. The key idea is to revise the existing level of causal abstraction to a different level of detail that is both consistent with the history of observed data and more effective in solving a given task.
translated by 谷歌翻译