Existing regulations prohibit model developers from accessing protected attributes (gender, race, etc.), often resulting in fairness assessments on populations without knowing their protected groups. In such scenarios, institutions often adopt a separation between the model developers (who train models with no access to the protected attributes) and a compliance team (who may have access to the entire dataset for auditing purpose). However, the model developers might be allowed to test their models for bias by querying the compliance team for group fairness metrics. In this paper, we first demonstrate that simply querying for fairness metrics, such as statistical parity and equalized odds can leak the protected attributes of individuals to the model developers. We demonstrate that there always exist strategies by which the model developers can identify the protected attribute of a targeted individual in the test dataset from just a single query. In particular, we show that one can reconstruct the protected attributes of all the individuals from O(Nk log n/Nk) queries when Nk<<n using techniques from compressed sensing (n: size of the test dataset, Nk: size of smallest group). Our results pose an interesting debate in algorithmic fairness: should querying for fairness metrics be viewed as a neutral-valued solution to ensure compliance with regulations? Or, does it constitute a violation of regulations and privacy if the number of queries answered is enough for the model developers to identify the protected attributes of specific individuals? To address this supposed violation, we also propose Attribute-Conceal, a novel technique that achieves differential privacy by calibrating noise to the smooth sensitivity of our bias query, outperforming naive techniques such as Laplace mechanism. We also include experimental results on the Adult dataset and synthetic data (broad range of parameters).
translated by 谷歌翻译
反事实解释为从机器学习模型中获得预期结果的方法提供了信息。但是,这种解释对基础模型的某些现实世界变化(例如,重新训练模型,更改的超参数等)并不强大,质疑其在多种应用程序中的可靠性,例如信用贷款。在这项工作中,我们提出了一种新颖的策略 - 我们称之为Robx,以生成基于树的合奏,例如XGBoost的强大反事实。基于树的合奏在强大的反事实生成中提出了其他挑战,例如,它们具有非平滑和非差异的目标函数,并且在非常相似的数据上,它们可以在RETOR下的参数空间中进行很多更改。我们首先引入了一种新颖的指标(我们称之为反事实稳定性),该指标试图量化反事实的鲁棒性将是为了模拟重新训练下的变化,并具有理想的理论属性。我们提出的策略ROBX使用任何反事实生成方法(基本方法),并通过使用我们的度量反事实稳定性迭代地完善基本方法生成的反事实来搜索强大的反事实。我们将ROBX的性能与基于基准数据集的流行反事实生成方法(对于基于树的合奏)进行了比较。结果表明,我们的策略会产生反事实,这些反事实是强大的(实际模型更改后的有效性近100%),并且在现有最新方法上也是现实的(就局部异常因素而言)。
translated by 谷歌翻译
DNN的成功是由过度参数化网络概括的违反直觉能力驱动的,即使它们完全适合培训数据。实际上,测试误差通常会随着过度参数化的增加而继续减少,称为双重下降。这使从业者可以实例化大型模型,而不必担心过度合适。但是,尽管有好处,但先前的工作表明,过度参数会加剧偏见对少数族裔亚组。已经提出了几种公平约束的DNN培训方法来解决这一问题。在这里,我们对Mindiff进行了严格的研究,这是Tensorflow负责AI工具包中实施的公平约束培训程序,旨在实现机会平等。我们表明,尽管Mindiff改善了参数化不足的模型的公平性,但在过度参数化的制度中可能是无效的。这是因为一个具有零训练损失的过度合适模型在培训数据上是微不足道的,造成了“公平幻想”,因此可以关闭Mindiff的优化(这将适用于任何基于差异的措施,这些措施关心错误或准确性。它不适用于人口统计)。在指定的公平限制内,与参数过度的同行相比,参数化的Mindiff模型甚至可能具有较低的错误(尽管基线过度参数化模型的错误较低)。我们进一步表明,Mindiff优化对在参数不足的制度中的批处理大小非常敏感。因此,使用Mindiff的公平模型培训需要耗时的超参数搜索。最后,我们建议使用先前提出的正则化技术,即。 L2,与Mindiff结合使用的早期停止和洪水训练公平的参数化模型。
translated by 谷歌翻译
当机器学习算法做出有偏见的决定时,了解差异来源以解释为什么存在偏见会很有帮助。在此方面,我们研究了量化每个单独特征对观察到的差异的贡献的问题。如果我们可以访问决策模型,则一种潜在的方法(从解释性文献中的基于干预的方法启发)是改变每个单独的功能(同时保持其他功能),并使用结果变化的差异来量化其贡献。但是,我们可能无法访问该模型,也无法测试/审核其输出以单独变化的功能。此外,该决定可能并不总是是输入特征(例如,在循环中)的确定性函数。对于这些情况,我们可能需要使用纯粹的分布(即观察性)技术来解释贡献,而不是介入。我们提出一个问题:当确切的决策机制无法访问时,每个单独特征对在决策中观察到的差异的“潜在”贡献是什么?我们首先提供规范的示例(思想实验),以说明解释贡献的分布和介入方法之间的差异,以及何时更适合。当无法干预输入时,我们通过利用一种称为部分信息分解的信息理论中的作品来量化有关最终决策和单个特征中存在的受保护属性的“冗余”统计依赖性。我们还进行了一个简单的案例研究,以显示如何应用该技术来量化贡献。
translated by 谷歌翻译
我们的神经科学和临床应用程序的激励,我们经验检查信息流的观察措施是否可以提出干预措施。我们通过在机器学习的公平性的背景下对人工神经网络进行实验来进行,目标是通过干预措施在系统中诱导公平。使用我们最近开发的$ M $-Information Flow框架,我们测量真实标签的信息流(负责精度,并且需要),并单独地,有关受保护属性的信息流(负责偏见,因此负责不希望的是培训的神经网络边缘。然后,我们通过修剪将流量幅度与干预这些边缘的效果进行比较。我们表明,携带较大信息流的修剪边缘有关受保护的属性在更大程度上会降低输出的偏差。这表明$ M $ -Information流程可以有意义地建议干预措施,以肯定的方式回答标题的问题。我们还评估了不同干预策略的偏见准确性权衡,分析了人们如何使用所需和不期望的信息的估计(这里,准确性和偏置流量)来告知保存前者的干预,同时减少后者。
translated by 谷歌翻译
There exist several methods that aim to address the crucial task of understanding the behaviour of AI/ML models. Arguably, the most popular among them are local explanations that focus on investigating model behaviour for individual instances. Several methods have been proposed for local analysis, but relatively lesser effort has gone into understanding if the explanations are robust and accurately reflect the behaviour of underlying models. In this work, we present a survey of the works that analysed the robustness of two classes of local explanations (feature importance and counterfactual explanations) that are popularly used in analysing AI/ML models in finance. The survey aims to unify existing definitions of robustness, introduces a taxonomy to classify different robustness approaches, and discusses some interesting results. Finally, the survey introduces some pointers about extending current robustness analysis approaches so as to identify reliable explainability methods.
translated by 谷歌翻译
Deep neural networks have emerged as the workhorse for a large section of robotics and control applications, especially as models for dynamical systems. Such data-driven models are in turn used for designing and verifying autonomous systems. This is particularly useful in modeling medical systems where data can be leveraged to individualize treatment. In safety-critical applications, it is important that the data-driven model is conformant to established knowledge from the natural sciences. Such knowledge is often available or can often be distilled into a (possibly black-box) model $M$. For instance, the unicycle model for an F1 racing car. In this light, we consider the following problem - given a model $M$ and state transition dataset, we wish to best approximate the system model while being bounded distance away from $M$. We propose a method to guarantee this conformance. Our first step is to distill the dataset into few representative samples called memories, using the idea of a growing neural gas. Next, using these memories we partition the state space into disjoint subsets and compute bounds that should be respected by the neural network, when the input is drawn from a particular subset. This serves as a symbolic wrapper for guaranteed conformance. We argue theoretically that this only leads to bounded increase in approximation error; which can be controlled by increasing the number of memories. We experimentally show that on three case studies (Car Model, Drones, and Artificial Pancreas), our constrained neurosymbolic models conform to specified $M$ models (each encoding various constraints) with order-of-magnitude improvements compared to the augmented Lagrangian and vanilla training methods.
translated by 谷歌翻译
This paper is a technical overview of DeepMind and Google's recent work on reinforcement learning for controlling commercial cooling systems. Building on expertise that began with cooling Google's data centers more efficiently, we recently conducted live experiments on two real-world facilities in partnership with Trane Technologies, a building management system provider. These live experiments had a variety of challenges in areas such as evaluation, learning from offline data, and constraint satisfaction. Our paper describes these challenges in the hope that awareness of them will benefit future applied RL work. We also describe the way we adapted our RL system to deal with these challenges, resulting in energy savings of approximately 9% and 13% respectively at the two live experiment sites.
translated by 谷歌翻译
Aspect Based Sentiment Analysis is a dominant research area with potential applications in social media analytics, business, finance, and health. Prior works in this area are primarily based on supervised methods, with a few techniques using weak supervision limited to predicting a single aspect category per review sentence. In this paper, we present an extremely weakly supervised multi-label Aspect Category Sentiment Analysis framework which does not use any labelled data. We only rely on a single word per class as an initial indicative information. We further propose an automatic word selection technique to choose these seed categories and sentiment words. We explore unsupervised language model post-training to improve the overall performance, and propose a multi-label generator model to generate multiple aspect category-sentiment pairs per review sentence. Experiments conducted on four benchmark datasets showcase our method to outperform other weakly supervised baselines by a significant margin.
translated by 谷歌翻译
The SNMMI Artificial Intelligence (SNMMI-AI) Summit, organized by the SNMMI AI Task Force, took place in Bethesda, MD on March 21-22, 2022. It brought together various community members and stakeholders from academia, healthcare, industry, patient representatives, and government (NIH, FDA), and considered various key themes to envision and facilitate a bright future for routine, trustworthy use of AI in nuclear medicine. In what follows, essential issues, challenges, controversies and findings emphasized in the meeting are summarized.
translated by 谷歌翻译