智能论文笔记

In conversation with Artificial Intelligence: aligning language models with human values

Atoosa Kasirzadeh , Iason Gabriel

分类：自然语言处理

2022-09-01

大规模的语言技术越来越多地用于与人类在不同情况下的各种形式的交流中。这些技术的一种特殊用例是对话剂，它会根据提示和查询输出自然语言文本。这种参与方式提出了许多社会和道德问题。例如，将对话剂与人类规范或价值观相结合意味着什么？它们应该与哪些规范或价值观保持一致？如何实现这一目标？在本文中，我们提出了许多步骤来帮助回答这些问题。我们首先要对对话代理人和人类对话者之间语言交流的基础进行哲学分析。然后，我们使用此分析来识别和制定理想的对话规范，这些规范可以控制人类与对话代理之间的成功语言交流。此外，我们探讨了如何使用这些规范来使对话剂与在一系列不同的话语领域中的人类价值相结合。最后，我们讨论了我们对与这些规范和价值观一致的对话代理设计的建议的实际含义。

translated by 谷歌翻译

Explanation in Artificial Intelligence: Insights from the Social Sciences

Tim Miller

分类：

2017-06-22

There has been a recent resurgence in the area of explainable artificial intelligence as researchers and practitioners seek to make their algorithms more understandable. Much of this research is focused on explicitly explaining decisions or actions to a human observer, and it should not be controversial to say that looking at how humans explain to each other can serve as a useful starting point for explanation in artificial intelligence. However, it is fair to say that most work in explainable artificial intelligence uses only the researchers' intuition of what constitutes a 'good' explanation. There exists vast and valuable bodies of research in philosophy, psychology, and cognitive science of how people define, generate, select, evaluate, and present explanations, which argues that people employ certain cognitive biases and social expectations towards the explanation process. This paper argues that the field of explainable artificial intelligence should build on this existing research, and reviews relevant papers from philosophy, cognitive psychology/science, and social psychology, which study these topics. It draws out some important findings, and discusses ways that these can be infused with work on explainable artificial intelligence.

translated by 谷歌翻译

Is it possible not to cheat on the Turing Test: Exploring the potential and challenges for true natural language 'understanding' by computers

Lize Alberts

分类：自然语言处理 | 人工智能

2022-06-29

最近围绕语言处理模型的复杂性的最新炒作使人们对机器获得了类似人类自然语言的指挥的乐观情绪。人工智能中自然语言理解的领域声称在这一领域取得了长足的进步，但是，在这方面和其他学科中使用“理解”的概念性清晰，使我们很难辨别我们实际上有多近的距离。目前的方法和剩余挑战的全面，跨学科的概述尚待进行。除了语言知识之外，这还需要考虑我们特定于物种的能力，以对，记忆，标签和传达我们（足够相似的）体现和位置经验。此外，测量实际约束需要严格分析当前模型的技术能力，以及对理论可能性和局限性的更深入的哲学反思。在本文中，我将所有这些观点（哲学，认知语言和技术）团结在一起，以揭开达到真实（人类般的）语言理解所涉及的挑战。通过解开当前方法固有的理论假设，我希望说明我们距离实现这一目标的实际程度，如果确实是目标。

translated by 谷歌翻译

The Linguistic Blind Spot of Value-Aligned Agency, Natural and Artificial

Travis LaCroix

分类：人工智能 | 自然语言处理 | 机器学习

2022-07-02

人工智能（AI）的价值分配问题询问我们如何确保人造系统的“价值”（即，客观函数）与人类的价值一致。在本文中，我认为语言交流（自然语言）是稳健价值对齐的必要条件。我讨论了这一主张的真相对试图确保AI系统价值一致的研究计划所带来的后果；或者，更谨慎地设计强大的有益或道德人造代理。

translated by 谷歌翻译

Law Informs Code: A Legal Informatics Approach to Aligning Artificial Intelligence with Humans

John J. Nay

分类：人工智能 | 机器学习

2022-09-14

We are currently unable to specify human goals and societal values in a way that reliably directs AI behavior. Law-making and legal interpretation form a computational engine that converts opaque human values into legible directives. "Law Informs Code" is the research agenda capturing complex computational legal processes, and embedding them in AI. Similar to how parties to a legal contract cannot foresee every potential contingency of their future relationship, and legislators cannot predict all the circumstances under which their proposed bills will be applied, we cannot ex ante specify rules that provably direct good AI behavior. Legal theory and practice have developed arrays of tools to address these specification problems. For instance, legal standards allow humans to develop shared understandings and adapt them to novel situations. In contrast to more prosaic uses of the law (e.g., as a deterrent of bad behavior through the threat of sanction), leveraged as an expression of how humans communicate their goals, and what society values, Law Informs Code. We describe how data generated by legal processes (methods of law-making, statutory interpretation, contract drafting, applications of legal standards, legal reasoning, etc.) can facilitate the robust specification of inherently vague human goals. This increases human-AI alignment and the local usefulness of AI. Toward society-AI alignment, we present a framework for understanding law as the applied philosophy of multi-agent alignment. Although law is partly a reflection of historically contingent political power - and thus not a perfect aggregation of citizen preferences - if properly parsed, its distillation offers the most legitimate computational comprehension of societal values available. If law eventually informs powerful AI, engaging in the deliberative political process to improve law takes on even more meaning.

translated by 谷歌翻译

Ethics Sheet for Automatic Emotion Recognition and Sentiment Analysis

Saif M. Mohammad

分类：自然语言处理 | 人工智能

2021-09-17

我们生活中情绪的重要性和普及性使得情感计算了一个非常重要和充满活力的工作。自动情感识别（AER）和情感分析的系统可以是巨大进展的促进者（例如，改善公共卫生和商业），而且还有巨大伤害的推动者（例如，用于抑制持不同政见者和操纵选民）。因此，情感计算社区必须积极地与其创作的道德后果搞。在本文中，我已经从AI伦理和情感认可文学中综合和组织信息，以提出与AER相关的五十个道德考虑因素。值得注意的是，纸张捏出了隐藏在如何框架的假设，并且在经常对数据，方法和评估的选择中的选择。特别关注在隐私和社会群体上的AER对AER的影响。沿途，关键建议是针对负责任的航空制作的。纸张的目标是促进和鼓励更加思考为什么自动化，如何自动化，以及如何在建立AER系统之前判断成功。此外，该纸张作为情感认可的有用介绍文件（补充调查文章）。

translated by 谷歌翻译

Negative Human Rights as a Basis for Long-term AI Safety and Regulation

Ondrej Bajgar , Jan Horenovsky

分类：人工智能

2022-08-31

如果未来的AI系统在新的情况下是可靠的安全性，那么他们将需要纳入指导它们的一般原则，以便强烈地认识到哪些结果和行为将是有害的。这样的原则可能需要得到约束力的监管制度的支持，该法规需要广泛接受的基本原则。它们还应该足够具体用于技术实施。本文从法律中汲取灵感，解释了负面的人权如何履行此类原则的作用，并为国际监管制度以及为未来的AI系统建立技术安全限制的基础。

translated by 谷歌翻译

Atomist or Holist? A Diagnosis and Vision for More Productive Interdisciplinary AI Ethics Dialogue

Travis Greene , Amit Dhurandhar , Galit Shmueli

分类：人工智能

2022-08-19

为了回应对新的基于AI的技术的社会，法律和道德影响的认识，AI和ML少校会议和期刊现在鼓励或要求提交的论文包括道德影响声明并接受道德审查。这一举动引发了关于伦理在AI和数据科学研究中的作用的激烈辩论，有时会变成适得其反的名称和“取消”的威胁。我们认为，更加关注数据科学家的道德教育可能有助于弥合分离数据科学界的意识形态鸿沟。我们将这种深厚的意识形态冲突诊断为原子主义者和整体者之间的一项冲突。除其他事项外，原子主义者认为，事实是并且应该与价值观分开的想法，而整体者认为事实和价值观是并且应该彼此之间的不可分割。我们的目标是鼓励跨学科和减少学科两极分化的目标，我们借鉴了从哲学和法律到社会理论和人文心理学等各种历史来源，以描述每个意识形态的信仰和假设。最后，我们呼吁数据科学界内的原子主义者和整体者在道德分歧期间表现出更大的同理心，并提出四种有针对性的策略，以确保数据科学研究受益社会。

translated by 谷歌翻译

Holding AI to Account: Challenges for the Delivery of Trustworthy AI in Healthcare

Rob Procter , Peter Tolmie , Mark Rouncefield

分类：人工智能

2022-11-29

The need for AI systems to provide explanations for their behaviour is now widely recognised as key to their adoption. In this paper, we examine the problem of trustworthy AI and explore what delivering this means in practice, with a focus on healthcare applications. Work in this area typically treats trustworthy AI as a problem of Human-Computer Interaction involving the individual user and an AI system. However, we argue here that this overlooks the important part played by organisational accountability in how people reason about and trust AI in socio-technical settings. To illustrate the importance of organisational accountability, we present findings from ethnographic studies of breast cancer screening and cancer treatment planning in multidisciplinary team meetings to show how participants made themselves accountable both to each other and to the organisations of which they are members. We use these findings to enrich existing understandings of the requirements for trustworthy AI and to outline some candidate solutions to the problems of making AI accountable both to individual users and organisationally. We conclude by outlining the implications of this for future work on the development of trustworthy AI, including ways in which our proposed solutions may be re-used in different application settings.

translated by 谷歌翻译

Can Machines Learn Morality? The Delphi Experiment

Liwei Jiang , Jena D. Hwang , Chandra Bhagavatula , Ronan Le Bras , Jenny Liang , Jesse Dodge , Keisuke Sakaguchi , Maxwell Forbes , Jon Borchardt , Saadia Gabriel

分类：自然语言处理

2021-10-14

随着人工智能系统变得越来越强大和普遍，人们对机器的道德或缺乏道德的关注变得越来越关注。然而，向机器讲授道德是一项艰巨的任务，因为道德仍然是人类中最激烈的争论问题之一，更不用说AI了。但是，部署到数百万用户的现有AI系统已经在做出充满道德影响的决策，这构成了一个看似不可能的挑战：教学机器的道德意义，而人类继续努力努力。为了探索这一挑战，我们介绍了Delphi，这是一个基于深层神经网络的实验框架，直接训练了描述性道德判断，例如，“帮助朋友”通常是不错的，而“帮助朋友传播假新闻”不是。经验结果提供了对机器伦理的承诺和局限性的新见解。面对新的道德情况，德尔菲（Delphi）表现出强大的概括能力，而现成的神经网络模型表现出明显差的判断，包括不公正的偏见，证实了对明确教学机器的道德意义的必要性。然而，德尔菲并不完美，表现出对普遍性偏见和不一致的敏感性。尽管如此，我们还是展示了不完美的Delphi的积极用例，包括在其他不完美的AI系统中将其用作组件模型。重要的是，我们根据著名的道德理论来解释Delphi的运营化，这使我们提出了重要的未来研究问题。

translated by 谷歌翻译

Online Handbook of Argumentation for AI: Volume 3

Lars Bengel , Elfia Bezou-Vrakatseli , Lydia Blümel , Federico Castagna , Giulia D'Agostino , Daphne Odekerken , Minal Suresh Patil , Jordan Robinson , Hao Wu , Andreas Xydis

分类：人工智能

2022-12-15

This volume contains revised versions of the papers selected for the third volume of the Online Handbook of Argumentation for AI (OHAAI). Previously, formal theories of argument and argument interaction have been proposed and studied, and this has led to the more recent study of computational models of argument. Argumentation, as a field within artificial intelligence (AI), is highly relevant for researchers interested in symbolic representations of knowledge and defeasible reasoning. The purpose of this handbook is to provide an open access and curated anthology for the argumentation research community. OHAAI is designed to serve as a research hub to keep track of the latest and upcoming PhD-driven research on the theory and application of argumentation in all areas related to AI.

translated by 谷歌翻译

From partners to populations: A hierarchical Bayesian account of coordination and convention

Robert D. Hawkins , Michael Franke , Michael C. Frank , Adele E. Goldberg , Kenny Smith , Thomas L. Griffiths , Noah D. Goodman

分类：自然语言处理 | 人工智能

2021-04-12

语言是协调问题的强大解决方案：他们提供了稳定的，有关我们所说的单词如何对应于我们头脑中的信仰和意图的共同期望。然而，在变量和非静止社会环境中的语言使用需要语言表征来灵活：旧词在飞行中获取新的临时或合作伙伴特定含义。在本文中，我们介绍了柴（通过推理的连续分层适应），一个分层贝叶斯的协调理论和会议组织，旨在在这两个基本观察之间调和长期张力。我们认为，沟通的中央计算问题不仅仅是传输，如在经典配方中，而是在多个时间尺度上持续学习和适应。合作伙伴特定的共同点迅速出现在数型互动中的社会推论中，而社群范围内的社会公约是稳定的前锋，这些前锋已经抽象出与多个合作伙伴的互动。我们展示了新的实证数据，展示了我们的模型为多个现象提供了对先前账户挑战的计算基础：（1）与同一合作伙伴的重复互动的更有效的参考表达的融合（2）将合作伙伴特定的共同基础转移到陌生人，并（3）交际范围的影响最终会形成。

translated by 谷歌翻译

Est-ce que vous compute? Code-switching, cultural identity, and AI

Arianna Falbo , Travis LaCroix

分类：人工智能 | 自然语言处理

2021-12-15

文化代码切换涉及我们如何调整我们的整体行为，口语方式以及应对我们社会环境的感知变化。我们捍卫需要调查人工智能系统中的文化码切换能力。我们探索了一系列伦理和认识的问题，当培养文化代码切换到人工智能时出现。建立在Dotson的（2014）分析证言窒息的分析，我们讨论了AI中的新兴技术如何产生认识的压迫，具体而言，我们称之为“文化闷闷不乐”的自我沉默形式。通过离开文化代码切换的社会动态特征，通过扩大机遇差距和进一步根深蒂固的社会不平等，AI系统的风险负面影响已经边缘化的社会群体。

translated by 谷歌翻译

Computational Charisma -- A Brick by Brick Blueprint for Building Charismatic Artificial Intelligence

Björn W. Schuller , Shahin Amiriparian , Anton Batliner , Alexander Gebhard , Maurice Gerzcuk , Vincent Karas , Alexander Kathan , Lennart Seizer , Johanna Löchner

分类：人工智能 | 计算机视觉 | 机器学习

2022-12-31

Charisma is considered as one's ability to attract and potentially also influence others. Clearly, there can be considerable interest from an artificial intelligence's (AI) perspective to provide it with such skill. Beyond, a plethora of use cases opens up for computational measurement of human charisma, such as for tutoring humans in the acquisition of charisma, mediating human-to-human conversation, or identifying charismatic individuals in big social data. A number of models exist that base charisma on various dimensions, often following the idea that charisma is given if someone could and would help others. Examples include influence (could help) and affability (would help) in scientific studies or power (could help), presence, and warmth (both would help) as a popular concept. Modelling high levels in these dimensions for humanoid robots or virtual agents, seems accomplishable. Beyond, also automatic measurement appears quite feasible with the recent advances in the related fields of Affective Computing and Social Signal Processing. Here, we, thereforem present a blueprint for building machines that can appear charismatic, but also analyse the charisma of others. To this end, we first provide the psychological perspective including different models of charisma and behavioural cues of it. We then switch to conversational charisma in spoken language as an exemplary modality that is essential for human-human and human-computer conversations. The computational perspective then deals with the recognition and generation of charismatic behaviour by AI. This includes an overview of the state of play in the field and the aforementioned blueprint. We then name exemplary use cases of computational charismatic skills before switching to ethical aspects and concluding this overview and perspective on building charisma-enabled AI.

translated by 谷歌翻译

Structured Like a Language Model: Analysing AI as an Automated Subject

Liam Magee , Vanicka Arora , Luke Munn

分类：人工智能

2022-12-08

Drawing from the resources of psychoanalysis and critical media studies, in this paper we develop an analysis of Large Language Models (LLMs) as automated subjects. We argue the intentional fictional projection of subjectivity onto LLMs can yield an alternate frame through which AI behaviour, including its productions of bias and harm, can be analysed. First, we introduce language models, discuss their significance and risks, and outline our case for interpreting model design and outputs with support from psychoanalytic concepts. We trace a brief history of language models, culminating with the releases, in 2022, of systems that realise state-of-the-art natural language processing performance. We engage with one such system, OpenAI's InstructGPT, as a case study, detailing the layers of its construction and conducting exploratory and semi-structured interviews with chatbots. These interviews probe the model's moral imperatives to be helpful, truthful and harmless by design. The model acts, we argue, as the condensation of often competing social desires, articulated through the internet and harvested into training data, which must then be regulated and repressed. This foundational structure can however be redirected via prompting, so that the model comes to identify with, and transfer, its commitments to the immediate human subject before it. In turn, these automated productions of language can lead to the human subject projecting agency upon the model, effecting occasionally further forms of countertransference. We conclude that critical media methods and psychoanalytic theory together offer a productive frame for grasping the powerful new capacities of AI-driven language systems.

translated by 谷歌翻译

Making Intelligence: Ethics, IQ, and ML Benchmarks

Borhane Blili-Hamelin , Leif Hancox-Li

分类：机器学习

2022-09-01

ML社区认识到预期和减轻基准研究的潜在负面影响的重要性。在该立场论文中，我们认为，需要更多的关注，这需要对ML基准的技术和科学核心的道德风险领域。我们确定了人类智商和ML基准之间被忽视的结构相似性。人类智能和ML基准在设定标准以描述，评估和比较与智能相关的任务的标准方面具有相似之处。这使我们能够从ML基准社区考虑需要考虑的女权主义哲学哲学哲学中解开课程。最后，我们概述了基准研究伦理和伦理评论的实用建议。

translated by 谷歌翻译

Climbing towards NLU: On meaning, form, and understanding in the age of data

分类：

The success of the large neural language models on many NLP tasks is exciting. However, we find that these successes sometimes lead to hype in which these models are being described as "understanding" language or capturing "meaning". In this position paper, we argue that a system trained only on form has a priori no way to learn meaning. In keeping with the ACL 2020 theme of "Taking Stock of Where We've Been and Where We're Going", we argue that a clear understanding of the distinction between form and meaning will help guide the field towards better science around natural language understanding.

translated by 谷歌翻译

A Storytelling Robot managing Persuasive and Ethical Stances via ACT-R: an Exploratory Study

Agnese Augello , Giuseppe Città , Manuel Gentile , Antonio Lieto

分类：人工智能 | 机器人

2021-07-27

我们展示了一个讲故事机器人，通过ACT-R认知架构控制，能够采用不同的说服技术和道德阶段，同时对关于Covid-19的一些主题进行交谈。论文的主要贡献包括在对话期间，在代理程序内记忆中可用的有说服力技术的使用（如果有）使用（如果有的话）的需求驱动模型的提议。在这种模型中测试的说服技术组合从使用讲故事，以绘制技术和基于修辞的参数。据我们所知，这代表了建立一个有说服力的代理商，能够整合关于对话管理，讲故事和说服技术以及道德态度的明确接地的认知假设的混合。本文介绍了63名参与者对系统的探索性评估结果

translated by 谷歌翻译

A Principles-based Ethical Assurance Argument for AI and Autonomous Systems

Zoe Porter , Ibrahim Habli , John McDermid

分类：人工智能

2022-03-29

保证案件提出了一个明确且可辩护的论点，并得到证据支持，即系统将按照特定情况下的意图运行。通常，保证案例提出了一个论点，即系统在其预期的上下文中将是安全的。值得信赖的AI研究社区中的一项新兴建议是扩展和应用这种方法，以保证使用AI系统或自治系统（AI/AS）在特定情况下将是可接受的道德。在本文中，我们进一步提出了这一建议。我们通过为AI/AS提供基于原则的道德保证（PBEA）论点模式来做到这一点。 PBEA参数模式为推理给定AI/AS的整体道德可接受性提供了一个框架，它可能是特定道德保证案例的早期原型模板。构成PBEA论证模式基础的四个核心道德原则是：正义；福利；非遗憾；并尊重个人自主权。在整个过程中，我们将参数模式的阶段连接到AI/作为应用程序的示例。这有助于显示其最初的合理性。

translated by 谷歌翻译

Improving alignment of dialogue agents via targeted human judgements

Amelia Glaese , Nat McAleese , Maja Trębacz , John Aslanides , Vlad Firoiu , Timo Ewalds , Maribeth Rauh , Laura Weidinger , Martin Chadwick , Phoebe Thacker

分类：机器学习 | 自然语言处理

2022-09-28

我们介绍了Sparrow，这是一个寻求信息的对话代理，与提示的语言模型基线相比，训练有素，更有帮助，正确和无害。我们使用从人类反馈中的强化学习来培训我们的模型，以帮助人类评估者判断代理人的行为。首先，为了使我们的代理人更有帮助和无害，我们将良好对话的要求分解为代理人应遵循的自然语言规则，并分别向评估者询问每个规则。我们证明，这种崩溃使我们能够收集对代理行为的更多针对性的人类判断，并允许更有效的规则条件奖励模型。其次，我们的代理商在收集对模型声明的偏好判决时提供了支持事实主张的来源的证据。对于事实问题，麻雀提供的证据支持了78％的时间。比基线比基线更享受麻雀，同时对人类的对抗性探测更具弹性，在探测时只有8％的时间违反了我们的规则。最后，我们进行了广泛的分析，表明尽管我们的模型学会遵守我们的规则，但它可以表现出分布偏见。

translated by 谷歌翻译