了解信任如何建造在时间之中至关重要,因为信托在接受和采用自动车辆(AVS)中发挥着重要作用。本研究旨在调查制度绩效和参与者信任前提条件对接管过渡期间动态情境信任的影响。我们在观看了30个视频时,通过报告和行为措施评估了42名参与者的动态情境信任。该研究是3乘2个混合受试者设计,其中受试者内部变量是系统性能(即,95 \%,80 \%和70 \%的精度水平)和受试者之间的变量是前提条件参与者的信任(即俯视和弱者)。我们的研究结果表明,参与者迅速调整了他们的自我报告的情境信托(SST)水平,这些信托水平与信任前提条件的系统性能的不同准确度水平一致。然而,参与者的行为情况信托(BST)受到他们信任前提的影响,不同的准确性水平。例如,与底下前提条件相比,过度截图的前提条件显着增加了协议分数。与过度截图的前提条件相比,底下前提条件显着降低了开关分数。这些结果对设计用于条件AVS的车载信任校准系统具有重要意义。
translated by 谷歌翻译
人为决策的合作努力实现超出人类或人工智能表现的团队绩效。但是,许多因素都会影响人类团队的成功,包括用户的领域专业知识,AI系统的心理模型,对建议的信任等等。这项工作检查了用户与三种模拟算法模型的互动,所有这些模型都具有相似的精度,但对其真正的正面和真实负率进行了不同的调整。我们的研究检查了在非平凡的血管标签任务中的用户性能,参与者表明给定的血管是流动还是停滞。我们的结果表明,虽然AI-Assistant的建议可以帮助用户决策,但用户相对于AI的基线性能和AI错误类型的补充调整等因素会显着影响整体团队的整体绩效。新手用户有所改善,但不能达到AI的准确性。高度熟练的用户通常能够识别何时应遵循AI建议,并通常保持或提高其性能。与AI相似的准确性水平的表演者在AI建议方面是最大的变化。此外,我们发现用户对AI的性能亲戚的看法也对给出AI建议时的准确性是否有所提高产生重大影响。这项工作提供了有关与人类协作有关的因素的复杂性的见解,并提供了有关如何开发以人为中心的AI算法来补充用户在决策任务中的建议。
translated by 谷歌翻译
汽车行业在过去几十年中见证了越来越多的发展程度;从制造手动操作车辆到具有高自动化水平的制造车辆。随着近期人工智能(AI)的发展,汽车公司现在雇用BlackBox AI模型来使车辆能够感知其环境,并使人类少或没有输入的驾驶决策。希望能够在商业规模上部署自治车辆(AV),通过社会接受AV成为至关重要的,并且可能在很大程度上取决于其透明度,可信度和遵守法规的程度。通过为AVS行为的解释提供对这些接受要求的遵守对这些验收要求的评估。因此,解释性被视为AVS的重要要求。 AV应该能够解释他们在他们运作的环境中的“见到”。在本文中,我们对可解释的自动驾驶的现有工作体系进行了全面的调查。首先,我们通过突出显示并强调透明度,问责制和信任的重要性来开放一个解释的动机;并审查与AVS相关的现有法规和标准。其次,我们识别并分类了参与发展,使用和监管的不同利益相关者,并引出了AV的解释要求。第三,我们对以前的工作进行了严格的审查,以解释不同的AV操作(即,感知,本地化,规划,控制和系统管理)。最后,我们确定了相关的挑战并提供建议,例如AV可解释性的概念框架。该调查旨在提供对AVS中解释性感兴趣的研究人员所需的基本知识。
translated by 谷歌翻译
Explainable AI, in the context of autonomous systems, like self-driving cars, has drawn broad interests from researchers. Recent studies have found that providing explanations for autonomous vehicles' actions has many benefits (e.g., increased trust and acceptance), but put little emphasis on when an explanation is needed and how the content of explanation changes with driving context. In this work, we investigate which scenarios people need explanations and how the critical degree of explanation shifts with situations and driver types. Through a user experiment, we ask participants to evaluate how necessary an explanation is and measure the impact on their trust in self-driving cars in different contexts. Moreover, we present a self-driving explanation dataset with first-person explanations and associated measures of the necessity for 1103 video clips, augmenting the Berkeley Deep Drive Attention dataset. Our research reveals that driver types and driving scenarios dictate whether an explanation is necessary. In particular, people tend to agree on the necessity for near-crash events but hold different opinions on ordinary or anomalous driving situations.
translated by 谷歌翻译
机器学习和人工智能的进步正在促进公共道路上的自动车辆(AVS)的测试和部署。加利福尼亚州机动车部(CA DMV)推出了自主车辆测试程序,该计划收集和发布与自主驾驶自主驾驶的自主车辆脱离(AVD)相关的报告。了解AVD的原因对于提高AV系统的安全性和稳定性并提供AV测试和部署的指导至关重要。在这项工作中,构建可扩展的端到端管道以采用自然语言处理深度转移学习从2014年到2020年从2014年到2020年发布的脱离发电报告。使用分类,可视化和统计测试脱离数据分析揭示了AV测试,分类原因频率和AVD的原因与效果之间的显着关系趋势。我们发现(1)制造商在春季和/或冬季进行了密集地测试了AVS,(2)测试司机启动了超过80%的脱离,而感知,本地化和映射的误差超过75%的脱离,规划和控制AV系统本身,(3)AVD的发起者与原因类别之间存在重大关系。本研究用作使用预先训练的模型的深度转移学习的成功实践,并生成综合的脱离语数据库,允许进一步调查其他研究人员。
translated by 谷歌翻译
情绪可以提供自然的交流方式,以补充许多领域中社交机器人(例如文本和语音)现有的多模式能力。我们与112、223和151名参与者进行了三项在线研究,以调查使用情绪作为搜救(SAR)机器人的交流方式的好处。在第一个实验中,我们研究了通过机器人的情绪传达与SAR情况有关的信息的可行性,从而导致了从SAR情况到情绪的映射。第二项研究使用控制控制理论是推导此类映射的替代方法。此方法更灵活,例如允许对不同的情绪集和不同机器人进行调整。在第三个实验中,我们使用LED作为表达通道为外观受限的室外现场研究机器人创建了情感表达。在各种模拟的SAR情况下,使用这些情感表达式,我们评估了这些表达式对参与者(采用救援人员的作用)的影响。我们的结果和提议的方法提供了(a)有关情感如何帮助在SAR背景下传达信息的见解,以及(b)在(模拟)SAR通信环境中添加情绪为传播方式的有效性的证据。
translated by 谷歌翻译
Using chatbots to deliver recommendations is increasingly popular. The design of recommendation chatbots has primarily been taking an information-centric approach by focusing on the recommended content per se. Limited attention is on how social connection and relational strategies, such as self-disclosure from a chatbot, may influence users' perception and acceptance of the recommendation. In this work, we designed, implemented, and evaluated a social chatbot capable of performing three different levels of self-disclosure: factual information (low), cognitive opinions (medium), and emotions (high). In the evaluation, we recruited 372 participants to converse with the chatbot on two topics: movies and COVID-19 experiences. In each topic, the chatbot performed small talks and made recommendations relevant to the topic. Participants were randomly assigned to four experimental conditions where the chatbot used factual, cognitive, emotional, and adaptive strategies to perform self-disclosures. By training a text classifier to identify users' level of self-disclosure in real-time, the adaptive chatbot can dynamically match its self-disclosure to the level of disclosure exhibited by the users. Our results show that users reciprocate with higher-level self-disclosure when a recommendation chatbot consistently displays emotions throughout the conversation. Chatbot's emotional disclosure also led to increased interactional enjoyment and more positive interpersonal perception towards the bot, fostering a stronger human-chatbot relationship and thus leading to increased recommendation effectiveness, including a higher tendency to accept the recommendation. We discuss the understandings obtained and implications to future design.
translated by 谷歌翻译
尽管Ai在各个领域的超人表现,但人类往往不愿意采用AI系统。许多现代AI技术中缺乏可解释性的缺乏可令人伤害他们的采用,因为用户可能不相信他们不理解的决策过程的系统。我们通过一种新的实验调查这一主张,其中我们使用互动预测任务来分析可解释性和结果反馈对AI信任的影响和AI辅助预测任务的人类绩效。我们发现解释性导致了不强大的信任改进,而结果反馈具有明显更大且更可靠的效果。然而,这两个因素对参与者的任务表现产生了适度的影响。我们的研究结果表明(1)接受重大关注的因素,如可解释性,在越来越多的信任方面可能比其他结果反馈的因素效果,而(2)通过AI系统增强人类绩效可能不是在AI中增加信任的简单问题。 ,随着增加的信任并不总是与性能同样大的改进相关联。这些调查结果邀请了研究界不仅关注产生解释的方法,而且还专注于确保在实践中产生影响和表现的技巧。
translated by 谷歌翻译
Taking advice from others requires confidence in their competence. This is important for interaction with peers, but also for collaboration with social robots and artificial agents. Nonetheless, we do not always have access to information about others' competence or performance. In these uncertain environments, do our prior beliefs about the nature and the competence of our interacting partners modulate our willingness to rely on their judgments? In a joint perceptual decision making task, participants made perceptual judgments and observed the simulated estimates of either a human participant, a social humanoid robot or a computer. Then they could modify their estimates based on this feedback. Results show participants' belief about the nature of their partner biased their compliance with its judgments: participants were more influenced by the social robot than human and computer partners. This difference emerged strongly at the very beginning of the task and decreased with repeated exposure to empirical feedback on the partner's responses, disclosing the role of prior beliefs in social influence under uncertainty. Furthermore, the results of our functional task suggest an important difference between human-human and human-robot interaction in the absence of overt socially relevant signal from the partner: the former is modulated by social normative mechanisms, whereas the latter is guided by purely informational mechanisms linked to the perceived competence of the partner.
translated by 谷歌翻译
本文提出了一个论点,说明了为什么我们没有在解释性,可解释性和透明度研究中充分衡量信任。大多数研究要求参与者完成信任量表,以评估他们对已解释/解释的模型的信任。如果信托增加,我们认为这是积极的。但是,这有两个问题。首先,我们通常无法知道参与者是否应该信任该模型。如果模型质量较差,信任肯定应降低。其次,这些量表衡量了感知到的信任,而不是证明信任。本文展示了三种在衡量感知和证明信任方面做得很好的方法。它旨在讨论此主题的起点,而不是成为最终决定。作者引起了批评和讨论。
translated by 谷歌翻译
自动化车辆功能最佳接受和舒适性的关键因素是驾驶方式。自动化和驱动程序偏爱的驾驶方式之间的不匹配可以使用户更频繁地接管甚至禁用自动化功能。这项工作建议用多模式信号识别用户驾驶样式偏好,因此该车辆可以以连续自动的方式匹配用户偏好。我们对36名参与者进行了驾驶模拟器研究,并收集了广泛的多模式数据,包括行为,生理和情境数据。这包括眼目光,转向抓地力,驾驶演习,制动和节气门踏板输入以及距踏板的脚距离,瞳孔直径,电流皮肤反应,心率和情境驱动驱动环境。然后,我们建立了机器学习模型来识别首选的驾驶方式,并确认所有模式对于识别用户偏好都很重要。这项工作为自动车辆的隐性自适应驾驶风格铺平了道路。
translated by 谷歌翻译
Explainable AI (XAI) is widely viewed as a sine qua non for ever-expanding AI research. A better understanding of the needs of XAI users, as well as human-centered evaluations of explainable models are both a necessity and a challenge. In this paper, we explore how HCI and AI researchers conduct user studies in XAI applications based on a systematic literature review. After identifying and thoroughly analyzing 85 core papers with human-based XAI evaluations over the past five years, we categorize them along the measured characteristics of explanatory methods, namely trust, understanding, fairness, usability, and human-AI team performance. Our research shows that XAI is spreading more rapidly in certain application domains, such as recommender systems than in others, but that user evaluations are still rather sparse and incorporate hardly any insights from cognitive or social sciences. Based on a comprehensive discussion of best practices, i.e., common models, design choices, and measures in user studies, we propose practical guidelines on designing and conducting user studies for XAI researchers and practitioners. Lastly, this survey also highlights several open research directions, particularly linking psychological science and human-centered XAI.
translated by 谷歌翻译
自我跟踪可以提高人们对他们不健康的行为的认识,为行为改变提供见解。事先工作探索了自动跟踪器如何反映其记录数据,但它仍然不清楚他们从跟踪反馈中学到多少,以及哪些信息更有用。实际上,反馈仍然可以压倒,并简明扼要可以通过增加焦点和减少解释负担来改善学习。为了简化反馈,我们提出了一个自动跟踪反馈显着框架,以定义提供反馈的特定信息,为什么这些细节以及如何呈现它们(手动引出或自动反馈)。我们从移动食品跟踪的实地研究中收集了调查和膳食图像数据,并实施了Salientrack,一种机器学习模型,以预测用户从跟踪事件中学习。使用可解释的AI(XAI)技术,SalientRack识别该事件的哪些特征是最突出的,为什么它们导致正面学习结果,并优先考虑如何根据归属分数呈现反馈。我们展示了用例,并进行了形成性研究,以展示Salientrack的可用性和有用性。我们讨论自动跟踪中可读性的影响,以及如何添加模型解释性扩大了提高反馈体验的机会。
translated by 谷歌翻译
在本文中,我们研究了在共享物理空间中运行时的影响界面和反馈对人机信任级别的反馈。我们使用的任务是为室内环境中的机器人指定“无-Go”区域。我们评估三种界面(物理,AR和基于地图)和四个反馈机制(无反馈,机器人在空间,AR“栅栏”和地图上标记的区域)。我们的评估看起来可用和信任。具体而言,如果参与者信任机器人“知道”在禁止地区是禁止机器人避免该区域的能力的地方。我们使用自我报告和间接的信任措施和可用性。我们的主要研究结果是:1)接口和反馈确实影响信任水平;2)参与者在很大程度上优选的混合界面反馈对,其中界面的模态与反馈不同。
translated by 谷歌翻译
人类不断受到他人的行为和观点的影响。至关重要的是,人类之间的社会影响是由互惠构成的:我们更多地遵循一直在考虑我们意见的人的建议。在当前的工作中,我们研究了与社会类人机器人互动时相互影响的影响是否可以出现。在一项联合任务中,人类参与者和人形机器人进行了感知估计,然后在观察伴侣的判断后可以公开修改它们。结果表明,赋予机器人表达和调节其对人类判断的易感水平的能力代表了双刃剑。一方面,当机器人遵循他们的建议时,参与者对机器人的能力失去了信心。另一方面,参与者不愿透露他们对易感机器人缺乏信心,这表明出现了支持人类机器人合作的社会影响力的相互机制。
translated by 谷歌翻译
解释已被框起来是更好,更公平的人类决策的基本特征。在公平的背景下,这一点尚未得到适当的研究,因为先前的工作主要根据他们对人们的看法的影响进行了评估。但是,我们认为,要促进更公正的决定,它们必须使人类能够辨别正确和错误的AI建议。为了验证我们的概念论点,我们进行了一项实证研究,以研究解释,公平感和依赖行为之间的关系。我们的发现表明,解释会影响人们的公平感,这反过来又影响了依赖。但是,我们观察到,低公平的看法会导致AI建议的更多替代,无论它们是正确还是错。这(i)引起了人们对现有解释对增强分配公平性的有用性的怀疑,并且(ii)为为什么不必将感知作为适当依赖的代理而被混淆的重要案例。
translated by 谷歌翻译
最后,这项工作将包括对解释的上下文形式的调查。在这项研究中,我们将包括一个时间障碍的方案,其中将测试不同水平的理解水平,以使我们能够评估合适且可理解的解释。为此,我们提出了不同的理解水平(lou)。用户研究将旨在比较不同的LOU在不同的互动环境中。将研究同时医院环境的用户研究。
translated by 谷歌翻译
Intelligent agents have great potential as facilitators of group conversation among older adults. However, little is known about how to design agents for this purpose and user group, especially in terms of agent embodiment. To this end, we conducted a mixed methods study of older adults' reactions to voice and body in a group conversation facilitation agent. Two agent forms with the same underlying artificial intelligence (AI) and voice system were compared: a humanoid robot and a voice assistant. One preliminary study (total n=24) and one experimental study comparing voice and body morphologies (n=36) were conducted with older adults and an experienced human facilitator. Findings revealed that the artificiality of the agent, regardless of its form, was beneficial for the socially uncomfortable task of conversation facilitation. Even so, talkative personality types had a poorer experience with the "bodied" robot version. Design implications and supplementary reactions, especially to agent voice, are also discussed.
translated by 谷歌翻译
在人类居住的环境中使用机器人的挑战是设计对人类互动引起的扰动且鲁棒的设计行为。我们的想法是用内在动机(IM)拟订机器人,以便它可以处理新的情况,并作为人类的真正社交,因此对人类互动伙伴感兴趣。人机互动(HRI)实验主要关注脚本或远程机器人,这是模拟特性,如IM来控制孤立的行为因素。本文介绍了一个“机器人学家”的研究设计,允许比较自主生成的行为彼此,而且首次评估机器人中基于IM的生成行为的人类感知。我们在受试者内部用户学习(n = 24),参与者与具有不同行为制度的完全自主的Sphero BB8机器人互动:一个实现自适应,本质上动机的行为,另一个是反应性的,但不是自适应。机器人及其行为是故意最小的,以专注于IM诱导的效果。与反应基线行为相比,相互作用后问卷的定量分析表明对尺寸“温暖”的显着提高。温暖被认为是人类社会认知中社会态度形成的主要维度。一种被认为是温暖(友好,值得信赖的)的人体验更积极的社交互动。
translated by 谷歌翻译
Prior work has identified a resilient phenomenon that threatens the performance of human-AI decision-making teams: overreliance, when people agree with an AI, even when it is incorrect. Surprisingly, overreliance does not reduce when the AI produces explanations for its predictions, compared to only providing predictions. Some have argued that overreliance results from cognitive biases or uncalibrated trust, attributing overreliance to an inevitability of human cognition. By contrast, our paper argues that people strategically choose whether or not to engage with an AI explanation, demonstrating empirically that there are scenarios where AI explanations reduce overreliance. To achieve this, we formalize this strategic choice in a cost-benefit framework, where the costs and benefits of engaging with the task are weighed against the costs and benefits of relying on the AI. We manipulate the costs and benefits in a maze task, where participants collaborate with a simulated AI to find the exit of a maze. Through 5 studies (N = 731), we find that costs such as task difficulty (Study 1), explanation difficulty (Study 2, 3), and benefits such as monetary compensation (Study 4) affect overreliance. Finally, Study 5 adapts the Cognitive Effort Discounting paradigm to quantify the utility of different explanations, providing further support for our framework. Our results suggest that some of the null effects found in literature could be due in part to the explanation not sufficiently reducing the costs of verifying the AI's prediction.
translated by 谷歌翻译