智能论文笔记

Chatbots for Mental Health Support: Exploring the Impact of Emohaa on Reducing Mental Distress in China

Sahand Sabour , Wen Zhang , Xiyao Xiao , Yuwei Zhang , Yinhe Zheng , Jiaxin Wen , Jialu Zhao , Minlie Huang

分类：自然语言处理

2022-09-21

对心理健康支持的需求不断增长，强调了对话代理在全球和中国作为人类支持者的重要性。这些代理可以增加可用性并降低心理健康支持的相对成本。提供的支持可以分为两种主要类型：认知和情感支持。关于该主题的现有工作主要集中在采用认知行为疗法（CBT）原理的构造药物上。此类代理根据预定义的模板和练习来运行，以提供认知支持。但是，使用此类药物对情绪支持的研究是有限的。此外，大多数建设的代理商都以英语运作，强调了在中国进行此类研究的重要性。在这项研究中，我们分析了表情符疾病在减少精神痛苦症状方面的有效性。 Emohaa是一种对话剂，通过基于CBT的练习和指导性对话提供认知支持。它还通过使用户能够发泄所需的情绪问题来支持情感上的支持。该研究包括134名参与者，分为三组：Emohaa（基于CBT），Emohaa（Full）和控制。实验结果表明，与对照组相比，使用Emohaa的参与者在精神困扰症状方面的改善得到了更大的改善。我们还发现，添加情感支持剂对这种改善，主要是抑郁和失眠有互补的影响。根据获得的结果和参与者对平台的满意，我们得出结论，Emohaa是减少精神困扰的实用和有效工具。

translated by 谷歌翻译

Robots as Mental Well-being Coaches: Design and Ethical Recommendations

Minja Axelsson , Micol Spitale , Hatice Gunes

分类：机器人

2022-08-31

最近十年表明，人们对机器人作为福祉教练的兴趣越来越大。但是，尚未提出针对机器人设计作为促进心理健康的教练的凝聚力和全面的准则。本文详细介绍了基于基于扎根理论方法的定性荟萃分析的设计和道德建议，该方法是通过三项以用户为中心的涉及机器人福祉教练的三个不同的以用户为中心进行的，即：（1）与参与性设计研究一起进行的。 11名参与者由两位潜在用户组成，他们与人类教练一起参加了简短的专注于解决方案的实践研究，以及不同学科的教练，（2）半结构化的个人访谈数据，这些数据来自20名参加积极心理学干预研究的参与者借助机器人福祉教练胡椒，（3）与3名积极心理学研究的参与者以及2名相关的福祉教练进行了一项参与式设计研究。在进行主题分析和定性荟萃分析之后，我们将收集到收敛性和不同主题的数据整理在一起，并从这些结果中提炼了一套设计准则和道德考虑。我们的发现可以在设计机器人心理福祉教练时考虑到关键方面的关键方面。

translated by 谷歌翻译

Voice Over Body? Older Adults' Reactions to Robot and Voice Assistant Facilitators of Group Conversation

Katie Seaborn , Takuya Sekiguchi , Seiki Tokunaga , Norihisa P. Miyake , Mihoko Otake-Matsuura

分类：机器人

2022-12-08

Intelligent agents have great potential as facilitators of group conversation among older adults. However, little is known about how to design agents for this purpose and user group, especially in terms of agent embodiment. To this end, we conducted a mixed methods study of older adults' reactions to voice and body in a group conversation facilitation agent. Two agent forms with the same underlying artificial intelligence (AI) and voice system were compared: a humanoid robot and a voice assistant. One preliminary study (total n=24) and one experimental study comparing voice and body morphologies (n=36) were conducted with older adults and an experienced human facilitator. Findings revealed that the artificiality of the agent, regardless of its form, was beneficial for the socially uncomfortable task of conversation facilitation. Even so, talkative personality types had a poorer experience with the "bodied" robot version. Design implications and supplementary reactions, especially to agent voice, are also discussed.

translated by 谷歌翻译

Understanding the Information Needs and Practices of Human Supporters of an Online Mental Health Intervention to Inform Machine Learning Applications

Anja Thieme

分类：机器学习

2021-11-12

在数字治疗干预的背景下，例如互联网交付的认知行为治疗（ICBT）用于治疗抑郁和焦虑，广泛的研究表明，人类支持者或教练的参与如何协助接受治疗的人，改善用户参与治疗并导致更有效的健康结果而不是不受支持的干预措施。该研究旨在最大限度地提高这一人类支持的影响和结果，研究了通过AI和机器学习领域（ML）领域的最新进展提供的新机遇如何有助于有效地支持ICBT支持者的工作实践。本文报告了采访研究的详细调查结果，与15个ICBT支持者加深了解其现有的工作实践和信息需求，旨在有意义地向抑郁和焦虑治疗的背景下提供有用，可实现的ML申请。分析贡献（1）一组六个主题，总结了ICBT支持者在为其精神卫生客户提供有效，个性化反馈方面的策略和挑战;并回应这些学习，（2）对于ML方法如何帮助支持和解决挑战和信息需求，为每个主题提供具体机会。它依赖于在支持者LED客户审查实践中引入新的机器生成的数据见解的潜在社会，情感和务实含义的思考。

translated by 谷歌翻译

Taking a Language Detour: How International Migrants Speaking a Minority Language Seek COVID-Related Information in Their Host Countries

Ge Gao , Jian Zheng , Eun Kyoung Choe , Naomi Yamashita

分类：自然语言处理

2022-09-07

在公共危机时期，寻求信息对于人们的自我保健和福祉至关重要。广泛的研究调查了经验理解和技术解决方案，以促进受影响地区的家庭公民寻求信息。但是，建立有限的知识是为了支持需要在其东道国发生危机的国际移民。当前的论文对居住在日本和美国（n = 14）的两名中国移民（n = 14）进行了访谈研究。参与者反思了他们在共同大流行期间寻求经验的信息。反思补充了两周的自我追踪，参与者保持了相关信息寻求实践的记录。我们的数据表明，参与者经常绕开语言绕道，或访问普通话资源以获取有关其东道国疫情爆发的信息。他们还进行了战略性利用普通话信息，以进行选择性阅读，交叉检查以及对日语或英语的共同信息的上下文化解释。尽管这种做法增强了参与者对共同相关信息收集和感官的有效性，但他们有时会通过有时认识的方式使人们处于不利地位。此外，参与者缺乏对审查以移民为导向的信息的认识或偏爱，尽管该信息可用，这些信息是由东道国公共当局发布的。在这些发现的基础上，我们讨论了改善国际移民在非本地语言和文化环境中寻求共同相关信息的解决方案。我们主张包容性危机基础设施，这些基础设施将吸引以当地语言流利程度，信息素养和利用公共服务的经验的不同水平的人们。

translated by 谷歌翻译

SalienTrack: providing salient information for semi-automated self-tracking feedback with model explanations

Yunlong Wang , Jiaying Liu , Homin Park , Jordan Schultz-McArdle , Stephanie Rosenthal , Brian Y. Lim

分类：人工智能

2021-09-21

自我跟踪可以提高人们对他们不健康的行为的认识，为行为改变提供见解。事先工作探索了自动跟踪器如何反映其记录数据，但它仍然不清楚他们从跟踪反馈中学到多少，以及哪些信息更有用。实际上，反馈仍然可以压倒，并简明扼要可以通过增加焦点和减少解释负担来改善学习。为了简化反馈，我们提出了一个自动跟踪反馈显着框架，以定义提供反馈的特定信息，为什么这些细节以及如何呈现它们（手动引出或自动反馈）。我们从移动食品跟踪的实地研究中收集了调查和膳食图像数据，并实施了Salientrack，一种机器学习模型，以预测用户从跟踪事件中学习。使用可解释的AI（XAI）技术，SalientRack识别该事件的哪些特征是最突出的，为什么它们导致正面学习结果，并优先考虑如何根据归属分数呈现反馈。我们展示了用例，并进行了形成性研究，以展示Salientrack的可用性和有用性。我们讨论自动跟踪中可读性的影响，以及如何添加模型解释性扩大了提高反馈体验的机会。

translated by 谷歌翻译

Dialoging Resonance: How Users Perceive, Reciprocate and React to Chatbot's Self-Disclosure in Conversational Recommendations

Kai-Hui Liang , Weiyan Shi , Yoojung Oh , Hao-Chuan Wang , Jingwen Zhang , Zhou Yu

分类：自然语言处理 | 人工智能

2021-06-03

Using chatbots to deliver recommendations is increasingly popular. The design of recommendation chatbots has primarily been taking an information-centric approach by focusing on the recommended content per se. Limited attention is on how social connection and relational strategies, such as self-disclosure from a chatbot, may influence users' perception and acceptance of the recommendation. In this work, we designed, implemented, and evaluated a social chatbot capable of performing three different levels of self-disclosure: factual information (low), cognitive opinions (medium), and emotions (high). In the evaluation, we recruited 372 participants to converse with the chatbot on two topics: movies and COVID-19 experiences. In each topic, the chatbot performed small talks and made recommendations relevant to the topic. Participants were randomly assigned to four experimental conditions where the chatbot used factual, cognitive, emotional, and adaptive strategies to perform self-disclosures. By training a text classifier to identify users' level of self-disclosure in real-time, the adaptive chatbot can dynamically match its self-disclosure to the level of disclosure exhibited by the users. Our results show that users reciprocate with higher-level self-disclosure when a recommendation chatbot consistently displays emotions throughout the conversation. Chatbot's emotional disclosure also led to increased interactional enjoyment and more positive interpersonal perception towards the bot, fostering a stronger human-chatbot relationship and thus leading to increased recommendation effectiveness, including a higher tendency to accept the recommendation. We discuss the understandings obtained and implications to future design.

translated by 谷歌翻译

Understanding Postpartum Parents' Experiences via Two Digital Platforms

Xuewen Yao , Miriam Mikhelson , Megan Micheletti , Eunsol Choi , S Craig Watkins , Edison Thomaz , Kaya De Barbaro

分类：自然语言处理

2022-12-22

Digital platforms, including online forums and helplines, have emerged as avenues of support for caregivers suffering from postpartum mental health distress. Understanding support seekers' experiences as shared on these platforms could provide crucial insight into caregivers' needs during this vulnerable time. In the current work, we provide a descriptive analysis of the concerns, psychological states, and motivations shared by healthy and distressed postpartum support seekers on two digital platforms, a one-on-one digital helpline and a publicly available online forum. Using a combination of human annotations, dictionary models and unsupervised techniques, we find stark differences between the experiences of distressed and healthy mothers. Distressed mothers described interpersonal problems and a lack of support, with 8.60% - 14.56% reporting severe symptoms including suicidal ideation. In contrast, the majority of healthy mothers described childcare issues, such as questions about breastfeeding or sleeping, and reported no severe mental health concerns. Across the two digital platforms, we found that distressed mothers shared similar content. However, the patterns of speech and affect shared by distressed mothers differed between the helpline vs. the online forum, suggesting the design of these platforms may shape meaningful measures of their support-seeking experiences. Our results provide new insight into the experiences of caregivers suffering from postpartum mental health distress. We conclude by discussing methodological considerations for understanding content shared by support seekers and design considerations for the next generation of support tools for postpartum parents.

translated by 谷歌翻译

Towards Explainable Social Agent Authoring tools: A case study on FAtiMA-Toolkit

Manuel Guimarães , Joana Campos , Pedro A. Santos , João Dias , Rui Prada

分类：人工智能

2022-06-07

事实证明，在学习环境中，社会智能代理（SIA）的部署在不同的应用领域具有多个优势。社会代理创作工具使场景设计师能够创造出对SIAS行为的高度控制的量身定制体验，但是，另一方面，这是有代价的，因为该方案及其创作的复杂性可能变得霸道。在本文中，我们介绍了可解释的社会代理创作工具的概念，目的是分析社会代理的创作工具是否可以理解和解释。为此，我们检查了创作工具Fatima-Toolkit是否可以理解，并且从作者的角度来看，其创作步骤可以解释。我们进行了两项用户研究，以定量评估Fatima-Toolkit的解释性，可理解性和透明度，从场景设计师的角度来看。关键发现之一是，法蒂玛 - 库尔基特（Fatima-Toolkit）的概念模型通常是可以理解的，但是基于情感的概念并不那么容易理解和使用。尽管关于Fatima-Toolkit的解释性有一些积极的方面，但仍需要取得进展，以实现完全可以解释的社会代理商创作工具。我们提供一组关键概念和可能的解决方案，可以指导开发人员构建此类工具。

translated by 谷歌翻译

Thread With Caution: Proactively Helping Users Assess and Deescalate Tension in Their Online Discussions

Jonathan P. Chang , Charlotte Schluger , Cristian Danescu-Niculescu-Mizil

分类：人工智能 | 自然语言处理

2022-12-02

Incivility remains a major challenge for online discussion platforms, to such an extent that even conversations between well-intentioned users can often derail into uncivil behavior. Traditionally, platforms have relied on moderators to -- with or without algorithmic assistance -- take corrective actions such as removing comments or banning users. In this work we propose a complementary paradigm that directly empowers users by proactively enhancing their awareness about existing tension in the conversation they are engaging in and actively guides them as they are drafting their replies to avoid further escalation. As a proof of concept for this paradigm, we design an algorithmic tool that provides such proactive information directly to users, and conduct a user study in a popular discussion platform. Through a mixed methods approach combining surveys with a randomized controlled experiment, we uncover qualitative and quantitative insights regarding how the participants utilize and react to this information. Most participants report finding this proactive paradigm valuable, noting that it helps them to identify tension that they may have otherwise missed and prompts them to further reflect on their own replies and to revise them. These effects are corroborated by a comparison of how the participants draft their reply when our tool warns them that their conversation is at risk of derailing into uncivil behavior versus in a control condition where the tool is disabled. These preliminary findings highlight the potential of this user-centered paradigm and point to concrete directions for future implementations.

translated by 谷歌翻译

A repeated-measures study on emotional responses after a year in the pandemic

Maximilian Mozes , Isabelle van der Vegt , Bennett Kleinberg

分类：自然语言处理

2021-07-07

Covid-19锁定措施的引入和返回正常性的展望要求社会变化。最紧迫的问题是个人如何适应大流行。本文在重复措施设计中审查了对大流行的情绪反应。数据（n = 1698）于2020年4月（严格锁定措施期间），并于2021年4月（当疫苗接种计划获得牵引时）。我们要求参与者报告他们的情绪并在文本数据中表达这些。统计测试揭示了更好地调整大流行的平均趋势。然而，聚类分析建议更复杂的异构模式，具有良好的应对和辞职的参与者子组。语言计算分析发现，主题和N-GRAM频率转移到关注疫苗接种程序，远离一般担忧。讨论了对公共心理健康努力在识别风险上识别人们的努力的影响。数据集是公开可用的。

translated by 谷歌翻译

Testing Human Ability To Detect Deepfake Images of Human Faces

Sergi D. Bray , Shane D. Johnson , Bennett Kleinberg

分类：计算机视觉

2022-12-07

Deepfakes are computationally-created entities that falsely represent reality. They can take image, video, and audio modalities, and pose a threat to many areas of systems and societies, comprising a topic of interest to various aspects of cybersecurity and cybersafety. In 2020 a workshop consulting AI experts from academia, policing, government, the private sector, and state security agencies ranked deepfakes as the most serious AI threat. These experts noted that since fake material can propagate through many uncontrolled routes, changes in citizen behaviour may be the only effective defence. This study aims to assess human ability to identify image deepfakes of human faces (StyleGAN2:FFHQ) from nondeepfake images (FFHQ), and to assess the effectiveness of simple interventions intended to improve detection accuracy. Using an online survey, 280 participants were randomly allocated to one of four groups: a control group, and 3 assistance interventions. Each participant was shown a sequence of 20 images randomly selected from a pool of 50 deepfake and 50 real images of human faces. Participants were asked if each image was AI-generated or not, to report their confidence, and to describe the reasoning behind each response. Overall detection accuracy was only just above chance and none of the interventions significantly improved this. Participants' confidence in their answers was high and unrelated to accuracy. Assessing the results on a per-image basis reveals participants consistently found certain images harder to label correctly, but reported similarly high confidence regardless of the image. Thus, although participant accuracy was 62% overall, this accuracy across images ranged quite evenly between 85% and 30%, with an accuracy of below 50% for one in every five images. We interpret the findings as suggesting that there is a need for an urgent call to action to address this threat.

translated by 谷歌翻译

AI in HCI Design and User Experience

Wei Xu

分类：人工智能

2023-01-03

In this chapter, we review and discuss the transformation of AI technology in HCI/UX work and assess how AI technology will change how we do the work. We first discuss how AI can be used to enhance the result of user research and design evaluation. We then discuss how AI technology can be used to enhance HCI/UX design. Finally, we discuss how AI-enabled capabilities can improve UX when users interact with computing systems, applications, and services.

translated by 谷歌翻译

Towards Better User Studies in Computer Graphics and Vision

Zoya Bylinskii , Laura Herman , Aaron Hertzmann , Stefanie Hutka , Yile Zhang

分类：计算机视觉

2022-06-23

在线众包平台使对算法输出进行评估变得容易，并提出诸如“哪个图像更好，A或B？”之类的问题的调查，在视觉和图形研究论文中的这些“用户研究”的扩散导致了增加匆忙进行的研究充其量是草率且无知的，并且可能有害和误导。我们认为，在计算机视觉和图形论文中的用户研究的设计和报告需要更多关注。为了提高从业者的知识并提高用户研究的可信度和可复制性，我们提供了用户体验研究（UXR），人类计算机互动（HCI）和相关领域的方法论的概述。我们讨论了目前在计算机视觉和图形研究中未利用的基础用户研究方法（例如，需要调查），但可以为研究项目提供宝贵的指导。我们为有兴趣探索其他UXR方法的读者提供了进一步的指导。最后，我们描述了研究界的更广泛的开放问题和建议。我们鼓励作者和审稿人都认识到，并非每项研究贡献都需要用户研究，而且根本没有研究比不小心进行的研究更好。

translated by 谷歌翻译

Participant Perceptions of a Robotic Coach Conducting Positive Psychology Exercises: A Systematic Analysis

Minja Axelsson , Nikhil Churamani , Atahan Caldir , Hatice Gunes

分类：机器人

2022-09-08

本文详细概述了将连续学习（CL）应用于单课的人类机器人互动（HRI）会议（AVG。31 +-10分钟）的案例研究，其中机器人的心理健康教练是积极的（n = 20）参与者的心理学（PP）练习。我们介绍了互动会议后与参与者进行的简短半结构访谈记录的数据的主题分析（TA）的结果，以及对统计结果的分析，证明了参与者的个性如何影响他们如何看待机器人的方式及其互动。

translated by 谷歌翻译

A critical appraisal of equity in conversational AI: Evidence from auditing GPT-3's dialogues with different publics on climate change and Black Lives Matter

Kaiping Chen , Anqi Shao , Jirayu Burapacheep , Yixuan Li

分类：人工智能 | 自然语言处理

2022-09-27

使用深度学习来产生类似人类的文本的自回归语言模型已变得越来越普遍。这样的模型为智能健康，金融和自动驾驶等领域的流行虚拟助手提供动力。尽管这些大语言模型的参数正在改善，但担心这些模型可能对社会中的所有亚组都没有平等。尽管对跨学科的AI公平性进行了越来越多的讨论，但缺乏系统的指标来评估公平在对话系统中的意义以及如何使不同人群参与评估循环。本文基于审议民主和科学技术研究的理论，提出了一个分析框架，以解开人类对话中的公平意义。使用此框架，我们进行了一项审计研究，以研究GPT-3如何应对有关关键科学和社会主题的不同亚人群的反应：气候变化和黑人生活问题（BLM）运动。我们的语料库包括在性别，种族和种族，教育水平，英语作为第一语言的GPT-3和3290个人之间的超过20,000轮对话，以及对问题的看法。我们发现，在观点和教育少数群体中，对GPT-3的用户经验实质上较差；但是，这两个小组获得了最大的知识增长，改变了聊天后对BLM和气候变化工作的态度改变。我们将这些用户的经验划分为对话差异，发现GPT-3在对教育和舆论少数群体群体做出反应时，与对多数群体的反应相比，它使用了更多的负面表达。我们讨论了我们的发现对集中多样性，公平和包容性的审议对话AI系统的含义。

translated by 谷歌翻译

Advancing Human-AI Complementarity: The Impact of User Expertise and Algorithmic Tuning on Joint Decision Making

Kori Inkpen , Shreya Chappidi , Keri Mallari , Besmira Nushi , Divya Ramesh , Pietro Michelucci , Vani Mandava , Libuše Hannah Vepřek , Gabrielle Quinn

分类：人工智能

2022-08-16

人为决策的合作努力实现超出人类或人工智能表现的团队绩效。但是，许多因素都会影响人类团队的成功，包括用户的领域专业知识，AI系统的心理模型，对建议的信任等等。这项工作检查了用户与三种模拟算法模型的互动，所有这些模型都具有相似的精度，但对其真正的正面和真实负率进行了不同的调整。我们的研究检查了在非平凡的血管标签任务中的用户性能，参与者表明给定的血管是流动还是停滞。我们的结果表明，虽然AI-Assistant的建议可以帮助用户决策，但用户相对于AI的基线性能和AI错误类型的补充调整等因素会显着影响整体团队的整体绩效。新手用户有所改善，但不能达到AI的准确性。高度熟练的用户通常能够识别何时应遵循AI建议，并通常保持或提高其性能。与AI相似的准确性水平的表演者在AI建议方面是最大的变化。此外，我们发现用户对AI的性能亲戚的看法也对给出AI建议时的准确性是否有所提高产生重大影响。这项工作提供了有关与人类协作有关的因素的复杂性的见解，并提供了有关如何开发以人为中心的AI算法来补充用户在决策任务中的建议。

translated by 谷歌翻译

An Empathetic AI Coach for Self-Attachment Therapy

Lisa Alazraki , Ali Ghachem , Neophytos Polydorou , Foaad Khosmood , Abbas Edalat

分类：人工智能 | 自然语言处理 | 机器学习

2022-09-17

在这项工作中，我们为数字教练提供了一个新的数据集和一种计算策略，旨在指导用户练习自我附加疗法的方案。我们的框架增强了基于规则的对话代理，具有深入学习分类器，可在用户的文本响应中识别潜在的情感，以及一种深入学习的辅助检索方法，用于制作新颖，流利和善解人意的话语。我们还制作了用户可以选择与之互动的类似人类的角色。我们的目标是在虚拟疗法课程中获得高水平的参与度。我们在n = 16名参与者的非临床试验中评估了我们的框架的有效性，在五天的时间里，所有人都至少与代理商进行了四次相互作用。我们发现，与简单的基于规则的框架相比，我们的平台在同理心，用户参与度和实用性方面的评分始终高。最后，我们提供指南，以根据收到的反馈来进一步改善应用程序的设计和性能。

translated by 谷歌翻译

Dimensional Modeling of Emotions in Text with Appraisal Theories: Corpus Creation, Annotation Reliability, and Prediction

Enrica Troiano , Laura Oberländer , Roman Klinger

分类：自然语言处理

2022-06-10

情绪分析中最突出的任务是为文本分配情绪，并了解情绪如何在语言中表现出来。自然语言处理的一个重要观察结果是，即使没有明确提及情感名称，也可以通过单独参考事件来隐式传达情绪。在心理学中，被称为评估理论的情感理论类别旨在解释事件与情感之间的联系。评估可以被形式化为变量，通过他们认为相关的事件的人们的认知评估来衡量认知评估。其中包括评估事件是否是新颖的，如果该人认为自己负责，是否与自己的目标以及许多其他人保持一致。这样的评估解释了哪些情绪是基于事件开发的，例如，新颖的情况会引起惊喜或不确定后果的人可能引起恐惧。我们在文本中分析了评估理论对情绪分析的适用性，目的是理解注释者是否可以可靠地重建评估概念，如果可以通过文本分类器预测，以及评估概念是否有助于识别情感类别。为了实现这一目标，我们通过要求人们发短信描述触发特定情绪并披露其评估的事件来编译语料库。然后，我们要求读者重建文本中的情感和评估。这种设置使我们能够衡量是否可以纯粹从文本中恢复情绪和评估，并为判断模型的绩效指标提供人体基准。我们将文本分类方法与人类注释者的比较表明，两者都可以可靠地检测出具有相似性能的情绪和评估。我们进一步表明，评估概念改善了文本中情绪的分类。

translated by 谷歌翻译

GPT-3-driven pedagogical agents for training children's curious question-asking skills

Rania Abdelghani , Yen-Hsiang Wang , Xingdi Yuan , Tong Wang , Hélène Sauzéon , Pierre-Yves Oudeyer

分类：自然语言处理

2022-11-25

Students' ability to ask curious questions is a crucial skill that improves their learning processes. To train this skill, previous research has used a conversational agent that propose specific cues to prompt children's curiosity during learning. Despite showing pedagogical efficiency, this method is still limited since it relies on generating the said prompts by hand for each educational resource, which can be a very long and costly process. In this context, we leverage the advances in the natural language processing field and explore using a large language model (GPT-3) to automate the generation of this agent's curiosity-prompting cues to help children ask more and deeper questions. We then used this study to investigate a different curiosity-prompting behavior for the agent. The study was conducted with 75 students aged between 9 and 10. They either interacted with a hand-crafted conversational agent that proposes "closed" manually-extracted cues leading to predefined questions, a GPT-3-driven one that proposes the same type of cues, or a GPT-3-driven one that proposes "open" cues that can lead to several possible questions. Results showed a similar question-asking performance between children who had the two "closed" agents, but a significantly better one for participants with the "open" agent. Our first results suggest the validity of using GPT-3 to facilitate the implementation of curiosity-stimulating learning technologies. In a second step, we also show that GPT-3 can be efficient in proposing the relevant open cues that leave children with more autonomy to express their curiosity.

translated by 谷歌翻译