将人类运营商和虚拟代理(机器人)相结合到有效的混合系统中的前景是为客户提供适当的客户服务的前景,这是有希望而又具有挑战性的。当机器人无法提供适当的服务并在他们喜欢与人类运营商互动时,混合系统会减少客户的挫败感。此外,我们表明,可以通过使虚拟代理能够向人类操作员逐步学习来降低建立和维护此类虚拟代理的成本和努力。我们采用排队理论来确定控制此类混合系统行为和效率的关键参数,并确定应优化应进行优化以改善服务的主要参数。我们正式证明并在广泛的模拟和用户研究中证明,有了适当的选择,这种混合系统能够增加服务客户的数量,同时减少他们的预期等待时间和增加满意度。
translated by 谷歌翻译
Any organization needs to improve their products, services, and processes. In this context, engaging with customers and understanding their journey is essential. Organizations have leveraged various techniques and technologies to support customer engagement, from call centres to chatbots and virtual agents. Recently, these systems have used Machine Learning (ML) and Natural Language Processing (NLP) to analyze large volumes of customer feedback and engagement data. The goal is to understand customers in context and provide meaningful answers across various channels. Despite multiple advances in Conversational Artificial Intelligence (AI) and Recommender Systems (RS), it is still challenging to understand the intent behind customer questions during the customer journey. To address this challenge, in this paper, we study and analyze the recent work in Conversational Recommender Systems (CRS) in general and, more specifically, in chatbot-based CRS. We introduce a pipeline to contextualize the input utterances in conversations. We then take the next step towards leveraging reverse feature engineering to link the contextualized input and learning model to support intent recognition. Since performance evaluation is achieved based on different ML models, we use transformer base models to evaluate the proposed approach using a labelled dialogue dataset (MSDialogue) of question-answering interactions between information seekers and answer providers.
translated by 谷歌翻译
本文介绍了CAIR的设计和实施:为社会机器人和其他对话代理而设计的基于知识的自主互动的云系统。该系统对于低成本机器人和设备特别方便。为开发人员提供了一种可持续的解决方案,可以通过网络连接来管理口头和非语言互动,约有3,000个对话主题可以进行“闲聊”,并提供了一个预先煮熟的计划库,只需要将其接地到机器人的库中物理能力。该系统的结构为一组REST API端点,因此可以通过添加新的API来轻松扩展它,以提高连接到云的客户端的功能。该系统的另一个关键功能是它旨在使客户的开发变得直接:这样,可以轻松地赋予多个设备与用户自主交互的能力,了解何时执行特定的操作并利用云服务提供的所有信息。文章概述并讨论了为评估系统响应时间的性能而执行的实验结果,为研究和市场解决方案铺平了道路。提供了与ROS的客户的存储库的链接,并提供了诸如Pepper和Nao之类的流行机器人的链接。
translated by 谷歌翻译
Artificial intelligence and natural language processing (NLP) are increasingly being used in customer service to interact with users and answer their questions. The goal of this systematic review is to examine existing research on the use of NLP technology in customer service, including the research domain, applications, datasets used, and evaluation methods. The review also looks at the future direction of the field and any significant limitations. The review covers the time period from 2015 to 2022 and includes papers from five major scientific databases. Chatbots and question-answering systems were found to be used in 10 main fields, with the most common use in general, social networking, and e-commerce areas. Twitter was the second most commonly used dataset, with most research also using their own original datasets. Accuracy, precision, recall, and F1 were the most common evaluation methods. Future work aims to improve the performance and understanding of user behavior and emotions, and address limitations such as the volume, diversity, and quality of datasets. This review includes research on different spoken languages and models and techniques.
translated by 谷歌翻译
事实证明,在学习环境中,社会智能代理(SIA)的部署在不同的应用领域具有多个优势。社会代理创作工具使场景设计师能够创造出对SIAS行为的高度控制的量身定制体验,但是,另一方面,这是有代价的,因为该方案及其创作的复杂性可能变得霸道。在本文中,我们介绍了可解释的社会代理创作工具的概念,目的是分析社会代理的创作工具是否可以理解和解释。为此,我们检查了创作工具Fatima-Toolkit是否可以理解,并且从作者的角度来看,其创作步骤可以解释。我们进行了两项用户研究,以定量评估Fatima-Toolkit的解释性,可理解性和透明度,从场景设计师的角度来看。关键发现之一是,法蒂玛 - 库尔基特(Fatima-Toolkit)的概念模型通常是可以理解的,但是基于情感的概念并不那么容易理解和使用。尽管关于Fatima-Toolkit的解释性有一些积极的方面,但仍需要取得进展,以实现完全可以解释的社会代理商创作工具。我们提供一组关键概念和可能的解决方案,可以指导开发人员构建此类工具。
translated by 谷歌翻译
基于机器学习(ML)的系统的制作需要在其生命周期中进行统计控制。仔细量化业务需求和识别影响业务需求的关键因素降低了项目故障的风险。业务需求的量化导致随机变量的定义,表示通过统计实验需要分析的系统关键性能指标。此外,可提供的培训和实验结果产生影响系统的设计。开发系统后,测试并不断监控,以确保其符合其业务需求。这是通过持续应用统计实验来分析和控制关键绩效指标来完成的。本书教授制作和开发基于ML的系统的艺术。它倡导“首先”方法,强调从项目生命周期开始定义统计实验的需要。它还详细讨论了如何在整个生命周期中对基于ML的系统进行统计控制。
translated by 谷歌翻译
在数字治疗干预的背景下,例如互联网交付的认知行为治疗(ICBT)用于治疗抑郁和焦虑,广泛的研究表明,人类支持者或教练的参与如何协助接受治疗的人,改善用户参与治疗并导致更有效的健康结果而不是不受支持的干预措施。该研究旨在最大限度地提高这一人类支持的影响和结果,研究了通过AI和机器学习领域(ML)领域的最新进展提供的新机遇如何有助于有效地支持ICBT支持者的工作实践。本文报告了采访研究的详细调查结果,与15个ICBT支持者加深了解其现有的工作实践和信息需求,旨在有意义地向抑郁和焦虑治疗的背景下提供有用,可实现的ML申请。分析贡献(1)一组六个主题,总结了ICBT支持者在为其精神卫生客户提供有效,个性化反馈方面的策略和挑战;并回应这些学习,(2)对于ML方法如何帮助支持和解决挑战和信息需求,为每个主题提供具体机会。它依赖于在支持者LED客户审查实践中引入新的机器生成的数据见解的潜在社会,情感和务实含义的思考。
translated by 谷歌翻译
问答系统被认为是流行且经常有效的信息在网络上寻求信息的手段。在这样的系统中,寻求信息者可以通过自然语言提出问题来获得对他们的查询的简短回应。交互式问题回答是一种最近提出且日益流行的解决方案,它位于问答和对话系统的交集。一方面,用户可以以普通语言提出问题,并找到对她的询问的实际回答;另一方面,如果在初始请求中有多个可能的答复,很少或歧义,则系统可以将问题交通会话延长到对话中。通过允许用户提出更多问题,交互式问题回答使用户能够与系统动态互动并获得更精确的结果。这项调查提供了有关当前文献中普遍存在的交互式提问方法的详细概述。它首先要解释提问系统的基本原理,从而定义新的符号和分类法,以将所有已确定的作品结合在统一框架内。然后,根据提出的方法,评估方法和数据集/应用程序域来介绍和检查有关交互式问题解答系统的审查已发表的工作。我们还描述了围绕社区提出的特定任务和问题的趋势,从而阐明了学者的未来利益。 GitHub页面的综合综合了本文献研究中涵盖的所有主要主题,我们的工作得到了进一步的支持。 https://sisinflab.github.io/interactive-question-answering-systems-survey/
translated by 谷歌翻译
Chatbots已经彻底改变了人类与计算机系统互动的方式,他们替代了服务代理,呼叫中心代表等。健身行业一直是一种不断增长的行业,尽管它还没有适应AI,ML和云计算等最新技术。在本文中,我们建议使用IBM Watson开发适合健身管理的聊天栏,并将其与Web应用程序集成。我们建议使用自然语言处理(NLP)和自然语言理解(NLU)以及为Chatbot Assistant提供的IBM Cloud Watson框架。该软件采用无服务器架构,通过提供饮食计划,家庭练习,互动咨询会,健身建议,将专业的服务结合起来。
translated by 谷歌翻译
在本文中,我们展示了葡萄牙BERT模型如何与结构化数据组合,以便基于有限状态机部署Chatbot以创建一个对话AI系统,帮助房地产公司预测其客户的联系动机。该模型实现人类级别导致包含235个不平衡标签的数据集。然后,考虑到与古典NLP方法进行比较的业务影响,我们还展示了它的好处。
translated by 谷歌翻译
行为互联网(IOB)将人类行为放在工程智能连接系统的核心。 IOB将数字世界与人类行为联系起来建立人类驱动的设计,开发和适应过程。本文根据与软件工程师,人机互动科学家,社会科学家和认知科学社区互动的集体努力来定义IOB模型的新颖概念。基于IOB的模型,基于探索性研究,综合最先进的分析和专家访谈。真正的行业4.0制造基础设施的架构有助于解释IOB模型及其应用。概念模型用于成功为Uffizi画廊,意大利佛罗伦萨的人群监测和队列管理系统成功实施社会技术基础设施。该实验始于2016年秋季,并在2018年秋季进行运营,使用了一种数据驱动方法来使用实时感官数据来提供系统。它还在游客的移动行为上注入了预测模型。该系统的主要目标是捕捉人类行为,模型,并建立一种考虑变化,实时适应变化的机制,并不断从重复行为中学习。除了概念模型和现实生活评价外,本文还提供专家的建议,并为未来几年成为IOB成为一个重要的技术进步的未来指导。
translated by 谷歌翻译
Intelligent agents have great potential as facilitators of group conversation among older adults. However, little is known about how to design agents for this purpose and user group, especially in terms of agent embodiment. To this end, we conducted a mixed methods study of older adults' reactions to voice and body in a group conversation facilitation agent. Two agent forms with the same underlying artificial intelligence (AI) and voice system were compared: a humanoid robot and a voice assistant. One preliminary study (total n=24) and one experimental study comparing voice and body morphologies (n=36) were conducted with older adults and an experienced human facilitator. Findings revealed that the artificiality of the agent, regardless of its form, was beneficial for the socially uncomfortable task of conversation facilitation. Even so, talkative personality types had a poorer experience with the "bodied" robot version. Design implications and supplementary reactions, especially to agent voice, are also discussed.
translated by 谷歌翻译
Developing safe and useful general-purpose AI systems will require us to make progress on scalable oversight: the problem of supervising systems that potentially outperform us on most skills relevant to the task at hand. Empirical work on this problem is not straightforward, since we do not yet have systems that broadly exceed our abilities. This paper discusses one of the major ways we think about this problem, with a focus on how to turn it into one that can be productively studied empirically. We first present an experimental design centered on choosing tasks for which human specialists succeed but unaided humans and current general AI systems fail. We then present a proof-of-concept experiment following meant to demonstrate a key feature of this experimental design and show its viability with two question-answering tasks: MMLU and time-limited QuALITY. On these tasks, we find that human participants who interact with an unreliable large-language-model dialog assistant through chat -- a trivial baseline strategy for scalable oversight -- substantially outperform both the model alone and their own unaided performance. These results are an encouraging sign that scalable oversight will be tractable to study with present models and bolster recent findings that large language models can productively assist humans with difficult tasks.
translated by 谷歌翻译
组织依靠机器学习工程师(MLE)来操作ML,即部署和维护生产中的ML管道。操作ML或MLOP的过程包括(i)数据收集和标记的连续循环,(ii)实验以改善ML性能,(iii)在多阶段部署过程中评估,以及(iv)监视(iv)性能下降。当一起考虑这些责任似乎令人震惊 - 任何人如何进行MLOP,没有解决的挑战,对工具制造商有什么影响?我们对在包括聊天机器人,自动驾驶汽车和金融在内的许多应用程序中工作的18个MLE进行了半结构化的民族志访谈。我们的访谈暴露了三个变量,这些变量控制了生产ML部署的成功:速度,验证和版本。我们总结了成功实验,部署和维持生产绩效的共同实践。最后,我们讨论了受访者的痛点和反图案,对工具设计产生了影响。
translated by 谷歌翻译
创建可以自然与人类互动的代理是人工智能(AI)研究中的共同目标。但是,评估这些互动是具有挑战性的:收集在线人类代理相互作用缓慢而昂贵,但更快的代理指标通常与交互式评估相关。在本文中,我们评估了这些现有评估指标的优点,并提出了一种新颖的评估方法,称为标准化测试套件(STS)。 STS使用从真实人类交互数据中挖掘出的行为方案。代理商请参阅重播方案上下文,接收指令,然后将控制权控制以脱机完成交互。记录这些代理的延续并将其发送给人类注释者以将其标记为成功或失败,并且根据其成功的连续性比例对代理进行排名。最终的ST是自然主义相互作用的快速,控制,可解释的和代表的。总的来说,STS巩固了我们许多标准评估指标中所需的许多值,从而使我们能够加速研究进展,以生产可以自然与人类互动的代理。可以在https://youtu.be/yr1tnggorgq上找到视频。
translated by 谷歌翻译
In this paper, we increase the availability and integration of devices in the learning process to enhance the convergence of federated learning (FL) models. To address the issue of having all the data in one location, federated learning, which maintains the ability to learn over decentralized data sets, combines privacy and technology. Until the model converges, the server combines the updated weights obtained from each dataset over a number of rounds. The majority of the literature suggested client selection techniques to accelerate convergence and boost accuracy. However, none of the existing proposals have focused on the flexibility to deploy and select clients as needed, wherever and whenever that may be. Due to the extremely dynamic surroundings, some devices are actually not available to serve as clients in FL, which affects the availability of data for learning and the applicability of the existing solution for client selection. In this paper, we address the aforementioned limitations by introducing an On-Demand-FL, a client deployment approach for FL, offering more volume and heterogeneity of data in the learning process. We make use of the containerization technology such as Docker to build efficient environments using IoT and mobile devices serving as volunteers. Furthermore, Kubernetes is used for orchestration. The Genetic algorithm (GA) is used to solve the multi-objective optimization problem due to its evolutionary strategy. The performed experiments using the Mobile Data Challenge (MDC) dataset and the Localfed framework illustrate the relevance of the proposed approach and the efficiency of the on-the-fly deployment of clients whenever and wherever needed with less discarded rounds and more available data.
translated by 谷歌翻译
在带有电动车队的乘车系统中,充电是一个复杂的决策过程。大多数电动汽车(EV)出租车服务要求驾驶员做出利己主义决定,从而导致分散的临时充电策略。车辆之间通常缺乏或不共享移动性系统的当前状态,因此无法做出最佳的决定。大多数现有方法都不将时间,位置和持续时间结合到全面的控制算法中,也不适合实时操作。因此,我们提出了一种实时预测性充电方法,用于使用一个名为“闲置时间开发(ITX)”的单个操作员进行乘车服务,该方法预测了车辆闲置并利用这些时期来收获能量的时期。它依靠图形卷积网络和线性分配算法来设计最佳的车辆和充电站配对,以最大程度地提高利用的空闲时间。我们通过对纽约市现实世界数据集的广泛模拟研究评估了我们的方法。结果表明,就货币奖励功能而言,ITX的表现优于所有基线方法至少提高5%(相当于6,000个车辆操作的$ 70,000),该奖励奖励功能的建模旨在复制现实世界中乘车系统的盈利能力。此外,与基线方法相比,ITX可以将延迟至少减少4.68%,并且通常通过促进顾客在整个车队中更好地传播乘客的舒适度。我们的结果还表明,ITX使车辆能够在白天收获能量,稳定电池水平,并增加需求意外激增的弹性。最后,与表现最佳的基线策略相比,峰值负载减少了17.39%,这使网格操作员受益,并为更可持续的电网使用铺平了道路。
translated by 谷歌翻译
In this chapter, we review and discuss the transformation of AI technology in HCI/UX work and assess how AI technology will change how we do the work. We first discuss how AI can be used to enhance the result of user research and design evaluation. We then discuss how AI technology can be used to enhance HCI/UX design. Finally, we discuss how AI-enabled capabilities can improve UX when users interact with computing systems, applications, and services.
translated by 谷歌翻译
Recent work pre-training Transformers with self-supervised objectives on large text corpora has shown great success when fine-tuned on downstream NLP tasks including text summarization. However, pre-training objectives tailored for abstractive text summarization have not been explored. Furthermore there is a lack of systematic evaluation across diverse domains. In this work, we propose pre-training large Transformer-based encoder-decoder models on massive text corpora with a new selfsupervised objective. In PEGASUS, important sentences are removed/masked from an input document and are generated together as one output sequence from the remaining sentences, similar to an extractive summary. We evaluated our best PEGASUS model on 12 downstream summarization tasks spanning news, science, stories, instructions, emails, patents, and legislative bills. Experiments demonstrate it achieves state-of-the-art performance on all 12 downstream datasets measured by ROUGE scores. Our model also shows surprising performance on low-resource summarization, surpassing previous state-of-the-art results on 6 datasets with only 1000 examples. Finally we validated our results using human evaluation and show that our model summaries achieve human performance on multiple datasets.
translated by 谷歌翻译
拉力请求是当今协作软件开发和代码审核过程的关键部分。但是,当审阅者或作者不积极参与拉动请求时,拉动请求也可以减慢软件开发过程。在这项工作中,我们设计了一项端到端服务,以提醒作者或审阅者与他们的逾期拉动请求互动,以加速逾期拉动请求。首先,我们根据努力估算和机器学习使用模型来预测给定拉的请求的完成时间。其次,我们使用活动检测来滤除可能逾期的拉请请求,但仍在采取足够的动作。最后,我们使用演员身份证来了解拉动请求的阻止者是谁,并推动适当的演员(作者或审稿人)。轻推的主要新颖性是它成功地减少了拉动请求解决时间,同时确保开发人员认为发送的通知在成千上万的存储库中是有用的。在Microsoft使用的147个存储库的随机试验中,Nudge能够将拉的请求分辨率时间减少60%,而与Nudge未发送通知的逾期拉动请求相比,该请求的8,500次拉。此外,收到推动通知的开发人员将这些通知的73%置于正面。我们观察到在Microsoft的8,000个存储库中扩展Nudge的部署时,我们观察到了类似的结果,在整整一年中,Nudge发送了210,000个通知。这表明了Nudge可以扩展到数千个存储库的能力。最后,我们对选择通知的定性分析指示了未来研究的领域,例如在拉动请求和开发人员的可用性中考虑依赖性。
translated by 谷歌翻译