智能论文笔记

Understanding occupants' behaviour, engagement, emotion, and comfort indoors with heterogeneous sensors and wearables

Nan Gao , Max Marschall , Jane Burry , Simon Watkins , Flora D. Salim

分类：机器学习

2021-05-14

我们在澳大利亚墨尔本郊区的K-12私立学校进行了一个田间研究。数据捕获包含两个元素：首先，使用两个室外气象站的5个月纵向场研究，以及17个教室的室内气象站和乘员控制的房间空调的通风口上的温度传感器;这些在5分钟的测井频率下为每个教室的各个数据集中的各个数据集，包括乘员存在的额外数据。数据集用于推出乘员如何运营房间空调单元的预测模型。其次，我们在4周的横断面研究en-gage中跟踪了23名学生和6名教师，使用可穿戴传感器来记录生理数据，以及日常调查来查询乘客的热舒适度，学习参与，情绪和座位行为。总的来说，组合的数据集可用于分析校园内室内/室外气候和学生行为/精神状态之间的关系，这为未来设计智能反馈系统的机会为学生和员工受益。

translated by 谷歌翻译

High-resolution synthetic residential energy use profiles for the United States

Swapna Thorve , Young Yun Baek , Samarth Swarup , Henning Mortveit , Achla Marathe , Anil Vullikanti , Madhav Marathe

分类：人工智能

2022-10-14

Efficient energy consumption is crucial for achieving sustainable energy goals in the era of climate change and grid modernization. Thus, it is vital to understand how energy is consumed at finer resolutions such as household in order to plan demand-response events or analyze the impacts of weather, electricity prices, electric vehicles, solar, and occupancy schedules on energy consumption. However, availability and access to detailed energy-use data, which would enable detailed studies, has been rare. In this paper, we release a unique, large-scale, synthetic, residential energy-use dataset for the residential sector across the contiguous United States covering millions of households. The data comprise of hourly energy use profiles for synthetic households, disaggregated into Thermostatically Controlled Loads (TCL) and appliance use. The underlying framework is constructed using a bottom-up approach. Diverse open-source surveys and first principles models are used for end-use modeling. Extensive validation of the synthetic dataset has been conducted through comparisons with reported energy-use data. We present a detailed, open, high-resolution, residential energy-use dataset for the United States.

translated by 谷歌翻译

Cohort comfort models -- Using occupants' similarity to predict personal thermal preference with less data

Matias Quintana , Stefano Schiavon , Federico Tartarini , Joyce Kim , Clayton Miller

分类：机器学习

2022-08-05

我们介绍了队列舒适模型，这是一个新框架，用于预测新乘员如何看待其热环境。队列舒适模型利用从样本人群中收集的历史数据，这些数据具有一些潜在的偏好相似性，以预测新居民的热偏好反应。我们的框架能够利用可用的背景信息，例如物理特征和一次性的登机调查（对生活尺度的满意度，高度敏感的人尺度，五个个性特征）以及新乘员以及生理和环境传感器的测量值与热偏好响应配对。我们在两个公开可用的数据集中实施了框架，其中包含来自55人的纵向数据，其中包括6,000多个单独的热舒适调查。我们观察到，使用背景信息的队列舒适模型几乎没有变化的热偏好预测性能，但没有使用历史数据。另一方面，使用队列舒适模型的每个数据集占用人群的一半和三分之一的占用人群，而目标居民的历史数据较少，同类舒适模型将其热偏好预测增加了8〜 \％，平均为5〜 \％与对整个乘员人群进行训练的通用模型相比，某些乘员最多可容纳36点\％和46〜％。该框架以数据和站点不可知的方式呈现，其不同的组件很容易根据乘员和建筑物的数据可用性定制。队列舒适模型可能是迈向个性化的重要一步，而无需为每个新乘员开发个性化模型。

translated by 谷歌翻译

Building Matters: Spatial Variability in Machine Learning Based Thermal Comfort Prediction in Winters

Betty Lala , Srikant Manas Kala , Anmol Rastogi , Kunal Dahiya , Hirozumi Yamaguchi , Aya Hagishima

分类：机器学习

2022-06-28

室内环境中的热舒适感会对乘员的健康，福祉和表现产生巨大影响。鉴于对能源效率和实现智能建筑的关注，机器学习（ML）越来越多地用于数据驱动的热舒适度（TC）预测。通常，提出了用于空调或HVAC通风建筑物的基于ML的解决方案，这些模型主要是为成年人设计的。另一方面，在大多数国家 /地区，自然通风（NV）的建筑物是常态。它们也是节能和长期可持续性目标的理想选择。但是，NV建筑物的室内环境缺乏热调节，并且在空间环境中差异很大。这些因素使TC预测极具挑战性。因此，确定建筑环境对TC模型性能的影响很重要。此外，需要研究跨不同NV室内空间的TC预测模型的概括能力。这项工作解决了这些问题。数据是通过在5个自然通风的学校建筑中进行的为期一个月的实地实验，涉及512名小学生。空间变异性对学生舒适度的影响通过预测准确性的变化（高达71％）来证明。还通过特征重要性的变化来证明建筑环境对TC预测的影响。此外，对儿童（我们的数据集）和成人（ASHRAE-II数据库）进行了模型性能的空间变异性比较分析。最后，评估了NV教室中热舒适模型的概括能力，并强调了主要挑战。

translated by 谷歌翻译

Inconsistencies in Measuring Student Engagement in Virtual Learning -- A Critical Review

Shehroz S. Khan , Ali Abedi , Tracey Colella

分类：计算机视觉

2022-08-09

近年来，虚拟学习已成为传统课堂教学的替代方法。学生参与虚拟学习可能会对满足学习目标和计划辍学风险产生重大影响。在虚拟学习环境中，有许多专门针对学生参与度（SE）的测量工具。在这项关键综述中，我们分析了这些作品，并从不同的参与定义和测量量表上突出了不一致之处。现有研究人员之间的这种多样性在比较不同的注释和构建可推广的预测模型时可能会出现问题。我们进一步讨论了有关参与注释和设计缺陷的问题。我们根据我们定义的七个参与注释的七个维度分析现有的SE注释量表，包括来源，用于注释的数据模式，注释发生的时间，注释发生的时间段，抽象，组合和组合水平的时间段，定量。令人惊讶的发现之一是，在SE测量中，很少有审查的数据集使用了现有的精神法法学验证量表中的注释中。最后，我们讨论了除虚拟学习以外的其他一些范围，这些量表具有用于测量虚拟学习中SE的潜力。

translated by 谷歌翻译

GLOBEM Dataset: Multi-Year Datasets for Longitudinal Human Behavior Modeling Generalization

Xuhai Xu , Han Zhang , Yasaman Sefidgar , Yiyi Ren , Xin Liu , Woosuk Seo , Jennifer Brown , Kevin Kuehn , Mike Merrill , Paula Nurius

分类：机器学习 | 人工智能

2022-11-04

Recent research has demonstrated the capability of behavior signals captured by smartphones and wearables for longitudinal behavior modeling. However, there is a lack of a comprehensive public dataset that serves as an open testbed for fair comparison among algorithms. Moreover, prior studies mainly evaluate algorithms using data from a single population within a short period, without measuring the cross-dataset generalizability of these algorithms. We present the first multi-year passive sensing datasets, containing over 700 user-years and 497 unique users' data collected from mobile and wearable sensors, together with a wide range of well-being metrics. Our datasets can support multiple cross-dataset evaluations of behavior modeling algorithms' generalizability across different users and years. As a starting point, we provide the benchmark results of 18 algorithms on the task of depression detection. Our results indicate that both prior depression detection algorithms and domain generalization techniques show potential but need further research to achieve adequate cross-dataset generalizability. We envision our multi-year datasets can support the ML community in developing generalizable longitudinal behavior modeling algorithms.

translated by 谷歌翻译

"Are you okay, honey?": Recognizing Emotions among Couples Managing Diabetes in Daily Life using Multimodal Real-World Smartwatch Data

George Boateng , Prabhakaran Santhanam , Elgar Fleisch , Janina Lüscher , Theresa Pauly , Urte Scholz , Guy Bodenmann , Tobias Kowatsch

分类：自然语言处理

2022-08-16

夫妻通常在一起管理慢性疾病，管理层对患者及其浪漫伴侣造成了情感上的伤害。因此，认识到日常生活中每个伴侣的情绪可以提供对他们在慢性疾病管理中的情感健康的见解。当前，评估每个伴侣的情绪的过程是手动，时间密集和昂贵的。尽管夫妻之间存在着关于情感识别的作品，但这些作品都没有使用夫妻在日常生活中的互动中收集的数据。在这项工作中，我们收集了85小时（1,021个5分钟样本）现实世界多模式智能手表传感器数据（语音，心率，加速度计和陀螺仪）和自我报告的情绪数据（n = 612）（13个伙伴）（13）夫妻）在日常生活中管理2型糖尿病。我们提取了生理，运动，声学和语言特征，以及训练有素的机器学习模型（支持向量机和随机森林），以识别每个伴侣的自我报告的情绪（价和唤醒）。我们最佳模型的结果比偶然的结果更好，唤醒和价值分别为63.8％和78.1％。这项工作有助于建立自动情绪识别系统，最终使伙伴能够监视他们在日常生活中的情绪，并能够提供干预措施以改善其情感幸福感。

translated by 谷歌翻译

Are You Comfortable Now: Deep Learning the Temporal Variation in Thermal Comfort in Winters

Betty Lala , Srikant Manas Kala , Anmol Rastogi , Kunal Dahiya , Aya Hagishima

分类：机器学习 | 人工智能

2022-08-20

智能建筑中的室内热舒适对乘员的健康和表现有重大影响。因此，机器学习（ML）越来越多地用于解决与室内热舒适的挑战。热舒适感的时间变化是调节居住者福祉和能耗的重要问题。但是，在大多数基于ML的热舒适研究中，不考虑时间中的时间方面，例如一天中的时间，昼夜节律和室外温度。这项工作解决了这些问题。它研究了昼夜节律和室外温度对ML模型的预测准确性和分类性能的影响。数据是通过在14个教室中进行的长达一个月的实地实验收集的，其中512名小学生。四个热舒适度指标被认为是深神经网络的输出，并支持数据集的向量机模型。时间变异性对学童舒适性的影响通过“一天中的时间”分析显示。预测准确性的时间差异已显示（多达80％）。此外，我们表明室外温度（随时间变化）对热舒适模型的预测性能产生了积极影响高达30％。时空环境的重要性通过对比的是微观级别（特定于位置）和宏观级别（整个城市的6个位置）的重要性。这项工作的最重要发现是，对于多种热舒适度指标，显示了预测准确性的明确提高，而天空中的时间和天空照明则有所增加。

translated by 谷歌翻译

Calico: Relocatable On-cloth Wearables with Fast, Reliable, and Precise Locomotion

Anup Sathya , Jiasheng Li , Tauhidur Rahman , Ge Gao , Huaishu Peng

分类：机器人

2022-08-17

我们探索Calico是一种微型可重新定位的可穿戴系统，具有快速，精确的运动，用于体内相互作用，驱动和感应。印花布由两轮机器人和一条轨道机制或“铁路”组成，机器人在其上行驶。机器人具有独立的，尺寸很小，并且具有其他传感器扩展选项。轨道系统允许机器人沿着用户的身体移动并到达任何预定位置。它还包括旋转开关以启用复杂的路由选项，当提出发散轨道时。我们报告了印花布的设计和实施，并通过一系列的系统性能评估。然后，我们介绍一些应用程序方案和用户研究，以了解印花布作为舞蹈教练的潜力，并探索对我们情景的定性感知，以告知该领域未来的研究。

translated by 谷歌翻译

Camera Measurement of Physiological Vital Signs

Daniel McDuff

分类：计算机视觉 | 机器学习

2021-11-22

对医疗保健监控的远程工具的需求从未如此明显。摄像机测量生命体征利用成像装置通过分析人体的图像来计算生理变化。建立光学，机器学习，计算机视觉和医学的进步这些技术以来的数码相机的发明以来已经显着进展。本文介绍了对生理生命体征的相机测量综合调查，描述了它们可以测量的重要标志和实现所做的计算技术。我涵盖了临床和非临床应用以及这些应用需要克服的挑战，以便从概念上推进。最后，我描述了对研究社区可用的当前资源（数据集和代码），并提供了一个全面的网页（https://cameravitals.github.io/），其中包含这些资源的链接以及其中引用的所有文件的分类列表文章。

translated by 谷歌翻译

Dimensional Modeling of Emotions in Text with Appraisal Theories: Corpus Creation, Annotation Reliability, and Prediction

Enrica Troiano , Laura Oberländer , Roman Klinger

分类：自然语言处理

2022-06-10

情绪分析中最突出的任务是为文本分配情绪，并了解情绪如何在语言中表现出来。自然语言处理的一个重要观察结果是，即使没有明确提及情感名称，也可以通过单独参考事件来隐式传达情绪。在心理学中，被称为评估理论的情感理论类别旨在解释事件与情感之间的联系。评估可以被形式化为变量，通过他们认为相关的事件的人们的认知评估来衡量认知评估。其中包括评估事件是否是新颖的，如果该人认为自己负责，是否与自己的目标以及许多其他人保持一致。这样的评估解释了哪些情绪是基于事件开发的，例如，新颖的情况会引起惊喜或不确定后果的人可能引起恐惧。我们在文本中分析了评估理论对情绪分析的适用性，目的是理解注释者是否可以可靠地重建评估概念，如果可以通过文本分类器预测，以及评估概念是否有助于识别情感类别。为了实现这一目标，我们通过要求人们发短信描述触发特定情绪并披露其评估的事件来编译语料库。然后，我们要求读者重建文本中的情感和评估。这种设置使我们能够衡量是否可以纯粹从文本中恢复情绪和评估，并为判断模型的绩效指标提供人体基准。我们将文本分类方法与人类注释者的比较表明，两者都可以可靠地检测出具有相似性能的情绪和评估。我们进一步表明，评估概念改善了文本中情绪的分类。

translated by 谷歌翻译

MUTLA: A Large-Scale Dataset for Multimodal Teaching and Learning Analytics

Fangli Xu , Lingfei Wu , KP Thai , Carol Hsu , Wei Wang , Richard Tong

分类： (统计)机器学习

2019-10-05

Automatic analysis of teacher and student interactions could be very important to improve the quality of teaching and student engagement. However, despite some recent progress in utilizing multimodal data for teaching and learning analytics, a thorough analysis of a rich multimodal dataset coming for a complex real learning environment has yet to be done. To bridge this gap, we present a large-scale MUlti-modal Teaching and Learning Analytics (MUTLA) dataset. This dataset includes time-synchronized multimodal data records of students (learning logs, videos, EEG brainwaves) as they work in various subjects from Squirrel AI Learning System (SAIL) to solve problems of varying difficulty levels. The dataset resources include user records from the learner records store of SAIL, brainwave data collected by EEG headset devices, and video data captured by web cameras while students worked in the SAIL products. Our hope is that by analyzing real-world student learning activities, facial expressions, and brainwave patterns, researchers can better predict engagement, which can then be used to improve adaptive learning selection and student learning outcomes. An additional goal is to provide a dataset gathered from real-world educational activities versus those from controlled lab environments to benefit the educational learning community.

translated by 谷歌翻译

Artificial intelligence-driven digital twin of a modern house demonstrated in virtual reality

Elias Mohammed Elfarri , Adil Rasheed , Omer San

分类：计算机视觉

2022-12-14

A digital twin is defined as a virtual representation of a physical asset enabled through data and simulators for real-time prediction, optimization, monitoring, controlling, and improved decision-making. Unfortunately, the term remains vague and says little about its capability. Recently, the concept of capability level has been introduced to address this issue. Based on its capability, the concept states that a digital twin can be categorized on a scale from zero to five, referred to as standalone, descriptive, diagnostic, predictive, prescriptive, and autonomous, respectively. The current work introduces the concept in the context of the built environment. It demonstrates the concept by using a modern house as a use case. The house is equipped with an array of sensors that collect timeseries data regarding the internal state of the house. Together with physics-based and data-driven models, these data are used to develop digital twins at different capability levels demonstrated in virtual reality. The work, in addition to presenting a blueprint for developing digital twins, also provided future research directions to enhance the technology.

translated by 谷歌翻译

Designing A Clinically Applicable Deep Recurrent Model to Identify Neuropsychiatric Symptoms in People Living with Dementia Using In-Home Monitoring Data

Francesca Palermo , Honglin Li , Alexander Capstick , Nan Fletcher-Lloyd , Yuchen Zhao , Samaneh Kouchaki , Ramin Nilforooshan , David Sharp , Payam Barnaghi

分类：机器学习 | 人工智能

2021-10-19

搅拌是痴呆症患病率高的神经精神症状之一，可以对日常生活（ADL）的活动产生负面影响，以及个体的独立性。检测搅拌剧集可以帮助提前及时地提供痴呆症（PLWD）的人们。分析搅拌剧集还将有助于识别可修改的因素，例如环境温度和睡眠中的睡眠，导致个体搅动。这项初步研究提出了一种监督学习模型，用于分析PLWD中搅动风险，使用家庭监控数据。家庭监控数据包括来自2019年4月2021年4月至6月至6月20日至6月20日至6月至6月间PLWD的运动传感器，生理测量和厨房电器的使用。我们应用经常性的深度学习模型，以识别验证和记录的临床监测团队验证和记录的搅拌集团。我们提出了评估拟议模型的功效的实验。拟议的模型平均召回79.78％的召回，27.66％的精确度和37.64％的F1分数在采用最佳参数时得分，表明识别搅动事件的良好能力。我们还使用机器学习模型讨论使用连续监测数据分析行为模式，并探索临床适用性以及敏感性和特异性监控应用之间的选择。

translated by 谷歌翻译

When Creators Meet the Metaverse: A Survey on Computational Arts

Lik-Hang Lee , Zijun Lin , Rui Hu , Zhengya Gong , Abhishek Kumar , Tangyao Li , Sijia Li , Pan Hui

分类：人工智能 | 机器学习

2021-11-26

MetaVerse，巨大的虚拟物理网络空间，为艺术家带来了前所未有的机会，将我们的身体环境的每个角落与数字创造力混合。本文对计算艺术进行了全面的调查，其中七个关键主题与成权相关，描述了混合虚拟物理现实中的新颖艺术品。主题首先涵盖了MetaVerse的建筑元素，例如虚拟场景和字符，听觉，文本元素。接下来，已经反映了诸如沉浸式艺术，机器人艺术和其他用户以其他用户的方法提供了沉浸式艺术，机器人艺术和其他用户中心的若干非凡类型的新颖创作。最后，我们提出了几项研究议程：民主化的计算艺术，数字隐私和搬迁艺术家的安全性，为数字艺术品，技术挑战等等的所有权认可。该调查还担任艺术家和搬迁技术人员的介绍材料，以开始在超现实主义网络空间领域创造。

translated by 谷歌翻译

A large-scale and PCR-referenced vocal audio dataset for COVID-19

Jobie Budd , Kieran Baker , Emma Karoune , Harry Coppock , Selina Patel , Ana Tendero Cañadas , Alexander Titcomb , Richard Payne , David Hurley , Sabrina Egglestone

分类：机器学习

2022-12-15

The UK COVID-19 Vocal Audio Dataset is designed for the training and evaluation of machine learning models that classify SARS-CoV-2 infection status or associated respiratory symptoms using vocal audio. The UK Health Security Agency recruited voluntary participants through the national Test and Trace programme and the REACT-1 survey in England from March 2021 to March 2022, during dominant transmission of the Alpha and Delta SARS-CoV-2 variants and some Omicron variant sublineages. Audio recordings of volitional coughs, exhalations, and speech were collected in the 'Speak up to help beat coronavirus' digital survey alongside demographic, self-reported symptom and respiratory condition data, and linked to SARS-CoV-2 test results. The UK COVID-19 Vocal Audio Dataset represents the largest collection of SARS-CoV-2 PCR-referenced audio recordings to date. PCR results were linked to 70,794 of 72,999 participants and 24,155 of 25,776 positive cases. Respiratory symptoms were reported by 45.62% of participants. This dataset has additional potential uses for bioacoustics research, with 11.30% participants reporting asthma, and 27.20% with linked influenza PCR test results.

translated by 谷歌翻译

Multimedia Datasets for Anomaly Detection: A Survey

Pratibha Kumari , Anterpreet Kaur Bedi , Mukesh Saini

分类：计算机视觉

2021-12-10

多媒体异常数据集在自动监视中发挥着至关重要的作用。它们具有广泛的应用程序，从异常对象/情况检测到检测危及生命事件的检测。该字段正在接收大量的1.5多年的巨大研究兴趣，因此，已经创建了越来越多地专用于异常动作和对象检测的数据集。点击这些公共异常数据集使研究人员能够生成和比较具有相同输入数据的各种异常检测框架。本文介绍了各种视频，音频以及基于异常检测的应用的综合调查。该调查旨在解决基于异常检测的多媒体公共数据集缺乏全面的比较和分析。此外，它可以帮助研究人员选择最佳可用数据集，用于标记框架。此外，我们讨论了现有数据集和未来方向洞察中开发多峰异常检测数据集的差距。

translated by 谷歌翻译

ArchABM: an agent-based simulator of human interaction with the built environment. $CO_2$ and viral load analysis for indoor air quality

Iñigo Martinez , Jan L. Bruse , Ane M. Florez-Tapia , Elisabeth Viles , Igor G. Olaizola

分类：人工智能

2021-11-02

最近的证据表明，SARS-COV-2是2020年导致全球大流行病的病毒，主要经由室内环境中的空气机气溶胶传播。在评估和控制建筑物的室内空气质量（IAQ）时，这需要新颖的策略。 IAQ通常可以通过通风和/或政策来控制以调节人建筑物相互作用。然而，在建筑物中，占用者使用其他方式使用房间，可能并不明显哪种措施或对措施的组合导致成本和能源有效的解决方案，确保整个建筑物的良好IAQ。因此，在本文中，我们介绍了一种基于代理的模拟器，亚拟合，旨在帮助通过估计足够的房间尺寸，通风参数和测试政策的效果来帮助创造新的或适应现有建筑物，同时考虑到IAQ的结果复杂的人建筑物相互作用模式。最近公开的气溶胶模型适于计算每个房间中的时间依赖性二氧化碳（$ CO_2 $）和病毒量子浓度，每天吸入$ CO_2 $和病毒量子，作为生理反应的衡量标准。由于其模块化架构，Archabm对气溶胶模型和建筑布局具有灵活性，这允许实现进一步的模型，任何数字和房间，代理和操作的行动，反映人建筑物交互模式。我们提供了一个基于我们研究中心采用的真正平面计划和工作时间表的用例。本研究表明，先进的仿真工具如何有助于改善建筑物的IAQ，从而确保健康的室内环境。

translated by 谷歌翻译

MobilePhys: Personalized Mobile Camera-Based Contactless Physiological Sensing

Xin Liu , Yuntao Wang , Sinan Xie , Xiaoyu Zhang , Zixian Ma , Daniel McDuff , Shwetak Patel

分类：计算机视觉

2022-01-11

基于相机的非接触式光电子溶血性描绘是指一组流行的非接触生理测量技术。目前的最先进的神经模型通常以伴随金标准生理测量的视频以监督方式培训。但是，它们通常概括域名差别示例（即，与培训集中的视频不同）。个性化模型可以帮助提高型号的概括性，但许多个性化技术仍然需要一些金标准数据。为了帮助缓解这一依赖性，在本文中，我们展示了一种名为Mobilememon的新型移动感应系统，该系统是第一个移动个性化远程生理传感系统，它利用智能手机上的前后相机，为培训产生高质量的自我监督标签个性化非接触式相机的PPG模型。为了评估MobilemeLephys的稳健性，我们使用39名参与者进行了一个用户学习，他们在不同的移动设备下完成了一组任务，照明条件/强度，运动任务和皮肤类型。我们的研究结果表明，Mobilephys显着优于最先进的设备监督培训和几次拍摄适应方法。通过广泛的用户研究，我们进一步检查了Mobilephys如何在复杂的真实环境中执行。我们设想，从我们所提出的双摄像机移动传感系统产生的校准或基于相机的非接触式PPG模型将为智能镜，健身和移动健康应用等许多未来应用打开门。

translated by 谷歌翻译

PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization

Jingqing Zhang , Yao Zhao , Mohammad Saleh , Peter J. Liu

分类：

2019-12-18

Recent work pre-training Transformers with self-supervised objectives on large text corpora has shown great success when fine-tuned on downstream NLP tasks including text summarization. However, pre-training objectives tailored for abstractive text summarization have not been explored. Furthermore there is a lack of systematic evaluation across diverse domains. In this work, we propose pre-training large Transformer-based encoder-decoder models on massive text corpora with a new selfsupervised objective. In PEGASUS, important sentences are removed/masked from an input document and are generated together as one output sequence from the remaining sentences, similar to an extractive summary. We evaluated our best PEGASUS model on 12 downstream summarization tasks spanning news, science, stories, instructions, emails, patents, and legislative bills. Experiments demonstrate it achieves state-of-the-art performance on all 12 downstream datasets measured by ROUGE scores. Our model also shows surprising performance on low-resource summarization, surpassing previous state-of-the-art results on 6 datasets with only 1000 examples. Finally we validated our results using human evaluation and show that our model summaries achieve human performance on multiple datasets.

translated by 谷歌翻译