电子健康记录(EHR)是现代医疗系统的重要组成部分,影响医疗保健提供,运营和研究。尽管在EHR中进行了结构化信息,但非结构化的文本仍吸引了很多关注,并已成为一个令人兴奋的研究领域。最近的神经自然语言处理(NLP)方法的成功导致了处理非结构化临床笔记的新方向。在这项工作中,我们创建了一个用于临床文本的Python库,Ehrkit。该库包含两个主要部分:模拟III特定功能和任务特定功能。第一部分介绍了用于访问MIMIC-III NoteEvents数据的接口列表,包括基本搜索,信息检索和信息提取。第二部分集成了许多第三方库,用于多达12个删除NLP任务,例如命名实体识别,摘要,机器翻译等。
translated by 谷歌翻译
The research on text summarization for low-resource Indian languages has been limited due to the availability of relevant datasets. This paper presents a summary of various deep-learning approaches used for the ILSUM 2022 Indic language summarization datasets. The ISUM 2022 dataset consists of news articles written in Indian English, Hindi, and Gujarati respectively, and their ground-truth summarizations. In our work, we explore different pre-trained seq2seq models and fine-tune those with the ILSUM 2022 datasets. In our case, the fine-tuned SoTA PEGASUS model worked the best for English, the fine-tuned IndicBART model with augmented data for Hindi, and again fine-tuned PEGASUS model along with a translation mapping-based approach for Gujarati. Our scores on the obtained inferences were evaluated using ROUGE-1, ROUGE-2, and ROUGE-4 as the evaluation metrics.
translated by 谷歌翻译
自然语言处理(NLP)是一个人工智能领域,它应用信息技术来处理人类语言,在一定程度上理解并在各种应用中使用它。在过去的几年中,该领域已经迅速发展,现在采用了深层神经网络的现代变体来从大型文本语料库中提取相关模式。这项工作的主要目的是调查NLP在药理学领域的最新使用。正如我们的工作所表明的那样,NLP是药理学高度相关的信息提取和处理方法。它已被广泛使用,从智能搜索到成千上万的医疗文件到在社交媒体中找到对抗性药物相互作用的痕迹。我们将覆盖范围分为五个类别,以调查现代NLP方法论,常见的任务,相关的文本数据,知识库和有用的编程库。我们将这五个类别分为适当的子类别,描述其主要属性和想法,并以表格形式进行总结。最终的调查介绍了该领域的全面概述,对从业者和感兴趣的观察者有用。
translated by 谷歌翻译
由于临床实践所需的放射学报告和研究是在自由文本叙述中编写和存储的,因此很难提取相对信息进行进一步分析。在这种情况下,自然语言处理(NLP)技术可以促进自动信息提取和自由文本格式转换为结构化数据。近年来,基于深度学习(DL)的模型已适用于NLP实验,并具有令人鼓舞的结果。尽管基于人工神经网络(ANN)和卷积神经网络(CNN)的DL模型具有显着潜力,但这些模型仍面临临床实践中实施的一些局限性。变形金刚是另一种新的DL体系结构,已越来越多地用于改善流程。因此,在这项研究中,我们提出了一种基于变压器的细粒命名实体识别(NER)架构,以进行临床信息提取。我们以自由文本格式收集了88次腹部超声检查报告,并根据我们开发的信息架构进行了注释。文本到文本传输变压器模型(T5)和covive是T5模型的预训练域特异性适应性,用于微调来提取实体和关系,并将输入转换为结构化的格式。我们在这项研究中基于变压器的模型优于先前应用的方法,例如基于Rouge-1,Rouge-2,Rouge-L和BLEU分别为0.816、0.668、0.528和0.743的ANN和CNN模型,同时提供了一个分数可解释的结构化报告。
translated by 谷歌翻译
To facilitate research on text generation, this paper presents a comprehensive and unified library, TextBox 2.0, focusing on the use of pre-trained language models (PLMs). To be comprehensive, our library covers $13$ common text generation tasks and their corresponding $83$ datasets and further incorporates $45$ PLMs covering general, translation, Chinese, dialogue, controllable, distilled, prompting, and lightweight PLMs. We also implement $4$ efficient training strategies and provide $4$ generation objectives for pre-training new PLMs from scratch. To be unified, we design the interfaces to support the entire research pipeline (from data loading to training and evaluation), ensuring that each step can be fulfilled in a unified way. Despite the rich functionality, it is easy to use our library, either through the friendly Python API or command line. To validate the effectiveness of our library, we conduct extensive experiments and exemplify four types of research scenarios. The project is released at the link: https://github.com/RUCAIBox/TextBox.
translated by 谷歌翻译
科学主题的分类方案概述了其知识体系。它还可以用于促进访问研究文章和与受试者相关的其他材料。例如,ACM计算分类系统(CCS)用于ACM数字库搜索界面以及索引计算机科学论文。我们观察到,计算语言学(CL)和自然语言处理(NLP),不存在综合分类系统等CCS或数学主题分类(MSC)。我们提出了一个分类方案 - 基于在这一主题的77个大学课程的在线讲座的分析,Cl / NLP的Clicker。目前拟议的分类学包括334个主题,并侧重于CL / NLP的教育方面;它主要是基于,但不是完全,在NLP课程的讲义中。我们讨论这种分类系统如何帮助各种现实世界应用,包括辅导平台,资源检索,资源推荐,先决条件链学习和调查生成。
translated by 谷歌翻译
健康素养被出现为制定适当的健康决策和确保治疗结果的关键因素。然而,医学术语和该领域的专业语言的复杂结构使健康信息尤为难以解释。因此,迫切需要对自动化方法来提高生物医学文献的可访问性,以提高一般人群。这个问题可以作为医疗保健专业人员语言与公众的语言之间的翻译问题。在本文中,我们介绍了自动化生物医学科学评论的制定语言摘要的新任务,建设了一个数据集,以支持自动化方法的开发和评估,以提高生物医学文献的可访问性。我们对解决这项任务的各种挑战进行了分析,包括不仅对关键要点的总结,而且还概述了对背景知识和专业语言的简化的解释。我们试验最先进的摘要模型以及多种数据增强技术,并使用自动指标和人工评估评估其性能。结果表明,与专家专家专门开发的参考摘要相比,使用当代神经架构产生的自动产生的摘要可以实现有希望的质量和可读性(最佳Rouge-L为50.24和Flesch-Kincaid可读性得分为13.30)。我们还讨论了目前尝试的局限性,为未来工作提供了洞察和方向。
translated by 谷歌翻译
The publication rates are skyrocketing across many fields of science, and it is difficult to stay up to date with the latest research. This makes automatically summarizing the latest findings and helping scholars to synthesize related work in a given area an attractive research objective. In this paper we study the problem of citation text generation, where given a set of cited papers and citing context the model should generate a citation text. While citation text generation has been tackled in prior work, existing studies use different datasets and task definitions, which makes it hard to study citation text generation systematically. To address this, we propose CiteBench: a benchmark for citation text generation that unifies the previous datasets and enables standardized evaluation of citation text generation models across task settings and domains. Using the new benchmark, we investigate the performance of multiple strong baselines, test their transferability between the datasets, and deliver new insights into task definition and evaluation to guide the future research in citation text generation. We make CiteBench publicly available at https://github.com/UKPLab/citebench.
translated by 谷歌翻译
The internet has had a dramatic effect on the healthcare industry, allowing documents to be saved, shared, and managed digitally. This has made it easier to locate and share important data, improving patient care and providing more opportunities for medical studies. As there is so much data accessible to doctors and patients alike, summarizing it has become increasingly necessary - this has been supported through the introduction of deep learning and transformer-based networks, which have boosted the sector significantly in recent years. This paper gives a comprehensive survey of the current techniques and trends in medical summarization
translated by 谷歌翻译
Recent lay language generation systems have used Transformer models trained on a parallel corpus to increase health information accessibility. However, the applicability of these models is constrained by the limited size and topical breadth of available corpora. We introduce CELLS, the largest (63k pairs) and broadest-ranging (12 journals) parallel corpus for lay language generation. The abstract and the corresponding lay language summary are written by domain experts, assuring the quality of our dataset. Furthermore, qualitative evaluation of expert-authored plain language summaries has revealed background explanation as a key strategy to increase accessibility. Such explanation is challenging for neural models to generate because it goes beyond simplification by adding content absent from the source. We derive two specialized paired corpora from CELLS to address key challenges in lay language generation: generating background explanations and simplifying the original abstract. We adopt retrieval-augmented models as an intuitive fit for the task of background explanation generation, and show improvements in summary quality and simplicity while maintaining factual correctness. Taken together, this work presents the first comprehensive study of background explanation for lay language generation, paving the path for disseminating scientific knowledge to a broader audience. CELLS is publicly available at: https://github.com/LinguisticAnomalies/pls_retrieval.
translated by 谷歌翻译
与结构形式的结构良好的电子健康记录和数字方法集成的当前采用的现状和数字方法的集成通常被认为与使用传统的非结构化文本的患者数据文档相比较差。医疗数据分析领域的数据挖掘通常需要仅依赖于处理非结构化数据来检索相关数据。在自然语言处理(NLP)中,统计模型已在各种任务中成功示出,如语音部分标记,关系提取(RE)和命名实体识别(NER)。在这项工作中,我们呈现了专用于检测德语文本数据中的医疗实体类型的必要任务的第一个开放的神经NLP模型。在这里,我们避免了通过在自定义数据集上培训我们的模型通过佩戴的神经机翻译模型从公共可用数据集转换的自定义数据集中培训我们的模型来避免保护敏感患者数据的冲突和统计模型权重的出版。示例代码和统计模型可用于:https://github.com/frankkramer-lab/germed
translated by 谷歌翻译
使用自然语言处理方法自动汇总患者的主要进度注释中的主要问题,有助于与医院环境中的信息和认知超负荷作斗争,并可能为提供者提供计算机化的诊断决策支持。问题列表摘要需要一个模型来理解,抽象和生成临床文档。在这项工作中,我们提出了一项新的NLP任务,旨在在住院期间使用提供者进度注释的意见来在患者的日常护理计划中生成一系列问题。我们研究了两个最先进的SEQ2SEQ变压器体系结构T5和Bart的性能,以解决此问题。我们提供了一个基于公开可用的电子健康记录进度注释MART MART(MIMIC)-III中的公开电子健康记录进度注释的语料库。 T5和BART对通用域文本进行了培训,我们尝试了数据增强方法和域适应性预训练方法,以增加医学词汇和知识的接触。评估方法包括胭脂,Bertscore,嵌入句子上的余弦相似性以及对医学概念的F评分。结果表明,与基于规则的系统和通用域预训练的语言模型相比,具有领域自适应预训练的T5可实现显着的性能增长,这表明可以解决问题摘要任务的有希望的方向。
translated by 谷歌翻译
近年来,人们对使用电子病历(EMR)进行次要目的特别感兴趣,以增强医疗保健提供的质量和安全性。 EMR倾向于包含大量有价值的临床笔记。学习嵌入是一种将笔记转换为使其可比性的格式的方法。基于变压器的表示模型最近取得了巨大的飞跃。这些模型在大型在线数据集上进行了预训练,以有效地了解自然语言文本。学习嵌入的质量受临床注释如何用作表示模型的输入的影响。临床注释有几个部分具有不同水平的信息价值。医疗保健提供者通常使用不同的表达方式来实现同一概念也很常见。现有方法直接使用临床注释或初始预处理作为表示模型的输入。但是,要学习良好的嵌入,我们确定了最重要的临床笔记部分。然后,我们将提取的概念从选定部分映射到统一医学语言系统(UMLS)中的标准名称。我们使用与唯一概念相对应的标准短语作为临床模型的输入。我们进行了实验,以测量在公共可用的医疗信息集市(MIMIC-III)数据集的子集中,在医院死亡率预测的任务中,学到的嵌入向量的实用性。根据实验,与其他输入格式相比,基于临床变压器的表示模型通过提取的独特概念的标准名称产生的输入产生了更好的结果。表现最好的模型分别是Biobert,PubMedbert和Umlsbert。
translated by 谷歌翻译
我们提出了一个针对德国医学自然语言处理的统计模型,该模型训练了命名实体识别(NER),作为开放的公开模型。这项工作是我们第一个Gernerm模型的精致继任者,我们的工作大大优于我们的工作。我们证明了结合多种技术的有效性,以通过在预审预测的深度语言模型(LM),单词平衡和神经机器翻译上转移学习的方式来实现实体识别绩效。由于开放的公共医疗实体识别模型在德国文本上的稀疏情况,这项工作为医疗NLP作为基线模型的德国研究社区提供了好处。由于我们的模型基于公共英语数据,因此提供了其权重,而无需法律限制使用和分发。示例代码和统计模型可在以下网址获得:https://github.com/frankkramer-lab/gernermed-pp
translated by 谷歌翻译
Information extraction from scholarly articles is a challenging task due to the sizable document length and implicit information hidden in text, figures, and citations. Scholarly information extraction has various applications in exploration, archival, and curation services for digital libraries and knowledge management systems. We present MORTY, an information extraction technique that creates structured summaries of text from scholarly articles. Our approach condenses the article's full-text to property-value pairs as a segmented text snippet called structured summary. We also present a sizable scholarly dataset combining structured summaries retrieved from a scholarly knowledge graph and corresponding publicly available scientific articles, which we openly publish as a resource for the research community. Our results show that structured summarization is a suitable approach for targeted information extraction that complements other commonly used methods such as question answering and named entity recognition.
translated by 谷歌翻译
预训练的语言模型(PLM)通常会利用单语和多语言数据集的优势,该数据集可以在线免费获得,以在部署到特定任务中之前获取一般或混合域知识。最近提出了超大型PLM(XLPLM),以声称对较小尺寸的PLM(例如机器翻译(MT)任务)声称最高性能。这些XLPLM包括Meta-AI的WMT21密度24宽-EN-X和NLLB。 \ textIt {在这项工作中,我们检查XLPLM是否绝对优于较小尺寸的PLM,在针对特定域的MTS中进行微调。}我们使用了不同大小的两个不同的内域数据:商业自动化内部数据和\ textbf {临床}在WMT2022上共享了Clinspen2022挑战的任务数据。我们选择受欢迎的玛丽安·赫尔辛基(Marian Helsinki)作为较小尺寸的PLM和来自Meta-AI的两个大型大型转换器作为XLPLM。我们的实验研究表明,1)在较小尺寸的内域商业汽车数据上,XLPLM WMT21密度24宽24宽-EN-X确实显示出使用S \ TextSc {acre} BLEU和HLEU指标的评估得分要好得多。玛丽安(Marian),即使其得分提高率低于微调后的玛丽安(Marian); 2)在相对较大尺寸的精心准备的临床数据微调上,XLPLM NLLB \ textbf {倾向于失去}其优于较小尺寸的Marian在两个子任务(临床术语和本体概念)上使用Clinspen提供的指标Meteor,Meteor,Marian的优势。 Comet和Rouge-L,并且在所有指标上完全输给了Marian,包括S \ textsc {acre} bleu and Bleu; 3)\ textbf {指标并不总是同意}在相同的任务上使用相同的模型输出相互同意。
translated by 谷歌翻译
长文件摘要是自然语言处理领域的重要且艰巨的任务。良好的长文件摘要表现揭示了模型对人类语言的理解。目前,大多数研究侧重于如何修改变压器的注意机制,实现更高的胭脂分数。数据预处理和后处理的研究相对较少。在本文中,我们使用两个预处理方法和后处理方法,并分析了这些方法对各种长文件摘要模型的影响。
translated by 谷歌翻译
Spoken language understanding (SLU) tasks have been studied for many decades in the speech research community, but have not received as much attention as lower-level tasks like speech and speaker recognition. In particular, there are not nearly as many SLU task benchmarks, and many of the existing ones use data that is not freely available to all researchers. Recent work has begun to introduce such benchmark datasets for several tasks. In this work, we introduce several new annotated SLU benchmark tasks based on freely available speech data, which complement existing benchmarks and address gaps in the SLU evaluation landscape. We contribute four tasks: question answering and summarization involve inference over longer speech sequences; named entity localization addresses the speech-specific task of locating the targeted content in the signal; dialog act classification identifies the function of a given speech utterance. We follow the blueprint of the Spoken Language Understanding Evaluation (SLUE) benchmark suite. In order to facilitate the development of SLU models that leverage the success of pre-trained speech representations, we will be publishing for each task (i) annotations for a relatively small fine-tuning set, (ii) annotated development and test sets, and (iii) baseline models for easy reproducibility and comparisons. In this work, we present the details of data collection and annotation and the performance of the baseline models. We also perform sensitivity analysis of pipeline models' performance (speech recognizer + text model) to the speech recognition accuracy, using more than 20 state-of-the-art speech recognition models.
translated by 谷歌翻译
临床文本的自动汇总可以减轻医疗专业人员的负担。 “放电摘要”是摘要的一种有希望的应用,因为它们可以从每日住院记录中产生。我们的初步实验表明,放电摘要中有20-31%的描述与住院记录的内容重叠。但是,目前尚不清楚如何从非结构化来源生成摘要。为了分解医师的摘要过程,本研究旨在确定摘要中的最佳粒度。我们首先定义了具有不同粒度的三种摘要单元,以比较放电摘要生成的性能:整个句子,临床段和条款。我们在这项研究中定义了临床细分,旨在表达最小的医学意义概念。为了获得临床细分,有必要在管道的第一阶段自动拆分文本。因此,我们比较了基于规则的方法和一种机器学习方法,而后者在分裂任务中以0.846的F1得分优于构造者。接下来,我们在日本的多机构国家健康记录上,使用三种类型的单元(基于Rouge-1指标)测量了提取性摘要的准确性。使用整个句子,临床段和条款分别为31.91、36.15和25.18的提取性摘要的测量精度分别为31.91、36.15和25.18。我们发现,临床细分的准确性比句子和条款更高。该结果表明,住院记录的汇总需要比面向句子的处理更精细的粒度。尽管我们仅使用日本健康记录,但可以解释如下:医生从患者记录中提取“具有医学意义的概念”并重新组合它们...
translated by 谷歌翻译
Translating training data into many languages has emerged as a practical solution for improving cross-lingual transfer. For tasks that involve span-level annotations, such as information extraction or question answering, an additional label projection step is required to map annotated spans onto the translated texts. Recently, a few efforts have utilized a simple mark-then-translate method to jointly perform translation and projection by inserting special markers around the labeled spans in the original sentence. However, as far as we are aware, no empirical analysis has been conducted on how this approach compares to traditional annotation projection based on word alignment. In this paper, we present an extensive empirical study across 42 languages and three tasks (QA, NER, and Event Extraction) to evaluate the effectiveness and limitations of both methods, filling an important gap in the literature. Experimental results show that our optimized version of mark-then-translate, which we call EasyProject, is easily applied to many languages and works surprisingly well, outperforming the more complex word alignment-based methods. We analyze several key factors that affect end-task performance, and show EasyProject works well because it can accurately preserve label span boundaries after translation. We will publicly release all our code and data.
translated by 谷歌翻译