自然语言处理(NLP)已越来越多地用于提供教育应用的适应性。但是,最近的研究突出了预训练的语言模型中的各种偏见。尽管现有研究调查了不同领域的偏见,但它们在解决有关教育和多语言语料库的细粒度分析方面受到限制。在这项工作中,我们通过在五年内从大学生收集的9,165个德国同行评审的语料库中分析了跨文本和多个架构的偏见。值得注意的是,我们的语料库包括来自同行评审接收者以及人口统计属性的帮助,质量和关键方面等级等标签。我们对(1)与聚类标签有关的(2)最常见的预训练的德语模型(T5,BERT和GPT-2)和Glove Embeddings进行了单词嵌入关联测试(WEAT)测试(WEAT)分析(1)我们收集的语料库,以及(3)对我们收集的数据集进行微调后的语言模型。与我们的最初期望相反,我们发现我们收集的语料库在共同出现分析或手套嵌入中没有揭示许多偏见。但是,预先训练的德语模型发现了实质性的概念,种族和性别偏见,并且在同行评审数据的微调过程中,概念和种族轴之间的偏见发生了重大变化。通过我们的研究,我们的目标是通过新颖的数据集,对自然语言教育数据的偏见的理解以及不抵消语言模型中的教育任务偏见的潜在危害,为第四联合国的可持续发展目标(质量教育)做出贡献。
translated by 谷歌翻译
语言可以用作再现和执行有害刻板印象和偏差的手段,并被分析在许多研究中。在本文中,我们对自然语言处理中的性别偏见进行了304篇论文。我们分析了社会科学中性别及其类别的定义,并将其连接到NLP研究中性别偏见的正式定义。我们调查了在对性别偏见的研究中应用的Lexica和数据集,然后比较和对比方法来检测和减轻性别偏见。我们发现对性别偏见的研究遭受了四个核心限制。 1)大多数研究将性别视为忽视其流动性和连续性的二元变量。 2)大部分工作都在单机设置中进行英语或其他高资源语言进行。 3)尽管在NLP方法中对性别偏见进行了无数的论文,但我们发现大多数新开发的算法都没有测试他们的偏见模型,并无视他们的工作的伦理考虑。 4)最后,在这一研究线上发展的方法基本缺陷涵盖性别偏差的非常有限的定义,缺乏评估基线和管道。我们建议建议克服这些限制作为未来研究的指导。
translated by 谷歌翻译
对自然语言处理资源中的偏置模式的提高意识,如BERT,具有许多度量来量化“偏见”和“公平”。但是,如果没有完全不可能,请比较不同指标的结果和评估这些度量的作品仍然困难。我们调查了对预用语言模型的公平度量标准的现有文献,并通过实验评估兼容性,包括语言模型中的偏差,如在其下游任务中。我们通过传统文献调查和相关分析的混合来实现这一目标,以及运行实证评估。我们发现许多指标不兼容,高度依赖于(i)模板,(ii)属性和目标种子和(iii)选择嵌入式。这些结果表明,公平或偏见评估对情境化语言模型仍然具有挑战性,如果不是至少高度主观。为了提高未来的比较和公平评估,我们建议避免嵌入基于的指标并专注于下游任务中的公平评估。
translated by 谷歌翻译
News articles both shape and reflect public opinion across the political spectrum. Analyzing them for social bias can thus provide valuable insights, such as prevailing stereotypes in society and the media, which are often adopted by NLP models trained on respective data. Recent work has relied on word embedding bias measures, such as WEAT. However, several representation issues of embeddings can harm the measures' accuracy, including low-resource settings and token frequency differences. In this work, we study what kind of embedding algorithm serves best to accurately measure types of social bias known to exist in US online news articles. To cover the whole spectrum of political bias in the US, we collect 500k articles and review psychology literature with respect to expected social bias. We then quantify social bias using WEAT along with embedding algorithms that account for the aforementioned issues. We compare how models trained with the algorithms on news articles represent the expected social bias. Our results suggest that the standard way to quantify bias does not align well with knowledge from psychology. While the proposed algorithms reduce the~gap, they still do not fully match the literature.
translated by 谷歌翻译
语言语料库中的统计规律将众所周知的社会偏见编码为单词嵌入。在这里,我们专注于性别,以全面分析在互联网语料库中训练的广泛使用的静态英语单词嵌入式(Glove 2014,FastText 2017)。使用单类单词嵌入关联测试,我们证明了性别偏见的广泛流行,这些偏见也显示出:(1)与男性与女性相关的单词频率; (b)与性别相关的单词中的言论部分; (c)与性别相关的单词中的语义类别; (d)性别相关的单词中的价,唤醒和优势。首先,就单词频率而言:我们发现,在词汇量中,有1000个最常见的单词与男性相比,有77%的人与男性相关,这是在英语世界的日常语言中直接证明男性默认的证据。其次,转向言论的部分:顶级男性相关的单词通常是动词(例如,战斗,压倒性),而顶级女性相关的单词通常是形容词和副词(例如,奉献,情感上)。嵌入中的性别偏见也渗透到言论部分。第三,对于语义类别:自下而上,对与每个性别相关的前1000个单词的群集分析。与男性相关的顶级概念包括大技术,工程,宗教,体育和暴力的角色和领域;相比之下,顶级女性相关的概念较少关注角色,包括女性特定的诽谤和性内容以及外观和厨房用语。第四,使用〜20,000个单词词典的人类评级,唤醒和主导地位,我们发现与男性相关的单词在唤醒和优势上较高,而与女性相关的单词在价上更高。
translated by 谷歌翻译
语言的自动处理在我们的生活中普遍存在,经常在我们的决策中扮演核心角色,例如为我们的消息和邮件选择措辞,翻译我们的读物,甚至与我们进行完整的对话。单词嵌入是现代自然语言处理系统的关键组成部分。它们提供了一种词的表示,从而提高了许多应用程序的性能,从而是含义的表现。单词嵌入似乎可以捕捉到原始文本中单词的含义的外观,但与此同时,它们还提炼了刻板印象和社会偏见,后来传达给最终应用。这样的偏见可能是歧视性的。检测和减轻这些偏见,以防止自动化过程的歧视行为非常重要,因为它们的规模可能比人类更有害。目前,有许多工具和技术可以检测和减轻单词嵌入中的偏见,但是它们为没有技术技能的人的参与带来了许多障碍。碰巧的是,大多数偏见专家,无论是社会科学家还是对偏见有害,没有这样的技能的环境,并且由于技术障碍而无法参与偏见检测过程。我们研究了现有工具中的障碍,并与不同种类的用户探索了它们的可能性和局限性。通过此探索,我们建议开发一种专门旨在降低技术障碍的工具,并提供探索能力,以满足愿意审核这些技术的专家,科学家和一般人的要求。
translated by 谷歌翻译
Artificial intelligence and machine learning are in a period of astounding growth. However, there are concerns that these technologies may be used, either with or without intention, to perpetuate the prejudice and unfairness that unfortunately characterizes many human institutions. Here we show for the first time that human-like semantic biases result from the application of standard machine learning to ordinary language-the same sort of language humans are exposed to every day. We replicate a spectrum of standard human biases as exposed by the Implicit Association Test and other well-known psychological studies. We replicate these using a widely used, purely statistical machine-learning model-namely, the GloVe word embedding-trained on a corpus of text from the Web. Our results indicate that language itself contains recoverable and accurate imprints of our historic biases, whether these are morally neutral as towards insects or flowers, problematic as towards race or gender, or even simply veridical, reflecting the status quo for the distribution of gender with respect to careers or first names. These regularities are captured by machine learning along with the rest of semantics. In addition to our empirical findings concerning language, we also contribute new methods for evaluating bias in text, the Word Embedding Association Test (WEAT) and the Word Embedding Factual Association Test (WEFAT). Our results have implications not only for AI and machine learning, but also for the fields of psychology, sociology, and human ethics, since they raise the possibility that mere exposure to everyday language can account for the biases we replicate here.
translated by 谷歌翻译
Word Embeddings从单词共同发生统计信息中捕获的语言规律学习隐式偏差。通过延长定量单词嵌入中的人类偏差的方法,我们介绍了valnorm,一种新的内在评估任务和方法,以量化人类级字体群体的价值维度与社会心理学。从七种语言(中文,英语,德语,波兰语,葡萄牙语,西班牙语和土耳其语)以及跨越200年的历史英语文本,将Valnorm应用于静态词嵌入式Valnorm在量化非歧视性的非社交组字集的价值方面达到了始终如一的高精度。具体而言,Valnorm实现了r = 0.88的Pearson相关性,用于399个单词的人类判断得分,以建立英语的愉快规范。相比之下,我们使用相同的单词嵌入品测量性别刻板印象,并发现社会偏见因语言而异。我们的结果表明,非歧视性,非社会群组词的价协会代表着七种语言和200多年的广泛共享的协会。
translated by 谷歌翻译
我们采用自然语言处理技术来分析“ 200万首歌数据库”语料库中的377808英文歌曲歌词,重点是五十年(1960- 2010年)的性别歧视表达和性别偏见的测量。使用性别歧视分类器,我们比以前的研究使用手动注释的流行歌曲样本来确定性别歧视歌词。此外,我们通过测量在歌曲歌词中学到的单词嵌入中的关联来揭示性别偏见。我们发现性别歧视的内容可以在整个时间内增加,尤其是从男性艺术家和出现在Billboard图表中的流行歌曲。根据表演者的性别,歌曲还包含不同的语言偏见,男性独奏艺术家歌曲包含更多和更强烈的偏见。这是对此类类型的第一个大规模分析,在流行文化的如此有影响力的一部分中,可以深入了解语言使用。
translated by 谷歌翻译
We demonstrate that scaling up language models greatly improves task-agnostic, few-shot performance, sometimes even becoming competitive with prior state-ofthe-art fine-tuning approaches. Specifically, we train GPT-3, an autoregressive language model with 175 billion parameters, 10x more than any previous nonsparse language model, and test its performance in the few-shot setting. For all tasks, GPT-3 is applied without any gradient updates or fine-tuning, with tasks and few-shot demonstrations specified purely via text interaction with the model. GPT-3 achieves strong performance on many NLP datasets, including translation, question-answering, and cloze tasks. We also identify some datasets where GPT-3's few-shot learning still struggles, as well as some datasets where GPT-3 faces methodological issues related to training on large web corpora.
translated by 谷歌翻译
Despite being responsible for state-of-the-art results in several computer vision and natural language processing tasks, neural networks have faced harsh criticism due to some of their current shortcomings. One of them is that neural networks are correlation machines prone to model biases within the data instead of focusing on actual useful causal relationships. This problem is particularly serious in application domains affected by aspects such as race, gender, and age. To prevent models from incurring on unfair decision-making, the AI community has concentrated efforts in correcting algorithmic biases, giving rise to the research area now widely known as fairness in AI. In this survey paper, we provide an in-depth overview of the main debiasing methods for fairness-aware neural networks in the context of vision and language research. We propose a novel taxonomy to better organize the literature on debiasing methods for fairness, and we discuss the current challenges, trends, and important future work directions for the interested researcher and practitioner.
translated by 谷歌翻译
Objective. Chemical named entity recognition (NER) models have the potential to impact a wide range of downstream tasks, from identifying adverse drug reactions to general pharmacoepidemiology. However, it is unknown whether these models work the same for everyone. Performance disparities can potentially cause harm rather than the intended good. Hence, in this paper, we measure gender-related performance disparities of chemical NER systems. Materials and Methods. We develop a framework to measure gender bias in chemical NER models using synthetic data and a newly annotated dataset of over 92,405 words with self-identified gender information from Reddit. We applied and evaluated state-of-the-art biomedical NER models. Results. Our findings indicate that chemical NER models are biased. The results of the bias tests on the synthetic dataset and the real-world data multiple fairness issues. For example, for synthetic data, we find that female-related names are generally classified as chemicals, particularly in datasets containing many brand names rather than standard ones. For both datasets, we find consistent fairness issues resulting in substantial performance disparities between female- and male-related data. Discussion. Our study highlights the issue of biases in chemical NER models. For example, we find that many systems cannot detect contraceptives (e.g., birth control). Conclusion. Chemical NER models are biased and can be harmful to female-related groups. Therefore, practitioners should carefully consider the potential biases of these models and take steps to mitigate them.
translated by 谷歌翻译
我们研究了掩盖语言模型(MLMS)的任务无关内在和特定于任务的外在社会偏见评估措施之间的关系,并发现这两种评估措施之间仅存在弱相关性。此外,我们发现在下游任务进行微调期间,使用不同方法的MLMS DEBIAS进行了重新划分。我们确定两个培训实例中的社会偏见及其分配的标签是内在偏见评估测量值之间差异的原因。总体而言,我们的发现突出了现有的MLM偏见评估措施的局限性,并提出了使用这些措施在下游应用程序中部署MLM的担忧。
translated by 谷歌翻译
现代语言模型中的检测和缓解有害偏见被广泛认为是至关重要的开放问题。在本文中,我们退后一步,研究语言模型首先是如何偏见的。我们使用在英语Wikipedia语料库中训练的LSTM架构,使用相对较小的语言模型。在培训期间的每一步中,在每个步骤中都会更改数据和模型参数,我们可以详细介绍性别表示形式的发展,数据集中的哪些模式驱动器以及模型的内部状态如何与偏差相关在下游任务(语义文本相似性)中。我们发现性别的表示是动态的,并在训练过程中确定了不同的阶段。此外,我们表明,性别信息在模型的输入嵌入中越来越多地表示,因此,对这些性别的态度可以有效地减少下游偏置。监测训练动力学,使我们能够检测出在输入嵌入中如何表示男性和男性性别的不对称性。这很重要,因为这可能会导致幼稚的缓解策略引入新的不良偏见。我们更普遍地讨论了发现与缓解策略的相关性,以及将我们的方法推广到更大语言模型,变压器体系结构,其他语言和其他不良偏见的前景。
translated by 谷歌翻译
The recent success of large language models for text generation poses a severe threat to academic integrity, as plagiarists can generate realistic paraphrases indistinguishable from original work. However, the role of large autoregressive transformers in generating machine-paraphrased plagiarism and their detection is still developing in the literature. This work explores T5 and GPT-3 for machine-paraphrase generation on scientific articles from arXiv, student theses, and Wikipedia. We evaluate the detection performance of six automated solutions and one commercial plagiarism detection software and perform a human study with 105 participants regarding their detection performance and the quality of generated examples. Our results suggest that large models can rewrite text humans have difficulty identifying as machine-paraphrased (53% mean acc.). Human experts rate the quality of paraphrases generated by GPT-3 as high as original texts (clarity 4.0/5, fluency 4.2/5, coherence 3.8/5). The best-performing detection model (GPT-3) achieves a 66% F1-score in detecting paraphrases.
translated by 谷歌翻译
How do we design measures of social bias that we trust? While prior work has introduced several measures, no measure has gained widespread trust: instead, mounting evidence argues we should distrust these measures. In this work, we design bias measures that warrant trust based on the cross-disciplinary theory of measurement modeling. To combat the frequently fuzzy treatment of social bias in NLP, we explicitly define social bias, grounded in principles drawn from social science research. We operationalize our definition by proposing a general bias measurement framework DivDist, which we use to instantiate 5 concrete bias measures. To validate our measures, we propose a rigorous testing protocol with 8 testing criteria (e.g. predictive validity: do measures predict biases in US employment?). Through our testing, we demonstrate considerable evidence to trust our measures, showing they overcome conceptual, technical, and empirical deficiencies present in prior measures.
translated by 谷歌翻译
数据增强是自然语言处理(NLP)模型的鲁棒性评估的重要组成部分,以及增强他们培训的数据的多样性。在本文中,我们呈现NL-Cogmenter,这是一种新的参与式Python的自然语言增强框架,它支持创建两个转换(对数据的修改)和过滤器(根据特定功能的数据拆分)。我们描述了框架和初始的117个变换和23个过滤器,用于各种自然语言任务。我们通过使用其几个转换来分析流行自然语言模型的鲁棒性来证明NL-Upmenter的功效。基础架构,Datacards和稳健性分析结果在NL-Augmenter存储库上公开可用(\ url {https://github.com/gem-benchmark/nl-augmenter})。
translated by 谷歌翻译
近年来,文本的风格特性吸引了计算语言学研究人员。具体来说,研究人员研究了文本样式转移(TST)任务,该任务旨在在保留其样式独立内容的同时改变文本的风格属性。在过去的几年中,已经开发了许多新颖的TST算法,而该行业利用这些算法来实现令人兴奋的TST应用程序。由于这种共生,TST研究领域迅速发展。本文旨在对有关文本样式转移的最新研究工作进行全面审查。更具体地说,我们创建了一种分类法来组织TST模型,并提供有关最新技术状况的全面摘要。我们回顾了针对TST任务的现有评估方法,并进行了大规模的可重复性研究,我们在两个公开可用的数据集上实验基准了19个最先进的TST TST算法。最后,我们扩展了当前趋势,并就TST领域的新开发发展提供了新的观点。
translated by 谷歌翻译
各种现有研究分析了NLP模型继承了哪些社会偏见。这些偏见可能直接或间接损害人们,因此以前的研究仅关注人类属性。但是,直到最近,还没有关于NLP关于非人类的社会偏见的研究。在本文中,我们分析了非人类动物的偏见,即物种主义偏见,在英语蒙面语言模型(例如Bert)中固有的偏见。我们使用基于模板的和语料库提取的句子(或非特征主义)语言分析了物种主义对46个动物名称的偏见。我们发现,预先训练的蒙版语言模型倾向于将有害单词与非人类动物联系起来,并且有偏见的偏见,将物种主义语言用于某些非人类动物名称。我们用于复制实验的代码将在GitHub上提供。
translated by 谷歌翻译
基于变压器的语言模型最近在许多自然语言任务中取得了显着的结果。但是,通常通过利用大量培训数据来实现排行榜的性能,并且很少通过将明确的语言知识编码为神经模型。这使许多人质疑语言学对现代自然语言处理的相关性。在本文中,我介绍了几个案例研究,以说明理论语言学和神经语言模型仍然相互关联。首先,语言模型通过提供一个客观的工具来测量语义距离,这对语言学家很有用,语义距离很难使用传统方法。另一方面,语言理论通过提供框架和数据源来探究我们的语言模型,以了解语言理解的特定方面,从而有助于语言建模研究。本论文贡献了三项研究,探讨了语言模型中语法 - 听觉界面的不同方面。在论文的第一部分中,我将语言模型应用于单词类灵活性的问题。我将Mbert作为语义距离测量的来源,我提供了有利于将单词类灵活性分析为方向过程的证据。在论文的第二部分中,我提出了一种方法来测量语言模型中间层的惊奇方法。我的实验表明,包含形态句法异常的句子触发了语言模型早期的惊喜,而不是语义和常识异常。最后,在论文的第三部分中,我适应了一些心理语言学研究,以表明语言模型包含了论证结构结构的知识。总而言之,我的论文在自然语言处理,语言理论和心理语言学之间建立了新的联系,以为语言模型的解释提供新的观点。
translated by 谷歌翻译