语法误差校正(GEC)系统执行序列到序列任务,其中GEC系统校正了包含语法错误的输入单词序列,以输出语法正确的单词序列。随着深度学习方法的出现,自动化的GEC系统变得越来越流行。例如,GEC系统通常用于英语学习者的语音转录作为评估和反馈形式 - 这些强大的GEC系统可用于自动测量候选人流利度的一个方面。 \ textit {edits}的计数从候选人的输入句子(或论文)到GEC系统的语法校正输出句子,这表明候选人的语言能力,其中更少的编辑表明更好的流利度。因此,编辑计数可以被视为\ textit {fluency评分},零表示完美的流利度。但是,尽管基于深度学习的GEC系统非常强大和准确,但它们容易受到对抗性攻击:对手可以在系统的输入下引入一个小的,特定的更改,该系统在输出时会导致大型,不需要的变化。在考虑将GEC系统应用于自动化语言评估时,对手的目的可能是通过对语法上不正确的输入句子进行小改动来作弊,该句子隐藏了GEC系统中的错误被不公正地获得了完美的流利程度。这项工作研究了一种简单的普遍替代攻击攻击,非母语的英语说话者实际上可以采用欺骗用于评估的GEC系统。
translated by 谷歌翻译
Machine learning algorithms are often vulnerable to adversarial examples that have imperceptible alterations from the original counterparts but can fool the state-of-the-art models. It is helpful to evaluate or even improve the robustness of these models by exposing the maliciously crafted adversarial examples. In this paper, we present TEXTFOOLER, a simple but strong baseline to generate adversarial text. By applying it to two fundamental natural language tasks, text classification and textual entailment, we successfully attacked three target models, including the powerful pre-trained BERT, and the widely used convolutional and recurrent neural networks. We demonstrate three advantages of this framework:(1) effective-it outperforms previous attacks by success rate and perturbation rate, (2) utility-preserving-it preserves semantic content, grammaticality, and correct types classified by humans, and (3) efficient-it generates adversarial text with computational complexity linear to the text length. 1
translated by 谷歌翻译
本文提出了一种新方法,用于在大规模语言数据集中自动检测具有词汇性别的单词。目前,对自然语言处理中性别偏见的评估取决于手动编译的性别表达词典,例如代词('He','She'等)和具有词汇性别的名词(“母亲”,“男友”,''女警等)。但是,如果没有定期更新这些列表的手动汇编,则可以导致静态信息,并且通常涉及单个注释者和研究人员的价值判断。此外,列表中未包含的术语不超出分析范围。为了解决这些问题,我们设计了一种基于词典的可扩展方法,以自动检测词汇性别,该性别可以提供具有高覆盖范围的动态,最新分析。我们的方法在确定从Wikipedia样本中随机检索的名词的词汇性别以及在先前研究中使用的性别单词列表中进行测试时达到了超过80%的精度。
translated by 谷歌翻译
数据增强是自然语言处理(NLP)模型的鲁棒性评估的重要组成部分,以及增强他们培训的数据的多样性。在本文中,我们呈现NL-Cogmenter,这是一种新的参与式Python的自然语言增强框架,它支持创建两个转换(对数据的修改)和过滤器(根据特定功能的数据拆分)。我们描述了框架和初始的117个变换和23个过滤器,用于各种自然语言任务。我们通过使用其几个转换来分析流行自然语言模型的鲁棒性来证明NL-Upmenter的功效。基础架构,Datacards和稳健性分析结果在NL-Augmenter存储库上公开可用(\ url {https://github.com/gem-benchmark/nl-augmenter})。
translated by 谷歌翻译
The extent to which men and women use language differently has been questioned previously. Finding clear and consistent gender differences in language is not conclusive in general, and the research is heavily influenced by the context and method employed to identify the difference. In addition, the majority of the research was conducted in written form, and the sample was collected in writing. Therefore, we compared the word choices of male and female presenters in public addresses such as TED lectures. The frequency of numerous types of words, such as parts of speech (POS), linguistic, psychological, and cognitive terms were analyzed statistically to determine how male and female speakers use words differently. Based on our data, we determined that male speakers use specific types of linguistic, psychological, cognitive, and social words in considerably greater frequency than female speakers.
translated by 谷歌翻译
在过去的几年中,保护NLP模型免受拼写错误的障碍是研究兴趣的对象。现有的补救措施通常会损害准确性,或者需要对每个新的攻击类别进行完整的模型重新训练。我们提出了一种新颖的方法,可以向基于变压器的NLP模型中的拼写错误增加弹性。可以实现这种鲁棒性,而无需重新训练原始的NLP模型,并且只有最小的语言丧失理解在没有拼写错误的输入上的性能。此外,我们提出了一种新的有效近似方法来产生对抗性拼写错误,这大大降低了评估模型对对抗性攻击的弹性所需的成本。
translated by 谷歌翻译
关于NLP模型的最先进攻击缺乏对成功攻击的共享定义。我们将思考从过去的工作蒸馏成统一的框架:一个成功的自然语言对抗性示例是欺骗模型并遵循一些语言限制的扰动。然后,我们分析了两个最先进的同义词替换攻击的产出。我们发现他们的扰动通常不会保留语义,38%引入语法错误。人类调查显示,为了成功保留语义,我们需要大大增加交换词语的嵌入和原始和扰动句子的句子编码之间的最小余弦相似之处。与更好的保留语义和语法性,攻击成功率下降超过70个百分点。
translated by 谷歌翻译
在本文中,我们研究了波斯语的G2P转换的端到端和多模块框架的应用。结果表明,我们提出的多模型G2P系统在准确性和速度方面优于我们的端到端系统。该系统由发音词典作为我们的查找表组成,以及使用GRU和Transformer架构创建的波斯语中的同符,OOV和EZAFE的单独模型。该系统是序列级别而不是单词级别,它使其能够有效地捕获单词(跨字信息)之间的不成文关系,而无需进行任何预处理,而无需进行任何预歧歧义和EZAFE识别。经过评估后,我们的系统达到了94.48%的单词级准确性,表现优于先前的波斯语G2P系统。
translated by 谷歌翻译
In this paper, we propose dictionary attacks against speaker verification - a novel attack vector that aims to match a large fraction of speaker population by chance. We introduce a generic formulation of the attack that can be used with various speech representations and threat models. The attacker uses adversarial optimization to maximize raw similarity of speaker embeddings between a seed speech sample and a proxy population. The resulting master voice successfully matches a non-trivial fraction of people in an unknown population. Adversarial waveforms obtained with our approach can match on average 69% of females and 38% of males enrolled in the target system at a strict decision threshold calibrated to yield false alarm rate of 1%. By using the attack with a black-box voice cloning system, we obtain master voices that are effective in the most challenging conditions and transferable between speaker encoders. We also show that, combined with multiple attempts, this attack opens even more to serious issues on the security of these systems.
translated by 谷歌翻译
词性(POS)标签的一部分在自然语言处理(NLP)中起重要作用。它的应用程序可以在许多NLP任务中找到,例如命名实体识别,句法解析,依赖性解析和文本块。在本文进行的调查中,我们利用了两个广泛使用的工具包的技术,即Clearnlp和Stanford Pos Tagger,以及为越南人开发了两个新的POS标签,然后将它们与三个著名的越南标签者进行比较,即vntagger和rdrpostagger。我们进行系统的比较,以找出具有最佳性能的标签器。我们还设计了一个新功能集来衡量统计标签者的性能。我们由Stanford Tagger和新功能集构建的新标签者可以在标记准确性方面胜过所有其他当前的越南标签。此外,我们还分析了某些功能对统计标签者的性能的感情。最后,实验结果还表明,基于转换的标签器Rdrpostagger的运行速度明显快于任何其他统计标签器。
translated by 谷歌翻译
In this paper we investigated two different methods to parse relative and noun complement clauses in English and resorted to distinct tags for their corresponding that as a relative pronoun and as a complementizer. We used an algorithm to relabel a corpus parsed with the GUM Treebank using Universal Dependency. Our second experiment consisted in using TreeTagger, a Probabilistic Decision Tree, to learn the distinction between the two complement and relative uses of postnominal "that". We investigated the effect of the training set size on TreeTagger accuracy and how representative the GUM Treebank files are for the two structures under scrutiny. We discussed some of the linguistic and structural tenets of the learnability of this distinction.
translated by 谷歌翻译
大语言模型中的表示形式包含多种类型的性别信息。我们专注于英语文本中的两种此类信号:事实性别信息,这是语法或语义属性,以及性别偏见,这是单词和特定性别之间的相关性。我们可以解开模型的嵌入,并识别编码两种类型信息的组件。我们的目标是减少表示形式中的刻板印象偏见,同时保留事实性别信号。我们的过滤方法表明,可以减少性别中立职业名称的偏见,而不会严重恶化能力。这些发现可以应用于语言生成,以减轻对刻板印象的依赖,同时保留核心方面的性别协议。
translated by 谷歌翻译
虽然端到端的神经机翻译(NMT)取得了令人印象深刻的进步,但嘈杂的输入通常会导致模型变得脆弱和不稳定。生成对抗性示例作为增强数据被证明是有用的,以减轻这个问题。对逆势示例生成(AEG)的现有方法是字级或字符级。在本文中,我们提出了一个短语级侵犯示例生成(PAEG)方法来增强模型的鲁棒性。我们的方法利用基于梯度的策略来替代源输入中的弱势位置的短语。我们在三个基准中验证了我们的方法,包括LDC中文 - 英语,IWSLT14德语,以及WMT14英语 - 德语任务。实验结果表明,与以前的方法相比,我们的方法显着提高了性能。
translated by 谷歌翻译
Part of Speech (POS) tagging is crucial to Natural Language Processing (NLP). It is a well-studied topic in several resource-rich languages. However, the development of computational linguistic resources is still in its infancy despite the existence of numerous languages that are historically and literary rich. Assamese, an Indian scheduled language, spoken by more than 25 million people, falls under this category. In this paper, we present a Deep Learning (DL)-based POS tagger for Assamese. The development process is divided into two stages. In the first phase, several pre-trained word embeddings are employed to train several tagging models. This allows us to evaluate the performance of the word embeddings in the POS tagging task. The top-performing model from the first phase is employed to annotate another set of new sentences. In the second phase, the model is trained further using the fresh dataset. Finally, we attain a tagging accuracy of 86.52% in F1 score. The model may serve as a baseline for further study on DL-based Assamese POS tagging.
translated by 谷歌翻译
本文提出的方法是通过单个输入双语词典自动为低资源语言(尤其是资源贫乏的语言)创建大量新的双语词典。我们的算法使用可用的WordNets和Machine Translator(MT)生成了源语言的单词翻译为丰富的目标语言。由于我们的方法仅依赖于一个输入字典,可用的WordNet和MT,因此它们适用于任何双语词典,只要两种语言之一是英语,或者具有链接到Princeton WordNet的WordNet。从5个可用的双语词典开始,我们创建了48个新的双语词典。其中,流行的MTS不支持30双语言:Google和Bing。
translated by 谷歌翻译
手动构建WordNet是一项艰巨的任务,需要多年的专家时间。作为自动构建完整WordNet的第一步,我们建议使用公开可用的WordNet,机器翻译器和/或单语言词典来生成有关资源丰富和资源贫乏语言的WordNet Synset的方法。我们的算法将现有WordNet的合成器转换为目标语言t,然后在翻译候选者上应用排名方法以查找T中的最佳翻译。我们的方法适用于任何至少有一个从英语翻译到它的现有双语字典的语言。
translated by 谷歌翻译
Chronic pain is a multi-dimensional experience, and pain intensity plays an important part, impacting the patients emotional balance, psychology, and behaviour. Standard self-reporting tools, such as the Visual Analogue Scale for pain, fail to capture this burden. Moreover, this type of tools is susceptible to a degree of subjectivity, dependent on the patients clear understanding of how to use it, social biases, and their ability to translate a complex experience to a scale. To overcome these and other self-reporting challenges, pain intensity estimation has been previously studied based on facial expressions, electroencephalograms, brain imaging, and autonomic features. However, to the best of our knowledge, it has never been attempted to base this estimation on the patient narratives of the personal experience of chronic pain, which is what we propose in this work. Indeed, in the clinical assessment and management of chronic pain, verbal communication is essential to convey information to physicians that would otherwise not be easily accessible through standard reporting tools, since language, sociocultural, and psychosocial variables are intertwined. We show that language features from patient narratives indeed convey information relevant for pain intensity estimation, and that our computational models can take advantage of that. Specifically, our results show that patients with mild pain focus more on the use of verbs, whilst moderate and severe pain patients focus on adverbs, and nouns and adjectives, respectively, and that these differences allow for the distinction between these three pain classes.
translated by 谷歌翻译
本文研究了为濒危语言生成词汇资源的方法。我们的算法使用公共文字网和机器翻译器(MT)构建双语词典和多语言词库。由于我们的作品仅依赖于濒危语言和“中间帮手”语言之间的一个双语词典,因此它适用于缺乏许多现有资源的语言。
translated by 谷歌翻译
我们采用自然语言处理技术来分析“ 200万首歌数据库”语料库中的377808英文歌曲歌词,重点是五十年(1960- 2010年)的性别歧视表达和性别偏见的测量。使用性别歧视分类器,我们比以前的研究使用手动注释的流行歌曲样本来确定性别歧视歌词。此外,我们通过测量在歌曲歌词中学到的单词嵌入中的关联来揭示性别偏见。我们发现性别歧视的内容可以在整个时间内增加,尤其是从男性艺术家和出现在Billboard图表中的流行歌曲。根据表演者的性别,歌曲还包含不同的语言偏见,男性独奏艺术家歌曲包含更多和更强烈的偏见。这是对此类类型的第一个大规模分析,在流行文化的如此有影响力的一部分中,可以深入了解语言使用。
translated by 谷歌翻译
最先进的文本分类模型越来越依赖深度神经网络(DNNS)。由于其黑框的性质,忠实而强大的解释方法需要陪同分类器在现实生活中进行部署。但是,在视力应用中已经显示出解释方法对局部,不可察觉的扰动敏感,这些方法可以显着改变解释而不会改变预测类。我们在这里表明,这种扰动的存在也扩展到文本分类器。具体来说,我们介绍了一种新颖的解释攻击算法,它不概论地改变了文本输入样本,以使广泛使用的解释方法的结果发生了很大变化,而在使分类器预测不变。我们在五个序列分类数据集上评估了TEF归因鲁棒性估计性能的性能,并利用每个数据集的三个DNN体系结构和三个变压器体系结构。 TEF可以显着降低未改变和扰动输入归因之间的相关性,这表明所有模型和解释方法都易受TEF扰动的影响。此外,我们评估了扰动如何传输到其他模型架构和归因方法,并表明TEF扰动在目标模型和解释方法未知的情况下也有效。最后,我们引入了一种半世界攻击,能够在不了解受攻击的分类器和解释方法的情况下计算快速,计算轻度扰动。总体而言,我们的工作表明,文本分类器中的解释非常脆弱,用户需要仔细解决其鲁棒性,然后才能在关键应用程序中依靠它们。
translated by 谷歌翻译