面向目标的意见单词提取(TOWE)是一项精细的情感分析任务,旨在从句子中提取给定意见目标的相应意见单词。最近,深度学习方法在这项任务上取得了显着进步。然而,由于昂贵的数据注释过程,TOWE任务仍然遭受培训数据的稀缺性。有限的标记数据增加了测试数据和培训数据之间分配变化的风险。在本文中,我们建议利用大量未标记的数据来通过增加模型对变化分布变化的暴露来降低风险。具体而言,我们提出了一种新型的多透明一致性正则化(MGCR)方法,以利用未标记的数据并设计两个专门用于TOWE的过滤器,以在不同的粒度上过滤嘈杂的数据。四个TOWE基准数据集的广泛实验结果表明,与当前的最新方法相比,MGCR的优越性。深入分析还证明了不同粒度过滤器的有效性。我们的代码可在https://github.com/towessl/towessl上找到。
translated by 谷歌翻译
As an important fine-grained sentiment analysis problem, aspect-based sentiment analysis (ABSA), aiming to analyze and understand people's opinions at the aspect level, has been attracting considerable interest in the last decade. To handle ABSA in different scenarios, various tasks are introduced for analyzing different sentiment elements and their relations, including the aspect term, aspect category, opinion term, and sentiment polarity. Unlike early ABSA works focusing on a single sentiment element, many compound ABSA tasks involving multiple elements have been studied in recent years for capturing more complete aspect-level sentiment information. However, a systematic review of various ABSA tasks and their corresponding solutions is still lacking, which we aim to fill in this survey. More specifically, we provide a new taxonomy for ABSA which organizes existing studies from the axes of concerned sentiment elements, with an emphasis on recent advances of compound ABSA tasks. From the perspective of solutions, we summarize the utilization of pre-trained language models for ABSA, which improved the performance of ABSA to a new stage. Besides, techniques for building more practical ABSA systems in cross-domain/lingual scenarios are discussed. Finally, we review some emerging topics and discuss some open challenges to outlook potential future directions of ABSA.
translated by 谷歌翻译
Aspect-based sentiment analysis (ABSA) aims at extracting opinionated aspect terms in review texts and determining their sentiment polarities, which is widely studied in both academia and industry. As a fine-grained classification task, the annotation cost is extremely high. Domain adaptation is a popular solution to alleviate the data deficiency issue in new domains by transferring common knowledge across domains. Most cross-domain ABSA studies are based on structure correspondence learning (SCL), and use pivot features to construct auxiliary tasks for narrowing down the gap between domains. However, their pivot-based auxiliary tasks can only transfer knowledge of aspect terms but not sentiment, limiting the performance of existing models. In this work, we propose a novel Syntax-guided Domain Adaptation Model, named SDAM, for more effective cross-domain ABSA. SDAM exploits syntactic structure similarities for building pseudo training instances, during which aspect terms of target domain are explicitly related to sentiment polarities. Besides, we propose a syntax-based BERT mask language model for further capturing domain-invariant features. Finally, to alleviate the sentiment inconsistency issue in multi-gram aspect terms, we introduce a span-based joint aspect term and sentiment analysis module into the cross-domain End2End ABSA. Experiments on five benchmark datasets show that our model consistently outperforms the state-of-the-art baselines with respect to Micro-F1 metric for the cross-domain End2End ABSA task.
translated by 谷歌翻译
Natural Language Inference (NLI) or Recognizing Textual Entailment (RTE) aims at predicting the relation between a pair of sentences (premise and hypothesis) as entailment, contradiction or semantic independence. Although deep learning models have shown promising performance for NLI in recent years, they rely on large scale expensive human-annotated datasets. Semi-supervised learning (SSL) is a popular technique for reducing the reliance on human annotation by leveraging unlabeled data for training. However, despite its substantial success on single sentence classification tasks where the challenge in making use of unlabeled data is to assign "good enough" pseudo-labels, for NLI tasks, the nature of unlabeled data is more complex: one of the sentences in the pair (usually the hypothesis) along with the class label are missing from the data and require human annotations, which makes SSL for NLI more challenging. In this paper, we propose a novel way to incorporate unlabeled data in SSL for NLI where we use a conditional language model, BART to generate the hypotheses for the unlabeled sentences (used as premises). Our experiments show that our SSL framework successfully exploits unlabeled data and substantially improves the performance of four NLI datasets in low-resource settings. We release our code at: https://github.com/msadat3/SSL_for_NLI.
translated by 谷歌翻译
Information Extraction (IE) aims to extract structured information from heterogeneous sources. IE from natural language texts include sub-tasks such as Named Entity Recognition (NER), Relation Extraction (RE), and Event Extraction (EE). Most IE systems require comprehensive understandings of sentence structure, implied semantics, and domain knowledge to perform well; thus, IE tasks always need adequate external resources and annotations. However, it takes time and effort to obtain more human annotations. Low-Resource Information Extraction (LRIE) strives to use unsupervised data, reducing the required resources and human annotation. In practice, existing systems either utilize self-training schemes to generate pseudo labels that will cause the gradual drift problem, or leverage consistency regularization methods which inevitably possess confirmation bias. To alleviate confirmation bias due to the lack of feedback loops in existing LRIE learning paradigms, we develop a Gradient Imitation Reinforcement Learning (GIRL) method to encourage pseudo-labeled data to imitate the gradient descent direction on labeled data, which can force pseudo-labeled data to achieve better optimization capabilities similar to labeled data. Based on how well the pseudo-labeled data imitates the instructive gradient descent direction obtained from labeled data, we design a reward to quantify the imitation process and bootstrap the optimization capability of pseudo-labeled data through trial and error. In addition to learning paradigms, GIRL is not limited to specific sub-tasks, and we leverage GIRL to solve all IE sub-tasks (named entity recognition, relation extraction, and event extraction) in low-resource settings (semi-supervised IE and few-shot IE).
translated by 谷歌翻译
方面情感三胞胎提取(ASTE)旨在提取方面,意见及其情感关系作为情感三胞胎的跨度。现有的作品通常将跨度检测作为1D令牌标记问题制定,并使用令牌对的2D标记矩阵对情感识别进行建模。此外,通过利用诸如伯特(Bert)之类的审计语言编码器(PLES)的代表形式,它们可以实现更好的性能。但是,他们只是利用将功能提取器作为提取器来构建其模块,但从未深入了解特定知识所包含的内容。在本文中,我们争辩说,与其进一步设计模块以捕获ASTE的电感偏见,不如包含“足够”的“足够”功能,用于1D和2D标记:(1)令牌表示包含令牌本身的上下文含义,因此此级别,因此此级别功能带有必要的信息以进行1D标记。 (2)不同PLE层的注意力矩阵可以进一步捕获令牌对中存在的多层次语言知识,从而使2D标记受益。 (3)此外,对于简单的转换,这两个功能也可以很容易地转换为2D标记矩阵和1D标记序列。这将进一步提高标签结果。通过这样做,PLE可以是自然的标记框架并实现新的最新状态,通过广泛的实验和深入分析来验证。
translated by 谷歌翻译
Aspect Sentiment Triplet Extraction (ASTE) is a new fine-grained sentiment analysis task that aims to extract triplets of aspect terms, sentiments, and opinion terms from review sentences. Recently, span-level models achieve gratifying results on ASTE task by taking advantage of the predictions of all possible spans. Since all possible spans significantly increases the number of potential aspect and opinion candidates, it is crucial and challenging to efficiently extract the triplet elements among them. In this paper, we present a span-level bidirectional network which utilizes all possible spans as input and extracts triplets from spans bidirectionally. Specifically, we devise both the aspect decoder and opinion decoder to decode the span representations and extract triples from aspect-to-opinion and opinion-to-aspect directions. With these two decoders complementing with each other, the whole network can extract triplets from spans more comprehensively. Moreover, considering that mutual exclusion cannot be guaranteed between the spans, we design a similar span separation loss to facilitate the downstream task of distinguishing the correct span by expanding the KL divergence of similar spans during the training process; in the inference process, we adopt an inference strategy to remove conflicting triplets from the results base on their confidence scores. Experimental results show that our framework not only significantly outperforms state-of-the-art methods, but achieves better performance in predicting triplets with multi-token entities and extracting triplets in sentences contain multi-triplets.
translated by 谷歌翻译
基于方面的情感分析(ABSA)旨在预测对给定方面表达的情感极性(SC)或提取意见跨度(OE)。 ABSA的先前工作主要依赖于相当复杂的特定方面特征诱导。最近,审计的语言模型(PLM),例如伯特(Bert)已被用作上下文建模层,以简化特征感应结构并实现最新性能。但是,这种基于PLM的上下文建模可能不是特定于方面的。因此,一个关键问题的探索还不足:如何通过PLM更好地建模特定方面的上下文?为了回答这个问题,我们试图以非侵入性的方式通过PLM增强特定方面的上下文建模。我们提出了三个特定于方面的输入转换,即伴侣,方面提示和方面标记。通过这些转变,可以实现非侵入性方面的PLM,以促进PLM,以便更多地关注句子中特定方面的环境。此外,我们为ABSA(ADVABSA)制定了对抗性基准,以查看特定于方面的建模如何影响模型的鲁棒性。 SC和OE的标准和对抗性基准的广泛实验结果证明了该方法的有效性和鲁棒性,从而在OE上产生了新的最新性能和SC上的竞争性能。
translated by 谷歌翻译
基于方面的情绪分析(ABSA)任务由三个典型的子特点组成:术语术语提取,意见术语提取和情感极性分类。这三个子组织通常是共同执行的,以节省资源并减少管道中的错误传播。但是,大多数现有联合模型只关注编码器共享的福利在子任务之间共享,但忽略差异。因此,我们提出了一个关节ABSA模型,它不仅享有编码器共享的好处,而且还专注于提高模型效率的差异。详细地,我们介绍了双编码器设计,其中一对编码器特别侧重于候选方识对分类,并且原始编码器对序列标记进行注意。经验结果表明,我们的拟议模型显示了鲁棒性,并显着优于前一个基准数据集的先前最先进。
translated by 谷歌翻译
基于方面的情绪分析旨在确定产品评论中特定方面的情感极性。我们注意到,大约30%的评论不包含明显的观点词,但仍然可以传达清晰的人类感知情绪取向,称为隐含情绪。然而,最近的基于神经网络的方法几乎没有关注隐性情绪,这一审查有所关注。为了克服这个问题,我们通过域名语言资源检索的大规模情绪注释的Corpora采用监督对比培训。通过将隐式情感表达式的表示对准与具有相同情绪标签的人,预培训过程可以更好地捕获隐含和明确的情绪方向,以便在评论中的方面。实验结果表明,我们的方法在Semeval2014基准上实现了最先进的性能,综合分析验证了其对学习隐含情绪的有效性。
translated by 谷歌翻译
开放信息提取是一个重要的NLP任务,它针对从非结构化文本中提取结构化信息的目标,而无需限制关系类型或文本域。该调查文件涵盖了2007年至2022年的开放信息提取技术,重点是以前的调查未涵盖的新模型。我们从信息角度来源提出了一种新的分类方法,以适应最近的OIE技术的开发。此外,我们根据任务设置以及当前流行的数据集和模型评估指标总结了三种主要方法。鉴于全面的审查,从数据集,信息来源,输出表格,方法和评估指标方面显示了几个未来的方向。
translated by 谷歌翻译
基于方面的情绪分析(ABSA)主要涉及三个子任务:方面术语提取,意见术语提取和方面思维分类,其通常以单独的或联合方式处理。然而,以前的方法并没有很好地利用三个子任务之间的互动关系,并不完全利用易于使用的文档级标记的域/情绪知识,这限制了他们的性能。为解决这些问题,我们提出了一种用于端到端ABSA的新型迭代多知识转移网络(IMKTN)。首先,通过ABSA子组织之间的交互式相关性,我们的IMKTN通过利用精心设计的路由算法将来自三个子任务中的任意两个子组织中的任意两个子组织中的任务特定知识传输到另一个,即任何两个这三个子组织将有助于第三个子任务。对于另一个,我们的IMKTN无疑将文档级知识,即特定于域和情绪相关的知识传输到方面级别子特派团,以进一步提高相应的性能。三个基准数据集的实验结果证明了我们方法的有效性和优越性。
translated by 谷歌翻译
Although attention mechanisms have become fundamental components of deep learning models, they are vulnerable to perturbations, which may degrade the prediction performance and model interpretability. Adversarial training (AT) for attention mechanisms has successfully reduced such drawbacks by considering adversarial perturbations. However, this technique requires label information, and thus, its use is limited to supervised settings. In this study, we explore the concept of incorporating virtual AT (VAT) into the attention mechanisms, by which adversarial perturbations can be computed even from unlabeled data. To realize this approach, we propose two general training techniques, namely VAT for attention mechanisms (Attention VAT) and "interpretable" VAT for attention mechanisms (Attention iVAT), which extend AT for attention mechanisms to a semi-supervised setting. In particular, Attention iVAT focuses on the differences in attention; thus, it can efficiently learn clearer attention and improve model interpretability, even with unlabeled data. Empirical experiments based on six public datasets revealed that our techniques provide better prediction performance than conventional AT-based as well as VAT-based techniques, and stronger agreement with evidence that is provided by humans in detecting important words in sentences. Moreover, our proposal offers these advantages without needing to add the careful selection of unlabeled data. That is, even if the model using our VAT-based technique is trained on unlabeled data from a source other than the target task, both the prediction performance and model interpretability can be improved.
translated by 谷歌翻译
Opinion mining is the branch of computation that deals with opinions, appraisals, attitudes, and emotions of people and their different aspects. This field has attracted substantial research interest in recent years. Aspect-level (called aspect-based opinion mining) is often desired in practical applications as it provides detailed opinions or sentiments about different aspects of entities and entities themselves, which are usually required for action. Aspect extraction and entity extraction are thus two core tasks of aspect-based opinion mining. his paper has presented a framework of aspect-based opinion mining based on the concept of transfer learning. on real-world customer reviews available on the Amazon website. The model has yielded quite satisfactory results in its task of aspect-based opinion mining.
translated by 谷歌翻译
Aspect Sentiment Triplet Extraction (ASTE) has become an emerging task in sentiment analysis research, aiming to extract triplets of the aspect term, its corresponding opinion term, and its associated sentiment polarity from a given sentence. Recently, many neural networks based models with different tagging schemes have been proposed, but almost all of them have their limitations: heavily relying on 1) prior assumption that each word is only associated with a single role (e.g., aspect term, or opinion term, etc. ) and 2) word-level interactions and treating each opinion/aspect as a set of independent words. Hence, they perform poorly on the complex ASTE task, such as a word associated with multiple roles or an aspect/opinion term with multiple words. Hence, we propose a novel approach, Span TAgging and Greedy infErence (STAGE), to extract sentiment triplets in span-level, where each span may consist of multiple words and play different roles simultaneously. To this end, this paper formulates the ASTE task as a multi-class span classification problem. Specifically, STAGE generates more accurate aspect sentiment triplet extractions via exploring span-level information and constraints, which consists of two components, namely, span tagging scheme and greedy inference strategy. The former tag all possible candidate spans based on a newly-defined tagging set. The latter retrieves the aspect/opinion term with the maximum length from the candidate sentiment snippet to output sentiment triplets. Furthermore, we propose a simple but effective model based on the STAGE, which outperforms the state-of-the-arts by a large margin on four widely-used datasets. Moreover, our STAGE can be easily generalized to other pair/triplet extraction tasks, which also demonstrates the superiority of the proposed scheme STAGE.
translated by 谷歌翻译
Aspect Based Sentiment Analysis is a dominant research area with potential applications in social media analytics, business, finance, and health. Prior works in this area are primarily based on supervised methods, with a few techniques using weak supervision limited to predicting a single aspect category per review sentence. In this paper, we present an extremely weakly supervised multi-label Aspect Category Sentiment Analysis framework which does not use any labelled data. We only rely on a single word per class as an initial indicative information. We further propose an automatic word selection technique to choose these seed categories and sentiment words. We explore unsupervised language model post-training to improve the overall performance, and propose a multi-label generator model to generate multiple aspect category-sentiment pairs per review sentence. Experiments conducted on four benchmark datasets showcase our method to outperform other weakly supervised baselines by a significant margin.
translated by 谷歌翻译
方面情绪三重态提取(ASTE)旨在从句子中提取三胞胎,包括目标实体,相关情感极性,以及合理化极性的意见跨度。现有方法缺乏目标 - 意见对之间的构建相关性,并忽略不同情绪三联体之间的相互干扰。为了解决这些问题,我们利用了两阶段框架来增强目标和意见之间的相关性:在阶段,通过序列标记提取目标和意见;然后,我们附加了一组名为可感知对的人工标签,其指示特定目标意义元组的跨度,输入句子以获得更接近相关的目标意见对表示。同时,我们通过限制令牌的注意力领域来降低三态层之间的负干扰。最后,根据可感知对的表示来识别极性。我们对四个数据集进行实验,实验结果表明了我们模型的有效性。
translated by 谷歌翻译
Semi-supervised learning lately has shown much promise in improving deep learning models when labeled data is scarce. Common among recent approaches is the use of consistency training on a large amount of unlabeled data to constrain model predictions to be invariant to input noise. In this work, we present a new perspective on how to effectively noise unlabeled examples and argue that the quality of noising, specifically those produced by advanced data augmentation methods, plays a crucial role in semi-supervised learning. By substituting simple noising operations with advanced data augmentation methods such as RandAugment and back-translation, our method brings substantial improvements across six language and three vision tasks under the same consistency training framework. On the IMDb text classification dataset, with only 20 labeled examples, our method achieves an error rate of 4.20, outperforming the state-of-the-art model trained on 25,000 labeled examples. On a standard semi-supervised learning benchmark, CIFAR-10, our method outperforms all previous approaches and achieves an error rate of 5.43 with only 250 examples. Our method also combines well with transfer learning, e.g., when finetuning from BERT, and yields improvements in high-data regime, such as ImageNet, whether when there is only 10% labeled data or when a full labeled set with 1.3M extra unlabeled examples is used. 1
translated by 谷歌翻译
Aspect sentiment triplet extraction (ASTE) aims to extract aspect term, sentiment and opinion term triplets from sentences. Since the initial datasets used to evaluate models on ASTE had flaws, several studies later corrected the initial datasets and released new versions of the datasets independently. As a result, different studies select different versions of datasets to evaluate their methods, which makes ASTE-related works hard to follow. In this paper, we analyze the relation between different versions of datasets and suggest that the entire-space version should be used for ASTE. Besides the sentences containing triplets and the triplets in the sentences, the entire-space version additionally includes the sentences without triplets and the aspect terms which do not belong to any triplets. Hence, the entire-space version is consistent with real-world scenarios and evaluating models on the entire-space version can better reflect the models' performance in real-world scenarios. In addition, experimental results show that evaluating models on non-entire-space datasets inflates the performance of existing models and models trained on the entire-space version can obtain better performance.
translated by 谷歌翻译
Simile recognition involves two subtasks: simile sentence classification that discriminates whether a sentence contains simile, and simile component extraction that locates the corresponding objects (i.e., tenors and vehicles). Recent work ignores features other than surface strings. In this paper, we explore expressive features for this task to achieve more effective data utilization. Particularly, we study two types of features: 1) input-side features that include POS tags, dependency trees and word definitions, and 2) decoding features that capture the interdependence among various decoding decisions. We further construct a model named HGSR, which merges the input-side features as a heterogeneous graph and leverages decoding features via distillation. Experiments show that HGSR significantly outperforms the current state-of-the-art systems and carefully designed baselines, verifying the effectiveness of introduced features. Our code is available at https://github.com/DeepLearnXMU/HGSR.
translated by 谷歌翻译