智能论文笔记

Impact of Sentiment Analysis in Fake Review Detection

Amira Yousif , James Buckley

分类：自然语言处理 | 机器学习

2022-12-18

Fake review identification is an important topic and has gained the interest of experts all around the world. Identifying fake reviews is challenging for researchers, and there are several primary challenges to fake review detection. We propose developing an initial research paper for investigating fake reviews by using sentiment analysis. Ten research papers are identified that show fake reviews, and they discuss currently available solutions for predicting or detecting fake reviews. They also show the distribution of fake and truthful reviews through the analysis of sentiment. We summarize and compare previous studies related to fake reviews. We highlight the most significant challenges in the sentiment evaluation process and demonstrate that there is a significant impact on sentiment scores used to identify fake feedback.

translated by 谷歌翻译

Sentiment analysis and opinion mining on E-commerce site

Fatema Tuz Zohra Anny , Oahidul Islam

分类：自然语言处理 | 机器学习

2022-11-28

Sentiment analysis or opinion mining help to illustrate the phrase NLP (Natural Language Processing). Sentiment analysis has been the most significant topic in recent years. The goal of this study is to solve the sentiment polarity classification challenges in sentiment analysis. A broad technique for categorizing sentiment opposition is presented, along with comprehensive process explanations. With the results of the analysis, both sentence-level classification and review-level categorization are conducted. Finally, we discuss our plans for future sentiment analysis research.

translated by 谷歌翻译

Cross-Domain Consumer Review Analysis

Aditya Pandey , Kunal Joshi

分类：机器学习

2022-12-23

The paper presents a cross-domain review analysis on four popular review datasets: Amazon, Yelp, Steam, IMDb. The analysis is performed using Hadoop and Spark, which allows for efficient and scalable processing of large datasets. By examining close to 12 million reviews from these four online forums, we hope to uncover interesting trends in sales and customer sentiment over the years. Our analysis will include a study of the number of reviews and their distribution over time, as well as an examination of the relationship between various review attributes such as upvotes, creation time, rating, and sentiment. By comparing the reviews across different domains, we hope to gain insight into the factors that drive customer satisfaction and engagement in different product categories.

translated by 谷歌翻译

Urdu Speech and Text Based Sentiment Analyzer

Waqar Ahmad , Maryam Edalati

分类：自然语言处理

2022-07-19

发现别人认为是我们信息收集策略的关键方面。现在，人们可以积极利用信息技术来寻找和理解他人的想法，这要归功于越来越多的意见资源（例如在线评论网站和个人博客）的越来越多。由于其在理解人们的意见方面的关键功能，因此情感分析（SA）是一项至关重要的任务。另一方面，现有的研究主要集中在英语上，只有少量研究专门研究低资源语言。对于情感分析，这项工作根据用户评估提供了一个新的多级乌尔都语数据集。高音扬声器网站用于获取乌尔都语数据集。我们提出的数据集包括10,000项评论，这些评论已被人类专家精心归类为两类：正面，负面。这项研究的主要目的是构建一个手动注释的数据集进行乌尔都语情绪分析，并确定基线结果。采用了五种不同的词典和规则的算法，包括NaiveBayes，Stanza，TextBlob，Vader和Flair，实验结果表明，其精度为70％的天赋优于其他经过测试的算法。

translated by 谷歌翻译

Can Machine Learning Tools Support the Identification of Sustainable Design Leads From Product Reviews? Opportunities and Challenges

Michael Saidani , Harrison Kim , Bernard Yannou

分类：机器学习 | 人工智能

2021-12-17

在线发布的产品评论数量越来越多的是设计师的金矿，通过捕捉客户的声音，并相应地改善这些产品，了解他们开发的产品。与此同时，产品设计和开发在创造更可持续的未来方面具有重要作用。随着自然语言处理领域的人工智能技术最近，该研究旨在开发一体化机器学习解决方案，以便自动从线产品评论获得可持续设计的洞察。在本文中，讨论了，说明了现有框架 - 包括Python库，软件包以及伯爵等最先进的算法的机会和挑战。这一贡献讨论了达成的机会和建立机器学习管道的挑战，以便从产品审查中获取有限性，以设计更可持续的产品，包括五个阶段，包括与解释的可持续性相关的审查可持续设计引导：数据收集，数据格式，模型培训，模型评估和模型部署。给出了可持续设计见解的例子，可提供退出产品审查采矿和加工。最后，提供了用于该领域未来研究的有希望的线，包括与可持续替代品的平行标准产品的案例研究，以比较客户价值的特征，并在优质的可持续设计引线中产生。

translated by 谷歌翻译

The impact of Twitter on political influence on the choice of a running mate: Social Network Analysis and Semantic Analysis -- A Review

Immaculate Wanza , Irad Kamuti , David Gichohi , Kinyua Gikunda

分类：自然语言处理

2022-07-31

在这个新的社交媒体时代，社交网络正在成为互联网上用户生成内容的越来越重要的来源。这些信息资源包括许多人的感受，意见，反馈和评论，对大型企业，市场，政治，新闻和许多其他领域都非常有用。政治是目前社交媒体网络上最受关注和流行的主题之一。许多政客使用Twitter等微博客服务，因为他们在这些网络上有大量的追随者和支持者。政客，政党，政治组织和基金会使用社交媒体网络提前与公民进行交流。如今，社交媒体已被成千上万的政治团体和政客使用。在这些社交媒体网络上，每个政治家和政党都有数百万追随者，而政客们发现了新的创新方式来敦促个人参与政治。此外，社交媒体通过提供建议，例如根据以前的经验制定政策和策略，推荐并选择适合特定选区的候选人，建议在党中的特定职位，并选择合适的候选人，并选择合适的候选人，并选择合适的候选人，并选择合适的候选人，以及基于对各种问题和争议的公民情绪发起政治运动。这项研究是关于在Twitter平台上使用社交网络分析（SNA）和语义分析（SA）的综述，以研究政治领导者的支持者网络，因为它可以在预测其政治未来时有助于决策。

translated by 谷歌翻译

BanglaSarc: A Dataset for Sarcasm Detection

Tasnim Sakib Apon , Ramisa Anan , Elizabeth Antora Modhu , Arjun Suter , Ifrit Jamal Sneha , MD. Golam Rabiul Alam

分类：自然语言处理 | 人工智能

2022-09-27

作为世界上口语最广泛的语言之一，孟加拉国的使用在社交媒体世界中也在增加。讽刺是一种积极的陈述或言论，其基本的负面动机在当今的社交媒体平台中广泛使用。在过去的许多年中，英语的讽刺检测有了显着改善，但是有关孟加拉讽刺检测的情况仍然没有改变。结果，仍然很难识别孟加拉国中的讽刺，缺乏高质量的数据是主要因素。本文提出了Banglasarc，该数据集是专门为孟加拉文本数据讽刺检测的数据集。该数据集包含5112条评论/状态和从各种在线社交平台（例如Facebook，YouTube）以及一些在线博客中收集的内容。由于孟加拉语中分类评论的数据收集数量有限，因此该数据集将有助于确定讽刺的研究，认识到人们的情绪，检测到各种类型的孟加拉语表达式和其他领域。该数据集可在https://www.kaggle.com/datasets/sakibapon/banglasarc上公开获得。

translated by 谷歌翻译

5-Star Hotel Customer Satisfaction Analysis Using Hybrid Methodology

Yongmin Yoo , Yeongjoon Park , Dongjin Lim , Deaho Seo

分类：人工智能 | 自然语言处理

2022-09-26

由于由于电晕病毒而迅速开发了非面对面服务，因此通过互联网（例如销售和保留）的商业正在迅速增长。消费者还会在网站上发布有关商品或服务的评论，建议或判断。消费者直接使用的审查数据为消费者提供了积极的反馈和良好的影响，例如创造业务价值。因此，从营销的角度来看，分析审核数据非常重要。我们的研究提出了一种通过审核数据来找到客户满意度因素的新方法。我们采用了一种方法来通过混合和使用数据挖掘技术来找到客户满意度的因素，这是一种大数据分析方法，而自然语言处理技术（我们的研究中）是一种语言处理方法。与过去对客户满意度进行的许多研究不同，我们的研究通过使用各种技术来对论文的新颖性。由于分析，我们的实验结果非常准确。

translated by 谷歌翻译

Confounds and Overestimations in Fake Review Detection: Experimentally Controlling for Product-Ownership and Data-Origin

Felix Soldner , Bennett Kleinberg , Shane Johnson

分类：自然语言处理

2021-10-28

The popularity of online shopping is steadily increasing. At the same time, fake product reviewsare published widely and have the potential to affect consumer purchasing behavior. In response,previous work has developed automated methods for the detection of deceptive product reviews.However, studies vary considerably in terms of classification performance, and many use data thatcontain potential confounds, which makes it difficult to determine their validity. Two possibleconfounds are data-origin (i.e., the dataset is composed of more than one source) and productownership (i.e., reviews written by individuals who own or do not own the reviewed product). Inthe present study, we investigate the effect of both confounds for fake review detection. Using anexperimental design, we manipulate data-origin, product ownership, review polarity, and veracity.Supervised learning analysis suggests that review veracity (60.26 - 69.87%) is somewhat detectablebut reviews additionally confounded with product-ownership (66.19 - 74.17%), or with data-origin(84.44 - 86.94%) are easier to classify. Review veracity is most easily classified if confounded withproduct-ownership and data-origin combined (87.78 - 88.12%), suggesting overestimations of thetrue performance in other work. These findings are moderated by review polarity.

translated by 谷歌翻译

Detecting Spam Reviews on Vietnamese E-commerce Websites

Co Van Dinh , Son T. Luu , Anh Gia-Tuan Nguyen

分类：自然语言处理 | 人工智能

2022-07-27

客户的评论在在线购物中起着至关重要的作用。人们经常参考以前客户的评论或评论，以决定是否购买新产品。赶上这种行为，有些人会为骗子的客户创建不真实的评论，以了解产品的假质量。这些评论称为垃圾邮件评论，它使消费者在在线购物平台上混淆，并对在线购物行为产生负面影响。我们提出了称为Vispamreviews的数据集，该数据集具有严格的注释程序，用于检测电子商务平台上的垃圾邮件评论。我们的数据集由两个任务组成：用于检测评论是否为垃圾邮件的二进制分类任务以及用于识别垃圾邮件类型的多类分类任务。Phobert在这两个任务上均以宏平均F1分别获得了最高的结果，分别为88.93％和72.17％。

translated by 谷歌翻译

Fake or Genuine? Contextualised Text Representation for Fake Review Detection

Rami Mohawesh , Shuxiang Xu , Matthew Springer , Muna Al-Hawawreh , Sumbal Maqsood

分类：自然语言处理 | 人工智能

2021-12-29

在线评论对客户的购买决策有了重大影响，以满足任何产品或服务。但是，假审查可以误导消费者和公司。已经开发了几种模型来使用机器学习方法检测假审查。许多这些模型具有一些限制，导致在虚假和真正的评论之间具有低准确性。这些模型仅集中在语言特征上，以检测虚假评论，未能捕获评论的语义含义。要解决此问题，本文提出了一种新的集合模型，采用变换器架构，以在一系列虚假评论中发现隐藏的模式并准确地检测它们。该拟议方法结合了三种变压器模型来提高虚假和真正行为分析和建模的鲁棒性，以检测虚假评论。使用半真实基准数据集的实验结果显示了拟议的型号模型的优越性。

translated by 谷歌翻译

CovidMis20: COVID-19 Misinformation Detection System on Twitter Tweets using Deep Learning Models

Aos Mulahuwaish , Manish Osti , Kevin Gyorick , Majdi Maabreh , Ajay Gupta , Basheer Qolomany

分类：机器学习 | 自然语言处理

2022-09-13

在线新闻和信息来源是方便且可访问的方法来了解当前问题。例如，超过3亿人在全球Twitter上参与帖子，这提供了传播误导信息的可能性。在许多情况下，由于虚假新闻，已经犯了暴力犯罪。这项研究介绍了Covidmis20数据集（Covid-19误导2020数据集），该数据集由2月至2020年7月收集的1,375,592条推文组成。Covidmis20可以自动更新以获取最新新闻，并在以下网址公开，网址为：HTTPPS://GITHUB.COM./github.com./github.com。/一切guy/covidmis20。这项研究是使用BI-LSTM深度学习和合奏CNN+BI-GRU进行假新闻检测进行的。结果表明，测试精度分别为92.23％和90.56％，集合CNN+BI-GRU模型始终提供了比BI-LSTM模型更高的精度。

translated by 谷歌翻译

Attention-based Bidirectional LSTM for Deceptive Opinion Spam Classification

Ashish Salunkhe

分类：自然语言处理

2021-12-29

在线评论在电子商务中发挥重要作用进行决策。大部分人口做出了哪些地方，餐厅访问，以根据各自的平台发布的评论来购买的地方，从哪里购买。欺诈性审查或意见垃圾邮件被分类为一个不诚实或欺骗性的审查。产品或餐厅的肯定审查有助于吸引客户，从而导致销售额增加，而负面评论可能会妨碍餐厅或产品销售的进展，从而导致令人害羞的声誉和损失。欺诈性评论是故意发布的各种在线审查平台，以欺骗客户购买，访问或分散产品或餐厅的注意力。它们也被编写或诋毁产品的辩护。该工作旨在检测和分类审查作为欺骗性或真实性。它涉及使用各种深入学习技术来分类审查和概述涉及基于人的双向LSTM的提出的方法，以解决与基线机器学习技术的评论和比较研究中的语义信息有关的问题，以进行审查分类。

translated by 谷歌翻译

A Deep Convolutional Neural Networks Based Multi-Task Ensemble Model for Aspect and Polarity Classification in Persian Reviews

Milad Vazan , Fatemeh Sadat Masoumi , Sepideh Saeedi Majd

分类：自然语言处理

2022-01-17

基于方面的情感分析非常重要和应用，因为它能够识别文本中讨论的所有方面。但是，基于方面的情感分析将是最有效的，除了确定文本中讨论的所有方面外，它还可以识别其极性。大多数以前的方法都使用管道方法，即，它们首先识别各个方面，然后识别极性。此类方法不适合实际应用，因为它们可以导致模型错误。因此，在这项研究中，我们提出了一个基于卷积神经网络（CNN）的多任务学习模型，该模型可以同时检测方面类别并检测方面类别的极性。单独创建模型可能不会提供最佳的预测，并导致诸如偏见和高方差之类的错误。为了减少这些错误并提高模型预测的效率，将几种称为合奏学习的模型组合在一起可以提供更好的结果。因此，本文的主要目的是创建一个基于多任务深度卷积神经网络合奏的模型，以增强波斯评论中的情感分析。我们使用电影域中的波斯语数据集评估了提出的方法。 jacquard索引和锤损失措施用于评估开发模型的性能。结果表明，这种新方法提高了波斯语中情感分析模型的效率。

translated by 谷歌翻译

Dataset of Fake News Detection and Fact Verification: A Survey

Taichi Murayama

分类：机器学习 | 自然语言处理

2021-11-05

假新闻的迅速增加，这对社会造成重大损害，触发了许多假新闻相关研究，包括开发假新闻检测和事实验证技术。这些研究的资源主要是从Web数据中获取的公共数据集。我们通过三个观点调查了与假新闻研究相关的118个数据集：（1）假新闻检测，（2）事实验证，（3）其他任务;例如，假新闻和讽刺检测分析。我们还详细描述了他们的利用任务及其特征。最后，我们突出了假新闻数据集建设中的挑战以及解决这些挑战的一些研究机会。我们的调查通过帮助研究人员找到合适的数据集来促进假新闻研究，而无需重新发明轮子，从而提高了深度的假新闻研究。

translated by 谷歌翻译

Mining and summarizing customer reviews

分类：

Merchants selling products on the Web often ask their customers to review the products that they have purchased and the associated services. As e-commerce is becoming more and more popular, the number of customer reviews that a product receives grows rapidly. For a popular product, the number of reviews can be in hundreds or even thousands. This makes it difficult for a potential customer to read them to make an informed decision on whether to purchase the product. It also makes it difficult for the manufacturer of the product to keep track and to manage customer opinions. For the manufacturer, there are additional difficulties because many merchant sites may sell the same product and the manufacturer normally produces many kinds of products. In this research, we aim to mine and to summarize all the customer reviews of a product. This summarization task is different from traditional text summarization because we only mine the features of the product on which the customers have expressed their opinions and whether the opinions are positive or negative. We do not summarize the reviews by selecting a subset or rewrite some of the original sentences from the reviews to capture the main points as in the classic text summarization. Our task is performed in three steps: (1) mining product features that have been commented on by customers; (2) identifying opinion sentences in each review and deciding whether each opinion sentence is positive or negative; (3) summarizing the results. This paper proposes several novel techniques to perform these tasks. Our experimental results using reviews of a number of products sold online demonstrate the effectiveness of the techniques.

translated by 谷歌翻译

Comparative Study of Sentiment Analysis for Multi-Sourced Social Media Platforms

Keshav Kapur , Rajitha Harikrishnan

分类：自然语言处理 | 人工智能

2022-12-09

There is a vast amount of data generated every second due to the rapidly growing technology in the current world. This area of research attempts to determine the feelings or opinions of people on social media posts. The dataset we used was a multi-source dataset from the comment section of various social networking sites like Twitter, Reddit, etc. Natural Language Processing Techniques were employed to perform sentiment analysis on the obtained dataset. In this paper, we provide a comparative analysis using techniques of lexicon-based, machine learning and deep learning approaches. The Machine Learning algorithm used in this work is Naive Bayes, the Lexicon-based approach used in this work is TextBlob, and the deep-learning algorithm used in this work is LSTM.

translated by 谷歌翻译

Exploring the Distribution Regularities of User Attention and Sentiment toward Product Aspects in Online Reviews

Chenglei Qin , Chengzhi Zhang , Yi Bu

分类：自然语言处理

2022-09-08

[目的]更好地了解在线评论，并帮助潜在的消费者，商人和产品制造商有效地获得用户对产品方面的评估，本文从在线评论的时间角度来探讨了用户关注和对产品方面的情感分布规律性。 [设计/方法/方法]在线评论的时间特征（购买时间和审核时间之间的购买时间，审核时间和时间间隔），类似的属性聚类以及属性级别的情感计算技术是基于340k智能手机评论来使用的在JD.com（中国著名的在线购物平台）的三种产品中，探讨了本文中用户对产品方面的关注和情感的分布规律。 [调查结果]经验结果表明，幂律分布可以符合用户对产品方面的关注，并且在短时间间隔发布的评论包含更多产品方面。此外，结果表明，在短时间间隔内，产品方面的用户情感值显着更高/较低，这有助于判断产品的优势和弱点。 [研究局限性]本文无法获得更多具有时间特征的产品的在线评论，以验证发现，因为对购物平台的评论的限制限制了。 [原创性/价值]这项工作揭示了用户对产品方面的关注和情感的分布规律，这在协助决策，优化审查演示和改善购物体验方面具有重要意义。

translated by 谷歌翻译

Using Online Customer Reviews to Classify, Predict, and Learn about Domestic Robot Failures

Shanee Honig , Alon Bartal , Yisrael Parmet , Tal Oron-Gilad

分类：机器人

2022-01-10

关于哪些类型的故障机器人在家庭环境中以及这些失败如何影响客户体验时，存在知识差距。我们在亚马逊上分类了10,072个客户评论，通过它们中描述的机器人失败，将故障分组为十二种类型和三类（技术，互动和服务）。我们确定了先前忽略了文献中忽视的失败的来源和类型，将它们结合到更新的失败分类。我们分析了他们的频率和关系与客户明星评级。结果表明，对于功利主义国内机器人来说，技术故障比互动或服务失败更有害。常常报告任务完成和鲁棒性和恢复力的问题，并具有最大的负面影响。未来的预防和反应战略应解决机器人的技术能力，以满足功能目标，运作和保持结构完整性随着时间的推移。可用性和互动设计对客户体验不利，表明客户可能更宽容影响机器人和实际用途的影响的失败。此外，我们开发了一种能够预测客户审查是否包含描述故障的内容以及它描述的故障类型的自然语言处理模型。借鉴了这些知识，机器人系统的设计者和研究人员可以优先考虑设计和开发努力实现基本问题。

translated by 谷歌翻译

A Comprehensive Review of Visual-Textual Sentiment Analysis from Social Media Networks

Israa Khalaf Salman Al-Tameemi , Mohammad-Reza Feizi-Derakhshi , Saeed Pashazadeh , Mohammad Asadpour

分类：自然语言处理 | 人工智能

2022-07-05

社交媒体网络已成为人们生活的重要方面，它是其思想，观点和情感的平台。因此，自动化情绪分析（SA）对于以其他信息来源无法识别人们的感受至关重要。对这些感觉的分析揭示了各种应用，包括品牌评估，YouTube电影评论和医疗保健应用。随着社交媒体的不断发展，人们以不同形式发布大量信息，包括文本，照片，音频和视频。因此，传统的SA算法已变得有限，因为它们不考虑其他方式的表现力。通过包括来自各种物质来源的此类特征，这些多模式数据流提供了新的机会，以优化基于文本的SA之外的预期结果。我们的研究重点是多模式SA的最前沿领域，该领域研究了社交媒体网络上发布的视觉和文本数据。许多人更有可能利用这些信息在这些平台上表达自己。为了作为这个快速增长的领域的学者资源，我们介绍了文本和视觉SA的全面概述，包括数据预处理，功能提取技术，情感基准数据集以及适合每个字段的多重分类方法的疗效。我们还简要介绍了最常用的数据融合策略，并提供了有关Visual Textual SA的现有研究的摘要。最后，我们重点介绍了最重大的挑战，并调查了一些重要的情感应用程序。

translated by 谷歌翻译