智能论文笔记

Patent Data for Engineering Design: A Review

Shuo Jiang , Serhad Sarica , Binyang Song , Jie Hu , Jianxi Luo

分类：人工智能

2021-11-15

专利数据已用于工程设计研究，因为它包含大量的设计信息。人工智能和数据科学的最新进展呈现了我前所未有的机会，分析和对专利数据感开发设计理论和方法。在此，我们通过他们的贡献来调查专利设计文献，以设计理论，方法，工具和策略，以及不同形式的专利数据和各种方法。我们的评论阐明了对该领域的未来研究方向的光临。

translated by 谷歌翻译

A Review on Method Entities in the Academic Literature: Extraction, Evaluation, and Application

Yuzhuo Wang , Chengzhi Zhang , Kai Li

分类：自然语言处理

2022-09-08

在科学研究中，该方法是解决科学问题和关键研究对象的必不可少手段。随着科学的发展，正在提出，修改和使用许多科学方法。作者在抽象和身体文本中描述了该方法的详细信息，并且反映该方法名称的学术文献中的关键实体称为方法实体。在大量的学术文献中探索各种方法实体有助于学者了解现有方法，为研究任务选择适当的方法并提出新方法。此外，方法实体的演变可以揭示纪律的发展并促进知识发现。因此，本文对方法论和经验作品进行了系统的综述，重点是从全文学术文献中提取方法实体，并努力使用这些提取的方法实体来建立知识服务。首先提出了本综述涉及的关键概念的定义。基于这些定义，我们系统地审查了提取和评估方法实体的方法和指标，重点是每种方法的利弊。我们还调查了如何使用提取的方法实体来构建新应用程序。最后，讨论了现有作品的限制以及潜在的下一步。

translated by 谷歌翻译

Automated scholarly paper review: Possibility and challenges

Jialiang Lin , Jiaxin Song , Zhangping Zhou , Xiaodong Shi

分类：人工智能 | 自然语言处理

2021-11-15

同行评审是一项广泛接受的研究评估机制，在学术出版中发挥关键作用。然而，批评已经长期升级了这种机制，主要是因为它的低效率和主体性。近年来已经看到人工智能（AI）在协助同行评审过程中的应用。尽管如此，随着人类的参与，这种限制仍然是不可避免的。在本文中，我们提出了自动化学术纸质审查（ASPR）的概念，并审查了相关的文献和技术，讨论实现全面的计算机化审查流程的可能性。我们进一步研究了现有技术ASPR的挑战。在审查和讨论的基础上，我们得出结论，ASPR的每个阶段都有相应的研究和技术。这验证了随着相关技术继续发展的长期可以实现ASPR。其实现中的主要困难在于不完美的文献解析和表示，数据不足，数据缺陷，人机互动和有缺陷的深度逻辑推理。在可预见的未来，ASPR和同行评审将在ASPR能够充分承担从人类的审查工作量之前以加强方式共存。

translated by 谷歌翻译

The Development and Applications of Food Knowledge Graphs in the Food Science and Industry

Weiqing Min , Chunlin Liu , Leyi Xu , Shuqiang Jiang

分类：计算机视觉

2021-07-13

各种网络的部署（例如，事物互联网（IOT）和移动网络），数据库（例如，营养表和食品组成数据库）和社交媒体（例如，Instagram和Twitter）产生大量的多型食品数据，这在食品科学和工业中起着关键作用。然而，由于众所周知的数据协调问题，这些多源食品数据显示为信息孤岛，导致难以充分利用这些食物数据。食物知识图表提供了统一和标准化的概念术语及其结构形式的关系，因此可以将食物信息孤单转换为更可重复使用的全球数量数字连接的食物互联网以使各种应用有益。据我们所知，这是食品科学与工业中食品知识图表的第一个全面审查。我们首先提供知识图表的简要介绍，然后主要从食物分类，食品本体到食品知识图表的进展。粮食知识图表的代表性应用将在新的配方开发，食品可追溯性，食物数据可视化，个性化饮食推荐，食品搜索和质询回答，视觉食品对象识别，食品机械智能制造方面来概述。我们还讨论了该领域的未来方向，例如食品供应链系统和人类健康的食品知识图，这应该得到进一步的研究。他们的巨大潜力将吸引更多的研究努力，将食物知识图形应用于食品科学和工业领域。

translated by 谷歌翻译

Healthcare Knowledge Graph Construction: State-of-the-art, open issues, and opportunities

Bilal Abu-Salih , Muhammad AL-Qurishi , Mohammed Alweshah , Mohammad AL-Smadi , Reem Alfayez , Heba Saadeh

分类：人工智能

2022-07-08

由于对高效有效的大数据分析解决方案的需求，医疗保健行业中数据分析的合并已取得了重大进展。知识图（KGS）已在该领域证明了效用，并且植根于许多医疗保健应用程序，以提供更好的数据表示和知识推断。但是，由于缺乏代表性的kg施工分类法，该指定领域中的几种现有方法不足和劣等。本文是第一个提供综合分类法和鸟类对医疗kg建筑的眼光的看法。此外，还对与各种医疗保健背景相关的学术工作中最新的技术进行了彻底的检查。这些技术是根据用于知识提取的方法，知识库和来源的类型以及合并评估协议的方法进行了严格评估的。最后，报道和讨论了文献中的一些研究发现和现有问题，为这个充满活力的地区开放了未来研究的视野。

translated by 谷歌翻译

Generative Transformers for Design Concept Generation

Qihao Zhu , Jianxi Luo

分类：自然语言处理

2022-11-07

Generating novel and useful concepts is essential during the early design stage to explore a large variety of design opportunities, which usually requires advanced design thinking ability and a wide range of knowledge from designers. Growing works on computer-aided tools have explored the retrieval of knowledge and heuristics from design data. However, they only provide stimuli to inspire designers from limited aspects. This study explores the recent advance of the natural language generation (NLG) technique in the artificial intelligence (AI) field to automate the early-stage design concept generation. Specifically, a novel approach utilizing the generative pre-trained transformer (GPT) is proposed to leverage the knowledge and reasoning from textual data and transform them into new concepts in understandable language. Three concept generation tasks are defined to leverage different knowledge and reasoning: domain knowledge synthesis, problem-driven synthesis, and analogy-driven synthesis. The experiments with both human and data-driven evaluation show good performance in generating novel and useful concepts.

translated by 谷歌翻译

Intent Recognition in Conversational Recommender Systems

Sahar Moradizeyveh

分类：自然语言处理 | 机器学习

2022-12-06

Any organization needs to improve their products, services, and processes. In this context, engaging with customers and understanding their journey is essential. Organizations have leveraged various techniques and technologies to support customer engagement, from call centres to chatbots and virtual agents. Recently, these systems have used Machine Learning (ML) and Natural Language Processing (NLP) to analyze large volumes of customer feedback and engagement data. The goal is to understand customers in context and provide meaningful answers across various channels. Despite multiple advances in Conversational Artificial Intelligence (AI) and Recommender Systems (RS), it is still challenging to understand the intent behind customer questions during the customer journey. To address this challenge, in this paper, we study and analyze the recent work in Conversational Recommender Systems (CRS) in general and, more specifically, in chatbot-based CRS. We introduce a pipeline to contextualize the input utterances in conversations. We then take the next step towards leveraging reverse feature engineering to link the contextualized input and learning model to support intent recognition. Since performance evaluation is achieved based on different ML models, we use transformer base models to evaluate the proposed approach using a labelled dialogue dataset (MSDialogue) of question-answering interactions between information seekers and answer providers.

translated by 谷歌翻译

AI in HCI Design and User Experience

Wei Xu

分类：人工智能

2023-01-03

In this chapter, we review and discuss the transformation of AI technology in HCI/UX work and assess how AI technology will change how we do the work. We first discuss how AI can be used to enhance the result of user research and design evaluation. We then discuss how AI technology can be used to enhance HCI/UX design. Finally, we discuss how AI-enabled capabilities can improve UX when users interact with computing systems, applications, and services.

translated by 谷歌翻译

Is Neuro-Symbolic AI Meeting its Promise in Natural Language Processing? A Structured Review

Kyle Hamilton , Aparna Nayak , Bojan Božić , Luca Longo

分类：人工智能 | 自然语言处理 | 机器学习

2022-02-24

主张神经符号人工智能（NESY）断言，将深度学习与象征性推理相结合将导致AI更强大，而不是本身。像深度学习一样成功，人们普遍认为，即使我们最好的深度学习系统也不是很擅长抽象推理。而且，由于推理与语言密不可分，因此具有直觉的意义，即自然语言处理（NLP）将成为NESY特别适合的候选人。我们对实施NLP实施NESY的研究进行了结构化审查，目的是回答Nesy是否确实符合其承诺的问题：推理，分布概括，解释性，学习和从小数据的可转让性以及新的推理到新的域。我们研究了知识表示的影响，例如规则和语义网络，语言结构和关系结构，以及隐式或明确的推理是否有助于更高的承诺分数。我们发现，将逻辑编译到神经网络中的系统会导致满足最NESY的目标，而其他因素（例如知识表示或神经体系结构的类型）与实现目标没有明显的相关性。我们发现在推理的定义方式上，特别是与人类级别的推理有关的许多差异，这会影响有关模型架构的决策并推动结论，这些结论在整个研究中并不总是一致的。因此，我们倡导采取更加有条不紊的方法来应用人类推理的理论以及适当的基准的发展，我们希望这可以更好地理解该领域的进步。我们在GitHub上提供数据和代码以进行进一步分析。

translated by 谷歌翻译

Survey of NLP in Pharmacology: Methodology, Tasks, Resources, Knowledge, and Tools

Dimitar Trajanov , Vangel Trajkovski , Makedonka Dimitrieva , Jovana Dobreva , Milos Jovanovik , Matej Klemen , Aleš Žagar , Marko Robnik-Šikonja

分类：自然语言处理 | 机器学习

2022-08-22

自然语言处理（NLP）是一个人工智能领域，它应用信息技术来处理人类语言，在一定程度上理解并在各种应用中使用它。在过去的几年中，该领域已经迅速发展，现在采用了深层神经网络的现代变体来从大型文本语料库中提取相关模式。这项工作的主要目的是调查NLP在药理学领域的最新使用。正如我们的工作所表明的那样，NLP是药理学高度相关的信息提取和处理方法。它已被广泛使用，从智能搜索到成千上万的医疗文件到在社交媒体中找到对抗性药物相互作用的痕迹。我们将覆盖范围分为五个类别，以调查现代NLP方法论，常见的任务，相关的文本数据，知识库和有用的编程库。我们将这五个类别分为适当的子类别，描述其主要属性和想法，并以表格形式进行总结。最终的调查介绍了该领域的全面概述，对从业者和感兴趣的观察者有用。

translated by 谷歌翻译

Data-Centric Epidemic Forecasting: A Survey

Alexander Rodríguez , Harshavardhan Kamarthi , Pulak Agarwal , Javen Ho , Mira Patel , Suchet Sapre , B. Aditya Prakash

分类：机器学习

2022-07-19

COVID-19的大流行提出了对多个领域决策者的流行预测的重要性，从公共卫生到整个经济。虽然预测流行进展经常被概念化为类似于天气预测，但是它具有一些关键的差异，并且仍然是一项非平凡的任务。疾病的传播受到人类行为，病原体动态，天气和环境条件的多种混杂因素的影响。由于政府公共卫生和资助机构的倡议，捕获以前无法观察到的方面的丰富数据来源的可用性增加了研究的兴趣。这尤其是在“以数据为中心”的解决方案上进行的一系列工作，这些解决方案通过利用非传统数据源以及AI和机器学习的最新创新来增强我们的预测能力的潜力。这项调查研究了各种数据驱动的方法论和实践进步，并介绍了一个概念框架来导航它们。首先，我们列举了与流行病预测相关的大量流行病学数据集和新的数据流，捕获了各种因素，例如有症状的在线调查，零售和商业，流动性，基因组学数据等。接下来，我们将讨论关注最近基于数据驱动的统计和深度学习方法的方法和建模范式，以及将机械模型知识域知识与统计方法的有效性和灵活性相结合的新型混合模型类别。我们还讨论了这些预测系统的现实部署中出现的经验和挑战，包括预测信息。最后，我们重点介绍了整个预测管道中发现的一些挑战和开放问题。

translated by 谷歌翻译

Artificial Intelligence in Concrete Materials: A Scientometric View

Zhanzhao Li , Aleksandra Radlińska

分类：人工智能

2022-09-17

人工智能（AI）已成为一种变革性和多功能工具，破坏了跨科学领域的新边界。在其最有希望的应用中，AI研究是在混凝土科学和工程中开展的，它为混合设计优化和胶合系统的服务寿命预测提供了新的见解。本章旨在揭示有关混凝土材料AI现有文献的主要研究兴趣和知识结构。首先，从1990年至2020年发表的总共389篇文章是从科学网络中检索出来的。采用了科学计量学工具，例如关键字共同出现分析和文档共分析，以量化研究领域的特征和特征。这些发现在数据驱动的具体研究中引起了迫切的问题，并为混凝土社区提供了充分利用AI技术能力的未来机会。

translated by 谷歌翻译

The Harvard USPTO Patent Dataset: A Large-Scale, Well-Structured, and Multi-Purpose Corpus of Patent Applications

Mirac Suzgun , Luke Melas-Kyriazi , Suproteem K. Sarkar , Scott Duke Kominers , Stuart M. Shieber

分类：自然语言处理 | 机器学习

2022-07-08

创新是经济和社会发展的主要驱动力，有关多种创新的信息嵌入了专利和专利申请的半结构化数据中。尽管在专利数据中表达的创新的影响和新颖性很难通过传统手段来衡量，但ML提供了一套有希望的技术来评估新颖性，汇总贡献和嵌入语义。在本文中，我们介绍了Harvard USPTO专利数据集（HUPD），该数据集是2004年至2004年之间提交给美国专利商业办公室（USPTO）的大型，结构化和多用途的英语专利专利申请。 2018年。HUPD拥有超过450万张专利文件，是可比的Coldia的两到三倍。与以前在NLP中提出的专利数据集不同，HUPD包含了专利申请的发明人提交的版本（不是授予专利的最终版本），其中允许我们在第一次使用NLP方法进行申请时研究专利性。它在包含丰富的结构化元数据以及专利申请文本的同时也很新颖：通过提供每个应用程序的元数据及其所有文本字段，数据集使研究人员能够执行一组新的NLP任务，以利用结构性协变量的变异。作为有关HUPD的研究类型的案例研究，我们向NLP社区（即专利决策的二元分类）介绍了一项新任务。我们还显示数据集中提供的结构化元数据使我们能够对此任务进行概念转移的明确研究。最后，我们演示了如何将HUPD用于三个其他任务：专利主题领域的多类分类，语言建模和摘要。

translated by 谷歌翻译

Biologically Inspired Design Concept Generation Using Generative Pre-Trained Transformers

Qihao Zhu , Xinyu Zhang , Jianxi Luo

分类：自然语言处理

2022-12-26

Biological systems in nature have evolved for millions of years to adapt and survive the environment. Many features they developed can be inspirational and beneficial for solving technical problems in modern industries. This leads to a specific form of design-by-analogy called bio-inspired design (BID). Although BID as a design method has been proven beneficial, the gap between biology and engineering continuously hinders designers from effectively applying the method. Therefore, we explore the recent advance of artificial intelligence (AI) for a data-driven approach to bridge the gap. This paper proposes a generative design approach based on the generative pre-trained language model (PLM) to automatically retrieve and map biological analogy and generate BID in the form of natural language. The latest generative pre-trained transformer, namely GPT-3, is used as the base PLM. Three types of design concept generators are identified and fine-tuned from the PLM according to the looseness of the problem space representation. Machine evaluators are also fine-tuned to assess the mapping relevancy between the domains within the generated BID concepts. The approach is evaluated and then employed in a real-world project of designing light-weighted flying cars during its conceptual design phase The results show our approach can generate BID concepts with good performance.

translated by 谷歌翻译

Analyzing the State of Computer Science Research with the DBLP Discovery Dataset

Lennart Küll

分类：自然语言处理

2022-12-01

The number of scientific publications continues to rise exponentially, especially in Computer Science (CS). However, current solutions to analyze those publications restrict access behind a paywall, offer no features for visual analysis, limit access to their data, only focus on niches or sub-fields, and/or are not flexible and modular enough to be transferred to other datasets. In this thesis, we conduct a scientometric analysis to uncover the implicit patterns hidden in CS metadata and to determine the state of CS research. Specifically, we investigate trends of the quantity, impact, and topics for authors, venues, document types (conferences vs. journals), and fields of study (compared to, e.g., medicine). To achieve this we introduce the CS-Insights system, an interactive web application to analyze CS publications with various dashboards, filters, and visualizations. The data underlying this system is the DBLP Discovery Dataset (D3), which contains metadata from 5 million CS publications. Both D3 and CS-Insights are open-access, and CS-Insights can be easily adapted to other datasets in the future. The most interesting findings of our scientometric analysis include that i) there has been a stark increase in publications, authors, and venues in the last two decades, ii) many authors only recently joined the field, iii) the most cited authors and venues focus on computer vision and pattern recognition, while the most productive prefer engineering-related topics, iv) the preference of researchers to publish in conferences over journals dwindles, v) on average, journal articles receive twice as many citations compared to conference papers, but the contrast is much smaller for the most cited conferences and journals, and vi) journals also get more citations in all other investigated fields of study, while only CS and engineering publish more in conferences than journals.

translated by 谷歌翻译

Survey of Generative Methods for Social Media Analysis

Stan Matwin , Aristides Milios , Paweł Prałat , Amilcar Soares , François Théberge

分类：机器学习

2021-12-13

本次调查绘制了用于分析社交媒体数据的生成方法的研究状态的广泛的全景照片（Sota）。它填补了空白，因为现有的调查文章在其范围内或被约会。我们包括两个重要方面，目前正在挖掘和建模社交媒体的重要性：动态和网络。社会动态对于了解影响影响或疾病的传播，友谊的形成，友谊的形成等，另一方面，可以捕获各种复杂关系，提供额外的洞察力和识别否则将不会被注意的重要模式。

translated by 谷歌翻译

Recent Advances in Automated Question Answering In Biomedical Domain

Krishanu Das Baksi

分类：人工智能 | 自然语言处理

2021-11-10

自动问题应答（QA）系统的目的是以时间有效的方式向用户查询提供答案。通常在数据库（或知识库）或通常被称为语料库的文件集合中找到答案。在过去的几十年里，收购知识的扩散，因此生物医学领域的新科学文章一直是指数增长。因此，即使对于领域专家，也难以跟踪域中的所有信息。随着商业搜索引擎的改进，用户可以在某些情况下键入其查询并获得最相关的一小组文档，以及在某些情况下从文档中的相关片段。但是，手动查找所需信息或答案可能仍然令人疑惑和耗时。这需要开发高效的QA系统，该系统旨在为用户提供精确和精确的答案提供了生物医学领域的自然语言问题。在本文中，我们介绍了用于开发普通域QA系统的基本方法，然后彻底调查生物医学QA系统的不同方面，包括使用结构化数据库和文本集合的基准数据集和几种提出的方法。我们还探讨了当前系统的局限性，并探索潜在的途径以获得进一步的进步。

translated by 谷歌翻译

Law Informs Code: A Legal Informatics Approach to Aligning Artificial Intelligence with Humans

John J. Nay

分类：人工智能 | 机器学习

2022-09-14

We are currently unable to specify human goals and societal values in a way that reliably directs AI behavior. Law-making and legal interpretation form a computational engine that converts opaque human values into legible directives. "Law Informs Code" is the research agenda capturing complex computational legal processes, and embedding them in AI. Similar to how parties to a legal contract cannot foresee every potential contingency of their future relationship, and legislators cannot predict all the circumstances under which their proposed bills will be applied, we cannot ex ante specify rules that provably direct good AI behavior. Legal theory and practice have developed arrays of tools to address these specification problems. For instance, legal standards allow humans to develop shared understandings and adapt them to novel situations. In contrast to more prosaic uses of the law (e.g., as a deterrent of bad behavior through the threat of sanction), leveraged as an expression of how humans communicate their goals, and what society values, Law Informs Code. We describe how data generated by legal processes (methods of law-making, statutory interpretation, contract drafting, applications of legal standards, legal reasoning, etc.) can facilitate the robust specification of inherently vague human goals. This increases human-AI alignment and the local usefulness of AI. Toward society-AI alignment, we present a framework for understanding law as the applied philosophy of multi-agent alignment. Although law is partly a reflection of historically contingent political power - and thus not a perfect aggregation of citizen preferences - if properly parsed, its distillation offers the most legitimate computational comprehension of societal values available. If law eventually informs powerful AI, engaging in the deliberative political process to improve law takes on even more meaning.

translated by 谷歌翻译

Automatic Related Work Generation: A Meta Study

Xiangci Li , Jessica Ouyang

分类：自然语言处理

2022-01-06

学术研究是解决以前从未解决过的问题的探索活动。通过这种性质，每个学术研究工作都需要进行文献审查，以区分其Novelties尚未通过事先作品解决。在自然语言处理中，该文献综述通常在“相关工作”部分下进行。鉴于研究文件的其余部分和引用的论文列表，自动相关工作生成的任务旨在自动生成“相关工作”部分。虽然这项任务是在10年前提出的，但直到最近，它被认为是作为科学多文件摘要问题的变种。然而，即使在今天，尚未标准化了自动相关工作和引用文本生成的问题。在这项调查中，我们进行了一个元研究，从问题制定，数据集收集，方法方法，绩效评估和未来前景的角度来比较相关工作的现有文献，以便为读者洞察到国家的进步 - 最内容的研究，以及如何进行未来的研究。我们还调查了我们建议未来工作要考虑整合的相关研究领域。

translated by 谷歌翻译

A Comprehensive Review of Visual-Textual Sentiment Analysis from Social Media Networks

Israa Khalaf Salman Al-Tameemi , Mohammad-Reza Feizi-Derakhshi , Saeed Pashazadeh , Mohammad Asadpour

分类：自然语言处理 | 人工智能

2022-07-05

社交媒体网络已成为人们生活的重要方面，它是其思想，观点和情感的平台。因此，自动化情绪分析（SA）对于以其他信息来源无法识别人们的感受至关重要。对这些感觉的分析揭示了各种应用，包括品牌评估，YouTube电影评论和医疗保健应用。随着社交媒体的不断发展，人们以不同形式发布大量信息，包括文本，照片，音频和视频。因此，传统的SA算法已变得有限，因为它们不考虑其他方式的表现力。通过包括来自各种物质来源的此类特征，这些多模式数据流提供了新的机会，以优化基于文本的SA之外的预期结果。我们的研究重点是多模式SA的最前沿领域，该领域研究了社交媒体网络上发布的视觉和文本数据。许多人更有可能利用这些信息在这些平台上表达自己。为了作为这个快速增长的领域的学者资源，我们介绍了文本和视觉SA的全面概述，包括数据预处理，功能提取技术，情感基准数据集以及适合每个字段的多重分类方法的疗效。我们还简要介绍了最常用的数据融合策略，并提供了有关Visual Textual SA的现有研究的摘要。最后，我们重点介绍了最重大的挑战，并调查了一些重要的情感应用程序。

translated by 谷歌翻译