智能论文笔记

Finite-Cliquewidth Sets of Existential Rules: Toward a General Criterion for Decidable yet Highly Expressive Querying

Thomas Feller , Tim S. Lyon , Piotr Ostropolski-Nalewaja , Sebastian Rudolph

分类：人工智能

2022-09-06

为了追求基于本体本体的查询的通用标准，我们介绍了存在规则的“有限 - 局限性集合”（FCS），这是一种模型定义的规则集类别，灵感来自图形理论的cliquewidth措施。通过一个通用参数，我们表明FCS确保对相当一类的查询类（称为“ Damsoqs”）的必要性进行可决定性，这些查询均包含结合性查询（CQS）。 FCS类适当地概括了有限扩展集（FES）的类别，并且最多可以介绍2个Arity的签名，即有界树的类别（BTS）。对于较高的ARIT，BTS仅由FC通过重新化而间接汇总。尽管FCS的普遍性，但我们提供了一个规则集，该规则集具有可决定的CQ符号（由于一阶 - 剥离性），因此落在FC之外，从而证明了FCS的无与伦比和有限合并集（FUS）的无效性。尽管如此，我们还是表明，如果我们将自己限制在最多2的单头规则设置上，那么FCS属于FUS。

translated by 谷歌翻译

How to Agree to Disagree: Managing Ontological Perspectives using Standpoint Logic

Lucía Gómez Álvarez , Sebastian Rudolph , Hannes Strass

分类：人工智能

2022-06-14

在处理知识时考虑个人，潜在的矛盾观点的重要性已得到广泛认可。许多现有的本体管理方法完全合并了知识的观点，这可能需要削弱以保持一致性；其他人以完全独立的方式代表了独特的观点。作为替代方案，我们提出了观点逻辑，这是一种简单而多功能的多模式逻辑````addon'''，用于现有的KR语言，用于针对域知识的集成表示，相对于多样化的，可能是相互冲突的角度，可以是层次结构化的，，组合并相互关联。从一阶观点逻辑（FOSL）的通用框架开始，我们随后将注意力集中在句子公式的片段上，为此，我们将poly Time Translation转换为无角度版本。该结果对一阶逻辑的各种高度表达性可决定性片段产生可决定性和有利的复杂性。然后，我们使用一些精心设计的编码技巧，然后为OWL 2 DL本体语言的逻辑SROIQB_S建立类似的翻译。借助此结果，现有高度优化的猫头鹰推理器可用于为通过角度建模扩展的本体学语言提供实用的推理支持。

translated by 谷歌翻译

Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

Aarohi Srivastava , Abhinav Rastogi , Abhishek Rao , Abu Awal Md Shoeb , Abubakar Abid , Adam Fisch , Adam R. Brown , Adam Santoro , Aditya Gupta , Adrià Garriga-Alonso

分类：自然语言处理 | 人工智能 | 机器学习 | (统计)机器学习

2022-06-09

语言模型既展示了定量的改进，又展示了新的定性功能，随着规模的增加。尽管它们具有潜在的变革性影响，但这些新能力的特征却很差。为了为未来的研究提供信息，为破坏性的新模型能力做准备，并改善社会有害的效果，至关重要的是，我们必须了解目前和近乎未来的能力和语言模型的局限性。为了应对这一挑战，我们介绍了超越模仿游戏基准（Big Bench）。 Big Bench目前由204个任务组成，由132家机构的442位作者贡献。任务主题是多样的，从语言学，儿童发展，数学，常识性推理，生物学，物理学，社会偏见，软件开发等等。 Big-Bench专注于被认为超出当前语言模型的功能的任务。我们评估了OpenAI的GPT型号，Google内部密集变压器体系结构和大型基础上的开关稀疏变压器的行为，跨越了数百万到数十亿个参数。此外，一个人类专家评估者团队执行了所有任务，以提供强大的基准。研究结果包括：模型性能和校准都随规模改善，但绝对的术语（以及与评估者的性能相比）；在模型类中的性能非常相似，尽管带有稀疏性。逐渐和预测的任务通常涉及大量知识或记忆成分，而在临界规模上表现出“突破性”行为的任务通常涉及多个步骤或组成部分或脆性指标；社交偏见通常会随着含糊不清的环境而随着规模而增加，但这可以通过提示来改善。

translated by 谷歌翻译

Semantic Characterizations of General Belief Base Revision

Faiq Miftakhul Falakh , Sebastian Rudolph , Kai Sauerwald

分类：人工智能

2021-12-27

ALChour \“Ardenfors的AGM发布，Makinson继续代表与信仰变革有关的研究中的基石。Katsuno和Mendelzon（K＆M）通过了AGM假设改变信仰基地，并在命题中的特征agm信仰基地修订有限签名的逻辑。我们概括了K＆M在任意Tarskian逻辑中设置的（多个）基本修订版的方法，涵盖了具有经典模型 - 理论语义的所有逻辑，从而涵盖了知识表示和超越的各种逻辑。我们的通用配方适用于“基础”的各种概念（例如信仰集，任意或有限的句子或单句话）。核心结果是表示AGM基本修订运算符和某些“分配”之间双向对应的表示定理：函数映射信仰基础到总数 - 尚未传递 - “偏好”解释之间的关系。与此同时，我们为CAS提供了一个伴侣E当agm andodatience的AGM假设被遗弃时。我们还提供了所有逻辑的表征，我们的结果可以加强生产传递偏好关系的分配（如K＆M的原始工作），根据语法依赖与独立性，引起了这种逻辑的两个表示定理。

translated by 谷歌翻译

A General Katsuno-Mendelzon-Style Characterization of AGM Belief Base Revision for Arbitrary Monotonic Logics

Faiq Miftakhul Falakh , Sebastian Rudolph , Kai Sauerwald

分类：人工智能

2021-04-29

AGM由Alchour \'{o} N，G \“{A} Rdenfors，并且Makinson继续代表与信仰变革相关的研究中的基石。我们概括了Katsuno和Mendelzon（KM）的方法来表征AGM基础修订从命题逻辑到任意单调逻辑中的（多个）基本修订。我们的核心结果是使用总共 - 尚未传递的分配的代表性定理 - “偏好”关系与信仰基础。我们还提供了所有逻辑的表征我们的结果可以加强预订分配（以KM原始工作）。

translated by 谷歌翻译

Logic Mill -- A Knowledge Navigation System

Sebastian Erhardt , Mainak Ghosh , Erik Buunk , Michael E. Rose , Dietmar Harhoff

分类：自然语言处理

2022-12-31

Logic Mill is a scalable and openly accessible software system that identifies semantically similar documents within either one domain-specific corpus or multi-domain corpora. It uses advanced Natural Language Processing (NLP) techniques to generate numerical representations of documents. Currently it leverages a large pre-trained language model to generate these document representations. The system focuses on scientific publications and patent documents and contains more than 200 million documents. It is easily accessible via a simple Application Programming Interface (API) or via a web interface. Moreover, it is continuously being updated and can be extended to text corpora from other domains. We see this system as a general-purpose tool for future research applications in the social sciences and other domains.

translated by 谷歌翻译

NISQ-ready community detection based on separation-node identification

Jonas Stein , Dominik Ott , Mirco Schoenfeld , Sebastian Feld

分类：机器学习

2022-12-30

The analysis of network structure is essential to many scientific areas, ranging from biology to sociology. As the computational task of clustering these networks into partitions, i.e., solving the community detection problem, is generally NP-hard, heuristic solutions are indispensable. The exploration of expedient heuristics has led to the development of particularly promising approaches in the emerging technology of quantum computing. Motivated by the substantial hardware demands for all established quantum community detection approaches, we introduce a novel QUBO based approach that only needs number-of-nodes many qubits and is represented by a QUBO-matrix as sparse as the input graph's adjacency matrix. The substantial improvement on the sparsity of the QUBO-matrix, which is typically very dense in related work, is achieved through the novel concept of separation-nodes. Instead of assigning every node to a community directly, this approach relies on the identification of a separation-node set, which -- upon its removal from the graph -- yields a set of connected components, representing the core components of the communities. Employing a greedy heuristic to assign the nodes from the separation-node sets to the identified community cores, subsequent experimental results yield a proof of concept. This work hence displays a promising approach to NISQ ready quantum community detection, catalyzing the application of quantum computers for the network structure analysis of large scale, real world problem instances.

translated by 谷歌翻译

A Memetic Algorithm with Reinforcement Learning for Sociotechnical Production Scheduling

Felix Grumbach , Nour Eldin Alaa Badr , Pascal Reusch , Sebastian Trojahn

分类：机器学习 | 人工智能

2022-12-21

The following article presents a memetic algorithm with applying deep reinforcement learning (DRL) for solving practically oriented dual resource constrained flexible job shop scheduling problems (DRC-FJSSP). In recent years, there has been extensive research on DRL techniques, but without considering realistic, flexible and human-centered shopfloors. A research gap can be identified in the context of make-to-order oriented discontinuous manufacturing as it is often represented in medium-size companies with high service levels. From practical industry projects in this domain, we recognize requirements to depict flexible machines, human workers and capabilities, setup and processing operations, material arrival times, complex job paths with parallel tasks for bill of material (BOM) manufacturing, sequence-depended setup times and (partially) automated tasks. On the other hand, intensive research has been done on metaheuristics in the context of DRC-FJSSP. However, there is a lack of suitable and generic scheduling methods that can be holistically applied in sociotechnical production and assembly processes. In this paper, we first formulate an extended DRC-FJSSP induced by the practical requirements mentioned. Then we present our proposed hybrid framework with parallel computing for multicriteria optimization. Through numerical experiments with real-world data, we confirm that the framework generates feasible schedules efficiently and reliably. Utilizing DRL instead of random operations leads to better results and outperforms traditional approaches.

translated by 谷歌翻译

Needle in a Haystack: An Analysis of Finding Qualified Workers on MTurk for Summarization

Lining Zhang , João Sedoc , Simon Mille , Yufang Hou , Sebastian Gehrmann , Daniel Deutsch , Elizabeth Clark , Yixin Liu , Miruna Clinciu , Saad Mahamood

分类：自然语言处理

2022-12-20

The acquisition of high-quality human annotations through crowdsourcing platforms like Amazon Mechanical Turk (MTurk) is more challenging than expected. The annotation quality might be affected by various aspects like annotation instructions, Human Intelligence Task (HIT) design, and wages paid to annotators, etc. To avoid potentially low-quality annotations which could mislead the evaluation of automatic summarization system outputs, we investigate the recruitment of high-quality MTurk workers via a three-step qualification pipeline. We show that we can successfully filter out bad workers before they carry out the evaluations and obtain high-quality annotations while optimizing the use of resources. This paper can serve as basis for the recruitment of qualified annotators in other challenging annotation tasks.

translated by 谷歌翻译

NusaCrowd: Open Source Initiative for Indonesian NLP Resources

Samuel Cahyawijaya , Holy Lovenia , Alham Fikri Aji , Genta Indra Winata , Bryan Wilie , Rahmad Mahendra , Christian Wibisono , Ade Romadhony , Karissa Vincentio , Fajri Koto

分类：自然语言处理 | 人工智能

2022-12-19

We present NusaCrowd, a collaborative initiative to collect and unite existing resources for Indonesian languages, including opening access to previously non-public resources. Through this initiative, we have has brought together 137 datasets and 117 standardized data loaders. The quality of the datasets has been assessed manually and automatically, and their effectiveness has been demonstrated in multiple experiments. NusaCrowd's data collection enables the creation of the first zero-shot benchmarks for natural language understanding and generation in Indonesian and its local languages. Furthermore, NusaCrowd brings the creation of the first multilingual automatic speech recognition benchmark in Indonesian and its local languages. Our work is intended to help advance natural language processing research in under-represented languages.

translated by 谷歌翻译