智能论文笔记

Scaling Up Knowledge Graph Creation to Large and Heterogeneous Data Sources

Enrique Iglesias , Samaneh Jozashoori , Maria-Esther Vidal

分类：人工智能

2022-01-24

RDF知识图（kg）是强大的数据结构，可以表示由异质数据源创建的事实语句。 KG创建很费力，需要有效地执行数据管理技术。本文解决了自动生成KG创建过程的问题；它提出了在RDF映射语言（RML）中指定的映射断言之后，用于计划和将异质数据计划和转换为RDF三元组的技术。给定一组映射断言，计划者通过分区和安排断言的执行来提供优化的执行计划。首先，考虑到数据源数量，映射断言的类型以及不同断言之间的关联，计划者评估了优化数量的分区数量。在提供属于每个分区的分区和断言列表之后，计划者确定其执行命令。实施了一种贪婪的算法来生成分区的浓密树执行计划。浓密的树计划被转化为操作系统命令，以指导在灌木树指示的顺序中执行映射断言的分区。提出的优化方法是根据最新的RML兼容发动机以及数据源和RML三元图的现有基准进行评估的。我们的实验结果表明，所研究的引擎的性能可以大大改善，尤其是在具有大量三元图和大数据源的复杂环境中。结果，在复杂案例中超时的引擎可以至少生产出应用计划者的一部分。

translated by 谷歌翻译

EABlock: A Declarative Entity Alignment Block for Knowledge Graph Creation Pipelines

Samaneh Jozashoori , Ahmad Sakor , Enrique Iglesias , Maria-Esther Vidal

分类：人工智能

2021-12-14

尽管编码了大量丰富和有价值的数据，但现有的数据来源主要是独立创建的，这是他们整合的重大挑战。映射语言，例如RML和R2RML，促进了将Meta-Data和将数据集成到知识图中的过程的声明性规范。除了在数据源和统一模式中表达对应关系之外，映射规则还可以包括知识提取功能。组合映射规则和函数表示强大的形式主义，以指定流水管以透明地将数据集成到知识图中。令人惊讶的是，这些形式主义没有完全调整，并且通过将ad-hoc程序执行到预处理和集成数据来创建许多知识图表。在本文中，我们提出了Eablock，一种方法将实体对齐（EA）集成为RML映射规则的一部分。 eAblock包括执行从文本属性的实体识别的功能块，并将识别的实体链接到Wikidata，DBPedia和域特定词库中的相应资源，例如UML。 EABLOCK提供可靠性和有效的技术来评估功能并转移映射以促进其在任何符合RML标准的发动机中的应用。我们有经验评估的eAblock性能，结果表明eAblock加快了需要实体识别和链接在符合最先进的RML标准的发动机的知识图形创建管道。 Eablock还通过Github存储库（https:/github.com/sdm-tib/eablock）和doi（https://doi.org/10.5281/zenodo.5779777）作为工具被公开可用作工具。

translated by 谷歌翻译

A Multi-objective Evolutionary Algorithm for EEG Inverse Problem

José Enrique Alvarez Iglesias , Mayrim Vega-Hernández , Eduardo Martínez-Montes

分类：神经与进化计算

2021-07-21

在本文中，我们提出了对EEG逆问题的多目标方法。该配方不需要涉及经验程序的未知参数。由于问题的组合特征，这种替代方案包括解决方法的进化策略。结果是基于解剖学限制（MOEAAR）来估计分布式解决方案的多目标进化算法。比较试验在这种方法和3种经典的正则化方法之间：套索，脊-1和eNET-L。在实验阶段，选择回归模型以获得稀疏和分布式解决方案。分析涉及具有不同信噪比（SNR）的模拟数据。质量控制指标是本地化误差，空间分辨率和可见性。 MOEAAR证明了比重建和最大激活的定位的经典方法更好的稳定性。标准L0用于估计具有进化方法的稀疏解决方案，结果是相关的。

translated by 谷歌翻译

DDM-NET: End-to-end learning of keypoint feature Detection, Description and Matching for 3D localization

Xiangyu Xu , Li Guan , Enrique Dunn , Haoxiang Li , Guang Hua

分类：计算机视觉

2022-12-08

In this paper, we propose an end-to-end framework that jointly learns keypoint detection, descriptor representation and cross-frame matching for the task of image-based 3D localization. Prior art has tackled each of these components individually, purportedly aiming to alleviate difficulties in effectively train a holistic network. We design a self-supervised image warping correspondence loss for both feature detection and matching, a weakly-supervised epipolar constraints loss on relative camera pose learning, and a directional matching scheme that detects key-point features in a source image and performs coarse-to-fine correspondence search on the target image. We leverage this framework to enforce cycle consistency in our matching module. In addition, we propose a new loss to robustly handle both definite inlier/outlier matches and less-certain matches. The integration of these learning mechanisms enables end-to-end training of a single network performing all three localization components. Bench-marking our approach on public data-sets, exemplifies how such an end-to-end framework is able to yield more accurate localization that out-performs both traditional methods as well as state-of-the-art weakly supervised methods.

translated by 谷歌翻译

A Domain-specific Perceptual Metric via Contrastive Self-supervised Representation: Applications on Natural and Medical Images

Hongwei Bran Li , Chinmay Prabhakar , Suprosanna Shit , Johannes Paetzold , Tamaz Amiranashvili , Jianguo Zhang , Daniel Rueckert , Juan Eugenio Iglesias , Benedikt Wiestler , Bjoern Menze

分类：计算机视觉

2022-12-03

Quantifying the perceptual similarity of two images is a long-standing problem in low-level computer vision. The natural image domain commonly relies on supervised learning, e.g., a pre-trained VGG, to obtain a latent representation. However, due to domain shift, pre-trained models from the natural image domain might not apply to other image domains, such as medical imaging. Notably, in medical imaging, evaluating the perceptual similarity is exclusively performed by specialists trained extensively in diverse medical fields. Thus, medical imaging remains devoid of task-specific, objective perceptual measures. This work answers the question: Is it necessary to rely on supervised learning to obtain an effective representation that could measure perceptual similarity, or is self-supervision sufficient? To understand whether recent contrastive self-supervised representation (CSR) may come to the rescue, we start with natural images and systematically evaluate CSR as a metric across numerous contemporary architectures and tasks and compare them with existing methods. We find that in the natural image domain, CSR behaves on par with the supervised one on several perceptual tests as a metric, and in the medical domain, CSR better quantifies perceptual similarity concerning the experts' ratings. We also demonstrate that CSR can significantly improve image quality in two image synthesis tasks. Finally, our extensive results suggest that perceptuality is an emergent property of CSR, which can be adapted to many image domains without requiring annotations.

translated by 谷歌翻译

Constrained Reinforcement Learning via Dissipative Saddle Flow Dynamics

Tianqi Zheng , Pengcheng You , Enrique Mallada

分类：机器学习

2022-12-03

In constrained reinforcement learning (C-RL), an agent seeks to learn from the environment a policy that maximizes the expected cumulative reward while satisfying minimum requirements in secondary cumulative reward constraints. Several algorithms rooted in sampled-based primal-dual methods have been recently proposed to solve this problem in policy space. However, such methods are based on stochastic gradient descent ascent algorithms whose trajectories are connected to the optimal policy only after a mixing output stage that depends on the algorithm's history. As a result, there is a mismatch between the behavioral policy and the optimal one. In this work, we propose a novel algorithm for constrained RL that does not suffer from these limitations. Leveraging recent results on regularized saddle-flow dynamics, we develop a novel stochastic gradient descent-ascent algorithm whose trajectories converge to the optimal policy almost surely.

translated by 谷歌翻译

Learning Coherent Clusters in Weakly-Connected Network Systems

Hancheng Min , Enrique Mallada

分类：机器学习

2022-11-28

We propose a structure-preserving model-reduction methodology for large-scale dynamic networks with tightly-connected components. First, the coherent groups are identified by a spectral clustering algorithm on the graph Laplacian matrix that models the network feedback. Then, a reduced network is built, where each node represents the aggregate dynamics of each coherent group, and the reduced network captures the dynamic coupling between the groups. We provide an upper bound on the approximation error when the network graph is randomly generated from a weight stochastic block model. Finally, numerical experiments align with and validate our theoretical findings.

translated by 谷歌翻译

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

Teven Le Scao , Angela Fan , Christopher Akiki , Ellie Pavlick , Suzana Ilić , Daniel Hesslow , Roman Castagné , Alexandra Sasha Luccioni , François Yvon , Matthias Gallé

分类：自然语言处理

2022-11-09

Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.

translated by 谷歌翻译

Deep Learning based Defect classification and detection in SEM images: A Mask R-CNN approach

Bappaditya Dey , Enrique Dehaerne , Kasem Khalil , Sandip Halder , Philippe Leray , Magdy A. Bayoumi

分类：计算机视觉 | 人工智能

2022-11-03

In this research work, we have demonstrated the application of Mask-RCNN (Regional Convolutional Neural Network), a deep-learning algorithm for computer vision and specifically object detection, to semiconductor defect inspection domain. Stochastic defect detection and classification during semiconductor manufacturing has grown to be a challenging task as we continuously shrink circuit pattern dimensions (e.g., for pitches less than 32 nm). Defect inspection and analysis by state-of-the-art optical and e-beam inspection tools is generally driven by some rule-based techniques, which in turn often causes to misclassification and thereby necessitating human expert intervention. In this work, we have revisited and extended our previous deep learning-based defect classification and detection method towards improved defect instance segmentation in SEM images with precise extent of defect as well as generating a mask for each defect category/instance. This also enables to extract and calibrate each segmented mask and quantify the pixels that make up each mask, which in turn enables us to count each categorical defect instances as well as to calculate the surface area in terms of pixels. We are aiming at detecting and segmenting different types of inter-class stochastic defect patterns such as bridge, break, and line collapse as well as to differentiate accurately between intra-class multi-categorical defect bridge scenarios (as thin/single/multi-line/horizontal/non-horizontal) for aggressive pitches as well as thin resists (High NA applications). Our proposed approach demonstrates its effectiveness both quantitatively and qualitatively.

translated by 谷歌翻译

REST: REtrieve & Self-Train for generative action recognition

Adrian Bulat , Enrique Sanchez , Brais Martinez , Georgios Tzimiropoulos

分类：计算机视觉 | 人工智能 | 机器学习

2022-09-29

这项工作是在培训生成动作/视频识别模型上，其输出是描述视频的自由形式的特定动作标题（而不是动作类标签）。生成的方法具有实用的优势，例如生产更细粒度和人类可读的产出，并且自然而然地是开放的。为此，我们提议适应视频/动作识别的预先训练的生成视觉和语言（V＆L）基础模型。据我们所知，最近有几次尝试适应了用对比度学习（例如剪辑）训练的V＆L模型（例如剪辑），但据我们所知，我们提出了第一种设定实现这一目标的方法来实现生成模型的方法。我们首先表明，生成模型的直接微调生产具有严重过度拟合的动作类别。为了减轻这一点，我们介绍了REST，这是一个由两个关键组成部分组成的培训框架：一种无监督的方法，用于通过伪捕获生成和自我训练，将生成模型适应动作/视频，即不使用任何动作特定的标签；（b）基于剪辑的检索方法，用于为每个视频发现一套伪装的伪扣，以训练该模型。重要的是，我们表明这两个组件对于获得高精度都是必要的。我们评估零拍动识别的问题的休息，我们表明，与基于对比的学习方法相比，我们的方法非常有竞争力。代码将可用。

translated by 谷歌翻译