智能论文笔记

Robust Bayesian Subspace Identification for Small Data Sets

Alexandre Rodrigues Mesquita

分类： (统计)机器学习

2022-12-29

Model estimates obtained from traditional subspace identification methods may be subject to significant variance. This elevated variance is aggravated in the cases of large models or of a limited sample size. Common solutions to reduce the effect of variance are regularized estimators, shrinkage estimators and Bayesian estimation. In the current work we investigate the latter two solutions, which have not yet been applied to subspace identification. Our experimental results show that our proposed estimators may reduce the estimation risk up to $40\%$ of that of traditional subspace methods.

translated by 谷歌翻译

Zero-shot hashtag segmentation for multilingual sentiment analysis

Ruan Chaves Rodrigues , Marcelo Akira Inuzuka , Juliana Resplande Sant'Anna Gomes , Acquila Santos Rocha , Iacer Calixto , Hugo Alexandre Dantas do Nascimento

分类：自然语言处理

2021-12-06

HashTag分段，也称为HashTag分解，是用于社交媒体数据集的预处理流水线的共同步骤。它通常先于情绪分析和仇恨语音检测等任务。对于中期到低资源语言的情感分析，以前的研究表明，一种多语言方法，即机器翻译的多语言方法可以竞争或优于任务的先前方法。我们开发了零拍摄具有零点的分割框架，并演示了如何用于提高多语言情感分析管道的准确性。我们的零拍摄框架为HASHTAG分割数据集建立了新的最先进的，甚至超过了以前的方法，依赖于在域内数据的特征工程和语言模型。

translated by 谷歌翻译

Inverting brain grey matter models with likelihood-free inference: a tool for trustable cytoarchitecture measurements

Maëliss Jallais , Pedro Rodrigues , Alexandre Gramfort , Demian Wassermann

分类：机器学习

2021-11-15

对脑灰质细胞结构的有效表征具有定量敏感性对SOMA密度和体积的敏感性仍然是扩散MRI（DMRI）中的未解决的攻击。解决与细胞建筑特征的DMRI信号相关的问题呼吁通过少数生理相关参数和用于反相模型的算法来定义描述脑组织的数学模型。为了解决这个问题，我们提出了一个新的前向模型，特别是一个新的方程式系统，需要几个相对稀疏的B-shell。然后，我们从贝叶斯分析中应用现代工具，称为无似然推论（LFI）来颠覆我们所提出的模型。与文献中的其他方法相比，我们的算法不仅产生了最能描述给定的观察数据点$ x_0 $的参数向量$ \ theta $的估计，而且还产生了全面的后分发$ p（\ theta | x_0）超过参数空间。这使得模型反演的描述能够更丰富地描述，提供估计参数的可信间隔的指示符以及模型可能呈现不确定性的参数区域的完整表征。我们近似使用深神经密度估计器的后部分布，称为标准化流，并使用来自前向模型的一组重复模拟来拟合它们。我们使用DMIPY验证我们的模拟方法，然后在两个公共可用数据集上应用整个管道。

translated by 谷歌翻译

Contextual-Lexicon Approach for Abusive Language Detection

Francielle Vargas , Fabiana Rodrigues de Góes , Isabelle Carvalho , Fabrício Benevenuto , Thiago Alexandre Salgueiro Pardo

分类：自然语言处理

2021-04-25

Since a lexicon-based approach is more elegant scientifically, explaining the solution components and being easier to generalize to other applications, this paper provides a new approach for offensive language and hate speech detection on social media. Our approach embodies a lexicon of implicit and explicit offensive and swearing expressions annotated with contextual information. Due to the severity of the social media abusive comments in Brazil, and the lack of research in Portuguese, Brazilian Portuguese is the language used to validate the models. Nevertheless, our method may be applied to any other language. The conducted experiments show the effectiveness of the proposed approach, outperforming the current baseline methods for the Portuguese language.

translated by 谷歌翻译

HateBR: A Large Expert Annotated Corpus of Brazilian Instagram Comments for Offensive Language and Hate Speech Detection

Francielle Alves Vargas , Isabelle Carvalho , Fabiana Rodrigues de Góes , Fabrício Benevenuto , Thiago Alexandre Salgueiro Pardo

分类：自然语言处理

2021-03-27

Due to the severity of the social media offensive and hateful comments in Brazil, and the lack of research in Portuguese, this paper provides the first large-scale expert annotated corpus of Brazilian Instagram comments for hate speech and offensive language detection. The HateBR corpus was collected from the comment section of Brazilian politicians' accounts on Instagram and manually annotated by specialists, reaching a high inter-annotator agreement. The corpus consists of 7,000 documents annotated according to three different layers: a binary classification (offensive versus non-offensive comments), offensiveness-level classification (highly, moderately, and slightly offensive), and nine hate speech groups (xenophobia, racism, homophobia, sexism, religious intolerance, partyism, apology for the dictatorship, antisemitism, and fatphobia). We also implemented baseline experiments for offensive language and hate speech detection and compared them with a literature baseline. Results show that the baseline experiments on our corpus outperform the current state-of-the-art for the Portuguese language.

translated by 谷歌翻译

HNPE: Leveraging Global Parameters for Neural Posterior Estimation

Pedro L. C. Rodrigues , Thomas Moreau , Gilles Louppe , Alexandre Gramfort

分类： (统计)机器学习 | 机器学习

2021-02-12

推断基于实验观察的随机模型的参数是科学方法的核心。特别具有挑战性的设置是当模型强烈不确定时，即当不同的参数集产生相同的观察时。这在许多实际情况下出现，例如在推断无线电源的距离和功率时（是源关闭和弱或远远强，且强大且强大？）或估计电生理实验的放大器增益和底层脑活动。在这项工作中，我们通过利用由辅助观察集共享全局参数传达的附加信息来阐明这种不确定性的新方法。我们的方法基于对贝叶斯分层模型的标准化流程扩展了基于仿真的推断（SBI）的最新进展。我们通过模拟和实际EEG数据将其应用于可用于分析解决方案的激励示例，以便将其验证我们的提案，然后将其从计算神经科学逆变众所周知的非线性模型。

translated by 谷歌翻译

Multi-Coil MRI Reconstruction Challenge -- Assessing Brain MRI Reconstruction Models and their Generalizability to Varying Coil Configurations

Youssef Beauferris , Jonas Teuwen , Dimitrios Karkalousos , Nikita Moriakov , Mattha Caan , George Yiasemis , Lívia Rodrigues , Alexandre Lopes , Hélio Pedrini , Letícia Rittner

分类：机器学习

2020-11-10

基于深度学习的脑磁共振成像（MRI）重建方法有可能加速MRI采集过程。尽管如此，科学界缺乏适当的基准，以评估高分辨率大脑图像的MRI重建质量，并评估这些所提出的算法在存在小而且预期的数据分布班次存在下的表现。多线圈磁共振图像（MC-MRI）重建挑战提供了一种基准，其目的在于使用高分辨率，三维，T1加权MRI扫描的大型数据集。挑战有两个主要目标：1）比较该数据集和2）上的不同的MRI重建模型，并评估这些模型的概括性，以通过不同数量的接收器线圈获取的数据。在本文中，我们描述了挑战实验设计，并总结了一系列基线和艺术脑MRI重建模型的结果。我们提供有关目前MRI重建最先进的相关比较信息，并突出挑战在更广泛的临床采用之前获得所需的普遍模型。 MC-MRI基准数据，评估代码和当前挑战排行榜可公开可用。它们为脑MRI重建领域的未来发展提供了客观性能评估。

translated by 谷歌翻译

3DSGrasp: 3D Shape-Completion for Robotic Grasp

Seyed S. Mohammadi , Nuno F. Duarte , Dimitris Dimou , Yiming Wang , Matteo Taiana , Pietro Morerio , Atabak Dehban , Plinio Moreno , Alexandre Bernardino , Alessio Del Bue

分类：机器人 | 人工智能

2023-01-02

Real-world robotic grasping can be done robustly if a complete 3D Point Cloud Data (PCD) of an object is available. However, in practice, PCDs are often incomplete when objects are viewed from few and sparse viewpoints before the grasping action, leading to the generation of wrong or inaccurate grasp poses. We propose a novel grasping strategy, named 3DSGrasp, that predicts the missing geometry from the partial PCD to produce reliable grasp poses. Our proposed PCD completion network is a Transformer-based encoder-decoder network with an Offset-Attention layer. Our network is inherently invariant to the object pose and point's permutation, which generates PCDs that are geometrically consistent and completed properly. Experiments on a wide range of partial PCD show that 3DSGrasp outperforms the best state-of-the-art method on PCD completion tasks and largely improves the grasping success rate in real-world scenarios. The code and dataset will be made available upon acceptance.

translated by 谷歌翻译

Anxolotl, an Anxiety Companion App -- Stress Detection

Nuno Gomes , Matilde Pato , Pedro Santos , André Lourenço , Lourenço Rodrigues

分类：机器学习

2022-12-28

Stress has a great effect on people's lives that can not be understated. While it can be good, since it helps humans to adapt to new and different situations, it can also be harmful when not dealt with properly, leading to chronic stress. The objective of this paper is developing a stress monitoring solution, that can be used in real life, while being able to tackle this challenge in a positive way. The SMILE data set was provided to team Anxolotl, and all it was needed was to develop a robust model. We developed a supervised learning model for classification in Python, presenting the final result of 64.1% in accuracy and a f1-score of 54.96%. The resulting solution stood the robustness test, presenting low variation between runs, which was a major point for it's possible integration in the Anxolotl app in the future.

translated by 谷歌翻译

Automatic Text Simplification of News Articles in the Context of Public Broadcasting

Diego Maupomé , Fanny Rancourt , Thomas Soulas , Alexandre Lachance , Marie-Jean Meurs , Desislava Aleksandrova , Olivier Brochu Dufour , Igor Pontes , Rémi Cardon , Michel Simard

分类：自然语言处理 | 人工智能 | 机器学习

2022-12-26

This report summarizes the work carried out by the authors during the Twelfth Montreal Industrial Problem Solving Workshop, held at Universit\'e de Montr\'eal in August 2022. The team tackled a problem submitted by CBC/Radio-Canada on the theme of Automatic Text Simplification (ATS).

translated by 谷歌翻译