智能论文笔记

Large-Margin Representation Learning for Texture Classification

Jonathan de Matos , Luiz Eduardo Soares de Oliveira , Alceu de Souza Britto Junior , Alessandro Lameiras Koerich

分类：计算机视觉 | 机器学习

2022-06-17

本文提出了一种新的方法，该方法结合了卷积层（CLS）和大规模的度量度量，用于在小数据集上进行培训模型以进行纹理分类。这种方法的核心是损失函数，该函数计算了感兴趣的实例和支持向量之间的距离。目的是在迭代中更新CLS的权重，以学习一类之间具有较大利润的表示形式。每次迭代都会产生一个基于这种表示形式的支持向量表示的大细边缘判别模型。拟议方法的优势W.R.T.卷积神经网络（CNN）为两倍。首先，由于参数数量减少，与等效的CNN相比，它允许用少量数据进行表示。其次，自返回传播仅考虑支持向量以来，它的培训成本较低。关于纹理和组织病理学图像数据集的实验结果表明，与等效的CNN相比，所提出的方法以较低的计算成本和更快的收敛性达到了竞争精度。

translated by 谷歌翻译

Pattern Spotting and Image Retrieval in Historical Documents using Deep Hashing

Caio da S. Dias , Alceu de S. Britto Jr. , Jean P. Barddal , Laurent Heutte , Alessandro L. Koerich

分类：计算机视觉 | 机器学习

2022-08-04

本文提出了一种深度学习方法，用于在历史文档的数字收集中进行图像检索和图案斑点。首先，区域建议算法检测文档页面图像中的对象候选。接下来，考虑了两个不同的变体，这些模型用于特征提取，这些变体提供了实用值或二进制代码表示。最后，通过计算给定输入查询的特征相似性来对候选图像进行排名。一项强大的实验协议评估了DOCEXPLORE图像数据库上的每个表示方案（实用值和二进制代码）的建议方法。实验结果表明，所提出的深层模型与历史文档图像的最新图像检索方法相比，使用相同的技术用于模式斑点，优于2.56个百分点。此外，与基于实价表示的相关作品相比，提议的方法还将搜索时间缩短了200倍，并且存储的成本高达6,000倍。

translated by 谷歌翻译

Evaluation of Different Annotation Strategies for Deployment of Parking Spaces Classification Systems

Andre G. Hochuli , Alceu S. Britto Jr. , Paulo R. L. de Almeida , Williams B. S. Alves , Fabio M. C. Cagni

分类：计算机视觉

2022-07-22

当使用基于视觉的方法对被占用和空的空地之间的单个停车位进行分类时，人类专家通常需要注释位置，并标记包含目标停车场中收集的图像的训练集，以微调系统。我们建议研究三种注释类型（多边形，边界框和固定尺寸的正方形），提供停车位的不同数据表示。理由是阐明手工艺注释精度和模型性能之间的最佳权衡。我们还调查了在目标停车场微调预训练型号所需的带注释的停车位数。使用PKLOT数据集使用的实验表明，使用低精度注释（例如固定尺寸的正方形），可以将模型用少于1,000个标记的样品微调到目标停车场。

translated by 谷歌翻译

Towards fully automated deep-learning-based brain tumor segmentation: is brain extraction still necessary?

Bruno Machado Pacheco , Guilherme de Souza e Cassia , Danilo Silva

分类：计算机视觉

2022-12-14

State-of-the-art brain tumor segmentation is based on deep learning models applied to multi-modal MRIs. Currently, these models are trained on images after a preprocessing stage that involves registration, interpolation, brain extraction (BE, also known as skull-stripping) and manual correction by an expert. However, for clinical practice, this last step is tedious and time-consuming and, therefore, not always feasible, resulting in skull-stripping faults that can negatively impact the tumor segmentation quality. Still, the extent of this impact has never been measured for any of the many different BE methods available. In this work, we propose an automatic brain tumor segmentation pipeline and evaluate its performance with multiple BE methods. Our experiments show that the choice of a BE method can compromise up to 15.7% of the tumor segmentation performance. Moreover, we propose training and testing tumor segmentation models on non-skull-stripped images, effectively discarding the BE step from the pipeline. Our results show that this approach leads to a competitive performance at a fraction of the time. We conclude that, in contrast to the current paradigm, training tumor segmentation models on non-skull-stripped images can be the best option when high performance in clinical practice is desired.

translated by 谷歌翻译

Embedding generation for text classification of Brazilian Portuguese user reviews: from bag-of-words to transformers

Frederico Dias Souza , João Baptista de Oliveira e Souza Filho

分类：自然语言处理 | 人工智能

2022-12-01

Text classification is a natural language processing (NLP) task relevant to many commercial applications, like e-commerce and customer service. Naturally, classifying such excerpts accurately often represents a challenge, due to intrinsic language aspects, like irony and nuance. To accomplish this task, one must provide a robust numerical representation for documents, a process known as embedding. Embedding represents a key NLP field nowadays, having faced a significant advance in the last decade, especially after the introduction of the word-to-vector concept and the popularization of Deep Learning models for solving NLP tasks, including Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), and Transformer-based Language Models (TLMs). Despite the impressive achievements in this field, the literature coverage regarding generating embeddings for Brazilian Portuguese texts is scarce, especially when considering commercial user reviews. Therefore, this work aims to provide a comprehensive experimental study of embedding approaches targeting a binary sentiment classification of user reviews in Brazilian Portuguese. This study includes from classical (Bag-of-Words) to state-of-the-art (Transformer-based) NLP models. The methods are evaluated with five open-source databases with pre-defined data partitions made available in an open digital repository to encourage reproducibility. The Fine-tuned TLMs achieved the best results for all cases, being followed by the Feature-based TLM, LSTM, and CNN, with alternate ranks, depending on the database under analysis.

translated by 谷歌翻译

MonoByte: A Pool of Monolingual Byte-level Language Models

Hugo Abonizio , Leandro Rodrigues de Souza , Roberto Lotufo , Rodrigo Nogueira

分类：自然语言处理

2022-09-22

在多语言甚至单语言中鉴定的模型的零拍跨语言能力刺激了许多假设，以解释这一有趣的经验结果。但是，由于预处理的成本，大多数研究都使用公共模型的公共模型，其预处理方法（例如代币化，语料库规模和计算预算的选择）可能会大不相同。当研究人员对自己的模型预识时，他们通常会在预算有限的情况下这样做，并且与SOTA模型相比，最终的模型的表现可能明显不足。这些实验差异导致有关这些模型跨语性能力的性质的各种不一致的结论。为了帮助对该主题进行进一步研究，我们发布了10个单语字节级模型，并在相同的配置下进行了严格审慎的概述，并具有大型计算预算（相当于V100的420天）和Corpora，比原始BERT大4倍。由于它们不含令牌，因此消除了看不见的令牌嵌入的问题，从而使研究人员可以在具有不同脚本的语言中尝试更广泛的跨语言实验。此外，我们释放了在不自然语言文本上预测的两个模型，这些模型可用于理智检查实验。关于质量检查和NLI任务的实验表明，我们的单语模型实现了多语言的竞争性能，因此可以加强我们对语言模型中跨语性可传递性的理解。

translated by 谷歌翻译

Mapless Navigation of a Hybrid Aerial Underwater Vehicle with Deep Reinforcement Learning Through Environmental Generalization

Ricardo B. Grando , Junior C. de Jesus , Victor A. Kich , Alisson H. Kolling , Rodrigo S. Guerra , Paulo L. J. Drews-Jr

分类：机器人 | 人工智能

2022-09-13

先前的工作表明，深-RL可以应用于无地图导航，包括混合无人驾驶空中水下车辆（Huauvs）的中等过渡。本文介绍了基于最先进的演员批评算法的新方法，以解决Huauv的导航和中型过渡问题。我们表明，具有复发性神经网络的双重评论家Deep-RL可以使用仅范围数据和相对定位来改善Huauvs的导航性能。我们的深-RL方法通过通过不同的模拟场景对学习的扎实概括，实现了更好的导航和过渡能力，表现优于先前的方法。

translated by 谷歌翻译

Deterministic and Stochastic Analysis of Deep Reinforcement Learning for Low Dimensional Sensing-based Navigation of Mobile Robots

Ricardo B. Grando , Junior C. de Jesus , Victor A. Kich , Alisson H. Kolling , Rodrigo S. Guerra , Paulo L. J. Drews-Jr

分类：机器人 | 人工智能

2022-09-13

深钢筋学习中的确定性和随机技术已成为改善运动控制和各种机器人的决策任务的有前途的解决方案。先前的工作表明，这些深-RL算法通常可以应用于一般的移动机器人的无MAP导航。但是，他们倾向于使用简单的传感策略，因为已经证明它们在高维状态空间（例如基于图像的传感的空间）方面的性能不佳。本文在执行移动机器人无地图导航的任务时，对两种深-RL技术 - 深确定性政策梯度（DDPG）和软参与者（SAC）进行了比较分析。我们的目标是通过展示神经网络体系结构如何影响学习本身的贡献，并根据每种方法的航空移动机器人导航的时间和距离提出定量结果。总体而言，我们对六个不同体系结构的分析强调了随机方法（SAC）更好地使用更深的体系结构，而恰恰相反发生在确定性方法（DDPG）中。

translated by 谷歌翻译

CometKiwi: IST-Unbabel 2022 Submission for the Quality Estimation Shared Task

Ricardo Rei , Marcos Treviso , Nuno M. Guerreiro , Chrysoula Zerva , Ana C. Farinha , Christine Maroti , José G. C. de Souza , Taisiya Glushkova , Duarte M. Alves , Alon Lavie

分类：自然语言处理 | 机器学习

2022-09-13

我们介绍了IST和Unmabel对WMT 2022关于质量估计（QE）的共享任务的共同贡献。我们的团队参与了所有三个子任务：（i）句子和单词级质量预测；（ii）可解释的量化宽松；（iii）关键错误检测。对于所有任务，我们在彗星框架之上构建，将其与OpenKIWI的预测估计架构连接，并为其配备单词级序列标记器和解释提取器。我们的结果表明，在预处理过程中合并参考可以改善下游任务上多种语言对的性能，并且通过句子和单词级别的目标共同培训可以进一步提高。此外，将注意力和梯度信息结合在一起被证明是提取句子级量化量化宽松模型的良好解释的首要策略。总体而言，我们的意见书在几乎所有语言对的所有三个任务中都取得了最佳的结果。

translated by 谷歌翻译

Virtual Reality Platform to Develop and Test Applications on Human-Robot Social Interaction

Jair A. Bottega , Raul Steinmetz , Alisson H. Kolling , Victor A. Kich , Junior C. de Jesus , Ricardo B. Grando , Daniel F. T. Gamarra

分类：机器人

2022-08-13

机器人模拟一直是机器人领域研发的组成部分。模拟消除了通过启用机器人的应用测试来快速，负担得起的，而无需遭受机械或电子误差而进行机器人应用测试，从而消除了对传感器，电动机和实际机器人物理结构的可能性。通过虚拟现实（VR）模拟，通过提供更好的环境可视化提示，为与模拟机器人互动提供了更具吸引力的替代方法，从而提供了更严肃的体验。这种沉浸至关重要，尤其是在讨论社交机器人时，人类机器人相互作用（HRI）领域的子区域。在日常生活中，机器人的广泛使用取决于HRI。将来，机器人将能够与人们有效互动，以在人类文明中执行各种任务。在个人工作空间开始扩散时，为机器人开发简单且易于理解的接口至关重要。因此，在这项研究中，我们实施了一个使用现成的工具和包装的VR机器人框架，以增强社交HRI的研究和应用开发。由于整个VR接口是一个开源项目，因此可以在身临其境的环境中进行测试，而无需物理机器人。

translated by 谷歌翻译