智能论文笔记

Towards Layer-wise Image Vectorization

Xu Ma , Yuqian Zhou , Xingqian Xu , Bin Sun , Valerii Filev , Nikita Orlov , Yun Fu , Humphrey Shi

分类：计算机视觉

2022-06-09

图像栅格化是计算机图形中的一种成熟技术，而图像矢量化（栅格化的反向路径）仍然是一个重大挑战。最近的先进的基于深度学习的模型实现了向量图的矢量化和语义插值，并证明了生成新数字的更好拓扑。但是，深层模型不能轻易推广到室外测试数据。生成的SVG还包含复杂而冗余的形状，这些形状并不是进一步编辑的方便。具体而言，图像中关键的层拓扑和基本语义仍然没有很好地理解，因此尚未完全探索。在这项工作中，我们提出了层次图像矢量化，即现场，以将栅格图像转换为SVG并同时维护其图像拓扑。 Live可以产生紧凑的SVG形式，具有与人类视角一致的层结构。我们逐步添加新的bezier路径，并通过层面框架，新设计的损耗功能和组件途径初始化技术优化这些路径。我们的实验表明，与先前的作品相比，Live呈现出更合理的矢量形式，并且可以推广到新图像。在这个新知识的拓扑结构的帮助下，Live为设计师和其他下游应用程序启动了人类可编辑的SVG。代码可在https://github.com/picsart-ai-research/live-layerwise-image-vectorization上找到。

translated by 谷歌翻译

Unsupervised construction of representations for oil wells via Transformers

Alina Rogulina , Nikita Baramiia , Valerii Kornilov , Sergey Petrakov , Alexey Zaytsev

分类：机器学习

2022-12-29

Determining and predicting reservoir formation properties for newly drilled wells represents a significant challenge. One of the variations of these properties evaluation is well-interval similarity. Many methodologies for similarity learning exist: from rule-based approaches to deep neural networks. Recently, articles adopted, e.g. recurrent neural networks to build a similarity model as we deal with sequential data. Such an approach suffers from short-term memory, as it pays more attention to the end of a sequence. Neural network with Transformer architecture instead cast their attention over all sequences to make a decision. To make them more efficient in terms of computational time, we introduce a limited attention mechanism similar to Informer and Performer architectures. We conduct experiments on open datasets with more than 20 wells making our experiments reliable and suitable for industrial usage. The best results were obtained with our adaptation of the Informer variant of Transformer with ROC AUC 0.982. It outperforms classical approaches with ROC AUC 0.824, Recurrent neural networks with ROC AUC 0.934 and straightforward usage of Transformers with ROC AUC 0.961.

translated by 谷歌翻译

Automated Deep Aberration Detection from Chromosome Karyotype Images

Zahra Shamsi , Drew Bryant , Jacob Wilson , Xiaoyu Qu , Avinava Dubey , Konik Kothari , Mostafa Dehghani , Mariya Chavarha , Valerii Likhosherstov , Brian Williams

分类：计算机视觉 | 机器学习

2022-11-20

Chromosome analysis is essential for diagnosing genetic disorders. For hematologic malignancies, identification of somatic clonal aberrations by karyotype analysis remains the standard of care. However, karyotyping is costly and time-consuming because of the largely manual process and the expertise required in identifying and annotating aberrations. Efforts to automate karyotype analysis to date fell short in aberration detection. Using a training set of ~10k patient specimens and ~50k karyograms from over 5 years from the Fred Hutchinson Cancer Center, we created a labeled set of images representing individual chromosomes. These individual chromosomes were used to train and assess deep learning models for classifying the 24 human chromosomes and identifying chromosomal aberrations. The top-accuracy models utilized the recently introduced Topological Vision Transformers (TopViTs) with 2-level-block-Toeplitz masking, to incorporate structural inductive bias. TopViT outperformed CNN (Inception) models with >99.3% accuracy for chromosome identification, and exhibited accuracies >99% for aberration detection in most aberrations. Notably, we were able to show high-quality performance even in "few shot" learning scenarios. Incorporating the definition of clonality substantially improved both precision and recall (sensitivity). When applied to "zero shot" scenarios, the model captured aberrations without training, with perfect precision at >50% recall. Together these results show that modern deep learning models can approach expert-level performance for chromosome aberration detection. To our knowledge, this is the first study demonstrating the downstream effectiveness of TopViTs. These results open up exciting opportunities for not only expediting patient results but providing a scalable technology for early screening of low-abundance chromosomal lesions.

translated by 谷歌翻译

Non-contrastive approaches to similarity learning: positive examples are all you need

Alexander Marusov , Valerii Baianov , Alexey Zaytsev

分类：人工智能 | 机器学习

2022-09-28

石油和天然气行业中的相似性学习问题旨在构建一个模型，该模型估算以记录数据的间隔测量之间的相似性。以前的尝试主要基于经验规则，因此我们的目标是自动化此过程并排除昂贵且耗时的专家标签。相似性学习的方法之一是自学学习（SSL）。与监督范式相反，该数据几乎不需要标签。因此，即使缺乏或稀缺，我们也可以学习此类模型。如今，大多数SSL方法都是对比和非对抗性的。但是，由于可能对正和负样本进行错误的标记，对比度方法的扩展并不能很好地扩展到对象的数量。非对比度方法不依赖负样本。这种方法在计算机视觉中积极使用。我们为时间序列数据引入了非对比度SSL。特别是，我们建立在Byol和Barlow双胞胎方法的基础上，这些方法避免使用负对，仅专注于匹配正对。这些方法的关键部分是增强策略。存在时间序列的不同增强，而它们对性能的影响可能是正面的和负面的。我们对BYOL和BARLOW双胞胎的增强策略和适应性，使我们能够比其他自我监督的方法（仅ARI $ = 0.34 $）实现更高的质量（ARI $ = 0.49 $），证明了拟议中的非对比性自我的有用性间隔相似性问题和时间序列表示总体学习的监督方法。

translated by 谷歌翻译

Robust Action Governor for Uncertain Piecewise Affine Systems with Non-convex Constraints and Safe Reinforcement Learning

Yutong Li , Nan Li , H. Eric Tseng , Anouck Girard , Dimitar Filev , Ilya Kolmanovsky

分类：人工智能 | 机器人

2022-07-17

行动调速器是标称控制循环的附加方案，该方案监视和调整控制措施以强制执行以端加状态和控制约束表示的安全规范。在本文中，我们介绍了系统的强大动作调速器（RAG），该动力学可以使用具有参数和加法不确定性的离散时间分段仿射（PWA）模型来表示，并受到非convex约束。我们开发了抹布的理论属性和计算方法。之后，我们介绍了抹布来实现安全加强学习（RL），即确保在线RL勘探和探索过程中的历史约束满意度。该开发使控制策略的安全实时演变和适应操作环境和系统参数的变化（由于老化，损坏等）。我们通过考虑将其应用于质量 - 弹簧式抑制系统的软地面问题来说明抹布在约束执法和安全RL中的有效性。

translated by 谷歌翻译

Robust AI Driving Strategy for Autonomous Vehicles

Subramanya Nageshrao , Yousaf Rahman , Vladimir Ivanovic , Mrdjan Jankovic , Eric Tseng , Michael Hafner , Dimitar Filev

分类：机器人

2022-07-16

然而，由于各种交通/道路结构方案以及人类驾驶员行为的长时间分布，自动驾驶的感应，感知和本地化取得了重大进展，因此，对于智能车辆来说，这仍然是一个持开放态度的挑战始终知道如何在有可用的传感 /感知 /本地化信息的道路上做出和执行最佳决定。在本章中，我们讨论了人工智能，更具体地说，强化学习如何利用运营知识和安全反射来做出战略性和战术决策。我们讨论了一些与强化学习解决方案的鲁棒性及其对自动驾驶驾驶策略的实践设计有关的具有挑战性的问题。我们专注于在高速公路上自动驾驶以及增强学习，车辆运动控制和控制屏障功能的整合，从而实现了可靠的AI驾驶策略，可以安全地学习和适应。

translated by 谷歌翻译

PolyViT: Co-training Vision Transformers on Images, Videos and Audio

Valerii Likhosherstov , Anurag Arnab , Krzysztof Choromanski , Mario Lucic , Yi Tay , Adrian Weller , Mostafa Dehghani

分类：计算机视觉 | 机器学习

2021-11-25

我们可以训练一个能够处理多个模态和数据集的单个变压器模型，同时分享几乎所有的学习参数？我们呈现Polyvit，一种培训的模型，在图像，音频和视频上接受了讲述这个问题。通过在单一的方式上培训不同的任务，我们能够提高每个任务的准确性，并在5个标准视频和音频分类数据集中实现最先进的结果。多种模式和任务上的共同训练Polyvit会导致一个更具参数效率的模型，并学习遍历多个域的表示。此外，我们展示了实施的共同培训和实用，因为我们不需要调整数据集的每个组合的超级参数，但可以简单地调整来自标准的单一任务培训。

translated by 谷歌翻译

From block-Toeplitz matrices to differential equations on graphs: towards a general theory for scalable masked Transformers

Krzysztof Choromanski , Han Lin , Haoxian Chen , Tianyi Zhang , Arijit Sehanobish , Valerii Likhosherstov , Jack Parker-Holder , Tamas Sarlos , Adrian Weller , Thomas Weingarten

分类：机器学习 | 人工智能

2021-07-16

在本文中，据我们所知，我们提供了将各种掩盖机制纳入变形金刚以可扩展方式融入变形金刚结构的第一种综合方法。我们表明，有关线性因果关注的最新结果（Choromanski等，2021）和对数线性RPE注意力（Luo等，2021）是这种一般机制的特殊情况。但是，通过将问题作为对未掩盖注意力的拓扑调制（基于图的）调制，我们以前获得了几个未知结果，包括有效的D维RPE掩盖和图形内掩蔽。我们利用许多数学技术，从光谱分析到动态编程和随机步行到新算法，以求解图形上的马尔可夫过程。我们提供相应的经验评估。

translated by 谷歌翻译

Rethinking Attention with Performers

Krzysztof Choromanski , Valerii Likhosherstov , David Dohan , Xingyou Song , Andreea Gane , Tamas Sarlos , Peter Hawkins , Jared Davis , Afroz Mohiuddin , Lukasz Kaiser

分类：

2020-09-30

We introduce Performers, Transformer architectures which can estimate regular (softmax) full-rank-attention Transformers with provable accuracy, but using only linear (as opposed to quadratic) space and time complexity, without relying on any priors such as sparsity or low-rankness. To approximate softmax attentionkernels, Performers use a novel Fast Attention Via positive Orthogonal Random features approach (FAVOR+), which may be of independent interest for scalable kernel methods. FAVOR+ can also be used to efficiently model kernelizable attention mechanisms beyond softmax. This representational power is crucial to accurately compare softmax with other kernels for the first time on large-scale tasks, beyond the reach of regular Transformers, and investigate optimal attention-kernels. Performers are linear architectures fully compatible with regular Transformers and with strong theoretical guarantees: unbiased or nearly-unbiased estimation of the attention matrix, uniform convergence and low estimation variance. We tested Performers on a rich set of tasks stretching from pixel-prediction through text models to protein sequence modeling. We demonstrate competitive results with other examined efficient sparse and dense attention methods, showcasing effectiveness of the novel attention-learning paradigm leveraged by Performers.

translated by 谷歌翻译