智能论文笔记

Aztec curve: proposal for a new space-filling curve

Diego Ayala , Daniel Durini , Jose Rangel-Magdaleno

分类：计算机视觉

2022-07-28

本文简要审查了不同的空间填充曲线（SFC），并提出了新的曲线。一个世纪已经过去了这类曲线的建立，从那以后，它们在计算机科学中被发现有用，尤其是在数据存储和由于它们的聚类特性而引起的索引，成为希尔伯特曲线是分形家族中最知名的成员。本文介绍了所提出的阿兹台克曲线，具有与希尔伯特曲线相似的特征，并伴随着语法描述。它产生了创建双维簇的可能性，不适合希尔伯特（Hilbert）和佩恩诺（Peano）曲线。除此之外，还实施了在压缩传感范围上应用的情况，其中希尔伯特曲线的使用与阿兹台克曲线形成鲜明对比，具有相似的性能，并将AZTEC曲线定位为可行的，并将其定位为可行的新替代方法使用SFC的应用程序。

translated by 谷歌翻译

Neural Space-filling Curves

Hanyu Wang , Kamal Gupta , Larry Davis , Abhinav Shrivastava

分类：计算机视觉

2022-04-18

我们提出了神经空间填充曲线（SFC），这是一种数据驱动的方法，用于推断一组图像的基于上下文的扫描顺序。像素的线性排序构成了许多应用程序的基础，例如用于图像的生成建模中的视频扰动，压缩和自动回归模型。现有的算法诉诸固定扫描算法，例如栅格扫描或希尔伯特扫描。取而代之的是，我们的工作使用基于图的神经网络从图像数据集中学习了像素的空间连贯的线性顺序。当图像与扫描线顺序一起遍历时，对所得神经SFC进行了优化，适用于适合下游任务的物镜。我们展示了在下游应用中使用神经SFC（例如图像压缩）的优势。代码和其他结果将在https://hywang66.github.io/publication/neuralsfc上提供。

translated by 谷歌翻译

Hilbert Flattening: a Locality-Preserving Matrix Unfolding Method

Qingsong Zhao , Zhipeng Zhou , Yi Wang , Yu Qiao , Cairong Zhao

分类：计算机视觉

2022-02-21

Zigzag flattening (ZF) is commonly utilized as a default option to get the image patches ordering in deep models, e.g. vision transformers (ViTs). Notably, when decomposing multi-scale images, ZF could not maintain the invariance of feature point positions.To this end, we investigate the Hilbert flattening (HF) as an alternative for sequence ordering in vision tasks. HF has proven to be superior to other flatten approaches in maintaining spatial locality, when performing multi-scale transformations of dimensional space. In applications, we design a position encoding method based on HF, beating absolute position encoding non-trivially in Transformer architecture. It also can be used to feature down-sampling and feature/image interpolation. Extensive experiments demonstrate that it can yield consistent performance boosts for several popular architectures and applications. The code will be released upon acceptance.

translated by 谷歌翻译

K-SVD: An algorithm for designing overcomplete dictionaries for sparse representation

分类：

In recent years there has been a growing interest in the study of sparse representation of signals. Using an overcomplete dictionary that contains prototype signal-atoms, signals are described by sparse linear combinations of these atoms. Applications that use sparse representation are many and include compression, regularization in inverse problems, feature extraction, and more. Recent activity in this field has concentrated mainly on the study of pursuit algorithms that decompose signals with respect to a given dictionary. Designing dictionaries to better fit the above model can be done by either selecting one from a prespecified set of linear transforms or adapting the dictionary to a set of training signals. Both of these techniques have been considered, but this topic is largely still open. In this paper we propose a novel algorithm for adapting dictionaries in order to achieve sparse signal representations. Given a set of training signals, we seek the dictionary that leads to the best representation for each member in this set, under strict sparsity constraints. We present a new method-the K-SVD algorithm-generalizing the K-means clustering process. K-SVD is an iterative method that alternates between sparse coding of the examples based on the current dictionary and a process of updating the dictionary atoms to better fit the data. The update of the dictionary columns is combined with an update of the sparse representations, thereby accelerating convergence. The K-SVD algorithm is flexible and can work with any pursuit method (e.g., basis pursuit, FOCUSS, or matching pursuit). We analyze this algorithm and demonstrate its results both on synthetic tests and in applications on real image data.

translated by 谷歌翻译

Moment Transform-Based Compressive Sensing in Image Processing

T. Kalampokas , G. A. Papakostas

分类：计算机视觉

2021-11-14

在过去十年中，图像已成为许多域中的重要信息来源，因此他们的高质量是获取更好信息的必要条件。出现的重要问题是图像去噪，这意味着从不准确和/或部分测量的样品中恢复信号。这种解释与压缩感测理论高度相关，这是一种革命性的技术，并且意味着如果信号稀疏，则可以从几个测量值获得原始信号，这些值远低于其他使用的理论所建议的值像Shannon的抽样理论。压缩传感（CS）理论的强因素以实现稀疏性解决方案以及从损坏的图像中移除的噪声是基础词典的选择。在本文中，比较了基于压缩感测和稀疏近似理论的高斯粘性白噪声的离散余弦变换（DCT）和力矩变换（TCHEBICHEF，KRAWTCHOUK）。实验结果表明，由矩变换构建的基本词典竞争性地表现为传统的DCT。后一种变换显示了30.82dB的PSNR，与Tchebichef变换相同的0.91 SSIM值。此外，从稀疏性的角度来看，Krawtchouk时刻提供大约20-30％的稀疏结果比DCT更多。

translated by 谷歌翻译

Inpainting in discrete Sobolev spaces: structural information for uncertainty reduction

Marco Seracini , Stephen R. Brown

分类：计算机视觉

2022-11-07

In this article, using an exemplar-based approach, we investigate the inpainting problem, introducing a new mathematical functional, whose minimization determines the quality of the reconstructions. The new functional expression takes into account of fnite differences terms, in a similar fashion to what happens in the theoretical Sobolev spaces. Moreover, we introduce a new priority index to determine the scanning order of the points to inpaint, prioritizing the uncertainty reduction in the choice. The achieved results highlight important theoretical-connected aspects of the inpainting by patch procedure.

translated by 谷歌翻译

An analysis of reconstruction noise from undersampled 4D flow MRI

Lauren Partin , Daniele E. Schiavazzi , Carlos A. Sing Long

分类：计算机视觉

2022-01-11

新磁共振（MR）成像方式可以量化血流动力学，但需要长时间的采集时间，妨碍其广泛用于早期诊断心血管疾病。为了减少采集时间，常规使用来自未采样测量的重建方法，使得利用旨在提高图像可压缩性的表示。重建的解剖和血液动力学图像可能存在视觉伪影。尽管这些工件中的一些基本上是重建错误，因此欠采样的后果，其他人可能是由于测量噪声或采样频率的随机选择。另有说明，重建的图像变为随机变量，并且其偏差和其协方差都可以导致视觉伪影;后者会导致可能误解的空间相关性以用于视觉信息。虽然前者的性质已经在文献中已经研究过，但后者尚未得到关注。在这项研究中，我们研究了从重建过程产生的随机扰动的理论特性，并对模拟和主动脉瘤进行了许多数值实验。我们的结果表明，当基于$ \ ell_1 $ -norm最小化的高斯欠采样模式与恢复算法组合时，相关长度保持限制为2到三个像素。然而，对于其他欠采样模式，相关长度可以显着增加，较高的欠采样因子（即8倍或16倍压缩）和不同的重建方法。

translated by 谷歌翻译

Functional additive models on manifolds of planar shapes and forms

Almond Stöcker , Sonja Greven

分类： (统计)机器学习

2021-09-06

在翻译，旋转和形状下定义形状和形式作为等同类 - 也是规模的，我们将广义添加剂回归扩展到平面曲线和/或地标配置的形状/形式的模型。该模型尊重响应的所得到的商几何形状，采用平方的测量距离作为损耗函数和测地响应函数来将添加剂预测器映射到形状/形状空间。为了拟合模型，我们提出了一种riemannian $ l_2 $ -boosting算法，适用于可能大量可能的参数密集型模型术语，其还产生了自动模型选择。我们通过合适的张量 - 产品分解为形状/形状空间中的（甚至非线性）协变量提供新的直观可解释的可视化。所提出的框架的有用性在于1）的野生和驯养绵羊和2）细胞形式的分析中，在生物物理模型中产生的细胞形式，以及3）在具有反应形状和形式的现实模拟研究中，具有来自a的响应形状和形式在瓶轮廓上的数据集。

translated by 谷歌翻译

Wavelets on Graphs via Spectral Graph Theory

David K Hammond , Pierre Vandergheynst , Rémi Gribonval

分类：

2009-12-19

We propose a novel method for constructing wavelet transforms of functions defined on the vertices of an arbitrary finite weighted graph. Our approach is based on defining scaling using the the graph analogue of the Fourier domain, namely the spectral decomposition of the discrete graph Laplacian L. Given a wavelet generating kernel g and a scale parameter t, we define the scaled wavelet operator T t g = g(tL). The spectral graph wavelets are then formed by localizing this operator by applying it to an indicator function. Subject to an admissibility condition on g, this procedure defines an invertible transform. We explore the localization properties of the wavelets in the limit of fine scales. Additionally, we present a fast Chebyshev polynomial approximation algorithm for computing the transform that avoids the need for diagonalizing L. We highlight potential applications of the transform through examples of wavelets on graphs corresponding to a variety of different problem domains.

translated by 谷歌翻译

Fully Adaptive Bayesian Algorithm for Data Analysis, FABADA

Pablo M Sanchez-Alarcon , Yago Ascasibar Sequeiros

分类：计算机视觉

2022-01-13

本文的目的是描述一种从贝叶斯推理的观点来描述一种新的非参数降噪技术，其可以自动提高一个和二维数据的信噪比，例如例如，例如，天文图像和光谱。该算法迭代地评估数据的可能的平滑版本，平滑模型，获得与嘈杂测量统计上兼容的底层信号的估计。迭代基于最后一个顺利模型的证据和$ \ Chi ^ 2 $统计数据，并且我们将信号的预期值计算为整个平滑模型的加权平均值。在本文中，我们解释了算法的数学形式主义和数值实现，我们在利用真正的天文观测的电池对峰值信号，结构相似性指数和时间有效载荷来评估其性能。我们完全自适应的贝叶斯算法用于数据分析（Fabada）产生结果，没有任何参数调谐，与标准图像处理算法相当，其参数基于要恢复的真实信号进行了优化，在实际应用中不可能。诸如BM3D的最先进的非参数方法，以高信噪比提供稍微更好的性能，而我们的算法对于极其嘈杂的数据显着更准确（高于20-40 \％$相对错误，在天文领域特别兴趣的情况）。在此范围内，通过我们的重建获得的残差的标准偏差可能变得比原始测量值低的数量级。复制本报告中显示的所有结果所需的源代码，包括该方法的实现，在https://github.com/pablolyanala/fabada公开使用

translated by 谷歌翻译

Subtle Data Crimes: Naively training machine learning algorithms could lead to overly-optimistic results

Efrat Shimron , Jonathan I. Tamir , Ke Wang , Michael Lustig

分类：机器学习

2021-09-16

虽然开放数据库是深度学习（DL）时代的重要资源，但它们有时使用“Off-Label”：为一个任务发布的数据用于不同的数据。这项工作旨在强调在某些情况下，这种常见做法可能导致偏见，过于乐观的结果。我们展示了这种逆问题溶剂的这种现象，并展示了它们的偏置性能如何源于隐藏数据预处理管道。我们描述了两个典型的开放式访问数据库的预处理管道，并研究了对为磁共振成像（MRI）重建开发的三种熟发的算法的影响：压缩传感（CS），字典学习（DICTL）和DL。在这种大规模研究中，我们进行了广泛的计算。我们的结果表明，CS，DICTL和DL算法在看似适当的数据上天鹅训练时，CS，DICTL和DL算法产生了系统地偏见：归一化的根均方误差（NRMSE）随着预处理程度而一致地改善，显示人工增加25％-48％在某些情况下。由于这种现象通常是未知的，因此有时被公布为最先进的结果;我们将其称为细微的数据犯罪。因此，这项工作提出了关于大数据的天真的野外标签的红旗，并揭示了现代逆问题溶解于所产生的偏差的脆弱性。

translated by 谷歌翻译

Shining light on data: Geometric data analysis through quantum dynamics

Akshat Kumar , Mohan Sarovar

分类：机器学习 | (统计)机器学习

2022-12-01

Experimental sciences have come to depend heavily on our ability to organize, interpret and analyze high-dimensional datasets produced from observations of a large number of variables governed by natural processes. Natural laws, conservation principles, and dynamical structure introduce intricate inter-dependencies among these observed variables, which in turn yield geometric structure, with fewer degrees of freedom, on the dataset. We show how fine-scale features of this structure in data can be extracted from \emph{discrete} approximations to quantum mechanical processes given by data-driven graph Laplacians and localized wavepackets. This data-driven quantization procedure leads to a novel, yet natural uncertainty principle for data analysis induced by limited data. We illustrate the new approach with algorithms and several applications to real-world data, including the learning of patterns and anomalies in social distancing and mobility behavior during the COVID-19 pandemic.

translated by 谷歌翻译

OSLO: On-the-Sphere Learning for Omnidirectional images and its application to 360-degree image compression

Navid Mahmoudian Bidgoli , Roberto G. de A. Azevedo , Thomas Maugey , Aline Roumy , Pascal Frossard

分类：计算机视觉

2021-07-19

最新的2D图像压缩方案依赖于卷积神经网络（CNN）的力量。尽管CNN为2D图像压缩提供了有希望的观点，但将此类模型扩展到全向图像并不简单。首先，全向图像具有特定的空间和统计特性，这些特性无法通过当前CNN模型完全捕获。其次，在球体上，基本的数学操作组成了CNN体系结构，例如翻译和采样。在本文中，我们研究了全向图像的表示模型的学习，并建议使用球体的HealPix均匀采样的属性来重新定义用于全向图像的深度学习模型中使用的数学工具。特别是，我们：i）提出了在球体上进行新的卷积操作的定义，以保持经典2D卷积的高表现力和低复杂性； ii）适应标准的CNN技术，例如步幅，迭代聚集和像素改组到球形结构域；然后iii）将我们的新框架应用于全向图像压缩的任务。我们的实验表明，与应用于等应角图像的类似学习模型相比，我们提出的球形溶液可带来更好的压缩增益，可以节省比特率的13.7％。同样，与基于图形卷积网络的学习模型相比，我们的解决方案支持更具表现力的过滤器，这些过滤器可以保留高频并提供压缩图像的更好的感知质量。这样的结果证明了拟议框架的效率，该框架为其他全向视觉任务任务打开了新的研究场所，以在球体歧管上有效实施。

translated by 谷歌翻译

The whole and the parts: the MDL principle and the a-contrario framework

Rafael Grompone von Gioi , Ignacio Ramírez Paulino , Gregory Randall

分类：计算机视觉

2021-12-13

这项工作探讨了Rissanen开发的最小描述长度（MDL）原则之间的连接，以及DESOLNEUX，MOISAN和MOREL提出的结构检测的A-Contrario框架。MDL原则侧重于整个数据的最佳解释，而A逆方法专注于检测具有异常统计数据的数据部分。虽然在不同的理论形式主义中陷害，但两种方法都在他们的机器中分享了许多常见的概念和工具，并在许多有趣的场景中产生非常相似的配方，从简单的玩具例子到实际应用，如曲线和线段检测的多边形近似值在图像中。我们还制定了两种方法正式等同的条件。

translated by 谷歌翻译

Gradient-based learning applied to document recognition

分类：

Multilayer Neural Networks trained with the backpropagation algorithm constitute the best example of a successful Gradient-Based Learning technique. Given an appropriate network architecture, Gradient-Based Learning algorithms can be used to synthesize a complex decision surface that can classify high-dimensional patterns such as handwritten characters, with minimal preprocessing. This paper reviews various methods applied to handwritten character recognition and compares them on a standard handwritten digit recognition task. Convolutional Neural Networks, that are specifically designed to deal with the variability of 2D shapes, are shown to outperform all other techniques.Real-life document recognition systems are composed of multiple modules including eld extraction, segmentation, recognition, and language modeling. A new learning paradigm, called Graph Transformer Networks (GTN), allows such multi-module systems to be trained globally using Gradient-Based methods so as to minimize an overall performance measure.Two systems for on-line handwriting recognition are described. Experiments demonstrate the advantage of global training, and the exibility of Graph Transformer Networks.A Graph Transformer Network for reading bank check is also described. It uses Convolutional Neural Network character recognizers combined with global training techniques to provides record accuracy on business and personal checks. It is deployed commercially and reads several million checks per day.

translated by 谷歌翻译

Pictorial and apictorial polygonal jigsaw puzzles: The lazy caterer model, properties, and solvers

Peleg Harel , Ohad Ben-Shahar

分类：计算机视觉 | 人工智能

2020-08-17

拼图解决问题，从一组非重叠的无序视觉碎片构建一个连贯的整体，是许多应用的基础，然而，过去二十年的大部分文献都集中在较不太现实的谜题上正方形。在这里，我们正规化一种新型的拼图拼图，其中碎片是通过用任意数量的直切割的全局多边形/图像切割而产生的一般凸多边形，这是由庆祝的懒人辅助er序列的产生模型。我们分析了这种难题的理论特性，包括在碎片被几何噪声被污染时解决它们的固有挑战。为了应对此类困难并获得易行的解决方案，我们摘要作为一种具有分层循环约束和分层重建过程的多体弹簧质量动态系统的问题。我们定义了评估指标，并在普通植物和图案谜题上呈现实验结果，以表明它们是完全自动溶解的。

translated by 谷歌翻译

3D Labeling Tool

John Rachwan , Charbel Zalaket

分类：计算机视觉 | 人工智能

2022-07-23

培训和测试监督对象检测模型需要大量带有地面真相标签的图像。标签定义图像中的对象类及其位置，形状以及可能的其他信息，例如姿势。即使存在人力，标签过程也非常耗时。我们引入了一个新的标签工具，用于2D图像以及3D三角网格：3D标记工具（3DLT）。这是一个独立的，功能丰富和跨平台软件，不需要安装，并且可以在Windows，MacOS和基于Linux的发行版上运行。我们不再像当前工具那样在每个图像上分别标记相同的对象，而是使用深度信息从上述图像重建三角形网格，并仅在上述网格上标记一次对象。我们使用注册来简化3D标记，离群值检测来改进2D边界框的计算和表面重建，以将标记可能性扩展到大点云。我们的工具经过最先进的方法测试，并且在保持准确性和易用性的同时，它极大地超过了它们。

translated by 谷歌翻译

Real-Time GPU-Accelerated Machine Learning Based Multiuser Detection for 5G and Beyond

Matthias Mehlhose , Daniel Schäufele , Daniyal Amir Awan , Guillermo Marcus , Nikolaus Binder , Martin Kasparick , Renato L. G. Cavalcante , Sławomir Stańczak , Alexander Keller

分类：机器学习 | (统计)机器学习

2022-01-13

Adaptive partial linear beamforming meets the need of 5G and future 6G applications for high flexibility and adaptability. Choosing an appropriate tradeoff between conflicting goals opens the recently proposed multiuser (MU) detection method. Due to their high spatial resolution, nonlinear beamforming filters can significantly outperform linear approaches in stationary scenarios with massive connectivity. However, a dramatic decrease in performance can be expected in high mobility scenarios because they are very susceptible to changes in the wireless channel. The robustness of linear filters is required, considering these changes. One way to respond appropriately is to use online machine learning algorithms. The theory of algorithms based on the adaptive projected subgradient method (APSM) is rich, and they promise accurate tracking capabilities in dynamic wireless environments. However, one of the main challenges comes from the real-time implementation of these algorithms, which involve projections on time-varying closed convex sets. While the projection operations are relatively simple, their vast number poses a challenge in ultralow latency (ULL) applications where latency constraints must be satisfied in every radio frame. Taking non-orthogonal multiple access (NOMA) systems as an example, this paper explores the acceleration of APSM-based algorithms through massive parallelization. The result is a GPUaccelerated real-time implementation of an orthogonal frequency-division multiplexing (OFDM)based transceiver that enables detection latency of less than one millisecond and therefore complies with the requirements of 5G and beyond. To meet the stringent physical layer latency requirements, careful co-design of hardware and software is essential, especially in virtualized wireless systems with hardware accelerators.

translated by 谷歌翻译

Parametric Level-sets Enhanced To Improve Reconstruction (PaLEnTIR)

Ege Ozsar , Misha Kilmer , Eric Miller , Eric de Sturler , Arvind Saibaba

分类：计算机视觉

2022-04-21

在本文中，我们考虑使用Palentir在两个和三个维度中对分段常数对象的恢复和重建，这是相对于当前最新ART的显着增强的参数级别集（PALS）模型。本文的主要贡献是一种新的PALS公式，它仅需要一个单个级别的函数来恢复具有具有多个未知对比度的分段常数对象的场景。我们的模型比当前的多对抗性，多对象问题提供了明显的优势，所有这些问题都需要多个级别集并明确估计对比度大小。给定对比度上的上限和下限，我们的方法能够以任何对比度分布恢复对象，并消除需要知道给定场景中的对比度或其值的需求。我们提供了一个迭代过程，以找到这些空间变化的对比度限制。相对于使用径向基函数（RBF）的大多数PAL方法，我们的模型利用了非异型基函数，从而扩展了给定复杂性的PAL模型可以近似的形状类别。最后，Palentir改善了作为参数识别过程一部分所需的Jacobian矩阵的条件，因此通过控制PALS扩展系数的幅度来加速优化方法，固定基本函数的中心，以及参数映射到图像映射的唯一性，由新参数化提供。我们使用X射线计算机断层扫描，弥漫性光学断层扫描（DOT），Denoising，DeonConvolution问题的2D和3D变体证明了新方法的性能。应用于实验性稀疏CT数据和具有不同类型噪声的模拟数据，以进一步验证所提出的方法。

translated by 谷歌翻译

eGHWT: The extended Generalized Haar-Walsh Transform

Naoki Saito , Yiqun Shao

分类：计算机视觉

2021-07-11

将计算谐波分析工具扩展到常规格子的经典设置到更普通的图形和网络的设置是非常重要的，最近已经完成了许多研究。由IRION和SAITO（2014）开发的通用HAAR-WALSH变换（GHWT）是图形上的信号的多尺度变换，这是古典哈拉和沃尔什哈拉德变换的概括。我们提出了扩展的广义Haar-Walsh变换（eGHWT），这是Thiele和Villemoes（1996）的适应时频倾斜的概括。 eGHWT不仅检查了图形域分区的效率，还可以同时查看“续间域”分区。因此，图形信号的EGHWT及其相关的最佳基础选择算法显着提高了以前的计算成本，$ O（n \ log n）$的先前GHW的性能，其中$ n $是一个节点的数量输入图。虽然GHWT最佳基础算法在$ \ mathbb {r} ^ $可能的正交基础中寻求给定任务的最适合的正常正常基础。在$ \ mathbb {r} ^ n $，eghwt最佳基础算法可以找到一个通过在$ \ mathbb {r} ^ n $中搜索超过0.618美元\ cdot（1.84）^ n $可能的正交基础。本文介绍了EGHWT最佳基础算法的细节，并使用包括真正曲线信号的若干示例以及作为曲线图信号观看的传统数字图像来展示其优越性。此外，我们还通过将它们视为从其列和行生成的图表的张量乘积来展示如何扩展到2D信号和矩阵形式数据，并展示其对图像近似的应用的有效性。

translated by 谷歌翻译