图像生物标准化倡议(IBSI)旨在通过标准化从图像中提取图像生物标志物(特征)的计算过程来提高射致研究的再现性。我们之前建立了169个常用特征的参考值,创建了标准的射频图像处理方案,并开发了用于垄断研究的报告指南。但是,若干方面没有标准化。在这里,我们提出了在射频中使用卷积图像过滤器的参考手册的初步版本。滤波器,例如高斯滤波器的小波或拉普拉斯,在强调特定图像特征(如边缘和Blob)中发挥重要组成部分。已发现从过滤滤波器响应图派生的功能可重复差。此参考手册构成了持续工作的基础,用于标准化卷积滤波器中的覆盖物中的持续工作,并在这项工作进行时更新。
translated by 谷歌翻译
从随机字段或纹理中提取信息是科学中无处不在的任务,从探索性数据分析到分类和参数估计。从物理学到生物学,它往往通过功率谱分析来完成,这通常过于有限,或者使用需要大型训练的卷积神经网络(CNNS)并缺乏解释性。在本文中,我们倡导使用散射变换(Mallat 2012),这是一种强大的统计数据,它来自CNNS的数学思想,但不需要任何培训,并且是可解释的。我们表明它提供了一种相对紧凑的汇总统计数据,具有视觉解释,并在广泛的科学应用中携带大多数相关信息。我们向该估算者提供了非技术性介绍,我们认为它可以使数据分析有利于多种科学领域的模型和参数推断。有趣的是,了解散射变换的核心操作允许人们解读CNN的内部工作的许多关键方面。
translated by 谷歌翻译
We observe that despite their hierarchical convolutional nature, the synthesis process of typical generative adversarial networks depends on absolute pixel coordinates in an unhealthy manner. This manifests itself as, e.g., detail appearing to be glued to image coordinates instead of the surfaces of depicted objects. We trace the root cause to careless signal processing that causes aliasing in the generator network. Interpreting all signals in the network as continuous, we derive generally applicable, small architectural changes that guarantee that unwanted information cannot leak into the hierarchical synthesis process. The resulting networks match the FID of StyleGAN2 but differ dramatically in their internal representations, and they are fully equivariant to translation and rotation even at subpixel scales. Our results pave the way for generative models better suited for video and animation. * This work was done during an internship at NVIDIA. 35th Conference on Neural Information Processing Systems (NeurIPS 2021).
translated by 谷歌翻译
人类生理学中的各种结构遵循特异性形态,通常在非常细的尺度上表达复杂性。这种结构的例子是胸前气道,视网膜血管和肝血管。可以观察到可以观察到可以观察到可以观察到可以观察到空间排列的磁共振成像(MRI),计算机断层扫描(CT),光学相干断层扫描(OCT)等医学成像模式(MRI),计算机断层扫描(CT),可以观察到空间排列的大量2D和3D图像的集合。这些结构在医学成像中的分割非常重要,因为对结构的分析提供了对疾病诊断,治疗计划和预后的见解。放射科医生手动标记广泛的数据通常是耗时且容易出错的。结果,在过去的二十年中,自动化或半自动化的计算模型已成为医学成像的流行研究领域,迄今为止,许多计算模型已经开发出来。在这项调查中,我们旨在对当前公开可用的数据集,细分算法和评估指标进行全面审查。此外,讨论了当前的挑战和未来的研究方向。
translated by 谷歌翻译
The principle of equivariance to symmetry transformations enables a theoretically grounded approach to neural network architecture design. Equivariant networks have shown excellent performance and data efficiency on vision and medical imaging problems that exhibit symmetries. Here we show how this principle can be extended beyond global symmetries to local gauge transformations. This enables the development of a very general class of convolutional neural networks on manifolds that depend only on the intrinsic geometry, and which includes many popular methods from equivariant and geometric deep learning.We implement gauge equivariant CNNs for signals defined on the surface of the icosahedron, which provides a reasonable approximation of the sphere. By choosing to work with this very regular manifold, we are able to implement the gauge equivariant convolution using a single conv2d call, making it a highly scalable and practical alternative to Spherical CNNs. Using this method, we demonstrate substantial improvements over previous methods on the task of segmenting omnidirectional images and global climate patterns.
translated by 谷歌翻译
浑浊或形成是工业常规用于行业的概念,以解决非织造物和论文的均匀性的偏差。基于图像数据测量浑浊索引是工业质量保证的常见任务。量化云的两个最流行的方式是基于一方面的功率谱或相关功能,另一方面是拉普拉斯金字塔。在这里,我们全面地回顾了第一种方法的数学基础,得出了浑浊指数,并证明了其实际估计。我们证明了拉普拉斯金字塔以及表征云的其他量,如相互作用范围和小角度散射的强度与功率谱非常密切相关。最后,我们表明,功率谱易于分析图像,并且具有比替代方案更多的信息。
translated by 谷歌翻译
本文介绍了一个固定的厄贡点过程的统计模型,该模型是根据在方形窗口中观察到的单个实现估计的。使用随机几何形状中的现有方法,很难用大量颗粒形成复杂的几何形状进行建模。受到采样最大渗透模型的梯度下降算法的最新作品的启发,我们描述了一个模型,该模型允许快速采样新的配置,从而再现了给定观察的统计数据。从初始随机配置开始,其粒子根据能量的梯度移动,以匹配一组规定的矩(功能)。我们的矩是通过相谐波操作员在点模式的小波变换上定义的。它们允许一个人捕获粒子之间的多尺度相互作用,同时按照模型的结构的尺度明确控制矩数。我们介绍了具有各种几何结构的点过程的数值实验,并通过光谱和拓扑数据分析评估模型的质量。
translated by 谷歌翻译
我们介绍了一种确定全局特征解耦的方法,并显示其适用于提高数据分析性能的适用性,并开放了新的场所以进行功能传输。我们提出了一种新的形式主义,该形式主义是基于沿特征梯度遵循轨迹来定义对子曼群的转换的。通过这些转换,我们定义了一个归一化,我们证明,它允许解耦可区分的特征。通过将其应用于采样矩,我们获得了用于正骨的准分析溶液,正尾肌肉是峰度的归一化版本,不仅与平均值和方差相关,而且还与偏度相关。我们将此方法应用于原始数据域和过滤器库的输出中,以基于全局描述符的回归和分类问题,与使用经典(未删除)描述符相比,性能得到一致且显着的改进。
translated by 谷歌翻译
In photoacoustic tomography (PAT) with flat sensor, we routinely encounter two types of limited data. The first is due to using a finite sensor and is especially perceptible if the region of interest is large relative to the sensor or located farther away from the sensor. In this paper, we focus on the second type caused by a varying sensitivity of the sensor to the incoming wavefront direction which can be modelled as binary i.e. by a cone of sensitivity. Such visibility conditions result, in the Fourier domain, in a restriction of both the image and the data to a bow-tie, akin to the one corresponding to the range of the forward operator. The visible wavefrontsets in image and data domains, are related by the wavefront direction mapping. We adapt the wedge restricted Curvelet decomposition, we previously proposed for the representation of the full PAT data, to separate the visible and invisible wavefronts in the image. We optimally combine fast approximate operators with tailored deep neural network architectures into efficient learned reconstruction methods which perform reconstruction of the visible coefficients and the invisible coefficients are learned from a training set of similar data.
translated by 谷歌翻译
We propose a novel method for constructing wavelet transforms of functions defined on the vertices of an arbitrary finite weighted graph. Our approach is based on defining scaling using the the graph analogue of the Fourier domain, namely the spectral decomposition of the discrete graph Laplacian L. Given a wavelet generating kernel g and a scale parameter t, we define the scaled wavelet operator T t g = g(tL). The spectral graph wavelets are then formed by localizing this operator by applying it to an indicator function. Subject to an admissibility condition on g, this procedure defines an invertible transform. We explore the localization properties of the wavelets in the limit of fine scales. Additionally, we present a fast Chebyshev polynomial approximation algorithm for computing the transform that avoids the need for diagonalizing L. We highlight potential applications of the transform through examples of wavelets on graphs corresponding to a variety of different problem domains.
translated by 谷歌翻译
图像表示是计算机视觉和模式识别中的一个重要主题。它在一系列应用中扮演了了解视觉内容的基本作用。据报道,基于矩的图像表示在满足其由于其有益的数学特性而满足语义描述的核心条件,特别是几何不变性和独立性。本文介绍了对图像表示的正交矩的全面调查,涵盖了快速/准确计算,鲁棒性/不变性优化,定义扩展和应用程序的最新进步。我们还为各种广泛使用的正交瞬间创建一个软件包,并在同一基地中评估此类方法。提出的理论分析,软件实施和评估结果可以支持社区,特别是在开发新颖的技术和促进现实世界的应用方面。
translated by 谷歌翻译
图像取证是一个上升的话题,因为值得信赖的多媒体内容对于现代社会至关重要。像其他与视觉相关的应用一样,法医分析在很大程度上依赖于适当的图像表示。尽管很重要,但目前对这种表示的理论理解仍然有限,其关键作用的忽视程度不同。对于这个差距,我们试图从理论,实施和应用的角度研究以法医为导向的图像表示作为一个独特的问题。我们的工作始于基本原则的抽象,法医的表示应该满足,尤其是揭示了鲁棒性,可解释性和覆盖范围的关键性。在理论层面上,我们为取证的新表示框架提出了一个称为密​​集不变表示(dir)的新表示框架,该框架的特征是具有数学保证的稳定描述。在实现级别上,讨论了DIR的离散计算问题,并设计相应的准确和快速解决方案具有通用性质和恒定的复杂性。我们在密集域模式检测和匹配实验上演示了上述论点,从而提供了与最新描述符的比较结果。此外,在应用程序级别上,提出的DIR最初是在被动和主动取证中探索的,即复制移动伪造的检测和感知散列,在满足此类法医任务的要求方面表现出了好处。
translated by 谷歌翻译
Many scientific fields study data with an underlying structure that is a non-Euclidean space. Some examples include social networks in computational social sciences, sensor networks in communications, functional networks in brain imaging, regulatory networks in genetics, and meshed surfaces in computer graphics. In many applications, such geometric data are large and complex (in the case of social networks, on the scale of billions), and are natural targets for machine learning techniques. In particular, we would like to use deep neural networks, which have recently proven to be powerful tools for a broad range of problems from computer vision, natural language processing, and audio analysis. However, these tools have been most successful on data with an underlying Euclidean or grid-like structure, and in cases where the invariances of these structures are built into networks used to model them.Geometric deep learning is an umbrella term for emerging techniques attempting to generalize (structured) deep neural models to non-Euclidean domains such as graphs and manifolds. The purpose of this paper is to overview different examples of geometric deep learning problems and present available solutions, key difficulties, applications, and future research directions in this nascent field.
translated by 谷歌翻译
最新的2D图像压缩方案依赖于卷积神经网络(CNN)的力量。尽管CNN为2D图像压缩提供了有希望的观点,但将此类模型扩展到全向图像并不简单。首先,全向图像具有特定的空间和统计特性,这些特性无法通过当前CNN模型完全捕获。其次,在球体上,基本的数学操作组成了CNN体系结构,例如翻译和采样。在本文中,我们研究了全向图像的表示模型的学习,并建议使用球体的HealPix均匀采样的属性来重新定义用于全向图像的深度学习模型中使用的数学工具。特别是,我们:i)提出了在球体上进行新的卷积操作的定义,以保持经典2D卷积的高表现力和低复杂性; ii)适应标准的CNN技术,例如步幅,迭代聚集和像素改组到球形结构域;然后iii)将我们的新框架应用于全向图像压缩的任务。我们的实验表明,与应用于等应角图像的类似学习模型相比,我们提出的球形溶液可带来更好的压缩增益,可以节省比特率的13.7%。同样,与基于图形卷积网络的学习模型相比,我们的解决方案支持更具表现力的过滤器,这些过滤器可以保留高频并提供压缩图像的更好的感知质量。这样的结果证明了拟议框架的效率,该框架为其他全向视觉任务任务打开了新的研究场所,以在球体歧管上有效实施。
translated by 谷歌翻译
兴趣点检测是计算机视觉和图像处理中最根本,最关键的问题之一。在本文中,我们对图像特征信息(IFI)提取技术进行了全面综述,以进行利益点检测。为了系统地介绍现有的兴趣点检测方法如何从输入图像中提取IFI,我们提出了IFI提取技术的分类学检测。根据该分类法,我们讨论了不同类型的IFI提取技术以进行兴趣点检测。此外,我们确定了与现有的IFI提取技术有关的主要未解决的问题,以及以前尚未讨论过的任何兴趣点检测方法。提供了现有的流行数据集和评估标准,并评估和讨论了18种最先进方法的性能。此外,还详细阐述了有关IFI提取技术的未来研究方向。
translated by 谷歌翻译
Training generative adversarial networks (GAN) using too little data typically leads to discriminator overfitting, causing training to diverge. We propose an adaptive discriminator augmentation mechanism that significantly stabilizes training in limited data regimes. The approach does not require changes to loss functions or network architectures, and is applicable both when training from scratch and when fine-tuning an existing GAN on another dataset. We demonstrate, on several datasets, that good results are now possible using only a few thousand training images, often matching StyleGAN2 results with an order of magnitude fewer images. We expect this to open up new application domains for GANs. We also find that the widely used CIFAR-10 is, in fact, a limited data benchmark, and improve the record FID from 5.59 to 2.42.
translated by 谷歌翻译
Translating or rotating an input image should not affect the results of many computer vision tasks. Convolutional neural networks (CNNs) are already translation equivariant: input image translations produce proportionate feature map translations. This is not the case for rotations. Global rotation equivariance is typically sought through data augmentation, but patch-wise equivariance is more difficult. We present Harmonic Networks or H-Nets, a CNN exhibiting equivariance to patch-wise translation and 360-rotation. We achieve this by replacing regular CNN filters with circular harmonics, returning a maximal response and orientation for every receptive field patch.H-Nets use a rich, parameter-efficient and fixed computational complexity representation, and we show that deep feature maps within the network encode complicated rotational invariants. We demonstrate that our layers are general enough to be used in conjunction with the latest architectures and techniques, such as deep supervision and batch normalization. We also achieve state-of-the-art classification on rotated-MNIST, and competitive results on other benchmark challenges.
translated by 谷歌翻译
我们介绍了CheBlieset,一种对(各向异性)歧管的组成的方法。对基于GRAP和基于组的神经网络的成功进行冲浪,我们利用了几何深度学习领域的最新发展,以推导出一种新的方法来利用数据中的任何各向异性。通过离散映射的谎言组,我们开发由各向异性卷积层(Chebyshev卷积),空间汇集和解凝层制成的图形神经网络,以及全球汇集层。集团的标准因素是通过具有各向异性左不变性的黎曼距离的图形上的等级和不变的运算符来实现的。由于其简单的形式,Riemannian公制可以在空间和方向域中模拟任何各向异性。这种对Riemannian度量的各向异性的控制允许平衡图形卷积层的不变性(各向异性度量)的平衡(各向异性指标)。因此,我们打开大门以更好地了解各向异性特性。此外,我们经验证明了在CIFAR10上的各向异性参数的存在(数据依赖性)甜点。这一关键的结果是通过利用数据中的各向异性属性来获得福利的证据。我们还评估了在STL10(图像数据)和ClimateNet(球面数据)上的这种方法的可扩展性,显示了对不同任务的显着适应性。
translated by 谷歌翻译
A wavelet scattering network computes a translation invariant image representation, which is stable to deformations and preserves high frequency information for classification. It cascades wavelet transform convolutions with non-linear modulus and averaging operators. The first network layer outputs SIFT-type descriptors whereas the next layers provide complementary invariant information which improves classification. The mathematical analysis of wavelet scattering networks explain important properties of deep convolution networks for classification.A scattering representation of stationary processes incorporates higher order moments and can thus discriminate textures having same Fourier power spectrum. State of the art classification results are obtained for handwritten digits and texture discrimination, with a Gaussian kernel SVM and a generative PCA classifier.
translated by 谷歌翻译
定义网格上卷积的常用方法是将它们作为图形解释并应用图形卷积网络(GCN)。这种GCNS利用各向同性核,因此对顶点的相对取向不敏感,从而对整个网格的几何形状。我们提出了规范的等分性网状CNN,它概括了GCNS施加各向异性仪表等级核。由于产生的特征携带方向信息,我们引入了通过网格边缘并行传输特征来定义的几何消息传递方案。我们的实验验证了常规GCN和其他方法的提出模型的显着提高的表达性。
translated by 谷歌翻译