卷积神经网络已彻底改变了视力应用。但是,有一些图像域和表示,无法通过标准CNN(例如球形图像,超像素)来处理。这些数据通常使用针对每种类型的网络和算法进行处理。在这项工作中,我们表明可能并非总是有必要使用专门的神经网络在此类空间上操作。取而代之的是,我们介绍了一个新的结构化卷积操作员,该操作员可以复制2D卷积权重,将已经训练的传统CNN的功能转移到我们的新图形网络中。然后,该网络可以在任何可以表示为位置图的数据上运行。通过将非线性数据转换为图,我们可以在这些不规则的图像域上应用这些卷积,而无需在大型域特异性数据集上进行训练。对于各种此类数据表格,证明了转移预训练的图像网络进行分割,风格化和深度预测的结果。
translated by 谷歌翻译
3D点云的卷积经过广泛研究,但在几何深度学习中却远非完美。卷积的传统智慧在3D点之间表现出特征对应关系,这是对差的独特特征学习的内在限制。在本文中,我们提出了自适应图卷积(AGCONV),以供点云分析的广泛应用。 AGCONV根据其动态学习的功能生成自适应核。与使用固定/各向同性核的解决方案相比,AGCONV提高了点云卷积的灵活性,有效,精确地捕获了不同语义部位的点之间的不同关系。与流行的注意力体重方案不同,AGCONV实现了卷积操作内部的适应性,而不是简单地将不同的权重分配给相邻点。广泛的评估清楚地表明,我们的方法优于各种基准数据集中的点云分类和分割的最新方法。同时,AGCONV可以灵活地采用更多的点云分析方法来提高其性能。为了验证其灵活性和有效性,我们探索了基于AGCONV的完成,DeNoing,Upsmpling,注册和圆圈提取的范式,它们与竞争对手相当甚至优越。我们的代码可在https://github.com/hrzhou2/adaptconv-master上找到。
translated by 谷歌翻译
Point cloud learning has lately attracted increasing attention due to its wide applications in many areas, such as computer vision, autonomous driving, and robotics. As a dominating technique in AI, deep learning has been successfully used to solve various 2D vision problems. However, deep learning on point clouds is still in its infancy due to the unique challenges faced by the processing of point clouds with deep neural networks. Recently, deep learning on point clouds has become even thriving, with numerous methods being proposed to address different problems in this area. To stimulate future research, this paper presents a comprehensive review of recent progress in deep learning methods for point clouds. It covers three major tasks, including 3D shape classification, 3D object detection and tracking, and 3D point cloud segmentation. It also presents comparative results on several publicly available datasets, together with insightful observations and inspiring future research directions.
translated by 谷歌翻译
预测具有微观结构的材料的代表性样品的演变是均质化的基本问题。在这项工作中,我们提出了一种图形卷积神经网络,其利用直接初始微结构的离散化表示,而无需分割或聚类。与基于特征和基于像素的卷积神经网络模型相比,所提出的方法具有许多优点:(a)它是深入的,因为它不需要卵容,但可以从中受益,(b)它具有简单的实现使用标准卷积滤波器和层,(c)它在没有插值的非结构化和结构网格数据上本身工作(与基于像素的卷积神经网络不同),并且(d)它可以保留与其他基于图形的卷积神经网络等旋转不变性。我们展示了所提出的网络的性能,并将其与传统的基于像素的卷积神经网络模型和基于传统的像素的卷积神经网络模型进行比较,并且在多个大型数据集上的基于特征的图形卷积神经网络。
translated by 谷歌翻译
卷积神经网络(CNNS)在2D计算机视觉中取得了很大的突破。然而,它们的不规则结构使得难以在网格上直接利用CNNS的潜力。细分表面提供分层多分辨率结构,其中闭合的2 - 歧管三角网格中的每个面正恰好邻近三个面。本文推出了这两种观察,介绍了具有环形细分序列连接的3D三角形网格的创新和多功能CNN框架。在2D图像中的网格面和像素之间进行类比允许我们呈现网状卷积操作者以聚合附近面的局部特征。通过利用面部街区,这种卷积可以支持标准的2D卷积网络概念,例如,可变内核大小,步幅和扩张。基于多分辨率层次结构,我们利用汇集层,将四个面均匀地合并成一个和上采样方法,该方法将一个面分为四个。因此,许多流行的2D CNN架构可以容易地适应处理3D网格。可以通过自我参数化来回收具有任意连接的网格,以使循环细分序列连接,使子变量是一般的方法。广泛的评估和各种应用展示了SubDIVNet的有效性和效率。
translated by 谷歌翻译
最新的2D图像压缩方案依赖于卷积神经网络(CNN)的力量。尽管CNN为2D图像压缩提供了有希望的观点,但将此类模型扩展到全向图像并不简单。首先,全向图像具有特定的空间和统计特性,这些特性无法通过当前CNN模型完全捕获。其次,在球体上,基本的数学操作组成了CNN体系结构,例如翻译和采样。在本文中,我们研究了全向图像的表示模型的学习,并建议使用球体的HealPix均匀采样的属性来重新定义用于全向图像的深度学习模型中使用的数学工具。特别是,我们:i)提出了在球体上进行新的卷积操作的定义,以保持经典2D卷积的高表现力和低复杂性; ii)适应标准的CNN技术,例如步幅,迭代聚集和像素改组到球形结构域;然后iii)将我们的新框架应用于全向图像压缩的任务。我们的实验表明,与应用于等应角图像的类似学习模型相比,我们提出的球形溶液可带来更好的压缩增益,可以节省比特率的13.7%。同样,与基于图形卷积网络的学习模型相比,我们的解决方案支持更具表现力的过滤器,这些过滤器可以保留高频并提供压缩图像的更好的感知质量。这样的结果证明了拟议框架的效率,该框架为其他全向视觉任务任务打开了新的研究场所,以在球体歧管上有效实施。
translated by 谷歌翻译
卷积神经网络(CNN)已被广泛用于各种视觉任务,例如图像分类,语义分割等。不幸的是,标准2D CNN不太适合球形信号,例如全景图像或球形投影,因为球体是一个非结构化的网格。在本文中,我们提出了球形变压器,可以将球形信号转换为可以通过标准CNN直接处理的向量,从而通过预处理可以在任务和数据集中重复使用许多精心设计的CNNS体系结构。为此,提出的方法首先使用局部结构化采样方法(例如HealPix)通过使用球形点及其相邻点的信息来构建变压器网格,然后通过网格将球形信号转换为向量。通过构建球形变压器模块,我们可以直接使用多个CNN体系结构。我们评估了有关球形MNIST识别,3D对象分类和全向图像语义分割的任务的方法。对于3D对象分类,我们进一步提出了一种基于渲染的投影方法,以提高性能和旋转等值模型,以提高抗旋转能力。关于三个任务的实验结果表明,我们的方法比最先进的方法实现了卓越的性能。
translated by 谷歌翻译
Point clouds are characterized by irregularity and unstructuredness, which pose challenges in efficient data exploitation and discriminative feature extraction. In this paper, we present an unsupervised deep neural architecture called Flattening-Net to represent irregular 3D point clouds of arbitrary geometry and topology as a completely regular 2D point geometry image (PGI) structure, in which coordinates of spatial points are captured in colors of image pixels. \mr{Intuitively, Flattening-Net implicitly approximates a locally smooth 3D-to-2D surface flattening process while effectively preserving neighborhood consistency.} \mr{As a generic representation modality, PGI inherently encodes the intrinsic property of the underlying manifold structure and facilitates surface-style point feature aggregation.} To demonstrate its potential, we construct a unified learning framework directly operating on PGIs to achieve \mr{diverse types of high-level and low-level} downstream applications driven by specific task networks, including classification, segmentation, reconstruction, and upsampling. Extensive experiments demonstrate that our methods perform favorably against the current state-of-the-art competitors. We will make the code and data publicly available at https://github.com/keeganhk/Flattening-Net.
translated by 谷歌翻译
A number of problems can be formulated as prediction on graph-structured data. In this work, we generalize the convolution operator from regular grids to arbitrary graphs while avoiding the spectral domain, which allows us to handle graphs of varying size and connectivity. To move beyond a simple diffusion, filter weights are conditioned on the specific edge labels in the neighborhood of a vertex. Together with the proper choice of graph coarsening, we explore constructing deep neural networks for graph classification. In particular, we demonstrate the generality of our formulation in point cloud classification, where we set the new state of the art, and on a graph classification dataset, where we outperform other deep learning approaches. The source code is available at https://github.com/mys007/ecc.
translated by 谷歌翻译
地震阶段关联将地震到达时间测量连接到其致病来源。有效的关联必须确定离散事件的数量,其位置和起源时间,并且必须将实际到达与测量工件区分开。深度学习采摘者的出现,从紧密重叠的小地震中提供了高率的速度,它激发了重新审视相关问题并使用深度学习方法来解决它。我们已经开发了一个图形神经网络关联器,该协会同时预测源时空定位和离散的源源 - 边界关联可能性。该方法适用于任意几何形状,数百个电台的时变地震网络,并且具有可变噪声和质量的高源和输入选拔速率。我们的图形地震神经解释引擎(Genie)使用一个图来表示站点,另一个图表示空间源区域。 Genie从数据中从数据中学习了关系,使其能够确定可靠的源和源源联想。我们使用Phasenet Deep Learth Learning Phase Phase Picker生成的输入来培训合成数据,并测试来自北加州(NC)地震网络的真实数据的方法。我们成功地重新检测了USGS在2000年$ \ unicode {x2013} $ 2022之间的500天报告中报告的所有事件M> 1的96%。在2017年的100天连续处理间隔中,$ \ unicode {x2013} $ 2018,我们检测到〜4.2x USGS报告的事件数量。我们的新事件的估计值低于USGS目录的完整性幅度,并且位于该地区的活动故障和采石场附近。我们的结果表明,精灵可以在复杂的地震监测条件下有效解决关联问题。
translated by 谷歌翻译
We consider the prediction of interfaces between proteins, a challenging problem with important applications in drug discovery and design, and examine the performance of existing and newly proposed spatial graph convolution operators for this task. By performing convolution over a local neighborhood of a node of interest, we are able to stack multiple layers of convolution and learn effective latent representations that integrate information across the graph that represent the three dimensional structure of a protein of interest. An architecture that combines the learned features across pairs of proteins is then used to classify pairs of amino acid residues as part of an interface or not. In our experiments, several graph convolution operators yielded accuracy that is better than the state-of-the-art SVM method in this task. † denotes equal contribution 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA.
translated by 谷歌翻译
Deep learning has revolutionized many machine learning tasks in recent years, ranging from image classification and video processing to speech recognition and natural language understanding. The data in these tasks are typically represented in the Euclidean space. However, there is an increasing number of applications where data are generated from non-Euclidean domains and are represented as graphs with complex relationships and interdependency between objects. The complexity of graph data has imposed significant challenges on existing machine learning algorithms. Recently, many studies on extending deep learning approaches for graph data have emerged. In this survey, we provide a comprehensive overview of graph neural networks (GNNs) in data mining and machine learning fields. We propose a new taxonomy to divide the state-of-the-art graph neural networks into four categories, namely recurrent graph neural networks, convolutional graph neural networks, graph autoencoders, and spatial-temporal graph neural networks. We further discuss the applications of graph neural networks across various domains and summarize the open source codes, benchmark data sets, and model evaluation of graph neural networks. Finally, we propose potential research directions in this rapidly growing field.
translated by 谷歌翻译
Intelligent mesh generation (IMG) refers to a technique to generate mesh by machine learning, which is a relatively new and promising research field. Within its short life span, IMG has greatly expanded the generalizability and practicality of mesh generation techniques and brought many breakthroughs and potential possibilities for mesh generation. However, there is a lack of surveys focusing on IMG methods covering recent works. In this paper, we are committed to a systematic and comprehensive survey describing the contemporary IMG landscape. Focusing on 110 preliminary IMG methods, we conducted an in-depth analysis and evaluation from multiple perspectives, including the core technique and application scope of the algorithm, agent learning goals, data types, targeting challenges, advantages and limitations. With the aim of literature collection and classification based on content extraction, we propose three different taxonomies from three views of key technique, output mesh unit element, and applicable input data types. Finally, we highlight some promising future research directions and challenges in IMG. To maximize the convenience of readers, a project page of IMG is provided at \url{https://github.com/xzb030/IMG_Survey}.
translated by 谷歌翻译
标准空间卷积假设具有常规邻域结构的输入数据。现有方法通常通过修复常规“视图”来概括对不规则点云域的卷积。固定的邻域大小,卷积内核大小对于每个点保持不变。然而,由于点云不是像图像的结构,所以固定邻权给出了不幸的感应偏压。我们提出了一个名为digress图卷积(diffconv)的新图表卷积,不依赖常规视图。DiffConv在空间 - 变化和密度扩张的邻域上操作,其进一步由学习屏蔽的注意机制进行了进一步调整。我们在ModelNet40点云分类基准测试中验证了我们的模型,获得最先进的性能和更稳健的噪声,以及更快的推广速度。
translated by 谷歌翻译
基于简单的扩散层对空间通信非常有效的洞察力,我们对3D表面进行深度学习的新的通用方法。由此产生的网络是自动稳健的,以改变表面的分辨率和样品 - 一种对实际应用至关重要的基本属性。我们的网络可以在各种几何表示上离散化,例如三角网格或点云,甚至可以在一个表示上培训然后应用于另一个表示。我们优化扩散的空间支持,作为连续网络参数,从纯粹的本地到完全全球范围,从而消除手动选择邻域大小的负担。该方法中唯一的其他成分是在每个点处独立地施加的多层的Perceptron,以及用于支持方向滤波器的空间梯度特征。由此产生的网络简单,坚固,高效。这里,我们主要专注于三角网格表面,并且展示了各种任务的最先进的结果,包括表面分类,分割和非刚性对应。
translated by 谷歌翻译
Unlike images which are represented in regular dense grids, 3D point clouds are irregular and unordered, hence applying convolution on them can be difficult. In this paper, we extend the dynamic filter to a new convolution operation, named PointConv. PointConv can be applied on point clouds to build deep convolutional networks. We treat convolution kernels as nonlinear functions of the local coordinates of 3D points comprised of weight and density functions. With respect to a given point, the weight functions are learned with multi-layer perceptron networks and density functions through kernel density estimation. The most important contribution of this work is a novel reformulation proposed for efficiently computing the weight functions, which allowed us to dramatically scale up the network and significantly improve its performance. The learned convolution kernel can be used to compute translation-invariant and permutation-invariant convolution on any point set in the 3D space. Besides, PointConv can also be used as deconvolution operators to propagate features from a subsampled point cloud back to its original resolution. Experiments on ModelNet40, ShapeNet, and ScanNet show that deep convolutional neural networks built on PointConv are able to achieve state-of-the-art on challenging semantic segmentation benchmarks on 3D point clouds. Besides, our experiments converting CIFAR-10 into a point cloud showed that networks built on PointConv can match the performance of convolutional networks in 2D images of a similar structure.
translated by 谷歌翻译
最先进的参数和非参数样式转移方法容易导致由于全局统计的对准而导致的本地样式模式,或者由于补丁不匹配而导致的不愉快的人工制品。在本文中,我们研究了一种新型的半参数神经风格转移框架,可减轻参数和非参数风格的缺乏。我们方法的核心思想是使用图神经网络(GNN)建立准确且细粒的内容样式对应关系。为此,我们开发了一个详细的GNN模型,其中包含内容和样式的本地补丁作为图形顶点。然后,将样式转移过程建模为基于注意力的异质消息,以可学习的方式在样式和内容节点之间传递,从而导致本地补丁级别的自适应多一对一风格的相关性。此外,引入了详细的可变形图卷积操作,以进行跨尺度样式符合匹配。实验结果表明,所提出的半参数图像样式化方法可为具有挑战性的样式模式产生令人鼓舞的结果,从而保留了全球外观和精美的细节。此外,通过控制推理阶段的边缘数量,提出的方法还触发了新的功能,例如使用单个模型的多元化基于斑块的风格化。
translated by 谷歌翻译
Physically based rendering of complex scenes can be prohibitively costly with a potentially unbounded and uneven distribution of complexity across the rendered image. The goal of an ideal level of detail (LoD) method is to make rendering costs independent of the 3D scene complexity, while preserving the appearance of the scene. However, current prefiltering LoD methods are limited in the appearances they can support due to their reliance of approximate models and other heuristics. We propose the first comprehensive multi-scale LoD framework for prefiltering 3D environments with complex geometry and materials (e.g., the Disney BRDF), while maintaining the appearance with respect to the ray-traced reference. Using a multi-scale hierarchy of the scene, we perform a data-driven prefiltering step to obtain an appearance phase function and directional coverage mask at each scale. At the heart of our approach is a novel neural representation that encodes this information into a compact latent form that is easy to decode inside a physically based renderer. Once a scene is baked out, our method requires no original geometry, materials, or textures at render time. We demonstrate that our approach compares favorably to state-of-the-art prefiltering methods and achieves considerable savings in memory for complex scenes.
translated by 谷歌翻译
综合照片 - 现实图像和视频是计算机图形的核心,并且是几十年的研究焦点。传统上,使用渲染算法(如光栅化或射线跟踪)生成场景的合成图像,其将几何形状和材料属性的表示为输入。统称,这些输入定义了实际场景和呈现的内容,并且被称为场景表示(其中场景由一个或多个对象组成)。示例场景表示是具有附带纹理的三角形网格(例如,由艺术家创建),点云(例如,来自深度传感器),体积网格(例如,来自CT扫描)或隐式曲面函数(例如,截短的符号距离)字段)。使用可分辨率渲染损耗的观察结果的这种场景表示的重建被称为逆图形或反向渲染。神经渲染密切相关,并将思想与经典计算机图形和机器学习中的思想相结合,以创建用于合成来自真实观察图像的图像的算法。神经渲染是朝向合成照片现实图像和视频内容的目标的跨越。近年来,我们通过数百个出版物显示了这一领域的巨大进展,这些出版物显示了将被动组件注入渲染管道的不同方式。这种最先进的神经渲染进步的报告侧重于将经典渲染原则与学习的3D场景表示结合的方法,通常现在被称为神经场景表示。这些方法的一个关键优势在于它们是通过设计的3D-一致,使诸如新颖的视点合成捕获场景的应用。除了处理静态场景的方法外,我们还涵盖了用于建模非刚性变形对象的神经场景表示...
translated by 谷歌翻译