We advocate the use of implicit fields for learning generative models of shapes and introduce an implicit field decoder, called IM-NET, for shape generation, aimed at improving the visual quality of the generated shapes. An implicit field assigns a value to each point in 3D space, so that a shape can be extracted as an iso-surface. IM-NET is trained to perform this assignment by means of a binary classifier. Specifically, it takes a point coordinate, along with a feature vector encoding a shape, and outputs a value which indicates whether the point is outside the shape or not. By replacing conventional decoders by our implicit decoder for representation learning (via IM-AE) and shape generation (via IM-GAN), we demonstrate superior results for tasks such as generative shape modeling, interpolation, and single-view 3D reconstruction, particularly in terms of visual quality. Code and supplementary material are available at https://github.com/czq142857/implicit-decoder.
translated by 谷歌翻译
Figure 1. Given input as either a 2D image or a 3D point cloud (a), we automatically generate a corresponding 3D mesh (b) and its atlas parameterization (c). We can use the recovered mesh and atlas to apply texture to the output shape (d) as well as 3D print the results (e).
translated by 谷歌翻译
Three-dimensional geometric data offer an excellent domain for studying representation learning and generative modeling. In this paper, we look at geometric data represented as point clouds. We introduce a deep AutoEncoder (AE) network with state-of-the-art reconstruction quality and generalization ability. The learned representations outperform existing methods on 3D recognition tasks and enable shape editing via simple algebraic manipulations, such as semantic part editing, shape analogies and shape interpolation, as well as shape completion. We perform a thorough study of different generative models including GANs operating on the raw point clouds, significantly improved GANs trained in the fixed latent space of our AEs, and Gaussian Mixture Models (GMMs). To quantitatively evaluate generative models we introduce measures of sample fidelity and diversity based on matchings between sets of point clouds. Interestingly, our evaluation of generalization, fidelity and diversity reveals that GMMs trained in the latent space of our AEs yield the best results overall.
translated by 谷歌翻译
Figure 1: DeepSDF represents signed distance functions (SDFs) of shapes via latent code-conditioned feed-forward decoder networks. Above images are raycast renderings of DeepSDF interpolating between two shapes in the learned shape latent space. Best viewed digitally.
translated by 谷歌翻译
Generative models, as an important family of statistical modeling, target learning the observed data distribution via generating new instances. Along with the rise of neural networks, deep generative models, such as variational autoencoders (VAEs) and generative adversarial network (GANs), have made tremendous progress in 2D image synthesis. Recently, researchers switch their attentions from the 2D space to the 3D space considering that 3D data better aligns with our physical world and hence enjoys great potential in practice. However, unlike a 2D image, which owns an efficient representation (i.e., pixel grid) by nature, representing 3D data could face far more challenges. Concretely, we would expect an ideal 3D representation to be capable enough to model shapes and appearances in details, and to be highly efficient so as to model high-resolution data with fast speed and low memory cost. However, existing 3D representations, such as point clouds, meshes, and recent neural fields, usually fail to meet the above requirements simultaneously. In this survey, we make a thorough review of the development of 3D generation, including 3D shape generation and 3D-aware image synthesis, from the perspectives of both algorithms and more importantly representations. We hope that our discussion could help the community track the evolution of this field and further spark some innovative ideas to advance this challenging task.
translated by 谷歌翻译
从单视图重建3D形状是一个长期的研究问题。在本文中,我们展示了深度隐式地面网络,其可以通过预测底层符号距离场来从2D图像产生高质量的细节的3D网格。除了利用全局图像特征之外,禁止2D图像上的每个3D点的投影位置,并从图像特征映射中提取本地特征。结合全球和局部特征显着提高了符合距离场预测的准确性,特别是对于富含细节的区域。据我们所知,伪装是一种不断捕获从单视图图像中存在于3D形状中存在的孔和薄结构等细节的方法。 Disn在从合成和真实图像重建的各种形状类别上实现最先进的单视性重建性能。代码可在https://github.com/xharlie/disn提供补充可以在https://xharlie.github.io/images/neUrips_2019_Supp.pdf中找到补充
translated by 谷歌翻译
Point cloud completion is a generation and estimation issue derived from the partial point clouds, which plays a vital role in the applications in 3D computer vision. The progress of deep learning (DL) has impressively improved the capability and robustness of point cloud completion. However, the quality of completed point clouds is still needed to be further enhanced to meet the practical utilization. Therefore, this work aims to conduct a comprehensive survey on various methods, including point-based, convolution-based, graph-based, and generative model-based approaches, etc. And this survey summarizes the comparisons among these methods to provoke further research insights. Besides, this review sums up the commonly used datasets and illustrates the applications of point cloud completion. Eventually, we also discussed possible research trends in this promptly expanding field.
translated by 谷歌翻译
Computer graphics, 3D computer vision and robotics communities have produced multiple approaches to represent and generate 3D shapes, as well as a vast number of use cases. However, single-view reconstruction remains a challenging topic that can unlock various interesting use cases such as interactive design. In this work, we propose a novel framework that leverages the intermediate latent spaces of Vision Transformer (ViT) and a joint image-text representational model, CLIP, for fast and efficient Single View Reconstruction (SVR). More specifically, we propose a novel mapping network architecture that learns a mapping between deep features extracted from ViT and CLIP, and the latent space of a base 3D generative model. Unlike previous work, our method enables view-agnostic reconstruction of 3D shapes, even in the presence of large occlusions. We use the ShapeNetV2 dataset and perform extensive experiments with comparisons to SOTA methods to demonstrate our method's effectiveness.
translated by 谷歌翻译
本文提出了一种新的3D形状生成方法,从而在小波域中的连续隐式表示上实现了直接生成建模。具体而言,我们提出了一个带有一对粗糙和细节系数的紧凑型小波表示,通过截短的签名距离函数和多尺度的生物联盟波波隐式表示3D形状,并制定了一对神经网络:基于生成器基于扩散模型的生成器以粗糙系数的形式产生不同的形状;以及一个细节预测因子,以进一步生成兼容的细节系数量,以丰富具有精细结构和细节的生成形状。定量和定性实验结果都表现出我们的方法在产生具有复杂拓扑和结构,干净表面和细节的多样化和高质量形状方面的优势,超过了最先进的模型的3D生成能力。
translated by 谷歌翻译
本地化隐式功能的最新进展使神经隐式表示能够可扩展到大型场景。然而,这些方法采用的3D空间的定期细分未能考虑到表面占用的稀疏性和几何细节的变化粒度。结果,其内存占地面积与输入体积均别较大,即使在适度密集的分解中也导致禁止的计算成本。在这项工作中,我们为3D表面,编码OCTFIELD提供了一种学习的分层隐式表示,允许具有低内存和计算预算的复杂曲面的高精度编码。我们方法的关键是仅在感兴趣的表面周围分发本地隐式功能的3D场景的自适应分解。我们通过引入分层Octree结构来实现这一目标,以根据表面占用和部件几何形状的丰富度自适应地细分3D空间。随着八十六是离散和不可分辨性的,我们进一步提出了一种新颖的等级网络,其模拟八偏细胞的细分作为概率的过程,并以可差的方式递归地编码和解码八叠结构和表面几何形状。我们展示了Octfield的一系列形状建模和重建任务的价值,显示出在替代方法方面的优越性。
translated by 谷歌翻译
我们为3D形状生成(称为SDF-Stylegan)提供了一种基于stylegan2的深度学习方法,目的是降低生成形状和形状集合之间的视觉和几何差异。我们将stylegan2扩展到3D世代,并利用隐式签名的距离函数(SDF)作为3D形状表示,并引入了两个新颖的全球和局部形状鉴别器,它们区分了真实和假的SDF值和梯度,以显着提高形状的几何形状和视觉质量。我们进一步补充了基于阴影图像的FR \'Echet Inception距离(FID)分数的3D生成模型的评估指标,以更好地评估生成形状的视觉质量和形状分布。对形状生成的实验证明了SDF-Stylegan比最先进的表现出色。我们进一步证明了基于GAN倒置的各种任务中SDF-Stylegan的功效,包括形状重建,部分点云的形状完成,基于单图像的形状形状生成以及形状样式编辑。广泛的消融研究证明了我们框架设计的功效。我们的代码和训练有素的模型可在https://github.com/zhengxinyang/sdf-stylegan上找到。
translated by 谷歌翻译
本文介绍了一个名为DTNET的新颖框架,用于3D网格重建和通过Distangled Tostology生成。除了以前的工作之外,我们还学习一个特定于每个输入的拓扑感知的神经模板,然后将模板变形以重建详细的网格,同时保留学习的拓扑。一个关键的见解是将复杂的网格重建分解为两个子任务:拓扑配方和形状变形。多亏了脱钩,DT-NET隐含地学习了潜在空间中拓扑和形状的分离表示。因此,它可以启用新型的脱离控件,以支持各种形状生成应用,例如,将3D对象的拓扑混合到以前的重建作品无法实现的3D对象的拓扑结构。广泛的实验结果表明,与最先进的方法相比,我们的方法能够产生高质量的网格,尤其是具有不同拓扑结构。
translated by 谷歌翻译
Intelligent mesh generation (IMG) refers to a technique to generate mesh by machine learning, which is a relatively new and promising research field. Within its short life span, IMG has greatly expanded the generalizability and practicality of mesh generation techniques and brought many breakthroughs and potential possibilities for mesh generation. However, there is a lack of surveys focusing on IMG methods covering recent works. In this paper, we are committed to a systematic and comprehensive survey describing the contemporary IMG landscape. Focusing on 110 preliminary IMG methods, we conducted an in-depth analysis and evaluation from multiple perspectives, including the core technique and application scope of the algorithm, agent learning goals, data types, targeting challenges, advantages and limitations. With the aim of literature collection and classification based on content extraction, we propose three different taxonomies from three views of key technique, output mesh unit element, and applicable input data types. Finally, we highlight some promising future research directions and challenges in IMG. To maximize the convenience of readers, a project page of IMG is provided at \url{https://github.com/xzb030/IMG_Survey}.
translated by 谷歌翻译
深度生成模型的最新进展导致了3D形状合成的巨大进展。虽然现有模型能够合成表示为体素,点云或隐式功能的形状,但这些方法仅间接强制执行最终3D形状表面的合理性。在这里,我们提出了一种直接将对抗训练施加到物体表面的3D形状合成框架(Surfgen)。我们的方法使用可分解的球面投影层来捕获并表示隐式3D发生器的显式零IsoSurface作为在单元球上定义的功能。通过在对手设置中用球形CNN处理3D对象表面的球形表示,我们的发电机可以更好地学习自然形状表面的统计数据。我们在大规模形状数据集中评估我们的模型,并证明了端到端训练的模型能够产生具有不同拓扑的高保真3D形状。
translated by 谷歌翻译
With the advent of deep neural networks, learning-based approaches for 3D reconstruction have gained popularity. However, unlike for images, in 3D there is no canonical representation which is both computationally and memory efficient yet allows for representing high-resolution geometry of arbitrary topology. Many of the state-of-the-art learningbased 3D reconstruction approaches can hence only represent very coarse 3D geometry or are limited to a restricted domain. In this paper, we propose Occupancy Networks, a new representation for learning-based 3D reconstruction methods. Occupancy networks implicitly represent the 3D surface as the continuous decision boundary of a deep neural network classifier. In contrast to existing approaches, our representation encodes a description of the 3D output at infinite resolution without excessive memory footprint. We validate that our representation can efficiently encode 3D structure and can be inferred from various kinds of input. Our experiments demonstrate competitive results, both qualitatively and quantitatively, for the challenging tasks of 3D reconstruction from single images, noisy point clouds and coarse discrete voxel grids. We believe that occupancy networks will become a useful tool in a wide variety of learning-based 3D tasks.
translated by 谷歌翻译
我们在2D和3D域中介绍了一个Unist,是通用,未配对的形状转换的第一深度神经隐式模型。我们的模型是在自动编码隐式字段上构建的,而不是表示最先进的点云。此外,我们的翻译网络受过培训,以在潜在的网格表示上执行任务,该任务结合了潜在空间处理和位置意识的优点,不仅能够实现剧烈形状变换,而且很好地保护空间特征和用于自然形状的优质局部细节翻译。使用相同的网络架构和仅由输入域对决定,我们的模型可以了解风格保留的内容改变和内容保留的样式传输。我们展示了翻译结果的一般性和质量,并将它们与众所周知的基线进行比较。
translated by 谷歌翻译
随着几个行业正在朝着建模大规模的3D虚拟世界迈进,因此需要根据3D内容的数量,质量和多样性来扩展的内容创建工具的需求变得显而易见。在我们的工作中,我们旨在训练Parterant 3D生成模型,以合成纹理网格,可以通过3D渲染引擎直接消耗,因此立即在下游应用中使用。 3D生成建模的先前工作要么缺少几何细节,因此在它们可以生成的网格拓扑中受到限制,通常不支持纹理,或者在合成过程中使用神经渲染器,这使得它们在常见的3D软件中使用。在这项工作中,我们介绍了GET3D,这是一种生成模型,该模型直接生成具有复杂拓扑,丰富几何细节和高保真纹理的显式纹理3D网格。我们在可区分的表面建模,可区分渲染以及2D生成对抗网络中桥接了最新成功,以从2D图像集合中训练我们的模型。 GET3D能够生成高质量的3D纹理网格,从汽车,椅子,动物,摩托车和人类角色到建筑物,对以前的方法进行了重大改进。
translated by 谷歌翻译
通常在特定对象类别的大型3D数据集上对3D形状的现有生成模型进行培训。在本文中,我们研究了仅从单个参考3D形状学习的深层生成模型。具体而言,我们提出了一个基于GAN的多尺度模型,旨在捕获一系列空间尺度的输入形状的几何特征。为了避免在3D卷上操作引起的大量内存和计算成本,我们在三平面混合表示上构建了我们的发电机,这仅需要2D卷积。我们在参考形状的体素金字塔上训练我们的生成模型,而无需任何外部监督或手动注释。一旦受过训练,我们的模型就可以产生不同尺寸和宽高比的多样化和高质量的3D形状。所得的形状会跨不同的尺度呈现变化,同时保留了参考形状的全局结构。通过广泛的评估,无论是定性还是定量,我们都证明了我们的模型可以生成各种类型的3D形状。
translated by 谷歌翻译
In recent years, substantial progress has been achieved in learning-based reconstruction of 3D objects. At the same time, generative models were proposed that can generate highly realistic images. However, despite this success in these closely related tasks, texture reconstruction of 3D objects has received little attention from the research community and state-of-the-art methods are either limited to comparably low resolution or constrained experimental setups. A major reason for these limitations is that common representations of texture are inefficient or hard to interface for modern deep learning techniques. In this paper, we propose Texture Fields, a novel texture representation which is based on regressing a continuous 3D function parameterized with a neural network. Our approach circumvents limiting factors like shape discretization and parameterization, as the proposed texture representation is independent of the shape representation of the 3D object. We show that Texture Fields are able to represent high frequency texture and naturally blend with modern deep learning techniques. Experimentally, we find that Texture Fields compare favorably to state-of-the-art methods for conditional texture reconstruction of 3D objects and enable learning of probabilistic generative models for texturing unseen 3D models. We believe that Texture Fields will become an important building block for the next generation of generative 3D models.
translated by 谷歌翻译