We present PhoMoH, a neural network methodology to construct generative models of photorealistic 3D geometry and appearance of human heads including hair, beards, clothing and accessories. In contrast to prior work, PhoMoH models the human head using neural fields, thus supporting complex topology. Instead of learning a head model from scratch, we propose to augment an existing expressive head model with new features. Concretely, we learn a highly detailed geometry network layered on top of a mid-resolution head model together with a detailed, local geometry-aware, and disentangled color field. Our proposed architecture allows us to learn photorealistic human head models from relatively little data. The learned generative geometry and appearance networks can be sampled individually and allow the creation of diverse and realistic human heads. Extensive experiments validate our method qualitatively and across different metrics.
translated by 谷歌翻译
We introduce Structured 3D Features, a model based on a novel implicit 3D representation that pools pixel-aligned image features onto dense 3D points sampled from a parametric, statistical human mesh surface. The 3D points have associated semantics and can move freely in 3D space. This allows for optimal coverage of the person of interest, beyond just the body shape, which in turn, additionally helps modeling accessories, hair, and loose clothing. Owing to this, we present a complete 3D transformer-based attention framework which, given a single image of a person in an unconstrained pose, generates an animatable 3D reconstruction with albedo and illumination decomposition, as a result of a single end-to-end model, trained semi-supervised, and with no additional postprocessing. We show that our S3F model surpasses the previous state-of-the-art on various tasks, including monocular 3D reconstruction, as well as albedo and shading estimation. Moreover, we show that the proposed methodology allows novel view synthesis, relighting, and re-posing the reconstruction, and can naturally be extended to handle multiple input images (e.g. different views of a person, or the same view, in different poses, in video). Finally, we demonstrate the editing capabilities of our model for 3D virtual try-on applications.
translated by 谷歌翻译
我们向渲染和时间(4D)重建人类的渲染和时间(4D)重建的神经辐射场,通过稀疏的摄像机捕获或甚至来自单眼视频。我们的方法将思想与神经场景表示,新颖的综合合成和隐式统计几何人称的人类表示相结合,耦合使用新颖的损失功能。在先前使用符号距离功能表示的结构化隐式人体模型,而不是使用统一的占用率来学习具有统一占用的光域字段。这使我们能够从稀疏视图中稳健地融合信息,并概括超出在训练中观察到的姿势或视图。此外,我们应用几何限制以共同学习观察到的主题的结构 - 包括身体和衣服 - 并将辐射场正规化为几何合理的解决方案。在多个数据集上的广泛实验证明了我们方法的稳健性和准确性,其概括能力显着超出了一系列的姿势和视图,以及超出所观察到的形状的统计外推。
translated by 谷歌翻译
The recent breakthroughs in machine learning (ML) and deep learning (DL) have enabled many new capabilities across plenty of application domains. While most existing machine learning models require large memory and computing power, efforts have been made to deploy some models on resource-constrained devices as well. There are several systems that perform inference on the device, while direct training on the device still remains a challenge. On-device training, however, is attracting more and more interest because: (1) it enables training models on local data without needing to share data over the cloud, thus enabling privacy preserving computation by design; (2) models can be refined on devices to provide personalized services and cope with model drift in order to adapt to the changes of the real-world environment; and (3) it enables the deployment of models in remote, hardly accessible locations or places without stable internet connectivity. We summarize and analyze the-state-of-art systems research to provide the first survey of on-device training from a systems perspective.
translated by 谷歌翻译
自然语言处理(NLP)已越来越多地用于提供教育应用的适应性。但是,最近的研究突出了预训练的语言模型中的各种偏见。尽管现有研究调查了不同领域的偏见,但它们在解决有关教育和多语言语料库的细粒度分析方面受到限制。在这项工作中,我们通过在五年内从大学生收集的9,165个德国同行评审的语料库中分析了跨文本和多个架构的偏见。值得注意的是,我们的语料库包括来自同行评审接收者以及人口统计属性的帮助,质量和关键方面等级等标签。我们对(1)与聚类标签有关的(2)最常见的预训练的德语模型(T5,BERT和GPT-2)和Glove Embeddings进行了单词嵌入关联测试(WEAT)测试(WEAT)分析(1)我们收集的语料库,以及(3)对我们收集的数据集进行微调后的语言模型。与我们的最初期望相反,我们发现我们收集的语料库在共同出现分析或手套嵌入中没有揭示许多偏见。但是,预先训练的德语模型发现了实质性的概念,种族和性别偏见,并且在同行评审数据的微调过程中,概念和种族轴之间的偏见发生了重大变化。通过我们的研究,我们的目标是通过新颖的数据集,对自然语言教育数据的偏见的理解以及不抵消语言模型中的教育任务偏见的潜在危害,为第四联合国的可持续发展目标(质量教育)做出贡献。
translated by 谷歌翻译
最近的反向散射通信技术使超低功耗无线设备使得在没有电池的情况下操作,同时直接与未修饰的商品无线设备互操作。商品设备在提供未调制的载体时,可以在从其环境中收集能量以执行感测,计算和通信任务的同时需要进行通信的未调制载波。未经调制载波的最佳提供限制了网络的大小,因为它是NP硬组合优化问题。因此,以前的作品要么完全忽略载体优化,要么避免次优启发式,浪费宝贵的能量和光谱资源。我们展示了Deepgantt,这是一种与无线商品互通设备的无电池设备的深度学习调度程序。 Deepgantt利用图形神经网络来克服这个问题固有的变量输入和输出大小挑战。我们培养我们的深度学习调度程序,具有从约束优化求解器获得的相对较小的尺寸的最佳时间表。 Deepgantt不仅优于精心制作的启发式解决方案,而且还在训练有素的问题大小的最佳调度器的3%内执行。最后,DeepGantt推广了超过用于训练的最大值的问题超过四倍,因此打破了最佳调度器的可扩展性限制,并为更有效的反向散射网络铺平道路。
translated by 谷歌翻译
translated by 谷歌翻译