智能论文笔记

Pixel2ISDF: Implicit Signed Distance Fields based Human Body Model from Multi-view and Multi-pose Images

Jianchuan Chen , Wentao Yi , Tiantian Wang , Xing Li , Liqian Ma , Yangyu Fan , Huchuan Lu

分类：计算机视觉

2022-12-06

In this report, we focus on reconstructing clothed humans in the canonical space given multiple views and poses of a human as the input. To achieve this, we utilize the geometric prior of the SMPLX model in the canonical space to learn the implicit representation for geometry reconstruction. Based on the observation that the topology between the posed mesh and the mesh in the canonical space are consistent, we propose to learn latent codes on the posed mesh by leveraging multiple input images and then assign the latent codes to the mesh in the canonical space. Specifically, we first leverage normal and geometry networks to extract the feature vector for each vertex on the SMPLX mesh. Normal maps are adopted for better generalization to unseen images compared to 2D images. Then, features for each vertex on the posed mesh from multiple images are integrated by MLPs. The integrated features acting as the latent code are anchored to the SMPLX mesh in the canonical space. Finally, latent code for each 3D point is extracted and utilized to calculate the SDF. Our work for reconstructing the human shape on canonical pose achieves 3rd performance on WCPA MVP-Human Body Challenge.

translated by 谷歌翻译

FreeEnricher: Enriching Face Landmarks without Additional Cost

Yangyu Huang , Xi Chen , Jongyoo Kim , Hao Yang , Chong Li , Jiaolong Yang , Dong Chen

分类：计算机视觉

2022-12-19

Recent years have witnessed significant growth of face alignment. Though dense facial landmark is highly demanded in various scenarios, e.g., cosmetic medicine and facial beautification, most works only consider sparse face alignment. To address this problem, we present a framework that can enrich landmark density by existing sparse landmark datasets, e.g., 300W with 68 points and WFLW with 98 points. Firstly, we observe that the local patches along each semantic contour are highly similar in appearance. Then, we propose a weakly-supervised idea of learning the refinement ability on original sparse landmarks and adapting this ability to enriched dense landmarks. Meanwhile, several operators are devised and organized together to implement the idea. Finally, the trained model is applied as a plug-and-play module to the existing face alignment networks. To evaluate our method, we manually label the dense landmarks on 300W testset. Our method yields state-of-the-art accuracy not only in newly-constructed dense 300W testset but also in the original sparse 300W and WFLW testsets without additional cost.

translated by 谷歌翻译

BlindFL: Vertical Federated Machine Learning without Peeking into Your Data

Fangcheng Fu , Huanran Xue , Yong Cheng , Yangyu Tao , Bin Cui

分类：机器学习

2022-06-16

由于对隐私保护的关注不断增加，因此如何在不同数据源上建立机器学习（ML）模型具有安全保证，这越来越受欢迎。垂直联合学习（VFL）描述了这种情况，其中ML模型建立在不同参与方的私人数据上，该数据与同一集合相同的实例中拥有不相交的功能，这适合许多现实世界中的协作任务。但是，我们发现VFL现有的解决方案要么支持有限的输入功能，要么在联合执行过程中遭受潜在数据泄漏的损失。为此，本文旨在研究VFL方案中ML模式的功能和安全性。具体来说，我们介绍了BlindFL，这是VFL训练和推理的新型框架。首先，为了解决VFL模型的功能，我们建议联合源层团结不同各方的数据。联合源层可以有效地支持各种特征，包括密集，稀疏，数值和分类特征。其次，我们在联合执行期间仔细分析了安全性，并正式化了隐私要求。基于分析，我们设计了安全，准确的算法协议，并进一步证明了在理想真实的仿真范式下的安全保证。广泛的实验表明，BlindFL支持各种数据集和模型，同时获得强大的隐私保证。

translated by 谷歌翻译

Graph Attention Multi-Layer Perceptron

Wentao Zhang , Ziqi Yin , Zeang Sheng , Yang Li , Wen Ouyang , Xiaosen Li , Yangyu Tao , Zhi Yang , Bin Cui

分类：机器学习 | 人工智能

2022-06-09

图形神经网络（GNN）在许多基于图的应用程序中取得了巨大成功。但是，巨大的尺寸和高稀疏度的图表阻碍了其在工业场景下的应用。尽管为大规模图提出了一些可扩展的GNN，但它们为每个节点采用固定的$ k $ hop邻域，因此在稀疏区域内采用大型繁殖深度时面临过度光滑的问题。为了解决上述问题，我们提出了一种新的GNN体系结构 - 图形注意多层感知器（GAMLP），该架构可以捕获不同图形知识范围之间的基本相关性。我们已经与天使平台部署了GAMLP，并进一步评估了现实世界数据集和大规模工业数据集的GAMLP。这14个图数据集的广泛实验表明，GAMLP在享有高可扩展性和效率的同时，达到了最先进的性能。具体来说，在我们的大规模腾讯视频数据集上的预测准确性方面，它的表现优于1.3 \％，同时达到了高达$ 50 \ times $ triending的速度。此外，它在开放图基准的最大同质和异质图（即OGBN-PAPERS100M和OGBN-MAG）的排行榜上排名第一。

translated by 谷歌翻译

Learning Robust and Lightweight Model through Separable Structured Transformations

Yanhui Huang , Yangyu Xu , Xian Wei

分类：计算机视觉

2021-12-27

随着移动设备的扩散和物联网，深入学习模型越来越多地部署在具有有限的计算资源和记忆的设备上，并且暴露于对抗性噪声的威胁。这些设备必须使用轻质和稳健性学习深层模型。然而，当前的深度学习解决方案难以学习具有在不降低一个或另一个的情况下拥有这两个性质的模型。众所周知，全连接层有助于卷积神经网络的大部分参数。我们执行完全连接层的可分离结构变换以减少参数，其中全连接层的大规模重量矩阵由几个可分离的小矩阵的张量产品分离。注意，在馈送到完全连接的层之前，诸如图像的数据（例如图像）不再需要扁平化，保留数据的有价值的空间几何信息。此外，为了进一步提高轻质和稳健性，我们提出了稀疏性和可分性条件数的关节约束，其施加在这些可分离基质上。我们评估了MLP，VGG-16和视觉变压器上提出的方法。数据集的实验结果如想象成，SVHN，CiFar-100和CiFAR10，表明我们成功将网络参数量减少了90％，而稳健的精度损耗小于1.5％，这比基于的SOTA方法更好原始完全连接的图层。有趣的是，即使以高压缩率，例如，它也可以实现压倒性的优势，例如，200次。

translated by 谷歌翻译

K-Core Decomposition on Super Large Graphs with Limited Resources

Shicheng Gao , Jie Xu , Xiaosen Li , Fangcheng Fu , Wentao Zhang , Wen Ouyang , Yangyu Tao , Bin Cui

分类：机器学习

2021-12-26

K-Core Deconnosition是一个常用的指标来分析图形结构或研究节点在复杂图中的相对重要性。近年来，图表的规模迅速增长，特别是在工业环境中。例如，我们的工业伙伴以数十亿用户运行流行的社交应用程序，并且能够收集丰富的用户数据。因此，对大型图形的k核分解应用于学术界和行业的越来越多的关注。处理大图的简单但有效的方法是在分布式设置中训练它们，并且还提出了一些分布式k核分解算法。尽管他们有效性，我们在实验和理论上观察到这些算法消耗了太多资源，并在超大型图表上变得不稳定，特别是当给定的资源有限时。在本文中，我们处理那些超大型图形，并在分布式K核分解算法的顶部提出了分行和征服策略。我们在三个大图中评估我们的方法。实验结果表明，资源的消耗可以显着降低，大规模图的计算比现有方法更稳定。例如，分布式K-Core分解算法可以缩放到具有1360亿边缘的大图，而不会与我们的分行和征服技术丢失正确性。

translated by 谷歌翻译

HET: Scaling out Huge Embedding Model Training via Cache-enabled Distributed Framework

Xupeng Miao , Hailin Zhang , Yining Shi , Xiaonan Nie , Zhi Yang , Yangyu Tao , Bin Cui

分类：机器学习

2021-12-14

嵌入式模型是高维数据的有效学习范例。但是，嵌入模型的一个开放问题是它们的表示（潜在因子）通常会导致大参数空间。我们观察到，现有的分布式训练框架面临嵌入模型的可伸缩性问题，因为从服务器的共享嵌入参数更新和检索共享嵌入参数通常占主导地位培训周期。在本文中，我们提出了一种新的系统框架，可显着提高巨大嵌入模型培训的可扩展性。我们拥抱嵌入的嵌入式作为绩效机会的倾斜流行分布，并利用它来解决具有嵌入缓存的通信瓶颈。为确保缓存跨越一致性，我们将新的一致性模型纳入HET设计，该模型提供了在每嵌入的基础上提供细粒度的一致性保证。与以前的工作相比，只允许读取操作的僵化，HET也利用了写入操作的血液性。六种代表性任务的评估表明，在最先进的基线上，HET达到高达88％的嵌入通信减少和高达20.68倍的性能加速。

translated by 谷歌翻译

General Facial Representation Learning in a Visual-Linguistic Manner

Yinglin Zheng , Hao Yang , Ting Zhang , Jianmin Bao , Dongdong Chen , Yangyu Huang , Lu Yuan , Dong Chen , Ming Zeng , Fang Wen

分类：计算机视觉 | 自然语言处理

2021-12-06

如何学习一个促进所有面部分析任务的通用面部表示？本文对此目标进行了一步。在本文中，我们研究了面对面分析任务的预先训练模型的转移性能，并以视语言方式为一般面部代表学习学习的框架，称为Farl。一方面，该框架涉及从图像文本对学习高级语义含义的对比损失。另一方面，我们提出通过添加掩蔽图像建模来同时探索低级信息以进一步增强面部表示。我们对Laion-face进行预训练，一个包含大量面部图像文本对的数据集，并评估在多个下游任务上的表示功能。我们表明Farl与以前的预先训练的模型相比，Farl实现了更好的转移性能。我们还验证了低数据制度的优势。更重要的是，我们的模型在面部分析任务上超越了最先进的方法，包括面部解析和面部对齐。

translated by 谷歌翻译

ADNet: Leveraging Error-Bias Towards Normal Direction in Face Alignment

Yangyu Huang , Hao Yang , Chong Li , Jongyoo Kim , Fangyun Wei

分类：计算机视觉

2021-09-13

The recent progress of CNN has dramatically improved face alignment performance. However, few works have paid attention to the error-bias with respect to error distribution of facial landmarks. In this paper, we investigate the error-bias issue in face alignment, where the distributions of landmark errors tend to spread along the tangent line to landmark curves. This error-bias is not trivial since it is closely connected to the ambiguous landmark labeling task. Inspired by this observation, we seek a way to leverage the error-bias property for better convergence of CNN model. To this end, we propose anisotropic direction loss (ADL) and anisotropic attention module (AAM) for coordinate and heatmap regression, respectively. ADL imposes strong binding force in normal direction for each landmark point on facial boundaries. On the other hand, AAM is an attention module which can get anisotropic attention mask focusing on the region of point and its local edge connected by adjacent points, it has a stronger response in tangent than in normal, which means relaxed constraints in the tangent. These two methods work in a complementary manner to learn both facial structures and texture details. Finally, we integrate them into an optimized end-to-end training pipeline named ADNet. Our ADNet achieves state-of-the-art results on 300W, WFLW and COFW datasets, which demonstrates the effectiveness and robustness.

translated by 谷歌翻译

Graph Attention MLP with Reliable Label Utilization

Wentao Zhang , Ziqi Yin , Zeang Sheng , Wen Ouyang , Xiaosen Li , Yangyu Tao , Zhi Yang , Bin Cui

分类：机器学习

2021-08-23

Graph神经网络（GNN）最近在许多基于图的应用程序中都实现了最先进的性能。尽管具有很高的表现力，但他们通常需要在多个培训时期进行昂贵的递归邻里扩展，并面临可伸缩性问题。此外，它们中的大多数是不灵活的，因为它们仅限于固定跳跃社区，并且对不同节点的实际接受场需求不敏感。我们通过引入可扩展且灵活的图表多层感知器（GAMLP）来规避这些限制。随着非线性转化和特征传播的分离，GAMLP通过以预先计算的方式执行传播程序来显着提高可伸缩性和效率。有了三个原则的接受场注意力，GAMLP中的每个节点都具有灵活性和适应性，以利用接收场的不同尺寸的传播特征。我们对三个大型开放图基准（例如OGBN-PAPERS100M，OGBN产品和OGBN-MAG）进行了广泛的评估，这表明GAMLP不仅可以实现前面的性能，而且还提供了较高的可扩展性和效率。

translated by 谷歌翻译