垂直联合学习(VFL)引起了很多关注,因为它可以以隐私的方式实现跨核数据合作。虽然大多数在VFL专注于线性和树模型的研究工作,但在VFL中尚未对深层模型(例如,神经网络)进行很好的研究。在本文中,我们专注于Splitnn,这是VFL中著名的神经网络框架,并确定了SplitNN中数据安全性和模型性能之间的权衡。简而言之,SplitNN通过交换梯度和转换数据来训练模型。一方面,SplitNN遭受了模型性能的损失,因为多方使用转换的数据而不是原始数据共同训练模型,并且丢弃了大量的低级特征信息。另一方面,通过在SplitNN中的较低层的汇总(即,数据的转换较小,保留了更低级别的功能)来提高模型性能的天真解决方案,使原始数据易受推理攻击的影响。为了减轻上述权衡,我们在VFL中提出了一个新的神经网络协议,称为安全远射聚合(SFA)。它改变了汇总转换数据并采用可移动掩码以保护原始数据的方式。实验结果表明,具有SFA的网络同时实现了数据安全性和高模型性能。
translated by 谷歌翻译
3D面重建结果的评估通常取决于估计的3D模型和地面真相扫描之间的刚性形状比对。我们观察到,将两个形状与不同的参考点进行排列可以在很大程度上影响评估结果。这给精确诊断和改进3D面部重建方法带来了困难。在本文中,我们提出了一种新的评估方法,并采用了新的基准测试,包括100张全球对齐的面部扫描,具有准确的面部关键点,高质量的区域口罩和拓扑符合的网格。我们的方法执行区域形状比对,并导致计算形状误差期间更准确,双向对应关系。细粒度,区域评估结果为我们提供了有关最先进的3D面部重建方法表现的详细理解。例如,我们对基于单图像的重建方法的实验表明,DECA在鼻子区域表现最好,而Ganfit在脸颊区域的表现更好。此外,使用与我们构造的相同过程以对齐和重新构造几个3D面部数据集的新型和高质量的3DMM基础HIFI3D ++。我们将在https://realy3dface.com上发布真正的HIFI3D ++以及我们的新评估管道。
translated by 谷歌翻译
广泛应用的密度峰聚类(DPC)算法使得直观的群集形成假设假设集群中心通常被具有较低局部密度的数据点包围,远离具有较高局部密度的其他数据点。然而,这种假设遭受一个限制,即在识别具有较低密度的簇时通常有问题,因为它们可以容易地合并到具有更高密度的其他簇中。结果,DPC可能无法识别具有变分密度的簇。为了解决这个问题,我们提出了一种变分浓度峰值聚类(VDPC)算法,该算法旨在系统地和自主地在具有各种类型的密度分布的数据集上执行聚类任务。具体而言,我们首先提出了一种新的方法,以确定所有数据点中的代表,并根据所确定的代表构建初始集群,以进一步分析集群财产。此外,我们根据其本地密度将所有数据点划分为不同的级别,并通过组合DPC和DBSCAN的优点来提出统一的聚类框架。因此,系统地处理跨越不同密度水平跨越不同密度水平的所有识别的初始簇以形成最终簇。为了评估所提出的VDPC算法的有效性,我们使用20个数据集进行广泛的实验,包括八个合成,六个现实世界和六个图像数据集。实验结果表明,VDPC优于两个经典算法(即,DPC和DBSCAN)和四种最先进的扩展DPC算法。
translated by 谷歌翻译
联合建议解决了推荐系统的数据筒仓和隐私问题。当前的联合推荐系统主要利用加密或混淆方法来保护原始评级免受泄漏。但是,前者带有额外的沟通和计算成本,后者损坏了模型的准确性。他们俩都无法同时满足推荐系统的实时反馈和准确的个性化要求。在本文中,我们提出了联合蒙面的矩阵分解(FEDMMF),以保护联邦推荐系统中的数据隐私,而无需牺牲效率和有效性。在更多详细信息中,我们介绍了仅从本地数据生成的个性化面具的新想法,并将其应用于FEDMMF。一方面,个性化面具为参与者的私人数据提供了保护,而无需损失有效。另一方面,结合自适应安全聚合协议,个性化面膜可以进一步提高效率。从理论上讲,我们为个性化面具提供安全分析。从经验上讲,我们还显示了设计模型在不同的现实世界数据集上的优越性。
translated by 谷歌翻译
Federated Learning (FL) has been widely accepted as the solution for privacy-preserving machine learning without collecting raw data. While new technologies proposed in the past few years do evolve the FL area, unfortunately, the evaluation results presented in these works fall short in integrity and are hardly comparable because of the inconsistent evaluation metrics and experimental settings. In this paper, we propose a holistic evaluation framework for FL called FedEval, and present a benchmarking study on seven state-of-the-art FL algorithms. Specifically, we first introduce the core evaluation taxonomy model, called FedEval-Core, which covers four essential evaluation aspects for FL: Privacy, Robustness, Effectiveness, and Efficiency, with various well-defined metrics and experimental settings. Based on the FedEval-Core, we further develop an FL evaluation platform with standardized evaluation settings and easy-to-use interfaces. We then provide an in-depth benchmarking study between the seven well-known FL algorithms, including FedSGD, FedAvg, FedProx, FedOpt, FedSTC, SecAgg, and HEAgg. We comprehensively analyze the advantages and disadvantages of these algorithms and further identify the suitable practical scenarios for different algorithms, which is rarely done by prior work. Lastly, we excavate a set of take-away insights and future research directions, which are very helpful for researchers in the FL area.
translated by 谷歌翻译
时空人群流量预测(STCFP)问题是一种经典问题,具有丰富的现有研究工作,这些努力受益于传统的统计学习和最近的深度学习方法。虽然STCFP可以参考许多现实世界问题,但大多数现有研究都侧重于相当特定的应用,例如预测出租车需求,乘资顺序等。这会阻碍STCFP研究作为针对不同应用的方法几乎没有比较,因此如何将应用驱动的方法概括为其他场景尚不清楚。要填补这一差距,这篇论文进行了两项努力:(i)我们提出了一个叫做STANALYTIC的分析框架,以定性地调查其关于各种空间和时间因素的设计考虑的STCFP方法,旨在使不同的应用驱动的方法进行不同的方法; (ii)(ii)我们构建一个广泛的大型STCFP基准数据集,具有四种不同的场景(包括RideSharing,Bikesharing,Metro和电动车辆充电),其流量高达数亿个流量记录,以定量测量STCFP方法的普遍性。此外,为了详细说明STANalytic在帮助设计上推广的STCFP方法方面的有效性,我们提出了一种通过整合STANALYTIC鉴定的可推广的时间和空间知识来提出一种称为STETA的时空元模型。我们利用不同的深度学习技术实施STMETA的三种变体。通过数据集,我们证明Stmeta变体可以优于最先进的STCFP方法5%。
translated by 谷歌翻译
Existing 3D-aware image synthesis approaches mainly focus on generating a single canonical object and show limited capacity in composing a complex scene containing a variety of objects. This work presents DisCoScene: a 3Daware generative model for high-quality and controllable scene synthesis. The key ingredient of our method is a very abstract object-level representation (i.e., 3D bounding boxes without semantic annotation) as the scene layout prior, which is simple to obtain, general to describe various scene contents, and yet informative to disentangle objects and background. Moreover, it serves as an intuitive user control for scene editing. Based on such a prior, the proposed model spatially disentangles the whole scene into object-centric generative radiance fields by learning on only 2D images with the global-local discrimination. Our model obtains the generation fidelity and editing flexibility of individual objects while being able to efficiently compose objects and the background into a complete scene. We demonstrate state-of-the-art performance on many scene datasets, including the challenging Waymo outdoor dataset. Project page: https://snap-research.github.io/discoscene/
translated by 谷歌翻译
Modern autonomous driving system is characterized as modular tasks in sequential order, i.e., perception, prediction and planning. As sensors and hardware get improved, there is trending popularity to devise a system that can perform a wide diversity of tasks to fulfill higher-level intelligence. Contemporary approaches resort to either deploying standalone models for individual tasks, or designing a multi-task paradigm with separate heads. These might suffer from accumulative error or negative transfer effect. Instead, we argue that a favorable algorithm framework should be devised and optimized in pursuit of the ultimate goal, i.e. planning of the self-driving-car. Oriented at this goal, we revisit the key components within perception and prediction. We analyze each module and prioritize the tasks hierarchically, such that all these tasks contribute to planning (the goal). To this end, we introduce Unified Autonomous Driving (UniAD), the first comprehensive framework up-to-date that incorporates full-stack driving tasks in one network. It is exquisitely devised to leverage advantages of each module, and provide complementary feature abstractions for agent interaction from a global perspective. Tasks are communicated with unified query design to facilitate each other toward planning. We instantiate UniAD on the challenging nuScenes benchmark. With extensive ablations, the effectiveness of using such a philosophy is proven to surpass previous state-of-the-arts by a large margin in all aspects. The full suite of codebase and models would be available to facilitate future research in the community.
translated by 谷歌翻译
As a powerful representation of 3D scenes, the neural radiance field (NeRF) enables high-quality novel view synthesis from multi-view images. Stylizing NeRF, however, remains challenging, especially on simulating a text-guided style with both the appearance and the geometry altered simultaneously. In this paper, we present NeRF-Art, a text-guided NeRF stylization approach that manipulates the style of a pre-trained NeRF model with a simple text prompt. Unlike previous approaches that either lack sufficient geometry deformations and texture details or require meshes to guide the stylization, our method can shift a 3D scene to the target style characterized by desired geometry and appearance variations without any mesh guidance. This is achieved by introducing a novel global-local contrastive learning strategy, combined with the directional constraint to simultaneously control both the trajectory and the strength of the target style. Moreover, we adopt a weight regularization method to effectively suppress cloudy artifacts and geometry noises which arise easily when the density field is transformed during geometry stylization. Through extensive experiments on various styles, we demonstrate that our method is effective and robust regarding both single-view stylization quality and cross-view consistency. The code and more results can be found in our project page: https://cassiepython.github.io/nerfart/.
translated by 谷歌翻译
Software engineers working with the same programming language (PL) may speak different natural languages (NLs) and vice versa, erecting huge barriers to communication and working efficiency. Recent studies have demonstrated the effectiveness of generative pre-training in computer programs, yet they are always English-centric. In this work, we step towards bridging the gap between multilingual NLs and multilingual PLs for large language models (LLMs). We release ERNIE-Code, a unified pre-trained language model for 116 NLs and 6 PLs. We employ two methods for universal cross-lingual pre-training: span-corruption language modeling that learns patterns from monolingual NL or PL; and pivot-based translation language modeling that relies on parallel data of many NLs and PLs. Extensive results show that ERNIE-Code outperforms previous multilingual LLMs for PL or NL across a wide range of end tasks of code intelligence, including multilingual code-to-text, text-to-code, code-to-code, and text-to-text generation. We further show its advantage of zero-shot prompting on multilingual code summarization and text-to-text translation. We will make our code and pre-trained models publicly available.
translated by 谷歌翻译