我们考虑了一个联合表示的学习框架,在中央服务器的协助下,一组$ n $分布式客户通过其私人数据协作培训一组实体的表示(或嵌入)(例如,用户在一个中的用户社交网络)。在此框架下,对于以私人方式汇总在客户培训的本地嵌入的关键步骤,我们开发了一个名为SECEA的安全嵌入聚合协议,该协议为一组实体提供信息理论隐私保证,并在每个客户端提供相应的嵌入$同时$ $,对好奇的服务器和最多$ t <n/2 $勾结的客户。作为SECEA的第一步,联合学习系统执行了一个私人实体联盟,让每个客户在不知道哪个实体属于哪个客户的情况下学习系统中的所有实体。在每个聚合回合中,使用Lagrange插值在客户端中秘密共享本地嵌入,然后每个客户端构造编码的查询以检索预期实体的聚合嵌入。我们对各种表示的学习任务进行全面的实验,以评估SECEA的效用和效率,并从经验上证明,与没有(或具有较弱的)隐私保证的嵌入聚合协议相比,SECEA会造成可忽略的绩效损失(5%以内); SECEA的附加计算潜伏期减小,用于培训较大数据集的更深层次模型。
translated by 谷歌翻译
This paper proposes a novel application system for the generation of three-dimensional (3D) character animation driven by markerless human body motion capturing. The entire pipeline of the system consists of five stages: 1) the capturing of motion data using multiple cameras, 2) detection of the two-dimensional (2D) human body joints, 3) estimation of the 3D joints, 4) calculation of bone transformation matrices, and 5) generation of character animation. The main objective of this study is to generate a 3D skeleton and animation for 3D characters using multi-view images captured by ordinary cameras. The computational complexity of the 3D skeleton reconstruction based on 3D vision has been reduced as needed to achieve frame-by-frame motion capturing. The experimental results reveal that our system can effectively and efficiently capture human actions and use them to animate 3D cartoon characters in real-time.
translated by 谷歌翻译
对于应用智能,公用事业驱动的模式发现算法可以识别数据库中有见地和有用的模式。但是,在这些用于模式发现的技术中,模式的数量可能很大,并且用户通常只对其中一些模式感兴趣。因此,有针对性的高实数项目集挖掘已成为一个关键的研究主题,其目的是找到符合目标模式约束而不是所有模式的模式的子集。这是一项具有挑战性的任务,因为在非常大的搜索空间中有效找到量身定制的模式需要有针对性的采矿算法。已经提出了一种称为Targetum的第一种算法,该算法采用了类似于使用树结构进行后处理的方法,但是在许多情况下,运行时间和内存消耗都不令人满意。在本文中,我们通过提出一种带有模式匹配机制的新型基于列表的算法(名为Thuim(有针对性的高实用项目集挖掘))来解决此问题,该机制可以在挖掘过程中迅速匹配高实用项,以选择目标模式。在不同的数据集上进行了广泛的实验,以将所提出算法的性能与最新算法进行比较。结果表明,THUIM在运行时和内存消耗方面表现良好,并且与Targetum相比具有良好的可扩展性。
translated by 谷歌翻译
The existence of completely aligned and paired multi-modal neuroimaging data has proved its effectiveness in diagnosis of brain diseases. However, collecting the full set of well-aligned and paired data is expensive or even impractical, since the practical difficulties may include high cost, long time acquisition, image corruption, and privacy issues. A realistic solution is to explore either an unsupervised learning or a semi-supervised learning to synthesize the absent neuroimaging data. In this paper, we are the first one to comprehensively approach cross-modality neuroimage synthesis task from different perspectives, which include the level of the supervision (especially for weakly-supervised and unsupervised), loss function, evaluation metrics, the range of modality synthesis, datasets (aligned, private and public) and the synthesis-based downstream tasks. To begin with, we highlight several opening challenges for cross-modality neuroimage sysnthesis. Then we summarize the architecture of cross-modality synthesis under various of supervision level. In addition, we provide in-depth analysis of how cross-modality neuroimage synthesis can improve the performance of different downstream tasks. Finally, we re-evaluate the open challenges and point out the future directions for the remaining challenges. All resources are available at https://github.com/M-3LAB/awesome-multimodal-brain-image-systhesis
translated by 谷歌翻译
完全排列和配对的多模式神经成像数据的存在证明了其在诊断脑疾病中的有效性。但是,收集完整的一组良好的配对数据是不切实际的,因为实际困难可能包括高成本,长期获取,图像腐败和隐私问题。以前,未配对的神经影像数据(称为泥)通常被视为嘈杂的标签。但是,这种基于嘈杂的标签的方法在严重发生扭曲的数据时无法完成。例如,旋转角度不同。在本文中,我们提出了一种新的联邦自制学习(FEDMED),以用于脑形象合成。制定了仿射变换损失(ATL),以利用严重扭曲的图像,而无需违反医院的隐私立法。然后,我们引入了一种新的数据增强程序,以进行自我监督训练,并将其送入三个辅助头,即辅助旋转,辅助翻译和辅助缩放头。所提出的方法证明了在严重错误和未配对的数据设置下,我们合成结果的质量的高级性能,并且比其他基于GAN的算法更好。提出的方法还减少了对可变形注册的需求,同时鼓励利用未对准和未配对的数据。与其他最先进的方法相比,实验结果验证了我们学习范式的出色表现。
translated by 谷歌翻译
由于计算成本和能耗有限,部署在移动设备中的大多数神经网络模型都很小。然而,微小的神经网络通常很容易攻击。目前的研究证明,较大的模型规模可以提高鲁棒性,但很少的研究侧重于如何增强微小神经网络的稳健性。我们的工作侧重于如何改善微小神经网络的稳健性,而不会严重恶化移动级资源下的清洁准确性。为此,我们提出了一种多目标oneShot网络架构搜索(NAS)算法,以便在对抗准确度,清洁精度和模型尺寸方面获得最佳权衡网络。具体而言,我们基于新的微小块和通道设计一种新的搜索空间,以平衡模型大小和对抗性能。此外,由于SUPERNET显着影响了我们NAS算法中子网的性能,因此我们揭示了对SuperNet如何有助于获得白盒对抗攻击下最好的子网的洞察力。具体地,我们通过分析对抗性可转移性,超空网的宽度以及从头划痕和微调训练子网之间的差异来探索新的对抗性培训范式。最后,我们对第一个非主导的前沿的某些块和通道的层面组合进行了统计分析,这可以作为设计微小神经网络架构以实现对抗性扰动的指导。
translated by 谷歌翻译
Benefiting from the intrinsic supervision information exploitation capability, contrastive learning has achieved promising performance in the field of deep graph clustering recently. However, we observe that two drawbacks of the positive and negative sample construction mechanisms limit the performance of existing algorithms from further improvement. 1) The quality of positive samples heavily depends on the carefully designed data augmentations, while inappropriate data augmentations would easily lead to the semantic drift and indiscriminative positive samples. 2) The constructed negative samples are not reliable for ignoring important clustering information. To solve these problems, we propose a Cluster-guided Contrastive deep Graph Clustering network (CCGC) by mining the intrinsic supervision information in the high-confidence clustering results. Specifically, instead of conducting complex node or edge perturbation, we construct two views of the graph by designing special Siamese encoders whose weights are not shared between the sibling sub-networks. Then, guided by the high-confidence clustering information, we carefully select and construct the positive samples from the same high-confidence cluster in two views. Moreover, to construct semantic meaningful negative sample pairs, we regard the centers of different high-confidence clusters as negative samples, thus improving the discriminative capability and reliability of the constructed sample pairs. Lastly, we design an objective function to pull close the samples from the same cluster while pushing away those from other clusters by maximizing and minimizing the cross-view cosine similarity between positive and negative samples. Extensive experimental results on six datasets demonstrate the effectiveness of CCGC compared with the existing state-of-the-art algorithms.
translated by 谷歌翻译
To generate high quality rendering images for real time applications, it is often to trace only a few samples-per-pixel (spp) at a lower resolution and then supersample to the high resolution. Based on the observation that the rendered pixels at a low resolution are typically highly aliased, we present a novel method for neural supersampling based on ray tracing 1/4-spp samples at the high resolution. Our key insight is that the ray-traced samples at the target resolution are accurate and reliable, which makes the supersampling an interpolation problem. We present a mask-reinforced neural network to reconstruct and interpolate high-quality image sequences. First, a novel temporal accumulation network is introduced to compute the correlation between current and previous features to significantly improve their temporal stability. Then a reconstruct network based on a multi-scale U-Net with skip connections is adopted for reconstruction and generation of the desired high-resolution image. Experimental results and comparisons have shown that our proposed method can generate higher quality results of supersampling, without increasing the total number of ray-tracing samples, over current state-of-the-art methods.
translated by 谷歌翻译
Temporal sentence grounding (TSG) aims to identify the temporal boundary of a specific segment from an untrimmed video by a sentence query. All existing works first utilize a sparse sampling strategy to extract a fixed number of video frames and then conduct multi-modal interactions with query sentence for reasoning. However, we argue that these methods have overlooked two indispensable issues: 1) Boundary-bias: The annotated target segment generally refers to two specific frames as corresponding start and end timestamps. The video downsampling process may lose these two frames and take the adjacent irrelevant frames as new boundaries. 2) Reasoning-bias: Such incorrect new boundary frames also lead to the reasoning bias during frame-query interaction, reducing the generalization ability of model. To alleviate above limitations, in this paper, we propose a novel Siamese Sampling and Reasoning Network (SSRN) for TSG, which introduces a siamese sampling mechanism to generate additional contextual frames to enrich and refine the new boundaries. Specifically, a reasoning strategy is developed to learn the inter-relationship among these frames and generate soft labels on boundaries for more accurate frame-query reasoning. Such mechanism is also able to supplement the absent consecutive visual semantics to the sampled sparse frames for fine-grained activity understanding. Extensive experiments demonstrate the effectiveness of SSRN on three challenging datasets.
translated by 谷歌翻译
Representing and synthesizing novel views in real-world dynamic scenes from casual monocular videos is a long-standing problem. Existing solutions typically approach dynamic scenes by applying geometry techniques or utilizing temporal information between several adjacent frames without considering the underlying background distribution in the entire scene or the transmittance over the ray dimension, limiting their performance on static and occlusion areas. Our approach $\textbf{D}$istribution-$\textbf{D}$riven neural radiance fields offers high-quality view synthesis and a 3D solution to $\textbf{D}$etach the background from the entire $\textbf{D}$ynamic scene, which is called $\text{D}^4$NeRF. Specifically, it employs a neural representation to capture the scene distribution in the static background and a 6D-input NeRF to represent dynamic objects, respectively. Each ray sample is given an additional occlusion weight to indicate the transmittance lying in the static and dynamic components. We evaluate $\text{D}^4$NeRF on public dynamic scenes and our urban driving scenes acquired from an autonomous-driving dataset. Extensive experiments demonstrate that our approach outperforms previous methods in rendering texture details and motion areas while also producing a clean static background. Our code will be released at https://github.com/Luciferbobo/D4NeRF.
translated by 谷歌翻译