The interplay between quantum physics and machine learning gives rise to the emergent frontier of quantum machine learning, where advanced quantum learning models may outperform their classical counterparts in solving certain challenging problems. However, quantum learning systems are vulnerable to adversarial attacks: adding tiny carefully-crafted perturbations on legitimate input samples can cause misclassifications. To address this issue, we propose a general scheme to protect quantum learning systems from adversarial attacks by randomly encoding the legitimate data samples through unitary or quantum error correction encoders. In particular, we rigorously prove that both global and local random unitary encoders lead to exponentially vanishing gradients (i.e. barren plateaus) for any variational quantum circuits that aim to add adversarial perturbations, independent of the input data and the inner structures of adversarial circuits and quantum classifiers. In addition, we prove a rigorous bound on the vulnerability of quantum classifiers under local unitary adversarial attacks. We show that random black-box quantum error correction encoders can protect quantum classifiers against local adversarial noises and their robustness increases as we concatenate error correction codes. To quantify the robustness enhancement, we adapt quantum differential privacy as a measure of the prediction stability for quantum classifiers. Our results establish versatile defense strategies for quantum classifiers against adversarial perturbations, which provide valuable guidance to enhance the reliability and security for both near-term and future quantum learning technologies.
translated by 谷歌翻译
在过去的十年中,机器学习取得了巨大的成功,其应用程序从面部识别到自然语言处理不等。同时,在量子计算领域已经取得了快速的进步,包括开发强大的量子算法和高级量子设备。机器学习与量子物理学之间的相互作用具有将实际应用带给现代社会的有趣潜力。在这里,我们以参数化量子电路的形式关注量子神经网络。我们将主要讨论各种结构和编码量子神经网络的策略,以进行监督学习任务,并利用Yao.jl进行基准测试,这是用朱莉娅语言编写的量子模拟软件包。这些代码是有效的,旨在为科学工作中的初学者提供便利,例如开发强大的变分量子学习模型并协助相应的实验演示。
translated by 谷歌翻译
包含联系和象征主义范式的胶囊网络使新的见解成为人工智能。作为胶囊网络的构建块的胶囊是由向量表示的一组神经元以编码实体的不同特征。通过路由算法通过胶囊层分层地提取信息。在这里,我们将量子胶囊网络(被称为QCaPSnet)与量子动态路由算法一起介绍。我们的模型在动态路由过程中享有指数加速,并展示增强的表示电源。为了基准QCAPPAPSNET的性能,我们对手写数字和对称保护的拓扑阶段进行了广泛的数值模拟,并表明QCAPPSNET可以显然地实现最先进的准确度并优于传统量子分类器。我们进一步解开输出胶囊状态,并发现特定子空间可以对应于输入数据的人类可理解的特征,其指示这种网络的潜在可解释性。我们的工作揭示了量子机器学习中量子胶囊网络的有趣前景,这可能为可解释的量子人工智能提供有价值的指导。
translated by 谷歌翻译
我们提出了一种普遍和系统的策略来编制任意量子信道而不使用辅助额度的辅助额度 - 一种强大的深度加强学习算法。我们严格证明,与编译酉栅极的情况鲜明对比,不管分解序列的长度如何,不可能将任意信道与任意精度编译成任意精度。但是,对于固定精度$ \ epsilon $一个可以用恒定数量的$ \ epsilon $ -dependent基本通道构建通用集,使得任意量子通道可以分解成这些基本通道的序列,然后是酉门,序列长度有$ o(\ frac {1} {\ epsilon} \ log \ frac {1} {\ epsilon})$。通过一个关于Majorana Fermions的拓扑编译的具体例子,我们表明我们所提出的算法可以通过将加权成本添加到近端政策优化的奖励功能中方便和有效地减少昂贵的基本栅极的使用。
translated by 谷歌翻译
量子计算机对机器学习应用程序保持前所未有的潜力。在这里,我们证明了物理量子电路通过经验风险最小化在量子计算机上可读的PAC(可能近似正确):以最多为最多$ N ^ C $栅极的参数量子电路,每个门作用于恒定数量的Qubits,样本复杂度被$ \ tilde {o}界限为(n ^ {c + 1})$。特别是,我们明确地构建了一种以固定模式排列的$ O(n ^ {c + 1})$ o(n ^ {c + 1})的变形量子电路系列,其可以代表最多$ n ^ c $基本的所有物理量子电路盖茨。我们的结果为大量机器学习提供了一个有价值的理论和实践。
translated by 谷歌翻译
Different people speak with diverse personalized speaking styles. Although existing one-shot talking head methods have made significant progress in lip sync, natural facial expressions, and stable head motions, they still cannot generate diverse speaking styles in the final talking head videos. To tackle this problem, we propose a one-shot style-controllable talking face generation framework. In a nutshell, we aim to attain a speaking style from an arbitrary reference speaking video and then drive the one-shot portrait to speak with the reference speaking style and another piece of audio. Specifically, we first develop a style encoder to extract dynamic facial motion patterns of a style reference video and then encode them into a style code. Afterward, we introduce a style-controllable decoder to synthesize stylized facial animations from the speech content and style code. In order to integrate the reference speaking style into generated videos, we design a style-aware adaptive transformer, which enables the encoded style code to adjust the weights of the feed-forward layers accordingly. Thanks to the style-aware adaptation mechanism, the reference speaking style can be better embedded into synthesized videos during decoding. Extensive experiments demonstrate that our method is capable of generating talking head videos with diverse speaking styles from only one portrait image and an audio clip while achieving authentic visual effects. Project Page: https://github.com/FuxiVirtualHuman/styletalk.
translated by 谷歌翻译
Masked image modeling (MIM) has shown great promise for self-supervised learning (SSL) yet been criticized for learning inefficiency. We believe the insufficient utilization of training signals should be responsible. To alleviate this issue, we introduce a conceptually simple yet learning-efficient MIM training scheme, termed Disjoint Masking with Joint Distillation (DMJD). For disjoint masking (DM), we sequentially sample multiple masked views per image in a mini-batch with the disjoint regulation to raise the usage of tokens for reconstruction in each image while keeping the masking rate of each view. For joint distillation (JD), we adopt a dual branch architecture to respectively predict invisible (masked) and visible (unmasked) tokens with superior learning targets. Rooting in orthogonal perspectives for training efficiency improvement, DM and JD cooperatively accelerate the training convergence yet not sacrificing the model generalization ability. Concretely, DM can train ViT with half of the effective training epochs (3.7 times less time-consuming) to report competitive performance. With JD, our DMJD clearly improves the linear probing classification accuracy over ConvMAE by 5.8%. On fine-grained downstream tasks like semantic segmentation, object detection, etc., our DMJD also presents superior generalization compared with state-of-the-art SSL methods. The code and model will be made public at https://github.com/mx-mark/DMJD.
translated by 谷歌翻译
Recently, great progress has been made in single-image super-resolution (SISR) based on deep learning technology. However, the existing methods usually require a large computational cost. Meanwhile, the activation function will cause some features of the intermediate layer to be lost. Therefore, it is a challenge to make the model lightweight while reducing the impact of intermediate feature loss on the reconstruction quality. In this paper, we propose a Feature Interaction Weighted Hybrid Network (FIWHN) to alleviate the above problem. Specifically, FIWHN consists of a series of novel Wide-residual Distillation Interaction Blocks (WDIB) as the backbone, where every third WDIBs form a Feature shuffle Weighted Group (FSWG) by mutual information mixing and fusion. In addition, to mitigate the adverse effects of intermediate feature loss on the reconstruction results, we introduced a well-designed Wide Convolutional Residual Weighting (WCRW) and Wide Identical Residual Weighting (WIRW) units in WDIB, and effectively cross-fused features of different finenesses through a Wide-residual Distillation Connection (WRDC) framework and a Self-Calibrating Fusion (SCF) unit. Finally, to complement the global features lacking in the CNN model, we introduced the Transformer into our model and explored a new way of combining the CNN and Transformer. Extensive quantitative and qualitative experiments on low-level and high-level tasks show that our proposed FIWHN can achieve a good balance between performance and efficiency, and is more conducive to downstream tasks to solve problems in low-pixel scenarios.
translated by 谷歌翻译
Rigorous guarantees about the performance of predictive algorithms are necessary in order to ensure their responsible use. Previous work has largely focused on bounding the expected loss of a predictor, but this is not sufficient in many risk-sensitive applications where the distribution of errors is important. In this work, we propose a flexible framework to produce a family of bounds on quantiles of the loss distribution incurred by a predictor. Our method takes advantage of the order statistics of the observed loss values rather than relying on the sample mean alone. We show that a quantile is an informative way of quantifying predictive performance, and that our framework applies to a variety of quantile-based metrics, each targeting important subsets of the data distribution. We analyze the theoretical properties of our proposed method and demonstrate its ability to rigorously control loss quantiles on several real-world datasets.
translated by 谷歌翻译
Recently, large-scale pre-trained models have shown their advantages in many tasks. However, due to the huge computational complexity and storage requirements, it is challenging to apply the large-scale model to real scenes. A common solution is knowledge distillation which regards the large-scale model as a teacher model and helps to train a small student model to obtain a competitive performance. Cross-task Knowledge distillation expands the application scenarios of the large-scale pre-trained model. Existing knowledge distillation works focus on directly mimicking the final prediction or the intermediate layers of the teacher model, which represent the global-level characteristics and are task-specific. To alleviate the constraint of different label spaces, capturing invariant intrinsic local object characteristics (such as the shape characteristics of the leg and tail of the cattle and horse) plays a key role. Considering the complexity and variability of real scene tasks, we propose a Prototype-guided Cross-task Knowledge Distillation (ProC-KD) approach to transfer the intrinsic local-level object knowledge of a large-scale teacher network to various task scenarios. First, to better transfer the generalized knowledge in the teacher model in cross-task scenarios, we propose a prototype learning module to learn from the essential feature representation of objects in the teacher model. Secondly, for diverse downstream tasks, we propose a task-adaptive feature augmentation module to enhance the features of the student model with the learned generalization prototype features and guide the training of the student model to improve its generalization ability. The experimental results on various visual tasks demonstrate the effectiveness of our approach for large-scale model cross-task knowledge distillation scenes.
translated by 谷歌翻译