The ubiquity of camera-embedded devices and the advances in deep learning have stimulated various intelligent mobile video applications. These applications often demand on-device processing of video streams to deliver real-time, high-quality services for privacy and robustness concerns. However, the performance of these applications is constrained by the raw video streams, which tend to be taken with small-aperture cameras of ubiquitous mobile platforms in dim light. Despite extensive low-light video enhancement solutions, they are unfit for deployment to mobile devices due to their complex models and and ignorance of system dynamics like energy budgets. In this paper, we propose AdaEnlight, an energy-aware low-light video stream enhancement system on mobile devices. It achieves real-time video enhancement with competitive visual quality while allowing runtime behavior adaptation to the platform-imposed dynamic energy budgets. We report extensive experiments on diverse datasets, scenarios, and platforms and demonstrate the superiority of AdaEnlight compared with state-of-the-art low-light image and video enhancement solutions.
translated by 谷歌翻译
模型推理的成本效率对于现实世界机器学习(ML)应用至关重要,尤其是对于延迟敏感的任务和资源有限的设备。一个典型的困境是:为了提供复杂的智能服务(例如智能城市),我们需要多种ML模型的推理结果,但是成本预算(例如GPU内存)不足以运行所有这些结果。在这项工作中,我们研究了黑盒ML模型之间的基本关系,并提出了一项新的学习任务:模型链接,该任务旨在通过学习映射(配音模型链接)之间的输出空间之间的学习映射(配音模型链接)来弥合不同的黑盒模型的知识。我们提出了模型链接的设计,该链接支持链接异质的黑盒ML模型。同样,为了解决分布差异挑战,我们提出了模型链接的适应和聚合方法。根据我们提出的模型链接,我们开发了一种名为MLINK的调度算法。通过通过模型链接启用的协作多模型推断,麦克林可以提高成本预算下获得的推理结果的准确性。我们在具有七个不同的ML型号和两个现实世界的视频分析系统和3,264小时的视频上评估了多模式数据集上的麦克林。实验结果表明,我们提出的模型链接可以在各种黑盒模型之间有效构建。在GPU内存的预算下,MLINK可以节省66.7%的推理计算,同时保留94%的推理准确性,这表现优于多任务学习,基于强化的基于强化的计划调度程序和框架过滤基线。
translated by 谷歌翻译
我们介绍了Equivariant卷积算法的框架,该算法是针对具有任意SU($ d $)对称性的物理系统的许多机器学习任务而定制的。它使我们能够增强量子计算的自然模型 - 渗透量子计算(PQC)[量子INF。Comput。,10,470-497(2010)] - 并定义了一个更强大的模型:PQC+。虽然PQC被证明是有效的经典模拟,但我们表现出一个可以在PQC+机器上有效解决的问题,而最著名的经典算法则以$ O(N!n^2)$时间运行,从而提供了强有力的证据,从而提供了反对PQC+的证据。经典的模拟。我们进一步讨论可以在PQC+范式中执行的实用量子机学习算法。
translated by 谷歌翻译
我们为$ S_N $-Quivariant Quantum卷积电路,建立并大大概括了Jordan的置力量子计算(PQC)形式主义的理论框架。我们表明量子电路是傅里叶空间神经架构的自然选择,其在计算$ S_N $ -Fourier系数的矩阵元素中,与在对称组上的最佳已知的经典快速傅里叶变换(FFT)相比计算的超级指数加速。特别是,我们利用Okounkov-Vershik方法来证明Harrow的陈述(Ph.D.论文2005 P.160)在$ \ OperatorName {su}(d)$ - 和$ s_n $-frirep基地之间并建立$ s_n $-arequivariant卷积量子交替使用年轻Jucys-Murphy(YJM)元素的ans {\“a} tze($ s_n $ -cqa)。我们证明了$ s_n $ -cqa是密集的,因此在每美元内表达S_N $-Frirep块,其可以作为潜在的未来量子机器学习和优化应用成为普遍模型。我们的方法提供了另一种方法来证明量子近似优化算法(QAOA)的普遍性,从表示理论的角度来看。我们的框架可以自然地应用于全局$ \ Operatorname {su}(d)$对称性的各种问题。我们展示了数值模拟以展示ANS {\“A} TEE的有效性,以找到标志结构$ j_1 $ - $ j_2 $反铁磁性Heisenberg模型在矩形和矩形状态Kagome格子。我们的工作确定了特定机器学习问题的量子优势,并提供了庆祝的Okounkov-Vershik的表示理论的第一次应用于机器学习和量子物理学。
translated by 谷歌翻译
量子信息技术的快速发展显示了在近期量子设备中模拟量子场理论的有希望的机会。在这项工作中,我们制定了1+1尺寸$ \ lambda \ phi \ phi^4 $量子场理论的(时间依赖性)变异量子模拟理论,包括编码,状态准备和时间演化,并具有多个数值模拟结果。这些算法可以理解为Jordan-Lee-Preskill算法的近期变异类似物,这是使用通用量子设备模拟量子场理论的基本算法。此外,我们强调了基于LSZ降低公式和几种计算效率的谐波振荡器基础编码的优势,例如在实施单一耦合群集ANSATZ的肺泡版本时,以准备初始状态。我们还讨论了如何在量子场理论仿真中规避“光谱拥挤”问题,并根据州和子空间保真度评估我们的算法。
translated by 谷歌翻译
The optimal stopping problem is one of the core problems in financial markets, with broad applications such as pricing American and Bermudan options. The deep BSDE method [Han, Jentzen and E, PNAS, 115(34):8505-8510, 2018] has shown great power in solving high-dimensional forward-backward stochastic differential equations (FBSDEs), and inspired many applications. However, the method solves backward stochastic differential equations (BSDEs) in a forward manner, which can not be used for optimal stopping problems that in general require running BSDE backwardly. To overcome this difficulty, a recent paper [Wang, Chen, Sudjianto, Liu and Shen, arXiv:1807.06622, 2018] proposed the backward deep BSDE method to solve the optimal stopping problem. In this paper, we provide the rigorous theory for the backward deep BSDE method. Specifically, 1. We derive the a posteriori error estimation, i.e., the error of the numerical solution can be bounded by the training loss function; and; 2. We give an upper bound of the loss function, which can be sufficiently small subject to universal approximations. We give two numerical examples, which present consistent performance with the proved theory.
translated by 谷歌翻译
IoT设备收集的数据通常是私人的,并且在各种用户之间具有巨大的多样性。因此,学习需要使用可用的代表性数据样本进行预训练,在物联网设备上部署预训练的模型,并使用本地数据在设备上调整已部署的模型。这种用于深度学习授权应用程序的设备改编需要数据和记忆效率。但是,现有的基于梯度的元学习方案无法支持记忆有效的适应。为此,我们提出了P-Meta,这是一种新的元学习方法,该方法可以强制执行结构的部分参数更新,同时确保快速概括到看不见的任务。对几片图像分类和强化学习任务的评估表明,与最先进的几次适应方法相比。
translated by 谷歌翻译
In this paper, we propose a robust 3D detector, named Cross Modal Transformer (CMT), for end-to-end 3D multi-modal detection. Without explicit view transformation, CMT takes the image and point clouds tokens as inputs and directly outputs accurate 3D bounding boxes. The spatial alignment of multi-modal tokens is performed implicitly, by encoding the 3D points into multi-modal features. The core design of CMT is quite simple while its performance is impressive. CMT obtains 73.0% NDS on nuScenes benchmark. Moreover, CMT has a strong robustness even if the LiDAR is missing. Code will be released at https://github.com/junjie18/CMT.
translated by 谷歌翻译
Dataset distillation has emerged as a prominent technique to improve data efficiency when training machine learning models. It encapsulates the knowledge from a large dataset into a smaller synthetic dataset. A model trained on this smaller distilled dataset can attain comparable performance to a model trained on the original training dataset. However, the existing dataset distillation techniques mainly aim at achieving the best trade-off between resource usage efficiency and model utility. The security risks stemming from them have not been explored. This study performs the first backdoor attack against the models trained on the data distilled by dataset distillation models in the image domain. Concretely, we inject triggers into the synthetic data during the distillation procedure rather than during the model training stage, where all previous attacks are performed. We propose two types of backdoor attacks, namely NAIVEATTACK and DOORPING. NAIVEATTACK simply adds triggers to the raw data at the initial distillation phase, while DOORPING iteratively updates the triggers during the entire distillation procedure. We conduct extensive evaluations on multiple datasets, architectures, and dataset distillation techniques. Empirical evaluation shows that NAIVEATTACK achieves decent attack success rate (ASR) scores in some cases, while DOORPING reaches higher ASR scores (close to 1.0) in all cases. Furthermore, we conduct a comprehensive ablation study to analyze the factors that may affect the attack performance. Finally, we evaluate multiple defense mechanisms against our backdoor attacks and show that our attacks can practically circumvent these defense mechanisms.
translated by 谷歌翻译
Automatic music generation with artificial intelligence typically requires a large amount of data which is hard to obtain for many less common genres and musical instruments. To tackle this issue, we present ongoing work and preliminary findings on the possibility for deep models to transfer knowledge from language to music, by finetuning large language models pre-trained on a massive text corpus on only hundreds of MIDI files of drum performances. We show that by doing so, one of the largest, state-of-the-art models (GPT3) is capable of generating reasonable drum grooves, while models that are not pre-trained (Transformer) shows no such ability beyond naive repetition. Evaluating generated music is a challenging task, more so is evaluating drum grooves with little precedence in literature. Hence, we propose a tailored structural evaluation method and analyze drum grooves produced by GPT3 compared to those played by human professionals, exposing the strengths and weaknesses of such generation by language-to-music transfer. Our findings suggest that language-to-music transfer learning with large language models is viable and promising.
translated by 谷歌翻译