Efficient energy consumption is crucial for achieving sustainable energy goals in the era of climate change and grid modernization. Thus, it is vital to understand how energy is consumed at finer resolutions such as household in order to plan demand-response events or analyze the impacts of weather, electricity prices, electric vehicles, solar, and occupancy schedules on energy consumption. However, availability and access to detailed energy-use data, which would enable detailed studies, has been rare. In this paper, we release a unique, large-scale, synthetic, residential energy-use dataset for the residential sector across the contiguous United States covering millions of households. The data comprise of hourly energy use profiles for synthetic households, disaggregated into Thermostatically Controlled Loads (TCL) and appliance use. The underlying framework is constructed using a bottom-up approach. Diverse open-source surveys and first principles models are used for end-use modeling. Extensive validation of the synthetic dataset has been conducted through comparisons with reported energy-use data. We present a detailed, open, high-resolution, residential energy-use dataset for the United States.
translated by 谷歌翻译
While witnessing the noisy intermediate-scale quantum (NISQ) era and beyond, quantum federated learning (QFL) has recently become an emerging field of study. In QFL, each quantum computer or device locally trains its quantum neural network (QNN) with trainable gates, and communicates only these gate parameters over classical channels, without costly quantum communications. Towards enabling QFL under various channel conditions, in this article we develop a depth-controllable architecture of entangled slimmable quantum neural networks (eSQNNs), and propose an entangled slimmable QFL (eSQFL) that communicates the superposition-coded parameters of eS-QNNs. Compared to the existing depth-fixed QNNs, training the depth-controllable eSQNN architecture is more challenging due to high entanglement entropy and inter-depth interference, which are mitigated by introducing entanglement controlled universal (CU) gates and an inplace fidelity distillation (IPFD) regularizer penalizing inter-depth quantum state differences, respectively. Furthermore, we optimize the superposition coding power allocation by deriving and minimizing the convergence bound of eSQFL. In an image classification task, extensive simulations corroborate the effectiveness of eSQFL in terms of prediction accuracy, fidelity, and entropy compared to Vanilla QFL as well as under different channel conditions and various data distributions.
translated by 谷歌翻译
随着嘈杂的中间量子量子(NISQ)时代的开始,量子神经网络(QNN)最近已成为解决经典神经网络无法解决的问题的解决方案。此外,QCNN吸引了作为下一代QNN的注意力,因为它可以处理高维矢量输入。但是,由于量子计算的性质,经典QCNN很难提取足够数量的功能。在此激励的情况下,我们提出了一种新版本的QCNN,称为可伸缩量子卷积神经网络(SQCNN)。此外,使用QC的保真度,我们提出了一种名为ReverseDelity Trainity(RF-Train)的SQCNN培训算法,可最大程度地提高SQCNN的性能。
translated by 谷歌翻译
本文介绍了持续的Weisfeiler-Lehman随机步行方案(缩写为PWLR),用于图形表示,这是一个新型的数学框架,可生成具有离散和连续节点特征的图形的可解释的低维表示。提出的方案有效地结合了归一化的Weisfeiler-Lehman程序,在图形上随机行走以及持续的同源性。因此,我们整合了图形的三个不同属性,即局部拓扑特征,节点度和全局拓扑不变,同时保留图形扰动的稳定性。这概括了Weisfeiler-Lehman过程的许多变体,这些变体主要用于嵌入具有离散节点标签的图形。经验结果表明,可以有效地利用这些表示形式与最新的技术产生可比较的结果,以分类具有离散节点标签的图形,并在对具有连续节点特征的人分类中增强性能。
translated by 谷歌翻译
Federated learning (FL) is a key enabler for efficient communication and computing, leveraging devices' distributed computing capabilities. However, applying FL in practice is challenging due to the local devices' heterogeneous energy, wireless channel conditions, and non-independently and identically distributed (non-IID) data distributions. To cope with these issues, this paper proposes a novel learning framework by integrating FL and width-adjustable slimmable neural networks (SNN). Integrating FL with SNNs is challenging due to time-varying channel conditions and data distributions. In addition, existing multi-width SNN training algorithms are sensitive to the data distributions across devices, which makes SNN ill-suited for FL. Motivated by this, we propose a communication and energy-efficient SNN-based FL (named SlimFL) that jointly utilizes superposition coding (SC) for global model aggregation and superposition training (ST) for updating local models. By applying SC, SlimFL exchanges the superposition of multiple-width configurations decoded as many times as possible for a given communication throughput. Leveraging ST, SlimFL aligns the forward propagation of different width configurations while avoiding inter-width interference during backpropagation. We formally prove the convergence of SlimFL. The result reveals that SlimFL is not only communication-efficient but also deals with non-IID data distributions and poor channel conditions, which is also corroborated by data-intensive simulations.
translated by 谷歌翻译
移动设备是大数据的不可或缺的来源。联合学习(FL)通过交换本地培训的模型而不是其原始数据来利用这些私人数据具有很大的潜力。然而,移动设备通常是能量有限且无线连接的,并且FL不能灵活地应对它们的异构和时变的能量容量和通信吞吐量,限制采用。通过这些问题,我们提出了一种新颖的能源和通信有效的流动框架,被创造的Slimfl。为了解决异构能量容量问题,SLIMFL中的每个设备都运行宽度可调可泥瓦神经网络(SNN)。为了解决异构通信吞吐量问题,每个全宽(1.0倍)SNN模型及其半宽度(0.5美元$ x)模型在传输之前是叠加编码的,并且在接收后连续解码为0.5x或1.0美元$ 1.0 $ x模型取决于频道质量。仿真结果表明,SLIMFL可以通过合理的精度和收敛速度同时培养0.5美元和1.0美元的X模型,而是使用2美元的通信资源分别培训这两种型号。令人惊讶的是,SLIMFL甚至具有比Vanilla FL的较低的能量占地面积更高的精度,对于较差的通道和非IID数据分布,Vanilla Fl会缓慢收敛。
translated by 谷歌翻译
本文旨在整合两个协同技术,联合学习(FL)和宽度可调的可泥质网络(SNN)架构。通过交换当地培训的移动设备模型来保留数据隐私。通过采用SNNS作为本地模型,FL可以灵活地应对移动设备的时变能容量。然而,结合FL和SNN是非琐碎的,特别是在与时变通道条件的无线连接下。此外,现有的多宽SNN训练算法对跨设备的数据分布敏感,因此不适用于FL。由此激励,我们提出了一种通信和节能SNN的FL(命名SLIMFL),共同利用叠加编码(SC)进行全局模型聚合和叠加训练(ST),以更新本地模型。通过施加SC,SLIMFL交换多个宽度配置的叠加,这对于给定的通信吞吐量尽可能多地解码。利用ST,SLIMFL对准不同宽度配置的前向传播,同时避免在背部衰退期间的横宽干扰。我们正式证明了Slimfl的融合。结果表明,SLIMFL不仅是通信的,而且可以抵消非IID数据分布和差的信道条件,这也被模拟证实。
translated by 谷歌翻译
类别不平衡数据的问题在于,由于少数类别的数据缺乏数据,分类器的泛化性能劣化。在本文中,我们提出了一种新的少数民族过度采样方法,通过利用大多数类作为背景图像的丰富背景来增加多元化的少数民族样本。为了使少数民族样本多样化,我们的主要思想是将前景补丁从少数级别粘贴到来自具有富裕环境的多数类的背景图像。我们的方法很简单,可以轻松地与现有的长尾识别方法结合。我们通过广泛的实验和消融研究证明了提出的过采样方法的有效性。如果没有任何架构更改或复杂的算法,我们的方法在各种长尾分类基准上实现了最先进的性能。我们的代码将在链接上公开提供。
translated by 谷歌翻译
Dataset distillation has emerged as a prominent technique to improve data efficiency when training machine learning models. It encapsulates the knowledge from a large dataset into a smaller synthetic dataset. A model trained on this smaller distilled dataset can attain comparable performance to a model trained on the original training dataset. However, the existing dataset distillation techniques mainly aim at achieving the best trade-off between resource usage efficiency and model utility. The security risks stemming from them have not been explored. This study performs the first backdoor attack against the models trained on the data distilled by dataset distillation models in the image domain. Concretely, we inject triggers into the synthetic data during the distillation procedure rather than during the model training stage, where all previous attacks are performed. We propose two types of backdoor attacks, namely NAIVEATTACK and DOORPING. NAIVEATTACK simply adds triggers to the raw data at the initial distillation phase, while DOORPING iteratively updates the triggers during the entire distillation procedure. We conduct extensive evaluations on multiple datasets, architectures, and dataset distillation techniques. Empirical evaluation shows that NAIVEATTACK achieves decent attack success rate (ASR) scores in some cases, while DOORPING reaches higher ASR scores (close to 1.0) in all cases. Furthermore, we conduct a comprehensive ablation study to analyze the factors that may affect the attack performance. Finally, we evaluate multiple defense mechanisms against our backdoor attacks and show that our attacks can practically circumvent these defense mechanisms.
translated by 谷歌翻译
We propose a new causal inference framework to learn causal effects from multiple, decentralized data sources in a federated setting. We introduce an adaptive transfer algorithm that learns the similarities among the data sources by utilizing Random Fourier Features to disentangle the loss function into multiple components, each of which is associated with a data source. The data sources may have different distributions; the causal effects are independently and systematically incorporated. The proposed method estimates the similarities among the sources through transfer coefficients, and hence requiring no prior information about the similarity measures. The heterogeneous causal effects can be estimated with no sharing of the raw training data among the sources, thus minimizing the risk of privacy leak. We also provide minimax lower bounds to assess the quality of the parameters learned from the disparate sources. The proposed method is empirically shown to outperform the baselines on decentralized data sources with dissimilar distributions.
translated by 谷歌翻译