高斯工艺(GPS)是高度表达的概率模型。一个主要的限制是他们的计算复杂性。天真,精确的GP推理需要$ \ MATHCAL {o}(n^3)$计算$ n $表示建模点的数量。当前克服此限制的方法分别依赖于数据或内核的稀疏,结构化或随机表示形式,并且通常涉及嵌套的优化以评估GP。我们提出了一种名为迭代图表改进(ICR)的新的,生成的方法,以在$ \ Mathcal {o}(n)$ ntive ntime ntime ntime ntake ntige内核中衰减内核,而无需嵌套优化的时间。 ICR通过将不同分辨率的建模位置的视图与用户提供的坐标图组合在一起,代表长期和短距离相关性。在我们对两个数量级的间距有所不同的点的实验中,ICR的准确性与最新的GP方法相当。 ICR在CPU和GPU上以一个数量级的计算速度来优于现有方法,并且已经成功地应用于具有122亿美元参数的GP模型。
translated by 谷歌翻译
信息字段理论(IFT),字段信息理论是信号重建和非参数反转问题的数学框架。在这里,字段表示作为空间(和时间)和信息理论的函数连续变化的物理量,以及信息理论是指配备相关的熵信息措施的贝叶斯概率逻辑。用IFT重建信号是类似于训练生成神经网络(GNN)的计算问题。在本文中,讨论了IFT的推断,讨论了IFT和机器学习中使用的数值变分推理方法的交叉施肥。讨论表明,IFT推理可以被视为人工智能的特定形式。与经典神经网络相比,基于基于的GNN可以在没有预先培训的情况下运行,因为将专家知识纳入其架构。
translated by 谷歌翻译
神经网络在许多科学学科中发挥着越来越大的作用,包括物理学。变形AutoEncoders(VAE)是能够表示在低维潜空间中的高维数据的基本信息,该神经网络具有概率解释。特别是所谓的编码器网络,VAE的第一部分,其将其输入到潜伏空间中的位置,另外在该位置的方差方面提供不确定性信息。在这项工作中,介绍了对AutoEncoder架构的扩展,渔民。在该架构中,借助于Fisher信息度量,不使用编码器中的附加信息信道生成潜在空间不确定性,而是从解码器导出。这种架构具有来自理论观点的优点,因为它提供了从模型的直接不确定性量化,并且还考虑不确定的交叉相关。我们可以通过实验表明,渔民生产比可比较的VAE更准确的数据重建,并且其学习性能也明显较好地缩放了潜伏空间尺寸的数量。
translated by 谷歌翻译
The viral load of patients infected with SARS-CoV-2 varies on logarithmic scales and possibly with age. Controversial claims have been made in the literature regarding whether the viral load distribution actually depends on the age of the patients. Such a dependence would have implications for the COVID-19 spreading mechanism, the age-dependent immune system reaction, and thus for policymaking. We hereby develop a method to analyze viral-load distribution data as a function of the patients' age within a flexible, non-parametric, hierarchical, Bayesian, and causal model. The causal nature of the developed reconstruction additionally allows to test for bias in the data. This could be due to, e.g., bias in patient-testing and data collection or systematic errors in the measurement of the viral load. We perform these tests by calculating the Bayesian evidence for each implied possible causal direction. The possibility of testing for bias in data collection and identifying causal directions can be very useful in other contexts as well. For this reason we make our model freely available. When applied to publicly available age and SARS-CoV-2 viral load data, we find a statistically significant increase in the viral load with age, but only for one of the two analyzed datasets. If we consider this dataset, and based on the current understanding of viral load's impact on patients' infectivity, we expect a non-negligible difference in the infectivity of different age groups. This difference is nonetheless too small to justify considering any age group as noninfectious.
translated by 谷歌翻译
As the accuracy of machine learning models increases at a fast rate, so does their demand for energy and compute resources. On a low level, the major part of these resources is consumed by data movement between different memory units. Modern hardware architectures contain a form of fast memory (e.g., cache, registers), which is small, and a slow memory (e.g., DRAM), which is larger but expensive to access. We can only process data that is stored in fast memory, which incurs data movement (input/output-operations, or I/Os) between the two units. In this paper, we provide a rigorous theoretical analysis of the I/Os needed in sparse feedforward neural network (FFNN) inference. We establish bounds that determine the optimal number of I/Os up to a factor of 2 and present a method that uses a number of I/Os within that range. Much of the I/O-complexity is determined by a few high-level properties of the FFNN (number of inputs, outputs, neurons, and connections), but if we want to get closer to the exact lower bound, the instance-specific sparsity patterns need to be considered. Departing from the 2-optimal computation strategy, we show how to reduce the number of I/Os further with simulated annealing. Complementing this result, we provide an algorithm that constructively generates networks with maximum I/O-efficiency for inference. We test the algorithms and empirically verify our theoretical and algorithmic contributions. In our experiments on real hardware we observe speedups of up to 45$\times$ relative to the standard way of performing inference.
translated by 谷歌翻译
In this paper a global reactive motion planning framework for robotic manipulators in complex dynamic environments is presented. In particular, the circular field predictions (CFP) planner from Becker et al. (2021) is extended to ensure obstacle avoidance of the whole structure of a robotic manipulator. Towards this end, a motion planning framework is developed that leverages global information about promising avoidance directions from arbitrary configuration space motion planners, resulting in improved global trajectories while reactively avoiding dynamic obstacles and decreasing the required computational power. The resulting motion planning framework is tested in multiple simulations with complex and dynamic obstacles and demonstrates great potential compared to existing motion planning approaches.
translated by 谷歌翻译
Federated Learning (FL) is a scheme for collaboratively training Deep Neural Networks (DNNs) with multiple data sources from different clients. Instead of sharing the data, each client trains the model locally, resulting in improved privacy. However, recently so-called targeted poisoning attacks have been proposed that allow individual clients to inject a backdoor into the trained model. Existing defenses against these backdoor attacks either rely on techniques like Differential Privacy to mitigate the backdoor, or analyze the weights of the individual models and apply outlier detection methods that restricts these defenses to certain data distributions. However, adding noise to the models' parameters or excluding benign outliers might also reduce the accuracy of the collaboratively trained model. Additionally, allowing the server to inspect the clients' models creates a privacy risk due to existing knowledge extraction methods. We propose CrowdGuard, a model filtering defense, that mitigates backdoor attacks by leveraging the clients' data to analyze the individual models before the aggregation. To prevent data leaks, the server sends the individual models to secure enclaves, running in client-located Trusted Execution Environments. To effectively distinguish benign and poisoned models, even if the data of different clients are not independently and identically distributed (non-IID), we introduce a novel metric called HLBIM to analyze the outputs of the DNN's hidden layers. We show that the applied significance-based detection algorithm combined can effectively detect poisoned models, even in non-IID scenarios. We show in our extensive evaluation that CrowdGuard can effectively mitigate targeted poisoning attacks and achieve in various scenarios a True-Positive-Rate of 100% and a True-Negative-Rate of 100%.
translated by 谷歌翻译
安全关键系统通常在调试之前进行危害分析,以识别和分析操作过程中可能出现的潜在危险系统状态。当前,危害分析主要基于人类的推理,过去的经验以及清单和电子表格等简单工具。增加系统复杂性使这种方法非常合适。此外,由于高成本或身体缺陷的危险,基于测试的危害分析通常不适合。对此进行的补救措施是基于模型的危害分析方法,这些方法依赖于正式模型或模拟模型,每个模型都具有自己的好处和缺点。本文提出了一种两层方法,该方法使用正式方法与使用模拟的详细分析结合了详尽分析的好处。首先使用监督控制理论从系统的形式模型中合成了导致不安全状态的不安全行为。结果是输入到模拟的输入,在该模拟中,使用域特异性风险指标进行了详细的分析。尽管提出的方法通常适用,但本文证明了该方法对工业人类机器人协作系统的好处。
translated by 谷歌翻译
图形数据库(GDB)启用对非结构化,复杂,丰富且通常庞大的图形数据集的处理和分析。尽管GDB在学术界和行业中都具有很大的意义,但几乎没有努力将它们与图形神经网络(GNNS)的预测能力融为一体。在这项工作中,我们展示了如何无缝将几乎所有GNN模型与GDB的计算功能相结合。为此,我们观察到这些系统大多数是基于或支持的,称为标记的属性图(LPG)的图形数据模型,在该模型中,顶点和边缘可以任意复杂的标签和属性集。然后,我们开发LPG2VEC,这是一种编码器,将任意LPG数据集转换为可以与广泛的GNN类直接使用的表示形式,包括卷积,注意力,消息通话,甚至高阶或频谱模型。在我们的评估中,我们表明,LPG2VEC可以正确保留代表LPG标签和属性的丰富信息,并且与与图形相比,与与图形相比,它提高了预测的准确性,而不管有针对性的学习任务或使用过的GNN模型,多达34%没有LPG标签/属性。通常,LPG2VEC可以将最强大的GNN的预测能力与LPG模型中编码的全部信息范围相结合,为神经图数据库铺平了道路,这是一类系统,其中维护的数据的绝大复杂性将从现代和未来中受益图机学习方法。
translated by 谷歌翻译
成倍增长的模型大小驱动了深度学习的持续成功,但它带来了过度的计算和记忆成本。从算法的角度来看,已经研究了模型的稀疏和量化以减轻问题。从体系结构的角度来看,硬件供应商提供了张量核心以进行加速。但是,由于严格的数据布局要求以及缺乏有效操纵低精度整数的支持,因此从稀疏的低精度矩阵操作中获得实践加速非常具有挑战性。我们提出了Magicube,这是一个高性能的稀疏矩阵库,用于张量芯上的低精度整数。 Magicube支持SPMM和SDDMM,这是深度学习的两个主要稀疏操作。 NVIDIA A100 GPU的实验结果表明,Magicube平均在供应商优化的库中平均达到1.44倍(高达2.37倍)的速度,用于稀疏内核,而在最先进的艺术品上进行了1.43倍的速度,具有可比的准确性。端到端稀疏变压器推断。
translated by 谷歌翻译