The evolution of wireless communications into 6G and beyond is expected to rely on new machine learning (ML)-based capabilities. These can enable proactive decisions and actions from wireless-network components to sustain quality-of-service (QoS) and user experience. Moreover, new use cases in the area of vehicular and industrial communications will emerge. Specifically in the area of vehicle communication, vehicle-to-everything (V2X) schemes will benefit strongly from such advances. With this in mind, we have conducted a detailed measurement campaign with the purpose of enabling a plethora of diverse ML-based studies. The resulting datasets offer GPS-located wireless measurements across diverse urban environments for both cellular (with two different operators) and sidelink radio access technologies, thus enabling a variety of different studies towards V2X. The datasets are labeled and sampled with a high time resolution. Furthermore, we make the data publicly available with all the necessary information to support the on-boarding of new researchers. We provide an initial analysis of the data showing some of the challenges that ML needs to overcome and the features that ML can leverage, as well as some hints at potential research studies.
translated by 谷歌翻译
A generalized understanding of protein dynamics is an unsolved scientific problem, the solution of which is critical to the interpretation of the structure-function relationships that govern essential biological processes. Here, we approach this problem by constructing coarse-grained molecular potentials based on artificial neural networks and grounded in statistical mechanics. For training, we build a unique dataset of unbiased all-atom molecular dynamics simulations of approximately 9 ms for twelve different proteins with multiple secondary structure arrangements. The coarse-grained models are capable of accelerating the dynamics by more than three orders of magnitude while preserving the thermodynamics of the systems. Coarse-grained simulations identify relevant structural states in the ensemble with comparable energetics to the all-atom systems. Furthermore, we show that a single coarse-grained potential can integrate all twelve proteins and can capture experimental structural features of mutated proteins. These results indicate that machine learning coarse-grained potentials could provide a feasible approach to simulate and understand protein dynamics.
translated by 谷歌翻译
高斯工艺(GPS)是高度表达的概率模型。一个主要的限制是他们的计算复杂性。天真,精确的GP推理需要$ \ MATHCAL {o}(n^3)$计算$ n $表示建模点的数量。当前克服此限制的方法分别依赖于数据或内核的稀疏,结构化或随机表示形式,并且通常涉及嵌套的优化以评估GP。我们提出了一种名为迭代图表改进(ICR)的新的,生成的方法,以在$ \ Mathcal {o}(n)$ ntive ntime ntime ntime ntake ntige内核中衰减内核,而无需嵌套优化的时间。 ICR通过将不同分辨率的建模位置的视图与用户提供的坐标图组合在一起,代表长期和短距离相关性。在我们对两个数量级的间距有所不同的点的实验中,ICR的准确性与最新的GP方法相当。 ICR在CPU和GPU上以一个数量级的计算速度来优于现有方法,并且已经成功地应用于具有122亿美元参数的GP模型。
translated by 谷歌翻译
神经网络在许多科学学科中发挥着越来越大的作用,包括物理学。变形AutoEncoders(VAE)是能够表示在低维潜空间中的高维数据的基本信息,该神经网络具有概率解释。特别是所谓的编码器网络,VAE的第一部分,其将其输入到潜伏空间中的位置,另外在该位置的方差方面提供不确定性信息。在这项工作中,介绍了对AutoEncoder架构的扩展,渔民。在该架构中,借助于Fisher信息度量,不使用编码器中的附加信息信道生成潜在空间不确定性,而是从解码器导出。这种架构具有来自理论观点的优点,因为它提供了从模型的直接不确定性量化,并且还考虑不确定的交叉相关。我们可以通过实验表明,渔民生产比可比较的VAE更准确的数据重建,并且其学习性能也明显较好地缩放了潜伏空间尺寸的数量。
translated by 谷歌翻译
为了实现峰值预测性能,封路计优化(HPO)是机器学习的重要组成部分及其应用。在过去几年中,HPO的有效算法和工具的数量大幅增加。与此同时,社区仍缺乏现实,多样化,计算廉价和标准化的基准。这是多保真HPO方法的情况。为了缩短这个差距,我们提出了HPoBench,其中包括7个现有和5个新的基准家庭,共有100多个多保真基准问题。 HPobench允许以可重复的方式运行该可扩展的多保真HPO基准,通过隔离和包装容器中的各个基准。它还提供了用于计算实惠且统计数据的评估的代理和表格基准。为了展示HPoBench与各种优化工具的广泛兼容性,以及其有用性,我们开展了一个来自6个优化工具的13个优化器的示例性大规模研究。我们在这里提供HPobench:https://github.com/automl/hpobench。
translated by 谷歌翻译
The viral load of patients infected with SARS-CoV-2 varies on logarithmic scales and possibly with age. Controversial claims have been made in the literature regarding whether the viral load distribution actually depends on the age of the patients. Such a dependence would have implications for the COVID-19 spreading mechanism, the age-dependent immune system reaction, and thus for policymaking. We hereby develop a method to analyze viral-load distribution data as a function of the patients' age within a flexible, non-parametric, hierarchical, Bayesian, and causal model. The causal nature of the developed reconstruction additionally allows to test for bias in the data. This could be due to, e.g., bias in patient-testing and data collection or systematic errors in the measurement of the viral load. We perform these tests by calculating the Bayesian evidence for each implied possible causal direction. The possibility of testing for bias in data collection and identifying causal directions can be very useful in other contexts as well. For this reason we make our model freely available. When applied to publicly available age and SARS-CoV-2 viral load data, we find a statistically significant increase in the viral load with age, but only for one of the two analyzed datasets. If we consider this dataset, and based on the current understanding of viral load's impact on patients' infectivity, we expect a non-negligible difference in the infectivity of different age groups. This difference is nonetheless too small to justify considering any age group as noninfectious.
translated by 谷歌翻译
由于它可能对粮食安全,可持续性,资源利用效率,化学处理的降低以及人类努力和产量的优化,因此,自主机器人在农业中的应用正在越来越受欢迎。有了这一愿景,蓬勃发展的研究项目旨在开发一种适应性的机器人解决方案,用于精确耕作,该解决方案结合了小型自动无人驾驶飞机(UAV)(UAV)的空中调查能力以及由多功能无人驾驶的无人接地车(UGV)执行的针对性干预措施。本文概述了该项目中获得的科学和技术进步和结果。我们引入了多光谱感知算法以及空中和地面系统,用于监测农作物密度,杂草压力,作物氮营养状况,并准确地对杂草进行分类和定位。然后,我们介绍了针对我们在农业环境中机器人身份量身定制的导航和映射系统,以及用于协作映射的模块。我们最终介绍了我们在不同的现场条件和不同农作物中实施和测试的地面干预硬件,软件解决方案以及接口。我们描述了一个真正的用例,在该案例中,无人机与UGV合作以监视该领域并进行选择性喷涂而无需人工干预。
translated by 谷歌翻译
The release of ChatGPT, a language model capable of generating text that appears human-like and authentic, has gained significant attention beyond the research community. We expect that the convincing performance of ChatGPT incentivizes users to apply it to a variety of downstream tasks, including prompting the model to simplify their own medical reports. To investigate this phenomenon, we conducted an exploratory case study. In a questionnaire, we asked 15 radiologists to assess the quality of radiology reports simplified by ChatGPT. Most radiologists agreed that the simplified reports were factually correct, complete, and not potentially harmful to the patient. Nevertheless, instances of incorrect statements, missed key medical findings, and potentially harmful passages were reported. While further studies are needed, the initial insights of this study indicate a great potential in using large language models like ChatGPT to improve patient-centered care in radiology and other medical domains.
translated by 谷歌翻译
We consider a semi-supervised $k$-clustering problem where information is available on whether pairs of objects are in the same or in different clusters. This information is either available with certainty or with a limited level of confidence. We introduce the PCCC algorithm, which iteratively assigns objects to clusters while accounting for the information provided on the pairs of objects. Our algorithm can include relationships as hard constraints that are guaranteed to be satisfied or as soft constraints that can be violated subject to a penalty. This flexibility distinguishes our algorithm from the state-of-the-art in which all pairwise constraints are either considered hard, or all are considered soft. Unlike existing algorithms, our algorithm scales to large-scale instances with up to 60,000 objects, 100 clusters, and millions of cannot-link constraints (which are the most challenging constraints to incorporate). We compare the PCCC algorithm with state-of-the-art approaches in an extensive computational study. Even though the PCCC algorithm is more general than the state-of-the-art approaches in its applicability, it outperforms the state-of-the-art approaches on instances with all hard constraints or all soft constraints both in terms of running time and various metrics of solution quality. The source code of the PCCC algorithm is publicly available on GitHub.
translated by 谷歌翻译
Linear partial differential equations (PDEs) are an important, widely applied class of mechanistic models, describing physical processes such as heat transfer, electromagnetism, and wave propagation. In practice, specialized numerical methods based on discretization are used to solve PDEs. They generally use an estimate of the unknown model parameters and, if available, physical measurements for initialization. Such solvers are often embedded into larger scientific models or analyses with a downstream application such that error quantification plays a key role. However, by entirely ignoring parameter and measurement uncertainty, classical PDE solvers may fail to produce consistent estimates of their inherent approximation error. In this work, we approach this problem in a principled fashion by interpreting solving linear PDEs as physics-informed Gaussian process (GP) regression. Our framework is based on a key generalization of a widely-applied theorem for conditioning GPs on a finite number of direct observations to observations made via an arbitrary bounded linear operator. Crucially, this probabilistic viewpoint allows to (1) quantify the inherent discretization error; (2) propagate uncertainty about the model parameters to the solution; and (3) condition on noisy measurements. Demonstrating the strength of this formulation, we prove that it strictly generalizes methods of weighted residuals, a central class of PDE solvers including collocation, finite volume, pseudospectral, and (generalized) Galerkin methods such as finite element and spectral methods. This class can thus be directly equipped with a structured error estimate and the capability to incorporate uncertain model parameters and observations. In summary, our results enable the seamless integration of mechanistic models as modular building blocks into probabilistic models.
translated by 谷歌翻译