智能论文笔记

Evaluating vaccine allocation strategies using simulation-assisted causal modelling

Armin Kekić , Jonas Dehning , Luigi Gresele , Julius von Kügelgen , Viola Priesemann , Bernhard Schölkopf

分类：人工智能

2022-12-14

Early on during a pandemic, vaccine availability is limited, requiring prioritisation of different population groups. Evaluating vaccine allocation is therefore a crucial element of pandemics response. In the present work, we develop a model to retrospectively evaluate age-dependent counterfactual vaccine allocation strategies against the COVID-19 pandemic. To estimate the effect of allocation on the expected severe-case incidence, we employ a simulation-assisted causal modelling approach which combines a compartmental infection-dynamics simulation, a coarse-grained, data-driven causal model and literature estimates for immunity waning. We compare Israel's implemented vaccine allocation strategy in 2021 to counterfactual strategies such as no prioritisation, prioritisation of younger age groups or a strict risk-ranked approach; we find that Israel's implemented strategy was indeed highly effective. We also study the marginal impact of increasing vaccine uptake for a given age group and find that increasing vaccinations in the elderly is most effective at preventing severe cases, whereas additional vaccinations for middle-aged groups reduce infections most effectively. Due to its modular structure, our model can easily be adapted to study future pandemics. We demonstrate this flexibility by investigating vaccine allocation strategies for a pandemic with characteristics of the Spanish Flu. Our approach thus helps evaluate vaccination strategies under the complex interplay of core epidemic factors, including age-dependent risk profiles, immunity waning, vaccine availability and spreading rates.

translated by 谷歌翻译

Probable Domain Generalization via Quantile Risk Minimization

Cian Eastwood , Alexander Robey , Shashank Singh , Julius von Kügelgen , Hamed Hassani , George J. Pappas , Bernhard Schölkopf

分类： (统计)机器学习 | 人工智能 | 计算机视觉 | 机器学习

2022-07-20

域的概括（DG）通过利用来自多个相关分布或域的标记培训数据在看不见的测试分布上表现良好的预测因子。为了实现这一目标，标准公式优化了所有可能域的最差性能。但是，由于最糟糕的转变在实践中的转变极不可能，这通常会导致过度保守的解决方案。实际上，最近的一项研究发现，没有DG算法在平均性能方面优于经验风险最小化。在这项工作中，我们认为DG既不是最坏的问题，也不是一个普通的问题，而是概率问题。为此，我们为DG提出了一个概率框架，我们称之为可能的域概括，其中我们的关键想法是在训练期间看到的分配变化应在测试时告诉我们可能的变化。为了实现这一目标，我们将培训和测试域明确关联为从同一基础元分布中获取的，并提出了一个新的优化问题 - 分数风险最小化（QRM） - 要求该预测因子以很高的概率概括。然后，我们证明了QRM：（i）产生的预测因子，这些预测因素将具有所需概率的新域（给定足够多的域和样本）；（ii）随着概括的所需概率接近一个，恢复因果预测因子。在我们的实验中，我们引入了针对DG的更全面的以分位数评估协议，并表明我们的算法在真实和合成数据上的最先进基准都优于最先进的基准。

translated by 谷歌翻译

Causal Inference Through the Structural Causal Marginal Problem

Luigi Gresele , Julius von Kügelgen , Jonas M. Kübler , Elke Kirschbaum , Bernhard Schölkopf , Dominik Janzing

分类：人工智能 | 机器学习

2022-02-02

我们基于从多个数据集的合并信息介绍了一种反事实推断的方法。我们考虑了统计边际问题的因果重新重新制定：鉴于边际结构因果模型（SCM）的集合在不同但重叠的变量集上，请确定与边际相反一致的关节SCMS集。我们使用响应函数配方对分类SCM进行了形式化这种方法，并表明它降低了允许的边际和关节SCM的空间。因此，我们的工作通过其他变量突出了一种通过其他变量的新模式，与统计数据相反。

translated by 谷歌翻译

RL and Fingerprinting to Select Moving Target Defense Mechanisms for Zero-day Attacks in IoT

Alberto Huertas Celdrán , Pedro Miguel Sánchez Sánchez , Jan von der Assen , Timo Schenk , Gérôme Bovet , Gregorio Martínez Pérez , Burkhard Stiller

分类：人工智能

2022-12-30

Cybercriminals are moving towards zero-day attacks affecting resource-constrained devices such as single-board computers (SBC). Assuming that perfect security is unrealistic, Moving Target Defense (MTD) is a promising approach to mitigate attacks by dynamically altering target attack surfaces. Still, selecting suitable MTD techniques for zero-day attacks is an open challenge. Reinforcement Learning (RL) could be an effective approach to optimize the MTD selection through trial and error, but the literature fails when i) evaluating the performance of RL and MTD solutions in real-world scenarios, ii) studying whether behavioral fingerprinting is suitable for representing SBC's states, and iii) calculating the consumption of resources in SBC. To improve these limitations, the work at hand proposes an online RL-based framework to learn the correct MTD mechanisms mitigating heterogeneous zero-day attacks in SBC. The framework considers behavioral fingerprinting to represent SBCs' states and RL to learn MTD techniques that mitigate each malicious state. It has been deployed on a real IoT crowdsensing scenario with a Raspberry Pi acting as a spectrum sensor. More in detail, the Raspberry Pi has been infected with different samples of command and control malware, rootkits, and ransomware to later select between four existing MTD techniques. A set of experiments demonstrated the suitability of the framework to learn proper MTD techniques mitigating all attacks (except a harmfulness rootkit) while consuming <1 MB of storage and utilizing <55% CPU and <80% RAM.

translated by 谷歌翻译

Mixture of von Mises-Fisher distribution with sparse prototypes

Fabrice Rossi , Florian Barbaro

分类：机器学习 | (统计)机器学习

2022-12-30

Mixtures of von Mises-Fisher distributions can be used to cluster data on the unit hypersphere. This is particularly adapted for high-dimensional directional data such as texts. We propose in this article to estimate a von Mises mixture using a l 1 penalized likelihood. This leads to sparse prototypes that improve clustering interpretability. We introduce an expectation-maximisation (EM) algorithm for this estimation and explore the trade-off between the sparsity term and the likelihood one with a path following algorithm. The model's behaviour is studied on simulated data and, we show the advantages of the approach on real data benchmark. We also introduce a new data set on financial reports and exhibit the benefits of our method for exploratory analysis.

translated by 谷歌翻译

Robust Cross-vendor Mammographic Texture Models Using Augmentation-based Domain Adaptation for Long-term Breast Cancer Risk

Andreas D. Lauritzen , My Catarina von Euler-Chelpin , Elsebeth Lynge , Ilse Vejborg , Mads Nielsen , Nico Karssemeijer , Martin Lillholm

分类：计算机视觉

2022-12-27

The future of population-based breast cancer screening is likely personalized strategies based on clinically relevant risk models. Mammography-based risk models should remain robust to domain shifts caused by different populations and mammographic devices. Modern risk models do not ensure adaptation across vendor-domains and are often conflated to unintentionally rely on both precursors of cancer and systemic/global mammographic information associated with short- and long-term risk, respectively, which might limit performance. We developed a robust, cross-vendor model for long-term risk assessment. An augmentation-based domain adaption technique, based on flavorization of mammographic views, ensured generalization to an unseen vendor-domain. We trained on samples without diagnosed/potential malignant findings to learn systemic/global breast tissue features, called mammographic texture, indicative of future breast cancer. However, training so may cause erratic convergence. By excluding noise-inducing samples and designing a case-control dataset, a robust ensemble texture model was trained. This model was validated in two independent datasets. In 66,607 Danish women with flavorized Siemens views, the AUC was 0.71 and 0.65 for prediction of interval cancers within two years (ICs) and from two years after screening (LTCs), respectively. In a combination with established risk factors, the model's AUC increased to 0.68 for LTCs. In 25,706 Dutch women with Hologic-processed views, the AUCs were not different from the AUCs in Danish women with flavorized views. The results suggested that the model robustly estimated long-term risk while adapting to an unseen processed vendor-domain. The model identified 8.1% of Danish women accounting for 20.9% of ICs and 14.2% of LTCs.

translated by 谷歌翻译

StoRM: A Diffusion-based Stochastic Regeneration Model for Speech Enhancement and Dereverberation

Jean-Marie Lemercier , Julius Richter , Simon Welker , Timo Gerkmann

分类：机器学习

2022-12-22

Diffusion models have shown a great ability at bridging the performance gap between predictive and generative approaches for speech enhancement. We have shown that they may even outperform their predictive counterparts for non-additive corruption types or when they are evaluated on mismatched conditions. However, diffusion models suffer from a high computational burden, mainly as they require to run a neural network for each reverse diffusion step, whereas predictive approaches only require one pass. As diffusion models are generative approaches they may also produce vocalizing and breathing artifacts in adverse conditions. In comparison, in such difficult scenarios, predictive models typically do not produce such artifacts but tend to distort the target speech instead, thereby degrading the speech quality. In this work, we present a stochastic regeneration approach where an estimate given by a predictive model is provided as a guide for further diffusion. We show that the proposed approach uses the predictive model to remove the vocalizing and breathing artifacts while producing very high quality samples thanks to the diffusion model, even in adverse conditions. We further show that this approach enables to use lighter sampling schemes with fewer diffusion steps without sacrificing quality, thus lifting the computational burden by an order of magnitude. Source code and audio examples are available online (https://uhh.de/inf-sp-storm).

translated by 谷歌翻译

ECG-Based Electrolyte Prediction: Evaluating Regression and Probabilistic Methods

Philipp Von Bachmann , Daniel Gedon , Fredrik K. Gustafsson , Antônio H. Ribeiro , Erik Lampa , Stefan Gustafsson , Johan Sundström , Thomas B. Schön

分类：计算机视觉 | 机器学习

2022-12-21

Objective: Imbalances of the electrolyte concentration levels in the body can lead to catastrophic consequences, but accurate and accessible measurements could improve patient outcomes. While blood tests provide accurate measurements, they are invasive and the laboratory analysis can be slow or inaccessible. In contrast, an electrocardiogram (ECG) is a widely adopted tool which is quick and simple to acquire. However, the problem of estimating continuous electrolyte concentrations directly from ECGs is not well-studied. We therefore investigate if regression methods can be used for accurate ECG-based prediction of electrolyte concentrations. Methods: We explore the use of deep neural networks (DNNs) for this task. We analyze the regression performance across four electrolytes, utilizing a novel dataset containing over 290000 ECGs. For improved understanding, we also study the full spectrum from continuous predictions to binary classification of extreme concentration levels. To enhance clinical usefulness, we finally extend to a probabilistic regression approach and evaluate different uncertainty estimates. Results: We find that the performance varies significantly between different electrolytes, which is clinically justified in the interplay of electrolytes and their manifestation in the ECG. We also compare the regression accuracy with that of traditional machine learning models, demonstrating superior performance of DNNs. Conclusion: Discretization can lead to good classification performance, but does not help solve the original problem of predicting continuous concentration levels. While probabilistic regression demonstrates potential practical usefulness, the uncertainty estimates are not particularly well-calibrated. Significance: Our study is a first step towards accurate and reliable ECG-based prediction of electrolyte concentration levels.

translated by 谷歌翻译

Lessons from Robot-Assisted Disaster Response Deployments by the German Rescue Robotics Center Task Force

Hartmut Surmann , Ivana Kruijff-Korbayova , Kevin Daun , Marius Schnaubelt , Oskar von Stryk , Manuel Patchou , Stefan Boecker , Christian Wietfeld , Jan Quenzel , Daniel Schleich

分类：机器人

2022-12-19

Earthquakes, fire, and floods often cause structural collapses of buildings. The inspection of damaged buildings poses a high risk for emergency forces or is even impossible, though. We present three recent selected missions of the Robotics Task Force of the German Rescue Robotics Center, where both ground and aerial robots were used to explore destroyed buildings. We describe and reflect the missions as well as the lessons learned that have resulted from them. In order to make robots from research laboratories fit for real operations, realistic test environments were set up for outdoor and indoor use and tested in regular exercises by researchers and emergency forces. Based on this experience, the robots and their control software were significantly improved. Furthermore, top teams of researchers and first responders were formed, each with realistic assessments of the operational and practical suitability of robotic systems.

translated by 谷歌翻译

Transformers learn in-context by gradient descent

Johannes von Oswald , Eyvind Niklasson , Ettore Randazzo , João Sacramento , Alexander Mordvintsev , Andrey Zhmoginov , Max Vladymyrov

分类：机器学习 | 人工智能 | 自然语言处理

2022-12-15

Transformers have become the state-of-the-art neural network architecture across numerous domains of machine learning. This is partly due to their celebrated ability to transfer and to learn in-context based on few examples. Nevertheless, the mechanisms by which Transformers become in-context learners are not well understood and remain mostly an intuition. Here, we argue that training Transformers on auto-regressive tasks can be closely related to well-known gradient-based meta-learning formulations. We start by providing a simple weight construction that shows the equivalence of data transformations induced by 1) a single linear self-attention layer and by 2) gradient-descent (GD) on a regression loss. Motivated by that construction, we show empirically that when training self-attention-only Transformers on simple regression tasks either the models learned by GD and Transformers show great similarity or, remarkably, the weights found by optimization match the construction. Thus we show how trained Transformers implement gradient descent in their forward pass. This allows us, at least in the domain of regression problems, to mechanistically understand the inner workings of optimized Transformers that learn in-context. Furthermore, we identify how Transformers surpass plain gradient descent by an iterative curvature correction and learn linear models on deep data representations to solve non-linear regression tasks. Finally, we discuss intriguing parallels to a mechanism identified to be crucial for in-context learning termed induction-head (Olsson et al., 2022) and show how it could be understood as a specific case of in-context learning by gradient descent learning within Transformers.

translated by 谷歌翻译