translated by 谷歌翻译
We provide a unifying approximate dynamic programming framework that applies to a broad variety of problems involving sequential estimation. We consider first the construction of surrogate cost functions for the purposes of optimization, and we focus on the special case of Bayesian optimization, using the rollout algorithm and some of its variations. We then discuss the more general case of sequential estimation of a random vector using optimal measurement selection, and its application to problems of stochastic and adaptive control. We finally consider related search and sequential decoding problems, and a rollout algorithm for the approximate solution of the Wordle and Mastermind puzzles, recently developed in the paper [BBB22].
translated by 谷歌翻译
In 2021 300 mm of rain, nearly half the average annual rainfall, fell near Catania (Sicily island, Italy). Such events took place in just a few hours, with dramatic consequences on the environmental, social, economic, and health systems of the region. This is the reason why, detecting extreme rainfall events is a crucial prerequisite for planning actions able to reverse possibly intensified dramatic future scenarios. In this paper, the Affinity Propagation algorithm, a clustering algorithm grounded on machine learning, was applied, to the best of our knowledge, for the first time, to identify excess rain events in Sicily. This was possible by using a high-frequency, large dataset we collected, ranging from 2009 to 2021 which we named RSE (the Rainfall Sicily Extreme dataset). Weather indicators were then been employed to validate the results, thus confirming the presence of recent anomalous rainfall events in eastern Sicily. We believe that easy-to-use and multi-modal data science techniques, such as the one proposed in this study, could give rise to significant improvements in policy-making for successfully contrasting climate changes.
translated by 谷歌翻译
Because of the considerable heterogeneity and complexity of the technological landscape, building accurate models to forecast is a challenging endeavor. Due to their high prevalence in many complex systems, S-curves are a popular forecasting approach in previous work. However, their forecasting performance has not been directly compared to other technology forecasting approaches. Additionally, recent developments in time series forecasting that claim to improve forecasting accuracy are yet to be applied to technological development data. This work addresses both research gaps by comparing the forecasting performance of S-curves to a baseline and by developing an autencoder approach that employs recent advances in machine learning and time series forecasting. S-curves forecasts largely exhibit a mean average percentage error (MAPE) comparable to a simple ARIMA baseline. However, for a minority of emerging technologies, the MAPE increases by two magnitudes. Our autoencoder approach improves the MAPE by 13.5% on average over the second-best result. It forecasts established technologies with the same accuracy as the other approaches. However, it is especially strong at forecasting emerging technologies with a mean MAPE 18% lower than the next best result. Our results imply that a simple ARIMA model is preferable over the S-curve for technology forecasting. Practitioners looking for more accurate forecasts should opt for the presented autoencoder approach.
translated by 谷歌翻译
We derive a learning framework to generate routing/pickup policies for a fleet of vehicles tasked with servicing stochastically appearing requests on a city map. We focus on policies that 1) give rise to coordination amongst the vehicles, thereby reducing wait times for servicing requests, 2) are non-myopic, considering a-priori unknown potential future requests, and 3) can adapt to changes in the underlying demand distribution. Specifically, we are interested in adapting to fluctuations of actual demand conditions in urban environments, such as on-peak vs. off-peak hours. We achieve this through a combination of (i) online play, a lookahead optimization method that improves the performance of rollout methods via an approximate policy iteration step, and (ii) an offline approximation scheme that allows for adapting to changes in the underlying demand model. In particular, we achieve adaptivity of our learned policy to different demand distributions by quantifying a region of validity using the q-valid radius of a Wasserstein Ambiguity Set. We propose a mechanism for switching the originally trained offline approximation when the current demand is outside the original validity region. In this case, we propose to use an offline architecture, trained on a historical demand model that is closer to the current demand in terms of Wasserstein distance. We learn routing and pickup policies over real taxicab requests in downtown San Francisco with high variability between on-peak and off-peak hours, demonstrating the ability of our method to adapt to real fluctuation in demand distributions. Our numerical results demonstrate that our method outperforms rollout-based reinforcement learning, as well as several benchmarks based on classical methods from the field of operations research.
translated by 谷歌翻译
In this paper we address the solution of the popular Wordle puzzle, using new reinforcement learning methods, which apply more generally to adaptive control of dynamic systems and to classes of Partially Observable Markov Decision Process (POMDP) problems. These methods are based on approximation in value space and the rollout approach, admit a straightforward implementation, and provide improved performance over various heuristic approaches. For the Wordle puzzle, they yield on-line solution strategies that are very close to optimal at relatively modest computational cost. Our methods are viable for more complex versions of Wordle and related search problems, for which an optimal strategy would be impossible to compute. They are also applicable to a wide range of adaptive sequential decision problems that involve an unknown or frequently changing environment whose parameters are estimated on-line.
translated by 谷歌翻译
translated by 谷歌翻译
我们解决了条件平均嵌入(CME)的内核脊回归估算的一致性,这是给定$ y $ x $的条件分布的嵌入到目标重现内核hilbert space $ hilbert space $ hilbert Space $ \ Mathcal {H} _y $ $ $ $ 。 CME允许我们对目标RKHS功能的有条件期望,并已在非参数因果和贝叶斯推论中使用。我们解决了错误指定的设置,其中目标CME位于Hilbert-Schmidt操作员的空间中,该操作员从$ \ Mathcal {H} _X _x $和$ L_2 $和$ \ MATHCAL {H} _Y $ $之间的输入插值空间起作用。该操作员的空间被证明是新定义的矢量值插值空间的同构。使用这种同构,我们在未指定的设置下为经验CME估计量提供了一种新颖的自适应统计学习率。我们的分析表明,我们的费率与最佳$ o(\ log n / n)$速率匹配,而无需假设$ \ Mathcal {h} _y $是有限维度。我们进一步建立了学习率的下限,这表明所获得的上限是最佳的。
translated by 谷歌翻译
闭环体系结构被广泛用于自动控制系统,并获得了杰出的性能。但是,经典的压缩传感系统采用了带有分离的采样和重建单元的开环体系结构。因此,通过将闭环框架引入传统压缩的传感系统中,提出了图像压缩传感(ICRIC)的迭代补偿恢复方法。提出的方法取决于任何现有方法,并通过添加负面反馈结构来升级其重建性能。对压缩传感系统负反馈的理论分析。还提供了所提出方法有效性的大致数学证明。在3个以上图像数据集上进行的仿真实验表明,该方法优于重建性能中的10种竞争方法。平均峰值信噪比的最大增量为4.36 dB,一个数据集的平均结构相似性的最大增量为0.034。基于负反馈机制的提议方法可以有效纠正现有图像压缩传感系统中的恢复误差。
translated by 谷歌翻译
我们考虑了路径计划和网络传输中的一些经典优化问题,并引入了基于拍卖的新算法,以实现其最佳和次优的解决方案。这些算法是基于与对象和随之而来的市场平衡的竞争竞标相关的数学思想,这些算法是拍卖过程的基础。但是,我们算法的起点是不同的,即在有向图中加权和未加权的路径构造,而不是将人分配给对象。新算法比现有方法具有多种潜在的优势:在某些重要情况下,它们在经验上更快,例如Max-Flow,它们非常适合在线重新融合,并且可以适应分布式的异步操作。此外,它们允许任意初始价格,而无需互补的懈怠限制,因此非常适合利用加强学习方法,这些方法将使用数据使用离线培训以及实时操作期间的在线培训。新算法还可以在涉及近似的增强学习环境中找到使用,例如Multistep LookAhead和Tree搜索方案和/或推出算法。
translated by 谷歌翻译