基于得分的生成模型在密度估计和生成建模任务上表现出最新的性能。这些模型通常假设数据几何形状是平坦的,但已开发出最近的扩展来合成生活在Riemannian歧管上的数据。现有的加速扩散模型采样方法通常不适用于Riemannian设置,基于Riemannian得分的方法尚未适应数据集插值的重要任务。为了克服这些问题,我们介绍了\ emph {riemannian扩散schr \“ odinger桥}。我们提出的方法概括了扩散的schr \“ \ cite {debortoli2021neurips}中引入的odinger桥,向非欧国性分数设置超出了Riemannian Score的模型,并扩展第一次逆转。我们验证我们提出的关于合成数据以及真实地球和气候数据的方法。
translated by 谷歌翻译
计算分布之间的最佳传输(OT)耦合在机器学习中起着越来越重要的作用。虽然可以将OT问题求解为线性程序,但添加熵平滑项会导致求解器对离群值更快,更强大,可区分且易于并行化。 Sinkhorn固定点算法是这些方法的基石,结果,已经进行了多次尝试以缩短其运行时,例如退火,动量或加速度。本文的前提是,\ textit {initialization}的sindhorn算法受到了相对较少的关注,可能是由于两个先入为主的:由于正规化的ot问题是凸的,因此可能不值得制定量身定制的初始化,因为\ textit {\ textit { }保证工作;其次,由于sindhorn算法在端到端管道中通常是区分的,因此数据依赖性初始化可能会通过展开迭代而获得的偏差梯度估计。我们挑战了这种传统的观点,并表明精心选择的初始化可能会导致巨大的加速,并且不会偏向梯度,这些梯度是通过隐式分化计算的。我们详细介绍如何使用1D或高斯设置中的已知结果从封闭形式或近似OT解决方案中恢复初始化。我们从经验上表明,这些初始化可以在现成的情况下使用,几乎没有调整,并且导致各种OT问题的速度持续加速。
translated by 谷歌翻译
我们考虑模拟扩散桥的问题,即被调节以在两个给定的状态下初始化和终止的扩散过程。扩散桥梁仿真在不同的科学领域具有应用,并对离散观察的扩散的统计推断起着至关重要的作用。众所周知,这是一个有挑战性的问题,在过去的二十年里受到了很多关注。在这项工作中,我们首先表明,如果可以在时间反转无条件的扩散过程,则可以模拟时间反转的扩散桥接过程。我们介绍了一个变分制剂,以了解这一依赖于得分匹配方法以规避诡计的逆转性。然后,我们考虑另一次迭代我们提出的方法,以近似Dooob的$ H $ -transform定义扩散桥过程。由于我们的方法通常适用于潜在的扩散过程的温和假设,因此可以轻松地用于改善现有方法和框架内的提案桥接过程。我们讨论算法考虑和扩展,并呈现一些数值结果。
translated by 谷歌翻译
逐步应用高斯噪声将复杂的数据分布转换为大约高斯。逆转此动态定义了一种生成模型。当前进通知过程由随机微分方程(SDE),Song等人提供。 (2021)证明可以使用分数匹配估计相关反向时间SDE的时间不均匀漂移。这种方法的限制是必须在最终分布到高斯的最终分布必须运行前进时间SDE。相反,解决Schr \“odinger桥问题(SB),即路径空间上的熵正常化的最佳运输问题,产生从有限时间内从数据分布产生样本的扩散。我们存在扩散SB(DSB),原始近似迭代比例拟合(IPF)程序来解决SB问题,并提供理论分析以及生成建模实验。第一个DSB迭代恢复Song等人提出的方法。(2021),使用较短时间的灵活性间隔,随后的DSB迭代减少了前进(RESP。后向)SDE的最终时间边际之间的差异,相对于先前(RESP。数据)分布。除了生成的建模之外,DSB提供了广泛适用的计算最优运输工具流行池算法的连续状态空间模拟(Cuturi,2013)。
translated by 谷歌翻译
Designing experiments often requires balancing between learning about the true treatment effects and earning from allocating more samples to the superior treatment. While optimal algorithms for the Multi-Armed Bandit Problem (MABP) provide allocation policies that optimally balance learning and earning, they tend to be computationally expensive. The Gittins Index (GI) is a solution to the MABP that can simultaneously attain optimality and computationally efficiency goals, and it has been recently used in experiments with Bernoulli and Gaussian rewards. For the first time, we present a modification of the GI rule that can be used in experiments with exponentially-distributed rewards. We report its performance in simulated 2- armed and 3-armed experiments. Compared to traditional non-adaptive designs, our novel GI modified design shows operating characteristics comparable in learning (e.g. statistical power) but substantially better in earning (e.g. direct benefits). This illustrates the potential that designs using a GI approach to allocate participants have to improve participant benefits, increase efficiencies, and reduce experimental costs in adaptive multi-armed experiments with exponential rewards.
translated by 谷歌翻译
We introduce Argoverse 2 (AV2) - a collection of three datasets for perception and forecasting research in the self-driving domain. The annotated Sensor Dataset contains 1,000 sequences of multimodal data, encompassing high-resolution imagery from seven ring cameras, and two stereo cameras in addition to lidar point clouds, and 6-DOF map-aligned pose. Sequences contain 3D cuboid annotations for 26 object categories, all of which are sufficiently-sampled to support training and evaluation of 3D perception models. The Lidar Dataset contains 20,000 sequences of unlabeled lidar point clouds and map-aligned pose. This dataset is the largest ever collection of lidar sensor data and supports self-supervised learning and the emerging task of point cloud forecasting. Finally, the Motion Forecasting Dataset contains 250,000 scenarios mined for interesting and challenging interactions between the autonomous vehicle and other actors in each local scene. Models are tasked with the prediction of future motion for "scored actors" in each scenario and are provided with track histories that capture object location, heading, velocity, and category. In all three datasets, each scenario contains its own HD Map with 3D lane and crosswalk geometry - sourced from data captured in six distinct cities. We believe these datasets will support new and existing machine learning research problems in ways that existing datasets do not. All datasets are released under the CC BY-NC-SA 4.0 license.
translated by 谷歌翻译
Extracting complex structures from grid-based data is a common key step in automated medical image analysis. The conventional solution to recovering tree-structured geometries typically involves computing the minimal cost path through intermediate representations derived from segmentation masks. However, this methodology has significant limitations in the context of projective imaging of tree-structured 3D anatomical data such as coronary arteries, since there are often overlapping branches in the 2D projection. In this work, we propose a novel approach to predicting tree connectivity structure which reformulates the task as an optimization problem over individual steps of a recursive process. We design and train a two-stage model which leverages the UNet and Transformer architectures and introduces an image-based prompting technique. Our proposed method achieves compelling results on a pair of synthetic datasets, and outperforms a shortest-path baseline.
translated by 谷歌翻译
Cashews are grown by over 3 million smallholders in more than 40 countries worldwide as a principal source of income. As the third largest cashew producer in Africa, Benin has nearly 200,000 smallholder cashew growers contributing 15% of the country's national export earnings. However, a lack of information on where and how cashew trees grow across the country hinders decision-making that could support increased cashew production and poverty alleviation. By leveraging 2.4-m Planet Basemaps and 0.5-m aerial imagery, newly developed deep learning algorithms, and large-scale ground truth datasets, we successfully produced the first national map of cashew in Benin and characterized the expansion of cashew plantations between 2015 and 2021. In particular, we developed a SpatioTemporal Classification with Attention (STCA) model to map the distribution of cashew plantations, which can fully capture texture information from discriminative time steps during a growing season. We further developed a Clustering Augmented Self-supervised Temporal Classification (CASTC) model to distinguish high-density versus low-density cashew plantations by automatic feature extraction and optimized clustering. Results show that the STCA model has an overall accuracy of 80% and the CASTC model achieved an overall accuracy of 77.9%. We found that the cashew area in Benin has doubled from 2015 to 2021 with 60% of new plantation development coming from cropland or fallow land, while encroachment of cashew plantations into protected areas has increased by 70%. Only half of cashew plantations were high-density in 2021, suggesting high potential for intensification. Our study illustrates the power of combining high-resolution remote sensing imagery and state-of-the-art deep learning algorithms to better understand tree crops in the heterogeneous smallholder landscape.
translated by 谷歌翻译
Grasping is an incredible ability of animals using their arms and limbs in their daily life. The human hand is an especially astonishing multi-fingered tool for precise grasping, which helped humans to develop the modern world. The implementation of the human grasp to virtual reality and telerobotics is always interesting and challenging at the same time. In this work, authors surveyed, studied, and analyzed the human hand-grasping behavior for the possibilities of haptic grasping in the virtual and remote environment. This work is focused on the motion and force analysis of fingers in human hand grasping scenarios and the paper describes the transition of the human hand grasping towards a tripod haptic grasp model for effective interaction in virtual reality.
translated by 谷歌翻译
Multivariate time series forecasting with hierarchical structure is pervasive in real-world applications, demanding not only predicting each level of the hierarchy, but also reconciling all forecasts to ensure coherency, i.e., the forecasts should satisfy the hierarchical aggregation constraints. Moreover, the disparities of statistical characteristics between levels can be huge, worsened by non-Gaussian distributions and non-linear correlations. To this extent, we propose a novel end-to-end hierarchical time series forecasting model, based on conditioned normalizing flow-based autoregressive transformer reconciliation, to represent complex data distribution while simultaneously reconciling the forecasts to ensure coherency. Unlike other state-of-the-art methods, we achieve the forecasting and reconciliation simultaneously without requiring any explicit post-processing step. In addition, by harnessing the power of deep model, we do not rely on any assumption such as unbiased estimates or Gaussian distribution. Our evaluation experiments are conducted on four real-world hierarchical datasets from different industrial domains (three public ones and a dataset from the application servers of Alipay's data center) and the preliminary results demonstrate efficacy of our proposed method.
translated by 谷歌翻译