translated by 谷歌翻译
For sequence generation, both autoregressive models and non-autoregressive models have been developed in recent years. Autoregressive models can achieve high generation quality, but the sequential decoding scheme causes slow decoding speed. Non-autoregressive models accelerate the inference speed with parallel decoding, while their generation quality still needs to be improved due to the difficulty of modeling multi-modalities in data. To address the multi-modality issue, we propose Diff-Glat, a non-autoregressive model featured with a modality diffusion process and residual glancing training. The modality diffusion process decomposes the modalities and reduces the modalities to learn for each transition. And the residual glancing sampling further smooths the modality learning procedures. Experiments demonstrate that, without using knowledge distillation data, Diff-Glat can achieve superior performance in both decoding efficiency and accuracy compared with the autoregressive Transformer.
translated by 谷歌翻译
Visual odometry is crucial for many robotic tasks such as autonomous exploration and path planning. Despite many progresses, existing methods are still not robust enough to dynamic illumination environments. In this paper, we present AirVO, an illumination-robust and accurate stereo visual odometry system based on point and line features. To be robust to illumination variation, we introduce the learning-based feature extraction and matching method and design a novel VO pipeline, including feature tracking, triangulation, key-frame selection, and graph optimization etc. We also employ long line features in the environment to improve the accuracy of the system. Different from the traditional line processing pipelines in visual odometry systems, we propose an illumination-robust line tracking method, where point feature tracking and distribution of point and line features are utilized to match lines. In the experiments, the proposed system is extensively evaluated in environments with dynamic illumination and the results show that it achieves superior performance to the state-of-the-art algorithms.
translated by 谷歌翻译
Semantic segmentation usually benefits from global contexts, fine localisation information, multi-scale features, etc. To advance Transformer-based segmenters with these aspects, we present a simple yet powerful semantic segmentation architecture, termed as IncepFormer. IncepFormer has two critical contributions as following. First, it introduces a novel pyramid structured Transformer encoder which harvests global context and fine localisation features simultaneously. These features are concatenated and fed into a convolution layer for final per-pixel prediction. Second, IncepFormer integrates an Inception-like architecture with depth-wise convolutions, and a light-weight feed-forward module in each self-attention layer, efficiently obtaining rich local multi-scale object features. Extensive experiments on five benchmarks show that our IncepFormer is superior to state-of-the-art methods in both accuracy and speed, e.g., 1) our IncepFormer-S achieves 47.7% mIoU on ADE20K which outperforms the existing best method by 1% while only costs half parameters and fewer FLOPs. 2) Our IncepFormer-B finally achieves 82.0% mIoU on Cityscapes dataset with 39.6M parameters. Code is available:github.com/shendu0321/IncepFormer.
translated by 谷歌翻译
Despite recent progress on trajectory planning of multiple robots and path planning of a single tethered robot, planning of multiple tethered robots to reach their individual targets without entanglements remains a challenging problem. In this paper, we present a complete approach to address this problem. Firstly, we propose a multi-robot tether-aware representation of homotopy, using which we can efficiently evaluate the feasibility and safety of a potential path in terms of (1) the cable length required to reach a target following the path, and (2) the risk of entanglements with the cables of other robots. Then, the proposed representation is applied in a decentralized and online planning framework that includes a graph-based kinodynamic trajectory finder and an optimization-based trajectory refinement, to generate entanglement-free, collision-free and dynamically feasible trajectories. The efficiency of the proposed homotopy representation is compared against existing single and multiple tethered robot planning approaches. Simulations with up to 8 UAVs show the effectiveness of the approach in entanglement prevention and its real-time capabilities. Flight experiments using 3 tethered UAVs verify the practicality of the presented approach.
translated by 谷歌翻译
While feature association to a global map has significant benefits, to keep the computations from growing exponentially, most lidar-based odometry and mapping methods opt to associate features with local maps at one voxel scale. Taking advantage of the fact that surfels (surface elements) at different voxel scales can be organized in a tree-like structure, we propose an octree-based global map of multi-scale surfels that can be updated incrementally. This alleviates the need for recalculating, for example, a k-d tree of the whole map repeatedly. The system can also take input from a single or a number of sensors, reinforcing the robustness in degenerate cases. We also propose a point-to-surfel (PTS) association scheme, continuous-time optimization on PTS and IMU preintegration factors, along with loop closure and bundle adjustment, making a complete framework for Lidar-Inertial continuous-time odometry and mapping. Experiments on public and in-house datasets demonstrate the advantages of our system compared to other state-of-the-art methods. To benefit the community, we release the source code and dataset at https://github.com/brytsknguyen/slict.
translated by 谷歌翻译
近年来,由渠道状态信息(CSI)启用了基于WiFi的智能人类传感技术(CSI)。但是,在不同的环境中部署时,基于CSI的传感系统会遭受性能降解。现有作品通过使用新环境中的大量未标记的高质量数据来通过域的适应来解决这一问题,这在实践中通常不可用。在本文中,我们提出了一种新颖的增强环境不变的鲁棒wifi wifi识别系统,名为Airfi,该系统从新的角度涉及环境依赖问题。 Airfi是一个新颖的领域泛化框架,无论环境如何,都可以学习CSI的关键部分,并将模型推广到看不见的场景,不需要收集任何数据以适应新环境。 Airfi从几个培训环境环境中提取了共同的功能,并最大程度地减少了它们之间的分布差异。该功能将进一步增强,以使环境更强大。此外,可以通过几次学习技术进一步改进该系统。与最先进的方法相比,Airfi能够在不同的环境环境中工作,而无需从新环境中获取任何CSI数据。实验结果表明,我们的系统在新环境中保持强大,并优于比较系统。
translated by 谷歌翻译
translated by 谷歌翻译
数据驱动决策的经验风险最小化方法假设我们可以从与我们想要在下面部署的条件相同的条件下绘制的数据中学习决策规则。但是,在许多设置中,我们可能会担心我们的培训样本是有偏见的,并且某些组(以可观察或无法观察到的属性为特征)可能相对于一般人群而言是不足或代表过多的;在这种情况下,对培训集的经验风险最小化可能无法产生在部署时表现良好的规则。我们基于分配强大的优化和灵敏度分析的概念,我们提出了一种学习决策规则的方法,该方法将在测试分布家族的家庭中最小化最糟糕的案例风险,其有条件的结果分布$ y $ y $ y $ y $ x $有所不同有条件的训练分布最多是一个恒定因素,并且相对于训练数据的协变量分布,其协变量分布绝对是连续的。我们应用Rockafellar和Uryasev的结果表明,此问题等同于增强的凸风险最小化问题。我们提供了使用筛子的方法来学习健壮模型的统计保证,并提出了一种深度学习算法,其损失函数捕获了我们的稳健性目标。我们从经验上验证了我们在模拟中提出的方法和使用MIMIC-III数据集的案例研究。
translated by 谷歌翻译
translated by 谷歌翻译