This paper presents a safety-critical locomotion control framework for quadrupedal robots. Our goal is to enable quadrupedal robots to safely navigate in cluttered environments. To tackle this, we introduce exponential Discrete Control Barrier Functions (exponential DCBFs) with duality-based obstacle avoidance constraints into a Nonlinear Model Predictive Control (NMPC) with Whole-Body Control (WBC) framework for quadrupedal locomotion control. This enables us to use polytopes to describe the shapes of the robot and obstacles for collision avoidance while doing locomotion control of quadrupedal robots. Compared to most prior work, especially using CBFs, that utilize spherical and conservative approximation for obstacle avoidance, this work demonstrates a quadrupedal robot autonomously and safely navigating through very tight spaces in the real world. (Our open-source code is available at github.com/HybridRobotics/quadruped_nmpc_dcbf_duality, and the video is available at youtu.be/p1gSQjwXm1Q.)
translated by 谷歌翻译
The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.
translated by 谷歌翻译
Distantly-Supervised Named Entity Recognition (DS-NER) effectively alleviates the data scarcity problem in NER by automatically generating training samples. Unfortunately, the distant supervision may induce noisy labels, thus undermining the robustness of the learned models and restricting the practical application. To relieve this problem, recent works adopt self-training teacher-student frameworks to gradually refine the training labels and improve the generalization ability of NER models. However, we argue that the performance of the current self-training frameworks for DS-NER is severely underestimated by their plain designs, including both inadequate student learning and coarse-grained teacher updating. Therefore, in this paper, we make the first attempt to alleviate these issues by proposing: (1) adaptive teacher learning comprised of joint training of two teacher-student networks and considering both consistent and inconsistent predictions between two teachers, thus promoting comprehensive student learning. (2) fine-grained student ensemble that updates each fragment of the teacher model with a temporal moving average of the corresponding fragment of the student, which enhances consistent predictions on each model fragment against noise. To verify the effectiveness of our proposed method, we conduct experiments on four DS-NER datasets. The experimental results demonstrate that our method significantly surpasses previous SOTA methods.
translated by 谷歌翻译
Deep learning (DL)-based tomographic SAR imaging algorithms are gradually being studied. Typically, they use an unfolding network to mimic the iterative calculation of the classical compressive sensing (CS)-based methods and process each range-azimuth unit individually. However, only one-dimensional features are effectively utilized in this way. The correlation between adjacent resolution units is ignored directly. To address that, we propose a new model-data-driven network to achieve tomoSAR imaging based on multi-dimensional features. Guided by the deep unfolding methodology, a two-dimensional deep unfolding imaging network is constructed. On the basis of it, we add two 2D processing modules, both convolutional encoder-decoder structures, to enhance multi-dimensional features of the imaging scene effectively. Meanwhile, to train the proposed multifeature-based imaging network, we construct a tomoSAR simulation dataset consisting entirely of simulation data of buildings. Experiments verify the effectiveness of the model. Compared with the conventional CS-based FISTA method and DL-based gamma-Net method, the result of our proposed method has better performance on completeness while having decent imaging accuracy.
translated by 谷歌翻译
Benefiting from a relatively larger aperture's angle, and in combination with a wide transmitting bandwidth, near-field synthetic aperture radar (SAR) provides a high-resolution image of a target's scattering distribution-hot spots. Meanwhile, imaging result suffers inevitable degradation from sidelobes, clutters, and noises, hindering the information retrieval of the target. To restore the image, current methods make simplified assumptions; for example, the point spread function (PSF) is spatially consistent, the target consists of sparse point scatters, etc. Thus, they achieve limited restoration performance in terms of the target's shape, especially for complex targets. To address these issues, a preliminary study is conducted on restoration with the recent promising deep learning inverse technique in this work. We reformulate the degradation model into a spatially variable complex-convolution model, where the near-field SAR's system response is considered. Adhering to it, a model-based deep learning network is designed to restore the image. A simulated degraded image dataset from multiple complex target models is constructed to validate the network. All the images are formulated using the electromagnetic simulation tool. Experiments on the dataset reveal their effectiveness. Compared with current methods, superior performance is achieved regarding the target's shape and energy estimation.
translated by 谷歌翻译
This work focuses on 3D Radar imaging inverse problems. Current methods obtain undifferentiated results that suffer task-depended information retrieval loss and thus don't meet the task's specific demands well. For example, biased scattering energy may be acceptable for screen imaging but not for scattering diagnosis. To address this issue, we propose a new task-oriented imaging framework. The imaging principle is task-oriented through an analysis phase to obtain task's demands. The imaging model is multi-cognition regularized to embed and fulfill demands. The imaging method is designed to be general-ized, where couplings between cognitions are decoupled and solved individually with approximation and variable-splitting techniques. Tasks include scattering diagnosis, person screen imaging, and parcel screening imaging are given as examples. Experiments on data from two systems indicate that the pro-posed framework outperforms the current ones in task-depended information retrieval.
translated by 谷歌翻译
人脸图像通常以广泛的视觉量表出现。现有的面部表示通过组装有限系列的预定尺度的多尺度方案来追求处理量表变化的带宽。这种多弹药方案带来了推理负担,而预定义的量表不可避免地从真实数据中差异。取而代之的是,从数据中学习比例参数,并将其用于单发功能推理是一个不错的解决方案。为此,我们通过诉诸规模空间理论并实现两倍的设施来改革Conv层:1)Conv层从真实数据分布中学习一组尺度,每个数据分布都由Conv内核来实现; 2)该图层自动在适当的通道和位置上突出显示与输入模式量表及其存在相对应的位置。然后,我们通过堆叠改革层的层来实现分层尺度的关注,建立一种名为“比例尺注意Cons Neurnet网络”(\ textbf {scan-cnn})的新颖风格。我们将扫描CNN应用于面部识别任务,并推动SOTA性能的前沿。当面部图像模糊时,准确性增长更为明显。同时,作为单发方案,该推断比多弹性融合更有效。与普通CNN相比,制造了一组工具,以确保对扫描CNN进行快速训练和推理成本的零增加。
translated by 谷歌翻译
现有的基于变压器的图像骨干通常会在一个方向上传播特征信息,从较低到更高级别。这可能不是理想的选择,因为定位能力划定准确的物体边界,在较低的高分辨率特征图中最突出,而可以删除属于一个对象的图像信号的语义与另一个对象相对于另一个对象,通常是在较高级别中出现的处理。我们提出了分层间注意力(HILA),这是一种基于注意力的方法,可在不同级别的功能之间捕获自下而上的更新和自上而下的更新。 Hila通过将较高和较低级别的特征之间的局部连接添加到骨干编码器中,扩展了层次视觉变压器体系结构。在每次迭代中,我们通过具有更高级别的功能来竞争作业来更新属于它们的低级功能,从而构建层次结构,从而迭代解决对象零件关系。然后使用这些改进的低级功能来更新更高级别的功能。 HILA可以集成到大多数层次结构中,而无需对基本模型进行任何更改。我们将HILA添加到Segformer和Swin Transformer中,并以更少的参数和拖鞋的方式显示出明显的语义分割精度。项目网站和代码:https://www.cs.toronto.edu/~garyleung/hila/
translated by 谷歌翻译
本文解决了机器人的问题,可以协作将电缆带到指定的目标位置,同时避免实时碰撞。引入电缆(与刚性链接相反)使机器人团队能够通过电缆的松弛/拉特开关更改其内在尺寸,从而使机器人团队能够穿越狭窄的空间。但是,这是一个具有挑战性的问题,因为混合模式开关以及多个机器人和负载之间的动态耦合。以前解决此类问题的尝试是离线执行的,并且不考虑避免在线障碍。在本文中,我们介绍了一个级联的计划方案,并采用平行的集中式轨迹优化,涉及混合模式开关。我们还每个机器人开发了一组分散的计划者,这使我们可以解决在线协作负载操作问题的方法。我们开发并演示了第一个能够移动有线电视载荷的首个协作自治框架之一,该框架太重了,无法通过一个机器人移动,通过狭窄空间,具有实时反馈和实验中的反应性计划。
translated by 谷歌翻译
我们可以将异源图结构与文本结合在一起以学习高质量的语义和行为表示吗?图形神经网络(GNN)S编码数值节点属性和图形结构,以在各种监督的学习任务中实现令人印象深刻的性能。当前的GNN方法受到文本特征的挑战,文本特征通常需要编码为数值向量,然后再提供给GNN,这可能会导致一些信息损失。在本文中,我们提出了一个有效有效的框架,称为语言模型GNN(LM-GNN),以共同训练大型语言模型和图形神经网络。我们的框架中的有效性是通过首先使用异质图信息,然后使用GNN模型应用BERT模型的阶段微调来实现的。提出了几种系统和设计优化,以实现可扩展有效的培训。 LM-GNN可容纳节点和边缘分类以及链接预测任务。我们在不同数据集的性能中评估了LM-GNN框架,并展示了所提出方法的有效性。 LM-GNN在亚马逊查询购买应用程序中提供竞争结果。
translated by 谷歌翻译