智能论文笔记

Exploiting Expert Knowledge for Assigning Firms to Industries: A Novel Deep Learning Method

Xiaohang Zhao , Xiao Fang , Jing He , Lihua Huang

分类：机器学习 | 人工智能

2022-09-11

行业分配根据预定义的行业分类系统（ICS）将公司分配给行业，这对于大量关键业务实践至关重要，从公司的运营和战略决策到政府机构的经济分析。三种专家知识对于有效行业分配至关重要：基于定义的知识（即每个行业的专家定义），基于结构的知识（即ICS中指定的行业之间的结构关系）和基于任务的知识（即，域专家执行的事先公司行业任务）。现有的行业分配方法仅利用基于任务的知识来学习将未分配的公司分类为行业的模型，并忽略基于定义和基于结构的知识。此外，这些方法仅考虑已分配了公司的哪个行业，但忽略了基于分配的知识的时间特异性，即在任务发生时。为了解决现有方法的局限性，我们提出了一种新颖的基于深度学习的方法，该方法不仅无缝整合了三种类型的行业分配知识，而且还考虑了基于分配的知识的特定时间。从方法上讲，我们的方法具有两种创新：动态行业表示和分层分配。前者通过通过我们提出的时间和空间聚集机制整合了三种类型的知识，将行业代表为一系列特定时间的向量。后者将行业和公司的表现作为投入，计算将公司分配给不同行业的可能性，并将公司分配给具有最高概率的行业。

translated by 谷歌翻译

MetaFi: Device-Free Pose Estimation via Commodity WiFi for Metaverse Avatar Simulation

Jianfei Yang , Yunjiao Zhou , He Huang , Han Zou , Lihua Xie

分类：计算机视觉 | 人工智能

2022-08-22

阿凡达（Avatar）是指虚拟世界中物理用户的代表，该代表可以从事不同的活动并与Metaverse中的其他对象进行交互。模拟化身需要准确的人类姿势估计。尽管基于摄像头的解决方案产生了出色的性能，但它们遇到了隐私问题，并因不同的照明而引起的性能退化，尤其是在智能家居中。在本文中，我们提出了一种基于WiFi的IOT基于Metavers Avatar模拟的人类姿势估计方案，即Metafi。具体而言，深度神经网络设计具有定制的卷积层和残留块，以将渠道状态信息映射到人体姿势地标。它被强制从准确的计算机视觉模型中学习注释，从而实现跨模式监督。 WiFi无处不在且强大的照明，使其成为智能家居中的头像应用的可行解决方案。实验是在现实世界中进行的，结果表明，METAFI以95.23％的50@PCK实现了很高的性能。

translated by 谷歌翻译

SECP-Net: SE-Connection Pyramid Network of Organ At Risk Segmentation for Nasopharyngeal Carcinoma

Zexi Huang , Lihua Guo , Xin Yang , Sijuan Huang

分类：计算机视觉

2021-12-28

鼻咽癌（NPC）是一种恶性肿瘤。在计算断层扫描（CT）图像的风险（OAR）的准确和自动分割（桨）是临床显着的。近年来，U-Net代表的深度学习模型已广泛应用于医学图像分割任务，这可以帮助医生减少工作量并更快地获得准确的结果。在NPC的OAR分割中，OAR的大小是可变的，特别是其中一些是小的。由于缺乏使用全局和多尺寸信息，传统的深神经网络在分割期间表现不佳。本文提出了一种新的SE连接金字塔网络（SECP-NET）。 SECP-Net提取全局和多尺寸信息流，使用SE连接（SEC）模块和网络的金字塔结构，用于改善分割性能，尤其是小器官。 SECP-NET还设计了一种自动上下文级联网络，以进一步提高分段性能。比较实验在SECP-NET和其他最近方法的与头部和颈部的CT图像上的数据集进行。五倍的交叉验证用于根据两个度量，即骰子和jaccard相似性来评估性能。实验结果表明，SECP-Net可以在这项挑战任务中实现SOTA性能。

translated by 谷歌翻译

Diff-Glat: Diffusion Glancing Transformer for Parallel Sequence to Sequence Learning

Lihua Qian , Mingxuan Wang , Yang Liu , Hao Zhou

分类：自然语言处理

2022-12-20

For sequence generation, both autoregressive models and non-autoregressive models have been developed in recent years. Autoregressive models can achieve high generation quality, but the sequential decoding scheme causes slow decoding speed. Non-autoregressive models accelerate the inference speed with parallel decoding, while their generation quality still needs to be improved due to the difficulty of modeling multi-modalities in data. To address the multi-modality issue, we propose Diff-Glat, a non-autoregressive model featured with a modality diffusion process and residual glancing training. The modality diffusion process decomposes the modalities and reduces the modalities to learn for each transition. And the residual glancing sampling further smooths the modality learning procedures. Experiments demonstrate that, without using knowledge distillation data, Diff-Glat can achieve superior performance in both decoding efficiency and accuracy compared with the autoregressive Transformer.

translated by 谷歌翻译

AirVO: An Illumination-Robust Point-Line Visual Odometry

Kuan Xu , Yuefan Hao , Chen Wang , Lihua Xie

分类：机器人

2022-12-15

Visual odometry is crucial for many robotic tasks such as autonomous exploration and path planning. Despite many progresses, existing methods are still not robust enough to dynamic illumination environments. In this paper, we present AirVO, an illumination-robust and accurate stereo visual odometry system based on point and line features. To be robust to illumination variation, we introduce the learning-based feature extraction and matching method and design a novel VO pipeline, including feature tracking, triangulation, key-frame selection, and graph optimization etc. We also employ long line features in the environment to improve the accuracy of the system. Different from the traditional line processing pipelines in visual odometry systems, we propose an illumination-robust line tracking method, where point feature tracking and distribution of point and line features are utilized to match lines. In the experiments, the proposed system is extensively evaluated in environments with dynamic illumination and the results show that it achieves superior performance to the state-of-the-art algorithms.

translated by 谷歌翻译

IncepFormer: Efficient Inception Transformer with Pyramid Pooling for Semantic Segmentation

Lihua Fu , Haoyue Tian , Xiangping Bryce Zhai , Pan Gao , Xiaojiang Peng

分类：计算机视觉

2022-12-06

Semantic segmentation usually benefits from global contexts, fine localisation information, multi-scale features, etc. To advance Transformer-based segmenters with these aspects, we present a simple yet powerful semantic segmentation architecture, termed as IncepFormer. IncepFormer has two critical contributions as following. First, it introduces a novel pyramid structured Transformer encoder which harvests global context and fine localisation features simultaneously. These features are concatenated and fed into a convolution layer for final per-pixel prediction. Second, IncepFormer integrates an Inception-like architecture with depth-wise convolutions, and a light-weight feed-forward module in each self-attention layer, efficiently obtaining rich local multi-scale object features. Extensive experiments on five benchmarks show that our IncepFormer is superior to state-of-the-art methods in both accuracy and speed, e.g., 1) our IncepFormer-S achieves 47.7% mIoU on ADE20K which outperforms the existing best method by 1% while only costs half parameters and fewer FLOPs. 2) Our IncepFormer-B finally achieves 82.0% mIoU on Cityscapes dataset with 39.6M parameters. Code is available:github.com/shendu0321/IncepFormer.

translated by 谷歌翻译

NEPTUNE: Non-Entangling Planning for Multiple Tethered Unmanned Vehicles

Muqing Cao , Kun Cao , Shenghai Yuan , Thien-Minh Nguyen , Lihua Xie

分类：机器人

2022-12-03

Despite recent progress on trajectory planning of multiple robots and path planning of a single tethered robot, planning of multiple tethered robots to reach their individual targets without entanglements remains a challenging problem. In this paper, we present a complete approach to address this problem. Firstly, we propose a multi-robot tether-aware representation of homotopy, using which we can efficiently evaluate the feasibility and safety of a potential path in terms of (1) the cable length required to reach a target following the path, and (2) the risk of entanglements with the cables of other robots. Then, the proposed representation is applied in a decentralized and online planning framework that includes a graph-based kinodynamic trajectory finder and an optimization-based trajectory refinement, to generate entanglement-free, collision-free and dynamically feasible trajectories. The efficiency of the proposed homotopy representation is compared against existing single and multiple tethered robot planning approaches. Simulations with up to 8 UAVs show the effectiveness of the approach in entanglement prevention and its real-time capabilities. Flight experiments using 3 tethered UAVs verify the practicality of the presented approach.

translated by 谷歌翻译

SLICT: Multi-input Multi-scale Surfel-Based Lidar-Inertial Continuous-Time Odometry and Mapping

Thien-Minh Nguyen , Daniel Duberg , Patric Jensfelt , Shenghai Yuan , Lihua Xie

分类：机器人

2022-11-07

While feature association to a global map has significant benefits, to keep the computations from growing exponentially, most lidar-based odometry and mapping methods opt to associate features with local maps at one voxel scale. Taking advantage of the fact that surfels (surface elements) at different voxel scales can be organized in a tree-like structure, we propose an octree-based global map of multi-scale surfels that can be updated incrementally. This alleviates the need for recalculating, for example, a k-d tree of the whole map repeatedly. The system can also take input from a single or a number of sensors, reinforcing the robustness in degenerate cases. We also propose a point-to-surfel (PTS) association scheme, continuous-time optimization on PTS and IMU preintegration factors, along with loop closure and bundle adjustment, making a complete framework for Lidar-Inertial continuous-time odometry and mapping. Experiments on public and in-house datasets demonstrate the advantages of our system compared to other state-of-the-art methods. To benefit the community, we release the source code and dataset at https://github.com/brytsknguyen/slict.

translated by 谷歌翻译

AirFi: Empowering WiFi-based Passive Human Gesture Recognition to Unseen Environment via Domain Generalization

Dazhuo Wang , Jianfei Yang , Wei Cui , Lihua Xie , Sumei Sun

分类：计算机视觉

2022-09-21

近年来，由渠道状态信息（CSI）启用了基于WiFi的智能人类传感技术（CSI）。但是，在不同的环境中部署时，基于CSI的传感系统会遭受性能降解。现有作品通过使用新环境中的大量未标记的高质量数据来通过域的适应来解决这一问题，这在实践中通常不可用。在本文中，我们提出了一种新颖的增强环境不变的鲁棒wifi wifi识别系统，名为Airfi，该系统从新的角度涉及环境依赖问题。 Airfi是一个新颖的领域泛化框架，无论环境如何，都可以学习CSI的关键部分，并将模型推广到看不见的场景，不需要收集任何数据以适应新环境。 Airfi从几个培训环境环境中提取了共同的功能，并最大程度地减少了它们之间的分布差异。该功能将进一步增强，以使环境更强大。此外，可以通过几次学习技术进一步改进该系统。与最先进的方法相比，Airfi能够在不同的环境环境中工作，而无需从新环境中获取任何CSI数据。实验结果表明，我们的系统在新环境中保持强大，并优于比较系统。

translated by 谷歌翻译

Learning from a Biased Sample

Roshni Sahoo , Lihua Lei , Stefan Wager

分类：机器学习 | (统计)机器学习

2022-09-05

数据驱动决策的经验风险最小化方法假设我们可以从与我们想要在下面部署的条件相同的条件下绘制的数据中学习决策规则。但是，在许多设置中，我们可能会担心我们的培训样本是有偏见的，并且某些组（以可观察或无法观察到的属性为特征）可能相对于一般人群而言是不足或代表过多的；在这种情况下，对培训集的经验风险最小化可能无法产生在部署时表现良好的规则。我们基于分配强大的优化和灵敏度分析的概念，我们提出了一种学习决策规则的方法，该方法将在测试分布家族的家庭中最小化最糟糕的案例风险，其有条件的结果分布$ y $ y $ y $ y $ x $有所不同有条件的训练分布最多是一个恒定因素，并且相对于训练数据的协变量分布，其协变量分布绝对是连续的。我们应用Rockafellar和Uryasev的结果表明，此问题等同于增强的凸风险最小化问题。我们提供了使用筛子的方法来学习健壮模型的统计保证，并提出了一种深度学习算法，其损失函数捕获了我们的稳健性目标。我们从经验上验证了我们在模拟中提出的方法和使用MIMIC-III数据集的案例研究。

translated by 谷歌翻译