智能论文笔记

Digital Engineering Transformation with Trustworthy AI towards Industry 4.0: Emerging Paradigm Shifts

Jingwei Huang

分类：人工智能

2023-01-03

Digital engineering transformation is a crucial process for the engineering paradigm shifts in the fourth industrial revolution (4IR), and artificial intelligence (AI) is a critical enabling technology in digital engineering transformation. This article discusses the following research questions: What are the fundamental changes in the 4IR? More specifically, what are the fundamental changes in engineering? What is digital engineering? What are the main uncertainties there? What is trustworthy AI? Why is it important today? What are emerging engineering paradigm shifts in the 4IR? What is the relationship between the data-intensive paradigm and digital engineering transformation? What should we do for digitalization? From investigating the pattern of industrial revolutions, this article argues that ubiquitous machine intelligence (uMI) is the defining power brought by the 4IR. Digitalization is a condition to leverage ubiquitous machine intelligence. Digital engineering transformation towards Industry 4.0 has three essential building blocks: digitalization of engineering, leveraging ubiquitous machine intelligence, and building digital trust and security. The engineering design community at large is facing an excellent opportunity to bring the new capabilities of ubiquitous machine intelligence and trustworthy AI principles, as well as digital trust, together in various engineering systems design to ensure the trustworthiness of systems in Industry 4.0.

translated by 谷歌翻译

Shoupa: An AI System for Early Diagnosis of Parkinson's Disease

Jingwei Li , Ruitian Wu , Tzu-liang Huang , Zian Pan , Ming-chun Huang

分类：人工智能

2022-11-28

Parkinson's Disease (PD) is a progressive nervous system disorder that has affected more than 5.8 million people, especially the elderly. Due to the complexity of its symptoms and its similarity to other neurological disorders, early detection requires neurologists or PD specialists to be involved, which is not accessible to most old people. Therefore, we integrate smart mobile devices with AI technologies. In this paper, we introduce the framework of our developed PD early detection system which combines different tasks evaluating both motor and non-motor symptoms. With the developed model, we help users detect PD punctually in non-clinical settings and figure out their most severe symptoms. The results are expected to be further used for PD rehabilitation guidance and detection of other neurological disorders.

translated by 谷歌翻译

Coupled Modeling and Fusion Control for a Multi-modal Deformable Land-air Robot

Xinyu Zhang , Yuanhao Huang , Kangyao Huang , Ziqi Zhao , Jingwei Li , Huaping Liu , Jun Li

分类：机器人

2022-11-08

This paper introduces a structure-deformable land-air robot which possesses both excellent ground driving and flying ability, with smooth switching mechanism between two modes. The elaborate coupled dynamics model of the proposed robot is established, including rotors, chassis, especially the deformable structures. Furthermore, taking fusion locomotion and complex near-ground situations into consideration, a model based controller is designed for landing and mode switching under various harsh conditions, in which we realise the cooperation between fused two motion modes. The entire system is implemented in ADAMS/Simulink simulation and in practical. We conduct experiments under various complex scenarios. The results show our robot can accomplish land-air switching swiftly and smoothly, and the designed controller can effectively improve the landing flexibility and reliability.

translated by 谷歌翻译

Point Primitive Transformer for Long-Term 4D Point Cloud Video Understanding

Hao Wen , Yunze Liu , Jingwei Huang , Bo Duan , Li Yi

分类：计算机视觉

2022-07-30

本文提出了一个4D主链，以供长期点云视频理解。捕获空间上下文的一种典型方法是使用无层次结构的4DCONV或变压器。但是，由于相机运动，场景变化，采样模式和4D数据的复杂性，这些方法既没有有效也没有高效的效率。为了解决这些问题，我们利用原始平面作为中层表示，以捕获4D点云视频中的长期空间上下文，并提出了一个名为Point Point Primitive Transformer（PPTR）的新型层次骨架，主要由该骨架组成，该骨架主要由主要的点变压器和原始变压器。广泛的实验表明，PPTR在不同任务上优于先前的艺术状态

translated by 谷歌翻译

Primitive Graph Learning for Unified Vector Mapping

Lei Wang , Min Dai , Jianan He , Jingwei Huang , Mingwei Sun

分类：计算机视觉

2022-06-28

大规模矢量映射对于运输，城市规划，调查和人口普查很重要。我们提出了GraphMapper，这是从卫星图像中提取端到端向量图的统一框架。我们的关键思想是一种新颖的统一表示，称为“原始图”的不同拓扑的形状，这是一组形状原语及其成对关系矩阵。然后，我们将向量形状的预测，正则化和拓扑重构转换为独特的原始图学习问题。具体而言，GraphMapper是一个基于多头注意的全局形状上下文建模的通用原始图形学习网络。开发了一种嵌入式空间排序方法，用于准确的原始关系建模。我们从经验上证明了GraphMapper对两个具有挑战性的映射任务的有效性，即建立足迹正则化和道路网络拓扑重建。我们的模型在公共基准上的两项任务中都优于最先进的方法。所有代码将公开可用。

translated by 谷歌翻译

MVLayoutNet:3D layout reconstruction with multi-view panoramas

Zhihua Hu , Bo Duan , Yanfeng Zhang , Mingwei Sun , Jingwei Huang

分类：计算机视觉

2021-12-12

我们展示了MVLayoutNet，是来自多视图全景的整体三维重建端到端网络。我们的核心贡献是无缝地将学习的单目布局估计和多视图立体声（MV）结合起来，以便在3D和图像空间中准确地重建。我们共同列出布局模块以产生初始布局和新型MVS模块，以获得精确的布局几何形状。与标准MVSNET [33]不同，我们的MVS模块采用新建的布局成本卷，其在相同的深度层中聚合到相应的布局元件中的多视图成本。我们还提供了一种基于注意的方案，指导MVS模块专注于结构区域。这种设计考虑了本地像素级成本和全球整体信息，以便更好地重建。实验表明，我们的方法在2D-3D-S [1]和Zind [5]数据集中，在深度RMSE方面以21.7％和20.6％表示最先进的。最后，我们的方法导致连贯的布局几何，使整个场景的重建能够。

translated by 谷歌翻译

Local Implicit Grid Representations for 3D Scenes

Chiyu Max Jiang , Avneesh Sud , Ameesh Makadia , Jingwei Huang , Matthias Nießner , Thomas Funkhouser

分类：

2020-03-19

Training parts from ShapeNet. (b) t-SNE plot of part embeddings. (c) Reconstructing entire scenes with Local Implicit Grids Figure 1:We learn an embedding of parts from objects in ShapeNet [3] using a part autoencoder with an implicit decoder. We show that this representation of parts is generalizable across object categories, and easily scalable to large scenes. By localizing implicit functions in a grid, we are able to reconstruct entire scenes from points via optimization of the latent grid.

translated by 谷歌翻译

Normalized Object Coordinate Space for Category-Level 6D Object Pose and Size Estimation

He Wang , Srinath Sridhar , Jingwei Huang , Julien Valentin , Shuran Song , Leonidas J. Guibas

分类：

2019-01-09

The goal of this paper is to estimate the 6D pose and dimensions of unseen object instances in an RGB-D image. Contrary to "instance-level" 6D pose estimation tasks, our problem assumes that no exact object CAD models are available during either training or testing time. To handle different and unseen object instances in a given category, we introduce Normalized Object Coordinate Space (NOCS)-a shared canonical representation for all possible object instances within a category. Our region-based neural network is then trained to directly infer the correspondence from observed pixels to this shared object representation (NOCS) along with other object information such as class label and instance mask. These predictions can be combined with the depth map to jointly estimate the metric 6D pose and dimensions of multiple objects in a cluttered scene. To train our network, we present a new contextaware technique to generate large amounts of fully annotated mixed reality data. To further improve our model and evaluate its performance on real data, we also provide a fully annotated real-world dataset with large environment and instance variation. Extensive experiments demonstrate that the proposed method is able to robustly estimate the pose and size of unseen object instances in real environments while also achieving state-of-the-art performance on standard 6D pose estimation benchmarks.

translated by 谷歌翻译

Local Learning on Transformers via Feature Reconstruction

Priyank Pathak , Jingwei Zhang , Dimitris Samaras

分类：计算机视觉

2022-12-29

Transformers are becoming increasingly popular due to their superior performance over conventional convolutional neural networks(CNNs). However, transformers usually require a much larger amount of memory to train than CNNs, which prevents their application in many low resource settings. Local learning, which divides the network into several distinct modules and trains them individually, is a promising alternative to the end-to-end (E2E) training approach to reduce the amount of memory for training and to increase parallelism. This paper is the first to apply Local Learning on transformers for this purpose. The standard CNN-based local learning method, InfoPro [32], reconstructs the input images for each module in a CNN. However, reconstructing the entire image does not generalize well. In this paper, we propose a new mechanism for each local module, where instead of reconstructing the entire image, we reconstruct its input features, generated from previous modules. We evaluate our approach on 4 commonly used datasets and 3 commonly used decoder structures on Swin-Tiny. The experiments show that our approach outperforms InfoPro-Transformer, the InfoPro with Transfomer backbone we introduced, by at up to 0.58% on CIFAR-10, CIFAR-100, STL-10 and SVHN datasets, while using up to 12% less memory. Compared to the E2E approach, we require 36% less GPU memory when the network is divided into 2 modules and 45% less GPU memory when the network is divided into 4 modules.

translated by 谷歌翻译

Precise Location Matching Improves Dense Contrastive Learning in Digital Pathology

Jingwei Zhang , Saarthak Kapse , Ke Ma , Prateek Prasanna , Maria Vakalopoulou , Joel Saltz , Dimitris Samaras

分类：计算机视觉

2022-12-23

Dense prediction tasks such as segmentation and detection of pathological entities hold crucial clinical value in the digital pathology workflow. However, obtaining dense annotations on large cohorts is usually tedious and expensive. Contrastive learning (CL) is thus often employed to leverage large volumes of unlabeled data to pre-train the backbone network. To boost CL for dense prediction, some studies have proposed variations of dense matching objectives in pre-training. However, our analysis shows that employing existing dense matching strategies on histopathology images enforces invariance among incorrect pairs of dense features and, thus, is imprecise. To address this, we propose a precise location-based matching mechanism that utilizes the overlapping information between geometric transformations to precisely match regions in two augmentations. Extensive experiments on two pretraining datasets (TCGA-BRCA, NCT-CRC-HE) and three downstream datasets (GlaS, CRAG, BCSS) highlight the superiority of our method in semantic and instance segmentation tasks. Our method outperforms previous dense matching methods by up to 7.2 % in average precision for detection and 5.6 % in average precision for instance segmentation tasks. Additionally, by using our matching mechanism in the three popular contrastive learning frameworks, MoCo-v2, VICRegL and ConCL, the average precision in detection is improved by 0.7 % to 5.2 % and the average precision in segmentation is improved by 0.7 % to 4.0 %, demonstrating its generalizability.

translated by 谷歌翻译