智能论文笔记

Image-specific Convolutional Kernel Modulation for Single Image Super-resolution

Yuanfei Huang , Jie Li , Yanting Hu , Xinbo Gao , Hua Huang

分类：计算机视觉

2021-11-16

最近，基于深度学习的超分辨率方法取得了良好的性能，但主要关注通过喂养许多样品来训练单个广义的深网络。但是直观地，每个图像都具有其表示，并且预计将获得自适应模型。对于此问题，我们通过利用图像或特征的全局上下文信息来提出一种新颖的图像特异性卷积核调制（IKM），以产生适当地调制卷积核的注意重量，这越优于Vanilla卷积和几个现有的注意机制在没有任何其他参数的情况下嵌入最先进的架构。特别是，为了优化我们在迷你批量培训中的IKM，我们引入了一种特定于图像的优化（ISO）算法，比传统的迷你批量SGD优化更有效。此外，我们调查IKM对最先进的架构的影响，并利用一个带有U风格的残差学习和沙漏密集的块学习的新骨干，术语U-HOLGLASS密集网络（U-HDN），这是一个理论上和实验，最大限度地提高IKM的效力。单图像超分辨率的广泛实验表明，该方法实现了优异的现有方法性能。代码可在github.com/yuanfeihuang/ikm获得。

translated by 谷歌翻译

Transitional Learning: Exploring the Transition States of Degradation for Blind Super-resolution

Yuanfei Huang , Jie Li , Yanting Hu , Xinbo Gao , Hua Huang

分类：计算机视觉

2021-03-29

极度依赖于从划痕的模型的降级或优化的降解或优化的迭代估计，现有的盲超分辨率（SR）方法通常是耗时和效率较低，因为退化的估计从盲初始化进行并且缺乏可解释降解前沿。为了解决它，本文提出了一种使用端到端网络的盲SR的过渡学习方法，没有任何额外的推断中的额外迭代，并探讨了未知降级的有效表示。首先，我们分析并证明降解的过渡性作为可解释的先前信息，以间接推断出未知的降解模型，包括广泛使用的添加剂和卷曲降解。然后，我们提出了一种新颖的过渡性学习方法，用于盲目超分辨率（TLSR），通过自适应地推断过渡转换功能来解决未知的降级而没有推断的任何迭代操作。具体地，端到端TLSR网络包括一定程度的过渡性（点）估计网络，同一性特征提取网络和过渡学习模块。对盲人SR任务的定量和定性评估表明，拟议的TLSR实现了优异的性能，并且对最先进的盲人SR方法的复杂性较少。该代码可在github.com/yuanfeihuang/tlsr获得。

translated by 谷歌翻译

Joint Open Knowledge Base Canonicalization and Linking

Yinan Liu , Wei Shen , Yuanfei Wang , Jianyong Wang , Zhenglu Yang , Xiaojie Yuan

分类：自然语言处理 | 人工智能

2022-12-02

Open Information Extraction (OIE) methods extract a large number of OIE triples (noun phrase, relation phrase, noun phrase) from text, which compose large Open Knowledge Bases (OKBs). However, noun phrases (NPs) and relation phrases (RPs) in OKBs are not canonicalized and often appear in different paraphrased textual variants, which leads to redundant and ambiguous facts. To address this problem, there are two related tasks: OKB canonicalization (i.e., convert NPs and RPs to canonicalized form) and OKB linking (i.e., link NPs and RPs with their corresponding entities and relations in a curated Knowledge Base (e.g., DBPedia). These two tasks are tightly coupled, and one task can benefit significantly from the other. However, they have been studied in isolation so far. In this paper, we explore the task of joint OKB canonicalization and linking for the first time, and propose a novel framework JOCL based on factor graph model to make them reinforce each other. JOCL is flexible enough to combine different signals from both tasks, and able to extend to fit any new signals. A thorough experimental study over two large scale OIE triple data sets shows that our framework outperforms all the baseline methods for the task of OKB canonicalization (OKB linking) in terms of average F1 (accuracy).

translated by 谷歌翻译

Model Predictive Robustness of Signal Temporal Logic Predicates

Yuanfei Lin , Haoxuan Li , Matthias Althoff

分类：机器人 | 人工智能 | 机器学习

2022-09-16

信号时间逻辑的鲁棒性不仅评估信号是否遵守规范，而且还提供了对公式的满足或违反的量度。鲁棒性的计算基于评估潜在谓词的鲁棒性。但是，通常以无模型方式（即不包括系统动力学）定义谓词的鲁棒性。此外，精确定义复杂谓词的鲁棒性通常是不平凡的。为了解决这些问题，我们提出了模型预测鲁棒性的概念，该概念通过考虑基于模型的预测，它与以前的方法相比提供了一种更系统的评估鲁棒性的方法。特别是，我们使用高斯过程回归来基于预定的预测来学习鲁棒性，以便可以在线上有效地计算鲁棒性值。我们评估了对自动驾驶用例的方法，该案例用在记录的数据集上使用形式的交通规则中使用的谓词来评估我们的方法，这与传统方法相比，在表达性方面相比，我们的方法优势。通过将我们的鲁棒性定义纳入轨迹规划师，自动驾驶汽车比数据集中的人类驾驶员更强大地遵守交通规则。

translated by 谷歌翻译

Rethinking Mobile Block for Efficient Neural Models

Jiangning Zhang , Xiangtai Li , Jian Li , Liang Liu , Zhucun Xue , Boshen Zhang , Zhengkai Jiang , Tianxin Huang , Yabiao Wang , Chengjie Wang

分类：计算机视觉

2023-01-03

This paper focuses on designing efficient models with low parameters and FLOPs for dense predictions. Even though CNN-based lightweight methods have achieved stunning results after years of research, trading-off model accuracy and constrained resources still need further improvements. This work rethinks the essential unity of efficient Inverted Residual Block in MobileNetv2 and effective Transformer in ViT, inductively abstracting a general concept of Meta-Mobile Block, and we argue that the specific instantiation is very important to model performance though sharing the same framework. Motivated by this phenomenon, we deduce a simple yet efficient modern \textbf{I}nverted \textbf{R}esidual \textbf{M}obile \textbf{B}lock (iRMB) for mobile applications, which absorbs CNN-like efficiency to model short-distance dependency and Transformer-like dynamic modeling capability to learn long-distance interactions. Furthermore, we design a ResNet-like 4-phase \textbf{E}fficient \textbf{MO}del (EMO) based only on a series of iRMBs for dense applications. Massive experiments on ImageNet-1K, COCO2017, and ADE20K benchmarks demonstrate the superiority of our EMO over state-of-the-art methods, \eg, our EMO-1M/2M/5M achieve 71.5, 75.1, and 78.4 Top-1 that surpass \textbf{SoTA} CNN-/Transformer-based models, while trading-off the model accuracy and efficiency well.

translated by 谷歌翻译

PIE-QG: Paraphrased Information Extraction for Unsupervised Question Generation from Small Corpora

Dinesh Nagumothu , Bahadorreza Ofoghi , Guangyan Huang , Peter W. Eklund

分类：自然语言处理 | 人工智能

2023-01-03

Supervised Question Answering systems (QA systems) rely on domain-specific human-labeled data for training. Unsupervised QA systems generate their own question-answer training pairs, typically using secondary knowledge sources to achieve this outcome. Our approach (called PIE-QG) uses Open Information Extraction (OpenIE) to generate synthetic training questions from paraphrased passages and uses the question-answer pairs as training data for a language model for a state-of-the-art QA system based on BERT. Triples in the form of <subject, predicate, object> are extracted from each passage, and questions are formed with subjects (or objects) and predicates while objects (or subjects) are considered as answers. Experimenting on five extractive QA datasets demonstrates that our technique achieves on-par performance with existing state-of-the-art QA systems with the benefit of being trained on an order of magnitude fewer documents and without any recourse to external reference data sources.

translated by 谷歌翻译

A New Perspective to Boost Vision Transformer for Medical Image Classification

Yuexiang Li , Yawen Huang , Nanjun He , Kai Ma , Yefeng Zheng

分类：计算机视觉 | 人工智能

2023-01-03

Transformer has achieved impressive successes for various computer vision tasks. However, most of existing studies require to pretrain the Transformer backbone on a large-scale labeled dataset (e.g., ImageNet) for achieving satisfactory performance, which is usually unavailable for medical images. Additionally, due to the gap between medical and natural images, the improvement generated by the ImageNet pretrained weights significantly degrades while transferring the weights to medical image processing tasks. In this paper, we propose Bootstrap Own Latent of Transformer (BOLT), a self-supervised learning approach specifically for medical image classification with the Transformer backbone. Our BOLT consists of two networks, namely online and target branches, for self-supervised representation learning. Concretely, the online network is trained to predict the target network representation of the same patch embedding tokens with a different perturbation. To maximally excavate the impact of Transformer from limited medical data, we propose an auxiliary difficulty ranking task. The Transformer is enforced to identify which branch (i.e., online/target) is processing the more difficult perturbed tokens. Overall, the Transformer endeavours itself to distill the transformation-invariant features from the perturbed tokens to simultaneously achieve difficulty measurement and maintain the consistency of self-supervised representations. The proposed BOLT is evaluated on three medical image processing tasks, i.e., skin lesion classification, knee fatigue fracture grading and diabetic retinopathy grading. The experimental results validate the superiority of our BOLT for medical image classification, compared to ImageNet pretrained weights and state-of-the-art self-supervised learning approaches.

translated by 谷歌翻译

Analogical Inference Enhanced Knowledge Graph Embedding

Yao Zhen , Zhang Wen , Chen Mingyang , Huang Yufeng , Yang Yi , Chen Huajun

分类：人工智能 | 自然语言处理

2023-01-03

Knowledge graph embedding (KGE), which maps entities and relations in a knowledge graph into continuous vector spaces, has achieved great success in predicting missing links in knowledge graphs. However, knowledge graphs often contain incomplete triples that are difficult to inductively infer by KGEs. To address this challenge, we resort to analogical inference and propose a novel and general self-supervised framework AnKGE to enhance KGE models with analogical inference capability. We propose an analogical object retriever that retrieves appropriate analogical objects from entity-level, relation-level, and triple-level. And in AnKGE, we train an analogy function for each level of analogical inference with the original element embedding from a well-trained KGE model as input, which outputs the analogical object embedding. In order to combine inductive inference capability from the original KGE model and analogical inference capability enhanced by AnKGE, we interpolate the analogy score with the base model score and introduce the adaptive weights in the score function for prediction. Through extensive experiments on FB15k-237 and WN18RR datasets, we show that AnKGE achieves competitive results on link prediction task and well performs analogical inference.

translated by 谷歌翻译

Digital Engineering Transformation with Trustworthy AI towards Industry 4.0: Emerging Paradigm Shifts

Jingwei Huang

分类：人工智能

2023-01-03

Digital engineering transformation is a crucial process for the engineering paradigm shifts in the fourth industrial revolution (4IR), and artificial intelligence (AI) is a critical enabling technology in digital engineering transformation. This article discusses the following research questions: What are the fundamental changes in the 4IR? More specifically, what are the fundamental changes in engineering? What is digital engineering? What are the main uncertainties there? What is trustworthy AI? Why is it important today? What are emerging engineering paradigm shifts in the 4IR? What is the relationship between the data-intensive paradigm and digital engineering transformation? What should we do for digitalization? From investigating the pattern of industrial revolutions, this article argues that ubiquitous machine intelligence (uMI) is the defining power brought by the 4IR. Digitalization is a condition to leverage ubiquitous machine intelligence. Digital engineering transformation towards Industry 4.0 has three essential building blocks: digitalization of engineering, leveraging ubiquitous machine intelligence, and building digital trust and security. The engineering design community at large is facing an excellent opportunity to bring the new capabilities of ubiquitous machine intelligence and trustworthy AI principles, as well as digital trust, together in various engineering systems design to ensure the trustworthiness of systems in Industry 4.0.

translated by 谷歌翻译

Human-in-the-loop Embodied Intelligence with Interactive Simulation Environment for Surgical Robot Learning

Yonghao Long , Wang Wei , Tao Huang , Yuehao Wang , Qi Dou

分类：机器人 | 人工智能 | 计算机视觉 | 机器学习

2023-01-01

Surgical robot automation has attracted increasing research interest over the past decade, expecting its huge potential to benefit surgeons, nurses and patients. Recently, the learning paradigm of embodied AI has demonstrated promising ability to learn good control policies for various complex tasks, where embodied AI simulators play an essential role to facilitate relevant researchers. However, existing open-sourced simulators for surgical robot are still not sufficiently supporting human interactions through physical input devices, which further limits effective investigations on how human demonstrations would affect policy learning. In this paper, we study human-in-the-loop embodied intelligence with a new interactive simulation platform for surgical robot learning. Specifically, we establish our platform based on our previously released SurRoL simulator with several new features co-developed to allow high-quality human interaction via an input device. With these, we further propose to collect human demonstrations and imitate the action patterns to achieve more effective policy learning. We showcase the improvement of our simulation environment with the designed new features and tasks, and validate state-of-the-art reinforcement learning algorithms using the interactive environment. Promising results are obtained, with which we hope to pave the way for future research on surgical embodied intelligence. Our platform is released and will be continuously updated in the website: https://med-air.github.io/SurRoL/

translated by 谷歌翻译