智能论文笔记

TextBox 2.0: A Text Generation Library with Pre-trained Language Models

Tianyi Tang , Junyi Li , Zhipeng Chen , Yiwen Hu , Zhuohao Yu , Wenxun Dai , Zican Dong , Xiaoxue Cheng , Yuhao Wang , Wayne Xin Zhao

分类：自然语言处理

2022-12-26

To facilitate research on text generation, this paper presents a comprehensive and unified library, TextBox 2.0, focusing on the use of pre-trained language models (PLMs). To be comprehensive, our library covers $13$ common text generation tasks and their corresponding $83$ datasets and further incorporates $45$ PLMs covering general, translation, Chinese, dialogue, controllable, distilled, prompting, and lightweight PLMs. We also implement $4$ efficient training strategies and provide $4$ generation objectives for pre-training new PLMs from scratch. To be unified, we design the interfaces to support the entire research pipeline (from data loading to training and evaluation), ensuring that each step can be fulfilled in a unified way. Despite the rich functionality, it is easy to use our library, either through the friendly Python API or command line. To validate the effectiveness of our library, we conduct extensive experiments and exemplify four types of research scenarios. The project is released at the link: https://github.com/RUCAIBox/TextBox.

translated by 谷歌翻译

Toward Efficient Language Model Pretraining and Downstream Adaptation via Self-Evolution: A Case Study on SuperGLUE

Qihuang Zhong , Liang Ding , Yibing Zhan , Yu Qiao , Yonggang Wen , Li Shen , Juhua Liu , Baosheng Yu , Bo Du , Yixin Chen

分类：自然语言处理

2022-12-04

This technical report briefly describes our JDExplore d-team's Vega v2 submission on the SuperGLUE leaderboard. SuperGLUE is more challenging than the widely used general language understanding evaluation (GLUE) benchmark, containing eight difficult language understanding tasks, including question answering, natural language inference, word sense disambiguation, coreference resolution, and reasoning. [Method] Instead of arbitrarily increasing the size of a pretrained language model (PLM), our aim is to 1) fully extract knowledge from the input pretraining data given a certain parameter budget, e.g., 6B, and 2) effectively transfer this knowledge to downstream tasks. To achieve goal 1), we propose self-evolution learning for PLMs to wisely predict the informative tokens that should be masked, and supervise the masked language modeling (MLM) process with rectified smooth labels. For goal 2), we leverage the prompt transfer technique to improve the low-resource tasks by transferring the knowledge from the foundation model and related downstream tasks to the target task. [Results] According to our submission record (Oct. 2022), with our optimized pretraining and fine-tuning strategies, our 6B Vega method achieved new state-of-the-art performance on 4/8 tasks, sitting atop the SuperGLUE leaderboard on Oct. 8, 2022, with an average score of 91.3.

translated by 谷歌翻译

Deep Learning in Single-Cell Analysis

Dylan Molho , Jiayuan Ding , Zhaoheng Li , Hongzhi Wen , Wenzhuo Tang , Yixin Wang , Julian Venegas , Wei Jin , Renming Liu , Runze Su

分类：人工智能

2022-10-22

Single-cell technologies are revolutionizing the entire field of biology. The large volumes of data generated by single-cell technologies are high-dimensional, sparse, heterogeneous, and have complicated dependency structures, making analyses using conventional machine learning approaches challenging and impractical. In tackling these challenges, deep learning often demonstrates superior performance compared to traditional machine learning methods. In this work, we give a comprehensive survey on deep learning in single-cell analysis. We first introduce background on single-cell technologies and their development, as well as fundamental concepts of deep learning including the most popular deep architectures. We present an overview of the single-cell analytic pipeline pursued in research applications while noting divergences due to data sources or specific applications. We then review seven popular tasks spanning through different stages of the single-cell analysis pipeline, including multimodal integration, imputation, clustering, spatial domain identification, cell-type deconvolution, cell segmentation, and cell-type annotation. Under each task, we describe the most recent developments in classical and deep learning methods and discuss their advantages and disadvantages. Deep learning tools and benchmark datasets are also summarized for each task. Finally, we discuss the future directions and the most recent challenges. This survey will serve as a reference for biologists and computer scientists, encouraging collaborations.

translated by 谷歌翻译

End-to-End Entity Detection with Proposer and Regressor

Xueru Wen , Changjiang Zhou , Haotian Tang , Luguang Liang , Yu Jiang , Hong Qi

分类：自然语言处理

2022-10-19

Named entity recognition is a traditional task in natural language processing. In particular, nested entity recognition receives extensive attention for the widespread existence of the nesting scenario. The latest research migrates the well-established paradigm of set prediction in object detection to cope with entity nesting. However, the manual creation of query vectors, which fail to adapt to the rich semantic information in the context, limits these approaches. An end-to-end entity detection approach with proposer and regressor is presented in this paper to tackle the issues. First, the proposer utilizes the feature pyramid network to generate high-quality entity proposals. Then, the regressor refines the proposals for generating the final prediction. The model adopts encoder-only architecture and thus obtains the advantages of the richness of query semantics, high precision of entity localization, and easiness of model training. Moreover, we introduce the novel spatially modulated attention and progressive refinement for further improvement. Extensive experiments demonstrate that our model achieves advanced performance in flat and nested NER, achieving a new state-of-the-art F1 score of 80.74 on the GENIA dataset and 72.38 on the WeiboNER dataset.

translated by 谷歌翻译

A Survey on Physical Adversarial Attack in Computer Vision

Donghua Wang , Wen Yao , Tingsong Jiang , Guijiang Tang , Xiaoqian Chen

分类：计算机视觉

2022-09-28

在过去的十年中，深度学习急剧改变了传统的手工艺特征方式，具有强大的功能学习能力，从而极大地改善了传统任务。然而，最近已经证明了深层神经网络容易受到对抗性例子的影响，这种恶意样本由小型设计的噪音制作，误导了DNNs做出错误的决定，同时仍然对人类无法察觉。对抗性示例可以分为数字对抗攻击和物理对抗攻击。数字对抗攻击主要是在实验室环境中进行的，重点是改善对抗性攻击算法的性能。相比之下，物理对抗性攻击集中于攻击物理世界部署的DNN系统，这是由于复杂的物理环境（即亮度，遮挡等），这是一项更具挑战性的任务。尽管数字对抗和物理对抗性示例之间的差异很小，但物理对抗示例具有特定的设计，可以克服复杂的物理环境的效果。在本文中，我们回顾了基于DNN的计算机视觉任务任务中的物理对抗攻击的开发，包括图像识别任务，对象检测任务和语义细分。为了完整的算法演化，我们将简要介绍不涉及身体对抗性攻击的作品。我们首先提出一个分类方案，以总结当前的物理对抗攻击。然后讨论现有的物理对抗攻击的优势和缺点，并专注于用于维持对抗性的技术，当应用于物理环境中时。最后，我们指出要解决的当前身体对抗攻击的问题并提供有前途的研究方向。

translated by 谷歌翻译

A Comprehensive Benchmark for COVID-19 Predictive Modeling Using Electronic Health Records in Intensive Care: Choosing the Best Model for COVID-19 Prognosis

Junyi Gao , Yinghao Zhu , Wenqing Wang , Yasha Wang , Wen Tang , Liantao Ma

分类：机器学习

2022-09-16

COVID-19大流行对全球医疗保健系统造成了沉重的负担，并造成了巨大的社会破坏和经济损失。已经提出了许多深度学习模型来执行临床预测任务，例如使用电子健康记录（EHR）数据在重症监护病房中为Covid-19患者的死亡率预测。尽管在某些临床应用中取得了最初的成功，但目前缺乏基准测试结果来获得公平的比较，因此我们可以选择最佳模型以供临床使用。此外，传统预测任务的制定与重症监护现实世界的临床实践之间存在差异。为了填补这些空白，我们提出了两项临床预测任务，特定于结局的预测和重症监护病房中的COVID-19患者的早期死亡率预测。这两个任务是根据幼稚的停车时间和死亡率预测任务的改编，以适应COVID-19患者的临床实践。我们提出了公平，详细的开源数据预处管道，并评估了两项任务的17个最先进的预测模型，包括5个机器学习模型，6种基本的深度学习模型和6种专门为EHR设计的深度学习预测模型数据。我们使用来自两个现实世界Covid-19 EHR数据集的数据提供基准测试结果。这两个数据集都可以公开可用，而无需任何查询，并且可以根据要求访问一个数据集。我们为两项任务提供公平，可重复的基准测试结果。我们在在线平台上部署所有实验结果和模型。我们还允许临床医生和研究人员将其数据上传到平台上，并使用训练有素的模型快速获得预测结果。我们希望我们的努力能够进一步促进Covid-19预测建模的深度学习和机器学习研究。

translated by 谷歌翻译

Explicitly Controllable 3D-Aware Portrait Generation

Junshu Tang , Bo Zhang , Binxin Yang , Ting Zhang , Dong Chen , Lizhuang Ma , Fang Wen

分类：计算机视觉

2022-09-12

与传统的头像创建管道相反，这是一个昂贵的过程，现代生成方法直接从照片中学习数据分布，而艺术的状态现在可以产生高度的照片现实图像。尽管大量作品试图扩展无条件的生成模型并达到一定程度的可控性，但要确保多视图一致性，尤其是在大型姿势中，仍然具有挑战性。在这项工作中，我们提出了一个3D肖像生成网络，该网络可产生3D一致的肖像，同时根据有关姿势，身份，表达和照明的语义参数可控。生成网络使用神经场景表示在3D中建模肖像，其生成以支持明确控制的参数面模型为指导。尽管可以通过将图像与部分不同的属性进行对比，但可以进一步增强潜在的分离，但在非面积区域（例如，在动画表达式）时，仍然存在明显的不一致。我们通过提出一种体积混合策略来解决此问题，在该策略中，我们通过将动态和静态辐射场融合在一起，形成一个复合输出，并从共同学习的语义场中分割了两个部分。我们的方法在广泛的实验中优于先前的艺术，在自由视点中观看时，在自然照明中产生了逼真的肖像。所提出的方法还证明了真实图像以及室外卡通面孔的概括能力，在实际应用中显示出巨大的希望。其他视频结果和代码将在项目网页上提供。

translated by 谷歌翻译

MVP: Multi-task Supervised Pre-training for Natural Language Generation

Tianyi Tang , Junyi Li , Wayne Xin Zhao , Ji-Rong Wen

分类：自然语言处理

2022-06-24

预训练的语言模型（PLM）在自然语言生成（NLG）任务中取得了显着的成功。到目前为止，大多数PLM都使用大型一般语料库以无监督的方式进行了预培训。同时，与无监督的模型相比，预先训练的模型越来越多地显示出较低的数据表现出色。受监督预训练的成功的激励，我们提出了自然语言生成的多任务监督预训练（MVP）。为了预先培训文本生成模型MVP，我们从七个生成任务中收集了45个数据集的标记预训练语料库。对于每个任务，我们进一步预先训练特定的软提示，以刺激执行特定任务的模型能力。广泛的实验证明了我们在许多NLG任务中有监督的预训练的有效性，并且我们的一般方法在17个数据集中的12个中实现了最先进的性能。

translated by 谷歌翻译

MMMNA-Net for Overall Survival Time Prediction of Brain Tumor Patients

Wen Tang , Haoyue Zhang , Pengxin Yu , Han Kang , Rongguo Zhang

分类：计算机视觉

2022-06-13

总生存时间（OS）时间是神经胶质瘤情况最重要的评估指数之一。多模式磁共振成像（MRI）扫描在神经胶质瘤预后OS时间的研究中起重要作用。为多模式MRI问题的OS时间预测提出了几种基于学习的方法。但是，这些方法通常在深度学习网络开始或结束时融合多模式信息，并且缺乏来自不同尺度的特征。此外，网络末尾的融合始终适应全球（例如，在全球平均池输出串联后完全连接）或与局部（例如，双线性池）的融合，这会失去与全球局部的局部信息。在本文中，我们提出了一种用于对脑肿瘤患者的多模式OS时间预测的新方法，该方法包含在不同尺度上引入的改进的非局部特征融合模块。我们的方法比当前最新方法获得了相对8.76％的改善（0.6989 vs. 0.6426的精度）。广泛的测试表明，我们的方法可以适应缺失方式的情况。该代码可在https://github.com/tangwen920812/mmmna-net上找到。

translated by 谷歌翻译

RPLHR-CT Dataset and Transformer Baseline for Volumetric Super-Resolution from CT Scans

Pengxin Yu , Haoyue Zhang , Han Kang , Wen Tang , Corey W. Arnold , Rongguo Zhang

分类：计算机视觉 | 机器学习

2022-06-13

在临床实践中，由于较短的获取时间和较低的存储成本，通常使用了平面分辨率低的各向异性体积医学图像。然而，粗分辨率可能导致医生或计算机辅助诊断算法的医学诊断困难。基于深度学习的体积超分辨率（SR）方法是改善分辨率的可行方法，其核心是卷积神经网络（CNN）。尽管进展最近，但这些方法受到卷积运算符的固有属性的限制，卷积运算符忽略内容相关性，无法有效地对远程依赖性进行建模。此外，大多数现有方法都使用伪配合的体积进行训练和评估，其中伪低分辨率（LR）体积是通过简单的高分辨率（HR）对应物的简单降解而产生的。但是，伪和现实LR之间的域间隙导致这些方法在实践中的性能不佳。在本文中，我们构建了第一个公共实用数据集RPLHR-CT作为体积SR的基准，并通过重新实现四种基于CNN的最先进的方法来提供基线结果。考虑到CNN的固有缺点，我们还提出了基于注意力机制的变压器体积超分辨率网络（TVSRN），完全与卷积分配。这是首次将纯变压器用于CT体积SR的研究。实验结果表明，TVSRN在PSNR和SSIM上的所有基准都显着胜过。此外，TVSRN方法在图像质量，参数数量和运行时间之间取得了更好的权衡。数据和代码可在https://github.com/smilenaxx/rplhr-ct上找到。

translated by 谷歌翻译