智能论文笔记

GeneFormer: Learned Gene Compression using Transformer-based Context Modeling

Zhanbei Cui , Yu Liao , Tongda Xu , Yan Wang

分类：机器学习

2022-12-16

With the development of gene sequencing technology, an explosive growth of gene data has been witnessed. And the storage of gene data has become an important issue. Traditional gene data compression methods rely on general software like G-zip, which fails to utilize the interrelation of nucleotide sequence. Recently, many researchers begin to investigate deep learning based gene data compression method. In this paper, we propose a transformer-based gene compression method named GeneFormer. Specifically, we first introduce a modified transformer structure to fully explore the nucleotide sequence dependency. Then, we propose fixed-length parallel grouping to accelerate the decoding speed of our autoregressive model. Experimental results on real-world datasets show that our method saves 29.7% bit rate compared with the state-of-the-art method, and the decoding speed is significantly faster than all existing learning-based gene compression methods.

translated by 谷歌翻译

Spatial Moment Pooling Improves Neural Image Assessment

Tongda Xu , Yifan Shao , Yan Wang , Hongwei Qin

分类：计算机视觉

2022-09-29

近年来，人们普遍关注基于卷积的神经网络（CNN）的盲图质量评估（IQA）。大量作品首先从CNN中提取深度功能。然后，通过空间平均池（SAP）和完全连接的层来处理这些特征以预测质量。在本文中，我们受到完整参考IQA和纹理功能的启发，我们通过合并高阶矩（例如方差，偏度），将SAP（$ 1^{st} $矩）扩展到空间矩池（SMP）。此外，我们在计算较高矩的梯度时提供了学习友好的归一化以规避数值问题。实验结果表明，仅将SAP升级到SMP可以显着增强基于CNN的盲目IQA方法，并达到最先进的性能状态。

translated by 谷歌翻译

Correcting the Sub-optimal Bit Allocation

Tongda Xu , Han Gao , Yuanyuan Wang , Hongwei Qin , Yan Wang , Jingjing Liu , Ya-Qin Zhang

分类：计算机视觉

2022-09-29

在本文中，我们研究了神经视频压缩（NVC）中位分配的问题。首先，我们揭示了最近声称是最佳的位分配方法实际上是由于其实施而是最佳的。具体而言，我们发现其亚典型性在于半损坏的变异推理（SAVI）对潜在的不正确的应用，具有非物质变异后验。然后，我们表明，在非因素潜伏期上校正的SAVI校正版本需要递归地通过梯度上升应用后传播，这是我们得出校正后的最佳位分配算法的。由于校正位分配的计算不可行性，我们设计了有效的近似值以使其实用。经验结果表明，我们提出的校正显着改善了R-D性能和比特率误差的错误分配，并且比所有其他位分配方法都大大提高了。源代码在补充材料中提供。

translated by 谷歌翻译

Multi-Sample Training for Neural Image Compression

Tongda Xu , Yan Wang , Dailan He , Chenjian Gao , Han Gao , Kunzan Liu , Hongwei Qin

分类：计算机视觉

2022-09-28

本文考虑了有损神经图像压缩（NIC）的问题。当前的最新方法（SOTA）方法采用近似量化噪声的后部均匀的后方，单样本估计量近似于证据下限（ELBO）的梯度。在本文中，我们建议用多个样本重要性加权自动编码器（IWAE）目标训练NIC，该目标比Elbo更紧，并随着样本量的增加而收敛至对数的可能性。首先，我们确定NIC的均匀后验具有特殊的特性，这会影响IWAE目标的Pathiswise和得分函数估计器的方差和偏差。此外，从梯度差异的角度来看，我们提供了有关NIC中通常采用的技巧的见解。基于这些分析，我们进一步提出了多样本NIC（MS-NIC），这是NIC的IWAE靶标。实验结果表明，它改善了SOTA NIC方法。我们的MS-NIC是插件，可以轻松扩展到其他神经压缩任务。

translated by 谷歌翻译

Bit Allocation using Optimization

Tongda Xu , Han Gao , Chenjian Gao , Jinyong Pi , Yanghao Li , Yuanyuan Wang , Ziyu Zhu , Dailan He , Mao Ye , Hongwei Qin

分类：计算机视觉

2022-09-20

在本文中，我们考虑了神经视频压缩（NVC）中位分配的问题。由于帧参考结构，使用相同的R-D（速率）权衡参数$ \ lambda $的当前NVC方法是次优的，这带来了位分配的需求。与以前基于启发式和经验R-D模型的方法不同，我们建议通过基于梯度的优化解决此问题。具体而言，我们首先提出了一种基于半损坏的变异推理（SAVI）的连续位实现方法。然后，我们通过更改SAVI目标，使用迭代优化提出了一个像素级隐式分配方法。此外，我们基于NVC的可区分特征得出了精确的R-D模型。我们通过使用精确的R-D模型证明其等效性与位分配的等效性来展示我们的方法的最佳性。实验结果表明，我们的方法显着改善了NVC方法，并且胜过现有的位分配方法。我们的方法是所有可区分NVC方法的插件，并且可以直接在现有的预训练模型上采用。

translated by 谷歌翻译

Flexible Neural Image Compression via Code Editing

Chenjian Gao , Tongda Xu , Dailan He , Hongwei Qin , Yan Wang

分类：计算机视觉 | 机器学习

2022-09-19

神经图像压缩（NIC）的表现优于传统图像编解码器（R-D）性能。但是，它通常需要R-D曲线上每个点的专用编码器对，这极大地阻碍了其实际部署。尽管最近的一些作品通过有条件的编码实现了比特率控制，但它们在训练过程中施加了强大的先验，并提供了有限的灵活性。在本文中，我们提出了代码编辑，这是一种基于半损坏的推理和自适应量化的NIC的高度灵活的编码方法。我们的工作是可变比特率NIC的新范式。此外，实验结果表明，我们的方法超过了现有的可变速率方法，并通过单个解码器实现了ROI编码和多功能权衡。

translated by 谷歌翻译

A multi-stage semi-supervised improved deep embedded clustering method for bearing fault diagnosis under the situation of insufficient labeled samples

Tongda Sun , Gang Yu

分类：机器学习 | 人工智能

2021-09-28

虽然数据驱动的故障诊断方法已被广泛应用，但模型培训需要大规模标记数据。然而，在真正的行业实施这一点难以阻碍这些方法的应用。因此，迫切需要在这种情况下运行良好的有效诊断方法。本研究中，多级半监督改进的深度嵌入式聚类（MS-SSIDEC）方法，将半监督学习与改进的深度嵌入式聚类相结合（IDEC），建议共同探索稀缺标记的数据和大规模的未标记数据。在第一阶段，提出了一种可以自动将未标记的数据映射到低维特征空间中的跳过连接的卷积自动编码器（SCCAE），并预先培训以成为故障特征提取器。在第二阶段，提出了一个半监督的改进的深嵌入式聚类（SSIDEC）网络以进行聚类。首先用可用标记数据初始化，然后用于同时优化群集标签分配，并使要素空间更加群集。为了解决过度装备现象，在本阶段将虚拟的对抗培训（增值税）作为正则化术语。在第三阶段，伪标签是通过SSIDEC的高质量结果获得的。标记的数据集可以由这些伪标记的数据增强，然后利用以训练轴承故障诊断模型。来自滚动轴承的两个振动数据数据集用于评估所提出的方法的性能。实验结果表明，该方法在半监督和无监督的故障诊断任务中实现了有希望的性能。该方法通过有效地探索无监督数据，提供了在有限标记样本的情况下的故障诊断方法。

translated by 谷歌翻译

Reference Twice: A Simple and Unified Baseline for Few-Shot Instance Segmentation

Yue Han , Jiangning Zhang , Zhucun Xue , Chao Xu , Xintian Shen , Yabiao Wang , Chengjie Wang , Yong Liu , Xiangtai Li

分类：计算机视觉

2023-01-03

Few Shot Instance Segmentation (FSIS) requires models to detect and segment novel classes with limited several support examples. In this work, we explore a simple yet unified solution for FSIS as well as its incremental variants, and introduce a new framework named Reference Twice (RefT) to fully explore the relationship between support/query features based on a Transformer-like framework. Our key insights are two folds: Firstly, with the aid of support masks, we can generate dynamic class centers more appropriately to re-weight query features. Secondly, we find that support object queries have already encoded key factors after base training. In this way, the query features can be enhanced twice from two aspects, i.e., feature-level and instance-level. In particular, we firstly design a mask-based dynamic weighting module to enhance support features and then propose to link object queries for better calibration via cross-attention. After the above steps, the novel classes can be improved significantly over our strong baseline. Additionally, our new framework can be easily extended to incremental FSIS with minor modification. When benchmarking results on the COCO dataset for FSIS, gFSIS, and iFSIS settings, our method achieves a competitive performance compared to existing approaches across different shots, e.g., we boost nAP by noticeable +8.2/+9.4 over the current state-of-the-art FSIS method for 10/30-shot. We further demonstrate the superiority of our approach on Few Shot Object Detection. Code and model will be available.

translated by 谷歌翻译

AI in HCI Design and User Experience

Wei Xu

分类：人工智能

2023-01-03

In this chapter, we review and discuss the transformation of AI technology in HCI/UX work and assess how AI technology will change how we do the work. We first discuss how AI can be used to enhance the result of user research and design evaluation. We then discuss how AI technology can be used to enhance HCI/UX design. Finally, we discuss how AI-enabled capabilities can improve UX when users interact with computing systems, applications, and services.

translated by 谷歌翻译

More is Better: A Database for Spontaneous Micro-Expression with High Frame Rates

Sirui Zhao , Huaying Tang , Xinglong Mao , Shifeng Liu , Hanqing Tao , Hao Wang , Tong Xu , Enhong Chen

分类：计算机视觉

2023-01-03

As one of the most important psychic stress reactions, micro-expressions (MEs), are spontaneous and transient facial expressions that can reveal the genuine emotions of human beings. Thus, recognizing MEs (MER) automatically is becoming increasingly crucial in the field of affective computing, and provides essential technical support in lie detection, psychological analysis and other areas. However, the lack of abundant ME data seriously restricts the development of cutting-edge data-driven MER models. Despite the recent efforts of several spontaneous ME datasets to alleviate this problem, it is still a tiny amount of work. To solve the problem of ME data hunger, we construct a dynamic spontaneous ME dataset with the largest current ME data scale, called DFME (Dynamic Facial Micro-expressions), which includes 7,526 well-labeled ME videos induced by 671 participants and annotated by more than 20 annotators throughout three years. Afterwards, we adopt four classical spatiotemporal feature learning models on DFME to perform MER experiments to objectively verify the validity of DFME dataset. In addition, we explore different solutions to the class imbalance and key-frame sequence sampling problems in dynamic MER respectively on DFME, so as to provide a valuable reference for future research. The comprehensive experimental results show that our DFME dataset can facilitate the research of automatic MER, and provide a new benchmark for MER. DFME will be published via https://mea-lab-421.github.io.

translated by 谷歌翻译