智能论文笔记

Bridging the Gap Between Patient-specific and Patient-independent Seizure Prediction via Knowledge Distillation

Di Wu , Jie Yang , Mohamad Sawan

分类：机器学习 | 人工智能

2022-02-25

客观的。深度神经网络（DNNS）在各种脑机界面应用中表现出了前所未有的成功，例如癫痫发作预测。但是，由于癫痫信号的高度个性化特征，现有方法通常会以特定于患者的方式训练模型。因此，只能将每个受试者的标记录音数量有限用于培训。结果，由于训练数据的不足，目前基于DNN的方法在一定程度上表现出较差的泛化能力。另一方面，与患者无关的模型试图利用更多的患者数据通过将患者数据汇总在一起为所有患者培训通用模型。尽管采用了不同的技术，但结果表明，由于患者的个体差异很高，与患者独立的模型相比性能要比患者特异性模型差。因此，在患者特异性和与患者无关的模型之间存在很大的差距。方法。在本文中，我们提出了一种基于知识蒸馏的新型培训计划，该方案利用了来自多个受试者的大量数据。首先，它从具有预训练的通用模型的所有可用受试者的信号中提取信息。然后可以借助蒸馏知识和其他个性化数据获得患者特异性的模型。主要结果。通过我们建议的计划，对波士顿-MIT儿童医院的Seeg数据库进行了四种最先进的癫痫发作预测方法。由此产生的准确性，敏感性和错误的预测率表明，我们提出的培训方案一致地提高了最先进方法的预测性能。意义。拟议的训练方案显着改善了患者特异性癫痫发作预测因子的性能，并弥合了患者特异性和与患者无关的预测因子之间的差距。

translated by 谷歌翻译

Enhancing Low-Density EEG-Based Brain-Computer Interfaces with Similarity-Keeping Knowledge Distillation

Xin-Yao Huang , Sung-Yu Chen , Chun-Shu Wei

分类：机器学习

2022-12-06

Electroencephalogram (EEG) has been one of the common neuromonitoring modalities for real-world brain-computer interfaces (BCIs) because of its non-invasiveness, low cost, and high temporal resolution. Recently, light-weight and portable EEG wearable devices based on low-density montages have increased the convenience and usability of BCI applications. However, loss of EEG decoding performance is often inevitable due to reduced number of electrodes and coverage of scalp regions of a low-density EEG montage. To address this issue, we introduce knowledge distillation (KD), a learning mechanism developed for transferring knowledge/information between neural network models, to enhance the performance of low-density EEG decoding. Our framework includes a newly proposed similarity-keeping (SK) teacher-student KD scheme that encourages a low-density EEG student model to acquire the inter-sample similarity as in a pre-trained teacher model trained on high-density EEG data. The experimental results validate that our SK-KD framework consistently improves motor-imagery EEG decoding accuracy when number of electrodes deceases for the input EEG data. For both common low-density headphone-like and headband-like montages, our method outperforms state-of-the-art KD methods across various EEG decoding model architectures. As the first KD scheme developed for enhancing EEG decoding, we foresee the proposed SK-KD framework to facilitate the practicality of low-density EEG-based BCI in real-world applications.

translated by 谷歌翻译

TASKED: Transformer-based Adversarial learning for human activity recognition using wearable sensors via Self-KnowledgE Distillation

Sungho Suh , Vitor Fortes Rey , Paul Lukowicz

分类：计算机视觉 | 机器学习

2022-09-14

Wearable sensor-based human activity recognition (HAR) has emerged as a principal research area and is utilized in a variety of applications. Recently, deep learning-based methods have achieved significant improvement in the HAR field with the development of human-computer interaction applications. However, they are limited to operating in a local neighborhood in the process of a standard convolution neural network, and correlations between different sensors on body positions are ignored. In addition, they still face significant challenging problems with performance degradation due to large gaps in the distribution of training and test data, and behavioral differences between subjects. In this work, we propose a novel Transformer-based Adversarial learning framework for human activity recognition using wearable sensors via Self-KnowledgE Distillation (TASKED), that accounts for individual sensor orientations and spatial and temporal features. The proposed method is capable of learning cross-domain embedding feature representations from multiple subjects datasets using adversarial learning and the maximum mean discrepancy (MMD) regularization to align the data distribution over multiple domains. In the proposed method, we adopt the teacher-free self-knowledge distillation to improve the stability of the training procedure and the performance of human activity recognition. Experimental results show that TASKED not only outperforms state-of-the-art methods on the four real-world public HAR datasets (alone or combined) but also improves the subject generalization effectively.

translated by 谷歌翻译

Toward Open-World Electroencephalogram Decoding Via Deep Learning: A Comprehensive Survey

Xun Chen , Chang Li , Aiping Liu , Martin J. McKeown , Ruobing Qian , Z. Jane Wang

分类：机器学习

2021-12-08

脑电图（EEG）解码旨在识别基于非侵入性测量的脑活动的神经处理的感知，语义和认知含量。当应用于在静态，受控的实验室环境中获取的数据时，传统的EEG解码方法取得了适度的成功。然而，开放世界的环境是一个更现实的环境，在影响EEG录音的情况下，可以意外地出现，显着削弱了现有方法的鲁棒性。近年来，由于其在特征提取的卓越容量，深入学习（DL）被出现为潜在的解决方案。它克服了使用浅架构提取的“手工制作”功能或功能的限制，但通常需要大量的昂贵，专业标记的数据 - 并不总是可获得的。结合具有域特定知识的DL可能允许开发即使具有小样本数据，也可以开发用于解码大脑活动的鲁棒方法。虽然已经提出了各种DL方法来解决EEG解码中的一些挑战，但目前缺乏系统的教程概述，特别是对于开放世界应用程序。因此，本文为开放世界EEG解码提供了对DL方法的全面调查，并确定了有前途的研究方向，以激发现实世界应用中的脑电图解码的未来研究。

translated by 谷歌翻译

Cross-Subject Domain Adaptation for Classifying Working Memory Load with Multi-Frame EEG Images

Junfu Chen , Xiaoyi Jiang , Yang Chen , Bi Wang

分类：机器学习 | 计算机视觉

2021-06-12

工作记忆（WM）表示在脑海中存储的信息，是人类认知领域的一个基本研究主题。可以监测大脑的电活动的脑电图（EEG）已被广泛用于测量WM的水平。但是，关键的挑战之一是个体差异可能会导致无效的结果，尤其是当既定模型符合陌生主题时。在这项工作中，我们提出了一个具有空间注意力（CS-DASA）的跨主题深层适应模型，以概括跨科目的工作负载分类。首先，我们将EEG时间序列转换为包含空间，光谱和时间信息的多帧EEG图像。首先，CS-DASA中的主题共享模块从源和目标主题中接收多帧的EEG图像数据，并学习了共同的特征表示。然后，在特定于主题的模块中，实现了最大平均差异，以测量重现的内核希尔伯特空间中的域分布差异，这可以为域适应增加有效的罚款损失。此外，采用主题对象的空间注意机制专注于目标图像数据的判别空间特征。在包含13个受试者的公共WM EEG数据集上进行的实验表明，所提出的模型能够达到比现有最新方法更好的性能。

translated by 谷歌翻译

ADAST: Attentive Cross-domain EEG-based Sleep Staging Framework with Iterative Self-Training

Emadeldeen Eldele , Mohamed Ragab , Zhenghua Chen , Min Wu , Chee-Keong Kwoh , Xiaoli Li , Cuntai Guan

分类：机器学习

2021-07-09

睡眠分期在诊断和治疗睡眠障碍中非常重要。最近，已经提出了许多数据驱动的深度学习模型，用于自动睡眠分期。他们主要在一个大型公共标签的睡眠数据集上训练该模型，并在较小的主题上对其进行测试。但是，他们通常认为火车和测试数据是从相同的分布中绘制的，这可能在现实世界中不存在。最近已经开发了无监督的域适应性（UDA）来处理此域移位问题。但是，以前用于睡眠分期的UDA方法具有两个主要局限性。首先，他们依靠一个完全共享的模型来对齐，该模型可能会在功能提取过程中丢失特定于域的信息。其次，它们仅在全球范围内将源和目标分布对齐，而无需考虑目标域中的类信息，从而阻碍了测试时模型的分类性能。在这项工作中，我们提出了一个名为Adast的新型对抗性学习框架，以解决未标记的目标域中的域转移问题。首先，我们开发了一个未共享的注意机制，以保留两个领域中的域特异性特征。其次，我们设计了一种迭代自我训练策略，以通过目标域伪标签提高目标域上的分类性能。我们还建议双重分类器，以提高伪标签的鲁棒性和质量。在六个跨域场景上的实验结果验证了我们提出的框架的功效及其优于最先进的UDA方法。源代码可在https://github.com/emadeldeen24/adast上获得。

translated by 谷歌翻译

Seizure Detection and Prediction by Parallel Memristive Convolutional Neural Networks

Chenqi Li , Corey Lammie , Xuening Dong , Amirali Amirsoleimani , Mostafa Rahimi Azghadi , Roman Genov

分类：人工智能

2022-06-20

在过去的二十年中，癫痫发作检测和预测算法迅速发展。然而，尽管性能得到了重大改进，但它们使用常规技术（例如互补的金属氧化物 - 轴导剂（CMO））进行的硬件实施，在权力和面积受限的设置中仍然是一项艰巨的任务；特别是当使用许多录音频道时。在本文中，我们提出了一种新型的低延迟平行卷积神经网络（CNN）体系结构，与SOTA CNN体系结构相比，网络参数少2-2,800倍，并且达到5倍的交叉验证精度为99.84％，用于癫痫发作检测，检测到99.84％。癫痫发作预测的99.01％和97.54％分别使用波恩大学脑电图（EEG），CHB-MIT和SWEC-ETHZ癫痫发作数据集进行评估。随后，我们将网络实施到包含电阻随机存储器（RRAM）设备的模拟横梁阵列上，并通过模拟，布置和确定系统中CNN组件的硬件要求来提供全面的基准。据我们所知，我们是第一个平行于在单独的模拟横杆上执行卷积层内核的人，与SOTA混合Memristive-CMOS DL加速器相比，潜伏期降低了2个数量级。此外，我们研究了非理想性对系统的影响，并研究了量化意识培训（QAT），以减轻由于ADC/DAC分辨率较低而导致的性能降解。最后，我们提出了一种卡住的重量抵消方法，以减轻因卡住的Ron/Roff Memristor重量而导致的性能降解，而无需再进行重新培训而恢复了高达32％的精度。我们平台的CNN组件估计在22nm FDSOI CMOS流程中占据31.255mm $^2 $的面积约为2.791W。

translated by 谷歌翻译

RRWaveNet: A Compact End-to-End Multi-Scale Residual CNN for Robust PPG Respiratory Rate Estimation

Pongpanut Osathitporn , Guntitat Sawadwuthikul , Punnawish Thuwajit , Kawisara Ueafuea , Thee Mateepithaktham , Narin Kunaseth , Tanut Choksatchawathi , Proadpran Punyabukkana , Emmanuel Mignot , Theerawit Wilaiprasitporn

分类：人工智能 | 计算机视觉 | 机器学习

2022-08-18

呼吸率（RR）是重要的生物标志物，因为RR变化可以反映严重的医学事件，例如心脏病，肺部疾病和睡眠障碍。但是，不幸的是，标准手动RR计数容易出现人为错误，不能连续执行。这项研究提出了一种连续估计RR，RRWAVENET的方法。该方法是一种紧凑的端到端深度学习模型，不需要特征工程，可以将低成本的原始光摄影学（PPG）用作输入信号。对RRWAVENET进行了独立于主题的测试，并与三个数据集（BIDMC，Capnobase和Wesad）中的基线进行了比较，并使用三个窗口尺寸（16、32和64秒）进行了比较。 RRWAVENET优于最佳窗口大小为1.66 \ pm 1.01、1.59 \ pm 1.08的最佳绝对错误的最新方法，每个数据集每分钟每分钟呼吸0.96。在远程监视设置（例如在WESAD数据集中），我们将传输学习应用于其他两个ICU数据集，将MAE降低到1.52 \ pm每分钟0.50呼吸，显示此模型可以准确且实用的RR对负担得起的可穿戴设备进行准确估算。我们的研究表明，在远程医疗和家里，远程RR监测的可行性。

translated by 谷歌翻译

Distilling and Transferring Knowledge via cGAN-generated Samples for Image Classification and Regression

Xin Ding , Yongwei Wang , Zuheng Xu , Z. Jane Wang , William J. Welch

分类：计算机视觉 | (统计)机器学习

2021-04-07

Knowledge distillation (KD) has been actively studied for image classification tasks in deep learning, aiming to improve the performance of a student based on the knowledge from a teacher. However, applying KD in image regression with a scalar response variable has been rarely studied, and there exists no KD method applicable to both classification and regression tasks yet. Moreover, existing KD methods often require a practitioner to carefully select or adjust the teacher and student architectures, making these methods less flexible in practice. To address the above problems in a unified way, we propose a comprehensive KD framework based on cGANs, termed cGAN-KD. Fundamentally different from existing KD methods, cGAN-KD distills and transfers knowledge from a teacher model to a student model via cGAN-generated samples. This novel mechanism makes cGAN-KD suitable for both classification and regression tasks, compatible with other KD methods, and insensitive to the teacher and student architectures. An error bound for a student model trained in the cGAN-KD framework is derived in this work, providing a theory for why cGAN-KD is effective as well as guiding the practical implementation of cGAN-KD. Extensive experiments on CIFAR-100 and ImageNet-100 show that we can combine state of the art KD methods with the cGAN-KD framework to yield a new state of the art. Moreover, experiments on Steering Angle and UTKFace demonstrate the effectiveness of cGAN-KD in image regression tasks, where existing KD methods are inapplicable.

translated by 谷歌翻译

PURSUhInT: In Search of Informative Hint Points Based on Layer Clustering for Knowledge Distillation

Reyhan Kevser Keser , Aydin Ayanzadeh , Omid Abdollahi Aghdam , Caglar Kilcioglu , Behcet Ugur Toreyin , Nazim Kemal Ure

分类：机器学习 | 计算机视觉

2021-02-26

One of the most efficient methods for model compression is hint distillation, where the student model is injected with information (hints) from several different layers of the teacher model. Although the selection of hint points can drastically alter the compression performance, conventional distillation approaches overlook this fact and use the same hint points as in the early studies. Therefore, we propose a clustering based hint selection methodology, where the layers of teacher model are clustered with respect to several metrics and the cluster centers are used as the hint points. Our method is applicable for any student network, once it is applied on a chosen teacher network. The proposed approach is validated in CIFAR-100 and ImageNet datasets, using various teacher-student pairs and numerous hint distillation methods. Our results show that hint points selected by our algorithm results in superior compression performance compared to state-of-the-art knowledge distillation algorithms on the same student models and datasets.

translated by 谷歌翻译

MP-SeizNet: A Multi-Path CNN Bi-LSTM Network for Seizure-Type Classification Using EEG

Hezam Albaqami , Ghulam Mubashar Hassan , Amitava Datta

分类：机器学习

2022-11-09

Seizure type identification is essential for the treatment and management of epileptic patients. However, it is a difficult process known to be time consuming and labor intensive. Automated diagnosis systems, with the advancement of machine learning algorithms, have the potential to accelerate the classification process, alert patients, and support physicians in making quick and accurate decisions. In this paper, we present a novel multi-path seizure-type classification deep learning network (MP-SeizNet), consisting of a convolutional neural network (CNN) and a bidirectional long short-term memory neural network (Bi-LSTM) with an attention mechanism. The objective of this study was to classify specific types of seizures, including complex partial, simple partial, absence, tonic, and tonic-clonic seizures, using only electroencephalogram (EEG) data. The EEG data is fed to our proposed model in two different representations. The CNN was fed with wavelet-based features extracted from the EEG signals, while the Bi-LSTM was fed with raw EEG signals to let our MP-SeizNet jointly learns from different representations of seizure data for more accurate information learning. The proposed MP-SeizNet was evaluated using the largest available EEG epilepsy database, the Temple University Hospital EEG Seizure Corpus, TUSZ v1.5.2. We evaluated our proposed model across different patient data using three-fold cross-validation and across seizure data using five-fold cross-validation, achieving F1 scores of 87.6% and 98.1%, respectively.

translated by 谷歌翻译

MIN2Net: End-to-End Multi-Task Learning for Subject-Independent Motor Imagery EEG Classification

Phairot Autthasan , Rattanaphon Chaisaen , Thapanun Sudhawiyangkul , Phurin Rangpong , Suktipol Kiatthaveephong , Nat Dilokthanakul , Gun Bhakdisongkhram , Huy Phan , Cuntai Guan , Theerawit Wilaiprasitporn

分类：人工智能 | 计算机视觉 | 机器学习

2021-02-07

基于电动机图像（MI）的脑电脑界面（BCIS）允许通过解码神经生理现象来控制几种应用，这些现象通常通过使用非侵入性技术被脑电图（EEG）记录。尽管在基于MI的BCI的进展方面很大，但脑电图有特定于受试者和各种变化随时间。这些问题指出了提高分类绩效的重大挑战，特别是在独立的方式。为了克服这些挑战，我们提出了Min2Net，这是一个新的端到端多任务学习来解决这项任务。我们将深度度量学习集成到多任务AutoEncoder中，以从脑电图中学习紧凑且识别的潜在表示，并同时执行分类。这种方法降低了预处理的复杂性，导致EEG分类的显着性能改善。实验结果以本语独立的方式表明，MIN2Net优于最先进的技术，在SMR-BCI和OpenBMI数据集中分别实现了6.72％的F1分数提高，以及2.23％。我们证明MIN2NET在潜在代表中提高了歧视信息。本研究表明使用此模型的可能性和实用性为新用户开发基于MI的BCI应用，而无需校准。

translated by 谷歌翻译

Learning Realistic Patterns from Unrealistic Stimuli: Generalization and Data Anonymization

Konstantinos Nikolaidis , Stein Kristiansen , Thomas Plagemann , Vera Goebel , Knut Liestøl , Mohan Kankanhalli , Gunn Marit Traaen , Britt Øverland , Harriet Akre , Lars Aakerøy

分类：机器学习 | (统计)机器学习

2020-09-21

良好的培训数据是开发有用的ML应用程序的先决条件。但是，在许多域中，现有数据集不能由于隐私法规（例如，从医学研究）而被共享。这项工作调查了一种简单而非规范的方法，可以匿名数据综合来使第三方能够受益于此类私人数据。我们探讨了从不切实际，任务相关的刺激中隐含地学习的可行性，这通过激发训练有素的深神经网络（DNN）的神经元来合成。因此，神经元励磁用作伪生成模型。刺激数据用于培训新的分类模型。此外，我们将此框架扩展以抑制与特定个人相关的表示。我们使用开放和大型闭合临床研究的睡眠监测数据，并评估（1）最终用户是否可以创建和成功使用定制分类模型进行睡眠呼吸暂停检测，并且（2）研究中参与者的身份受到保护。广泛的比较实证研究表明，在刺激上培训的不同算法能够在与原始模型相同的任务上成功概括。然而，新和原始模型之间的架构和算法相似性在性能方面发挥着重要作用。对于类似的架构，性能接近使用真实数据（例如，精度差为0.56 \％，Kappa系数差为0.03-0.04）。进一步的实验表明，刺激可以在很大程度上成功地匿名匿名研究临床研究的参与者。

translated by 谷歌翻译

U-Sleep: resilient to AASM guidelines

Luigi Fiorillo , Giuliana Monachino , Julia van der Meer , Marco Pesce , Jan Warncke , Markus H. Schmidt , Claudio L. A. Bassetti , Athina Tzovara , Paolo Favaro , Francesca D. Faraci

分类：机器学习

2022-09-19

AASM准则是为了有一种常用的方法，旨在标准化睡眠评分程序的数十年努力的结果。该指南涵盖了从技术/数字规格（例如，推荐的EEG推导）到相应的详细睡眠评分规则到年龄的几个方面。在睡眠评分自动化的背景下，与许多其他技术相比，深度学习表现出更好的性能。通常，临床专业知识和官方准则对于支持自动睡眠评分算法在解决任务时至关重要。在本文中，我们表明，基于深度学习的睡眠评分算法可能不需要充分利用临床知识或严格遵循AASM准则。具体而言，我们证明了U-Sleep是一种最先进的睡眠评分算法，即使使用临床非申请或非规定派生，也可以解决得分任务，即使无需利用有关有关的信息，也无需利用有关有关的信息。受试者的年代年龄。我们最终加强了一个众所周知的发现，即使用来自多个数据中心的数据始终导致与单个队列上的培训相比，可以使性能更好。确实，我们表明，即使增加了单个数据队列的大小和异质性，后者仍然有效。在我们的所有实验中，我们使用了来自13个不同临床研究的28528多个多摄影研究研究。

translated by 谷歌翻译

Few-Shot Learning with a Strong Teacher

Han-Jia Ye , Lu Ming , De-Chuan Zhan , Wei-Lun Chao

分类：计算机视觉 | 人工智能 | 机器学习

2021-07-01

很少有射击学习（FSL）旨在使用有限标记的示例生成分类器。许多现有的作品采用了元学习方法，构建了一些可以从几个示例中学习以生成分类器的学习者。通常，几次学习者是通过依次对多个几次射击任务进行采样并优化几杆学习者在为这些任务生成分类器时的性能来构建或进行元训练的。性能是通过结果分类器对这些任务的测试（即查询）示例进行分类的程度来衡量的。在本文中，我们指出了这种方法的两个潜在弱点。首先，采样的查询示例可能无法提供足够的监督来进行元训练少数学习者。其次，元学习的有效性随着射击数量的增加而急剧下降。为了解决这些问题，我们为少数学习者提出了一个新颖的元训练目标，这是为了鼓励少数学习者生成像强大分类器一样执行的分类器。具体而言，我们将每个采样的几个弹药任务与强大的分类器相关联，该分类器接受了充分的标记示例。强大的分类器可以看作是目标分类器，我们希望在几乎没有示例的情况下生成的几个学习者，我们使用强大的分类器来监督少数射击学习者。我们提出了一种构建强分类器的有效方法，使我们提出的目标成为现有基于元学习的FSL方法的易于插入的术语。我们与许多代表性的元学习方法相结合验证了我们的方法，Lastshot。在几个基准数据集中，我们的方法可导致各种任务的显着改进。更重要的是，通过我们的方法，基于元学习的FSL方法可以在不同数量的镜头上胜过基于非Meta学习的方法。

translated by 谷歌翻译

Applications of Unsupervised Deep Transfer Learning to Intelligent Fault Diagnosis: A Survey and Comparative Study

Zhibin Zhao , Qiyang Zhang , Xiaolei Yu , Chuang Sun , Shibin Wang , Ruqiang Yan , Xuefeng Chen

分类：机器学习

2019-12-28

最近的智能故障诊断（IFD）的进展大大依赖于深度代表学习和大量标记数据。然而，机器通常以各种工作条件操作，或者目标任务具有不同的分布，其中包含用于训练的收集数据（域移位问题）。此外，目标域中的新收集的测试数据通常是未标记的，导致基于无监督的深度转移学习（基于UDTL为基础的）IFD问题。虽然它已经实现了巨大的发展，但标准和开放的源代码框架以及基于UDTL的IFD的比较研究尚未建立。在本文中，我们根据不同的任务，构建新的分类系统并对基于UDTL的IFD进行全面审查。对一些典型方法和数据集的比较分析显示了基于UDTL的IFD中的一些开放和基本问题，这很少研究，包括特征，骨干，负转移，物理前导等的可转移性，强调UDTL的重要性和再现性 - 基于IFD，整个测试框架将发布给研究界以促进未来的研究。总之，发布的框架和比较研究可以作为扩展界面和基本结果，以便对基于UDTL的IFD进行新的研究。代码框架可用于\ url {https:/github.com/zhaozhibin/udtl}。

translated by 谷歌翻译

Embracing Annotation Efficient Learning (AEL) for Digital Pathology and Natural Images

Eu Wern Teh

分类：计算机视觉

2022-12-01

Jitendra Malik once said, "Supervision is the opium of the AI researcher". Most deep learning techniques heavily rely on extreme amounts of human labels to work effectively. In today's world, the rate of data creation greatly surpasses the rate of data annotation. Full reliance on human annotations is just a temporary means to solve current closed problems in AI. In reality, only a tiny fraction of data is annotated. Annotation Efficient Learning (AEL) is a study of algorithms to train models effectively with fewer annotations. To thrive in AEL environments, we need deep learning techniques that rely less on manual annotations (e.g., image, bounding-box, and per-pixel labels), but learn useful information from unlabeled data. In this thesis, we explore five different techniques for handling AEL.

translated by 谷歌翻译

Occlusion-Robust FAU Recognition by Mining Latent Space of Masked Autoencoders

Minyang Jiang , Yongwei Wang , Martin J. McKeown , Z. Jane Wang

分类：计算机视觉

2022-12-08

Facial action units (FAUs) are critical for fine-grained facial expression analysis. Although FAU detection has been actively studied using ideally high quality images, it was not thoroughly studied under heavily occluded conditions. In this paper, we propose the first occlusion-robust FAU recognition method to maintain FAU detection performance under heavy occlusions. Our novel approach takes advantage of rich information from the latent space of masked autoencoder (MAE) and transforms it into FAU features. Bypassing the occlusion reconstruction step, our model efficiently extracts FAU features of occluded faces by mining the latent space of a pretrained masked autoencoder. Both node and edge-level knowledge distillation are also employed to guide our model to find a mapping between latent space vectors and FAU features. Facial occlusion conditions, including random small patches and large blocks, are thoroughly studied. Experimental results on BP4D and DISFA datasets show that our method can achieve state-of-the-art performances under the studied facial occlusion, significantly outperforming existing baseline methods. In particular, even under heavy occlusion, the proposed method can achieve comparable performance as state-of-the-art methods under normal conditions.

translated by 谷歌翻译

Conditional Generative Data-Free Knowledge Distillation based on Attention Transfer

Xinyi YU , Ling Yan , Linlin Ou

分类：计算机视觉

2021-12-31

知识蒸馏在模型压缩方面取得了显着的成就。但是，大多数现有方法需要原始的培训数据，而实践中的实际数据通常是不可用的，因为隐私，安全性和传输限制。为了解决这个问题，我们提出了一种有条件的生成数据无数据知识蒸馏（CGDD）框架，用于培训有效的便携式网络，而无需任何实际数据。在此框架中，除了使用教师模型中提取的知识外，我们将预设标签作为额外的辅助信息介绍以培训发电机。然后，训练有素的发生器可以根据需要产生指定类别的有意义的培训样本。为了促进蒸馏过程，除了使用常规蒸馏损失，我们将预设标签视为地面真理标签，以便学生网络直接由合成训练样本类别监督。此外，我们强制学生网络模仿教师模型的注意图，进一步提高了其性能。为了验证我们方法的优越性，我们设计一个新的评估度量称为相对准确性，可以直接比较不同蒸馏方法的有效性。培训的便携式网络通过提出的数据无数据蒸馏方法获得了99.63％，99.07％和99.84％的CIFAR10，CIFAR100和CALTECH101的相对准确性。实验结果表明了所提出的方法的优越性。

translated by 谷歌翻译

Pixel Distillation: A New Knowledge Distillation Scheme for Low-Resolution Image Recognition

Guangyu Guo , Longfei Han , Junwei Han , Dingwen Zhang

分类：计算机视觉 | 机器学习

2021-12-17

深度学习的巨大成功主要是由于大规模的网络架构和高质量的培训数据。但是，在具有有限的内存和成像能力的便携式设备上部署最近的深层模型仍然挑战。一些现有的作品通过知识蒸馏进行了压缩模型。不幸的是，这些方法不能处理具有缩小图像质量的图像，例如低分辨率（LR）图像。为此，我们采取了开创性的努力，从高分辨率（HR）图像到达将处理LR图像的紧凑型网络模型中学习的繁重网络模型中蒸馏有用的知识，从而推动了新颖的像素蒸馏的当前知识蒸馏技术。为实现这一目标，我们提出了一名教师助理 - 学生（TAS）框架，将知识蒸馏分解为模型压缩阶段和高分辨率表示转移阶段。通过装备新颖的特点超分辨率（FSR）模块，我们的方法可以学习轻量级网络模型，可以实现与重型教师模型相似的准确性，但参数更少，推理速度和较低分辨率的输入。在三个广泛使用的基准，\即，幼崽200-2011，Pascal VOC 2007和ImageNetsub上的综合实验证明了我们方法的有效性。

translated by 谷歌翻译