智能论文笔记

Axial-LOB: High-Frequency Trading with Axial Attention

Damian Kisiel , Denise Gorse

分类：机器学习

2022-12-04

Previous attempts to predict stock price from limit order book (LOB) data are mostly based on deep convolutional neural networks. Although convolutions offer efficiency by restricting their operations to local interactions, it is at the cost of potentially missing out on the detection of long-range dependencies. Recent studies address this problem by employing additional recurrent or attention layers that increase computational complexity. In this work, we propose Axial-LOB, a novel fully-attentional deep learning architecture for predicting price movements of stocks from LOB data. By utilizing gated position-sensitive axial attention layers our architecture is able to construct feature maps that incorporate global interactions, while significantly reducing the size of the parameter space. Unlike previous works, Axial-LOB does not rely on hand-crafted convolutional kernels and hence has stable performance under input permutations and the capacity to incorporate additional LOB features. The effectiveness of Axial-LOB is demonstrated on a large benchmark dataset, containing time series representations of millions of high-frequency trading events, where our model establishes a new state of the art, achieving an excellent directional classification performance at all tested prediction horizons.

translated by 谷歌翻译

Augmented Bilinear Network for Incremental Multi-Stock Time-Series Classification

Mostafa Shabani , Dat Thanh Tran , Juho Kanniainen , Alexandros Iosifidis

分类：机器学习

2022-07-23

深度学习模型已在解决财务时间序列分析问题，推翻常规机器学习和统计方法方面已成为主导。大多数情况下，由于市场条件固有的差异，经过培训的一个市场或安全性的模型不能直接应用于另一个市场或安全性。此外，随着市场随着时间的推移的发展，有必要在提供新数据时更新现有模型或培训新模型。这种情况是大多数财务预测应用程序固有的，自然会提出以下研究问题：如何有效地将预训练的模型适应新的数据集，同时保留旧数据的性能，尤其是当旧数据无法访问时？在本文中，我们提出了一种方法，可以有效保留在一组证券上预先培训的神经网络中可用的知识，并将其调整以实现新的证券。在我们的方法中，通过保持现有连接的固定来维护预先训练的神经网络中编码的先验知识，并且通过一组增强连接对新证券进行调整，并使用新数据对新证券进行了调整。辅助连接被限制为低级。这不仅使我们能够快速针对新任务进行优化，而且还可以降低部署阶段的存储和运行时间复杂性。我们的方法的效率在使用大规模限制订单数据集的股票中价运动预测问题中得到了经验验证。实验结果表明，我们的方法增强了预测性能，并减少了网络参数的总数。

translated by 谷歌翻译

Trading with the Momentum Transformer: An Intelligent and Interpretable Architecture

Kieran Wood , Sven Giegerich , Stephen Roberts , Stefan Zohren

分类：机器学习 | (统计)机器学习

2021-12-16

已经发现，已经发现深度学习架构，特别是深度动量网络（DMNS）[1904.04912]是一种有效的势头和平均逆转交易的方法。然而，近年来一些关键挑战涉及学习长期依赖，在考虑返回交易成本净净额并适应新的市场制度时，绩效的退化，特别是在SARS-COV-2危机期间。注意机制或基于变换器的架构是对这些挑战的解决方案，因为它们允许网络专注于过去和长期模式的重要时间步骤。我们介绍了势头变压器，一种基于关注的架构，胜过基准，并且本质上是可解释的，为我们提供更大的深入学习交易策略。我们的模型是基于LSTM的DMN的扩展，它通过在风险调整的性能度量上优化网络，直接输出位置尺寸，例如锐利比率。我们发现注意力LSTM混合解码器仅时间融合变压器（TFT）样式架构是最佳的执行模型。在可解释性方面，我们观察注意力模式的显着结构，在动量转点时具有重要的重要性。因此，时间序列被分段为制度，并且该模型倾向于关注以前的制度中的先前时间步骤。我们发现ChangePoint检测（CPD）[2105.13727]，另一个用于响应政权变化的技术可以补充多抬头的注意力，特别是当我们在多个时间尺度运行CPD时。通过添加可解释的变量选择网络，我们观察CPD如何帮助我们的模型在日常返回数据上主要远离交易。我们注意到该模型可以智能地切换和混合古典策略 - 基于数据的决定。

translated by 谷歌翻译

Novel Modelling Strategies for High-frequency Stock Trading Data

Xuekui Zhang , Yuying Huang , Ke Xu , Li Xing

分类：机器学习

2022-11-30

Full electronic automation in stock exchanges has recently become popular, generating high-frequency intraday data and motivating the development of near real-time price forecasting methods. Machine learning algorithms are widely applied to mid-price stock predictions. Processing raw data as inputs for prediction models (e.g., data thinning and feature engineering) can primarily affect the performance of the prediction methods. However, researchers rarely discuss this topic. This motivated us to propose three novel modelling strategies for processing raw data. We illustrate how our novel modelling strategies improve forecasting performance by analyzing high-frequency data of the Dow Jones 30 component stocks. In these experiments, our strategies often lead to statistically significant improvement in predictions. The three strategies improve the F1 scores of the SVM models by 0.056, 0.087, and 0.016, respectively.

translated by 谷歌翻译

ConTraNet: A single end-to-end hybrid network for EEG-based and EMG-based human machine interfaces

Omair Ali , Muhammad Saif-ur-Rehman , Tobias Glasmachers , Ioannis Iossifidis , Christian Klaes

分类：机器学习

2022-06-21

目的：脑电图（EEG）和肌电图（EMG）是两个非侵入性的生物信号，它们在人类机器界面（HMI）技术（EEG-HMI和EMG-HMI范式）中广泛用于康复，用于康复的物理残疾人。将脑电图和EMG信号成功解码为各自的控制命令是康复过程中的关键步骤。最近，提出了几个基于卷积的神经网络（CNN）架构，它们直接将原始的时间序列信号映射到决策空间中，并同时执行有意义的特征提取和分类的过程。但是，这些网络是根据学习给定生物信号的预期特征量身定制的，并且仅限于单个范式。在这项工作中，我们解决了一个问题，即我们可以构建一个单个体系结构，该架构能够从不同的HMI范式中学习不同的功能并仍然成功地对其进行分类。方法：在这项工作中，我们引入了一个称为Controanet的单个混合模型，该模型基于CNN和Transformer架构，该模型对EEG-HMI和EMG-HMI范式同样有用。 Contranet使用CNN块在模型中引入电感偏置并学习局部依赖性，而变压器块则使用自我注意机制来学习信号中的长距离依赖性，这对于EEG和EMG信号的分类至关重要。主要结果：我们在三个属于EEG-HMI和EMG-HMI范式的公开数据集上评估并比较了Contronet与最先进的方法。 Contranet在所有不同类别任务（2级，3类，4级和10级解码任务）中的表现优于其对应。意义：结果表明，与当前的最新算法状态相比，从不同的HMI范式中学习不同的特征并概述了矛盾。

translated by 谷歌翻译

Paying Attention to Astronomical Transients: Introducing the Time-series Transformer for Photometric Classification

Tarek Allam Jr. , Jason D. McEwen

分类：机器学习

2021-05-13

Future surveys such as the Legacy Survey of Space and Time (LSST) of the Vera C. Rubin Observatory will observe an order of magnitude more astrophysical transient events than any previous survey before. With this deluge of photometric data, it will be impossible for all such events to be classified by humans alone. Recent efforts have sought to leverage machine learning methods to tackle the challenge of astronomical transient classification, with ever improving success. Transformers are a recently developed deep learning architecture, first proposed for natural language processing, that have shown a great deal of recent success. In this work we develop a new transformer architecture, which uses multi-head self attention at its core, for general multi-variate time-series data. Furthermore, the proposed time-series transformer architecture supports the inclusion of an arbitrary number of additional features, while also offering interpretability. We apply the time-series transformer to the task of photometric classification, minimising the reliance of expert domain knowledge for feature selection, while achieving results comparable to state-of-the-art photometric classification methods. We achieve a logarithmic-loss of 0.507 on imbalanced data in a representative setting using data from the Photometric LSST Astronomical Time-Series Classification Challenge (PLAsTiCC). Moreover, we achieve a micro-averaged receiver operating characteristic area under curve of 0.98 and micro-averaged precision-recall area under curve of 0.87.

translated by 谷歌翻译

A transformer-based model for default prediction in mid-cap corporate markets

Kamesh Korangi , Christophe Mues , Cristián Bravo

分类：机器学习

2021-11-18

在本文中，我们研究了中途公司，即在市场资本化少于100亿美元的公开交易公司。在30年内使用美国中载公司的大型数据集，我们期望通过中期预测默认的概率术语结构，了解哪些数据源（即基本，市场或定价数据）对违约风险贡献最多。然而，现有方法通常要求来自不同时间段的数据首先聚合并转变为横截面特征，我们将问题框架作为多标签时间级分类问题。我们适应变压器模型，从自然语言处理领域发出的最先进的深度学习模型，以信用风险建模设置。我们还使用注意热图解释这些模型的预测。为了进一步优化模型，我们为多标签分类和新型多通道架构提供了一种自定义损耗功能，具有差异训练，使模型能够有效地使用所有输入数据。我们的结果表明，拟议的深度学习架构的卓越性能，导致传统模型的AUC（接收器运行特征曲线下的区域）提高了13％。我们还展示了如何使用特定于这些模型的福利方法生成不同数据源和时间关系的重要性排名。

translated by 谷歌翻译

Attention Mechanisms in Computer Vision: A Survey

Meng-Hao Guo , Tian-Xing Xu , Jiang-Jiang Liu , Zheng-Ning Liu , Peng-Tao Jiang , Tai-Jiang Mu , Song-Hai Zhang , Ralph R. Martin , Ming-Ming Cheng , Shi-Min Hu

分类：计算机视觉

2021-11-15

人类自然有效地在复杂的场景中找到突出区域。通过这种观察的动机，引入了计算机视觉中的注意力机制，目的是模仿人类视觉系统的这一方面。这种注意机制可以基于输入图像的特征被视为动态权重调整过程。注意机制在许多视觉任务中取得了巨大的成功，包括图像分类，对象检测，语义分割，视频理解，图像生成，3D视觉，多模态任务和自我监督的学习。在本调查中，我们对计算机愿景中的各种关注机制进行了全面的审查，并根据渠道注意，空间关注，暂时关注和分支注意力进行分类。相关的存储库https：//github.com/menghaoguo/awesome-vision-tions致力于收集相关的工作。我们还建议了未来的注意机制研究方向。

translated by 谷歌翻译

Price graphs: Utilizing the structural information of financial time series for stock prediction

Junran Wu , Ke Xu , Xueyuan Chen , Shangzhe Li , Jichang Zhao

分类：机器学习

2021-06-04

良好的研究努力致力于利用股票预测中的深度神经网络。虽然远程依赖性和混沌属性仍然是在预测未来价格趋势之前降低最先进的深度学习模型的表现。在这项研究中，我们提出了一个新的框架来解决这两个问题。具体地，在将时间序列转换为复杂网络方面，我们将市场价格系列转换为图形。然后，从映射的图表中提取参考时间点和节点权重之间的关联的结构信息以解决关于远程依赖性和混沌属性的问题。我们采取图形嵌入式以表示时间点之间的关联作为预测模型输入。节点重量被用作先验知识，以增强时间关注的学习。我们拟议的框架的有效性通过现实世界股票数据验证，我们的方法在几个最先进的基准中获得了最佳性能。此外，在进行的交易模拟中，我们的框架进一步获得了最高的累积利润。我们的结果补充了复杂网络方法在金融领域的现有应用，并为金融市场中决策支持的投资应用提供了富有识别的影响。

translated by 谷歌翻译

Transformers in Vision: A Survey

Salman Khan , Muzammal Naseer , Munawar Hayat , Syed Waqas Zamir , Fahad Shahbaz Khan , Mubarak Shah

分类：

2021-01-04

Astounding results from Transformer models on natural language tasks have intrigued the vision community to study their application to computer vision problems. Among their salient benefits, Transformers enable modeling long dependencies between input sequence elements and support parallel processing of sequence as compared to recurrent networks e.g., Long short-term memory (LSTM). Different from convolutional networks, Transformers require minimal inductive biases for their design and are naturally suited as set-functions. Furthermore, the straightforward design of Transformers allows processing multiple modalities (e.g., images, videos, text and speech) using similar processing blocks and demonstrates excellent scalability to very large capacity networks and huge datasets. These strengths have led to exciting progress on a number of vision tasks using Transformer networks. This survey aims to provide a comprehensive overview of the Transformer models in the computer vision discipline. We start with an introduction to fundamental concepts behind the success of Transformers i.e., self-attention, large-scale pre-training, and bidirectional feature encoding. We then cover extensive applications of transformers in vision including popular recognition tasks (e.g., image classification, object detection, action recognition, and segmentation), generative modeling, multi-modal tasks (e.g., visual-question answering, visual reasoning, and visual grounding), video processing (e.g., activity recognition, video forecasting), low-level vision (e.g., image super-resolution, image enhancement, and colorization) and 3D analysis (e.g., point cloud classification and segmentation). We compare the respective advantages and limitations of popular techniques both in terms of architectural design and their experimental value. Finally, we provide an analysis on open research directions and possible future works. We hope this effort will ignite further interest in the community to solve current challenges towards the application of transformer models in computer vision.

translated by 谷歌翻译

Stock Market Prediction via Deep Learning Techniques: A Survey

Jinan Zou , Qingying Zhao , Yang Jiao , Haiyao Cao , Yanxi Liu , Qingsen Yan , Ehsan Abbasnejad , Lingqiao Liu , Javen Qinfeng Shi

分类：人工智能

2022-12-24

The stock market prediction has been a traditional yet complex problem researched within diverse research areas and application domains due to its non-linear, highly volatile and complex nature. Existing surveys on stock market prediction often focus on traditional machine learning methods instead of deep learning methods. Deep learning has dominated many domains, gained much success and popularity in recent years in stock market prediction. This motivates us to provide a structured and comprehensive overview of the research on stock market prediction focusing on deep learning techniques. We present four elaborated subtasks of stock market prediction and propose a novel taxonomy to summarize the state-of-the-art models based on deep neural networks from 2011 to 2022. In addition, we also provide detailed statistics on the datasets and evaluation metrics commonly used in the stock market. Finally, we highlight some open issues and point out several future directions by sharing some new perspectives on stock market prediction.

translated by 谷歌翻译

On the Integration of Self-Attention and Convolution

Xuran Pan , Chunjiang Ge , Rui Lu , Shiji Song , Guanfu Chen , Zeyi Huang , Gao Huang

分类：计算机视觉

2021-11-29

卷积和自我关注是表示学习的两个强大的技术，通常被认为是两个与彼此不同的对等方法。在本文中，我们表明它们之间存在强烈的潜在关系，从而在这两个范式的大部分计算实际上以相同的操作完成。具体来说，我们首先表明，具有内核大小k x k的传统卷积可以分解为k ^ 2个单独的1x1卷积，然后是换档和求和操作。然后，我们将自我注意模块中的查询，键和值解释为多个1x1卷积，然后计算注意力权重和值的聚合。因此，两个模块的第一阶段包括类似的操作。更重要的是，第一阶段有助于与第二阶段相比的主导计算复杂性（信道大小的正方形）。这种观察结果自然导致这两个看似独特的范例的优雅集成，即享有自我关注和卷积（ACMIX）的益处的混合模型，同时与纯卷积或自我关注对应相比具有最小的计算开销。广泛的实验表明，我们的模型在图像识别和下游任务上持续改进了竞争基础的结果。代码和预先训练的型号将在https://github.com/panxuran/acmix和https://gitee.com/mindspore/models发布。

translated by 谷歌翻译

Efficient deep learning models for land cover image classification

Ioannis Papoutsis , Nikolaos-Ioannis Bountos , Angelos Zavras , Dimitrios Michail , Christos Tryfonopoulos

分类：计算机视觉

2021-11-18

哥内克人Sentinel Imagery的纯粹卷的可用性为使用深度学习的大尺度创造了新的土地利用陆地覆盖（Lulc）映射的机会。虽然在这种大型数据集上培训是一个非琐碎的任务。在这项工作中，我们试验Lulc Image分类和基准不同最先进模型的Bigearthnet数据集，包括卷积神经网络，多层感知，视觉变压器，高效导通和宽残余网络（WRN）架构。我们的目标是利用分类准确性，培训时间和推理率。我们提出了一种基于用于网络深度，宽度和输入数据分辨率的WRNS复合缩放的高效导通的框架，以有效地训练和测试不同的模型设置。我们设计一种新颖的缩放WRN架构，增强了有效的通道注意力机制。我们提出的轻量级模型具有较小的培训参数，实现所有19个LULC类的平均F分类准确度达到4.5％，并且验证了我们使用的resnet50最先进的模型速度快两倍作为基线。我们提供超过50种培训的型号，以及我们在多个GPU节点上分布式培训的代码。

translated by 谷歌翻译

Defect Transformer: An Efficient Hybrid Transformer Architecture for Surface Defect Detection

Junpu Wang , Guili Xu , Fuju Yan , Jinjin Wang , Zhengsheng Wang

分类：计算机视觉

2022-07-17

表面缺陷检测是确保工业产品质量的极其至关重要的步骤。如今，基于编码器架构的卷积神经网络（CNN）在各种缺陷检测任务中取得了巨大的成功。然而，由于卷积的内在局部性，它们通常在明确建模长距离相互作用时表现出限制，这对于复杂情况下的像素缺陷检测至关重要，例如杂乱的背景和难以辨认的伪缺陷。最近的变压器尤其擅长学习全球图像依赖性，但对于详细的缺陷位置所需的本地结构信息有限。为了克服上述局限性，我们提出了一个有效的混合变压器体系结构，称为缺陷变压器（faft），用于表面缺陷检测，该检测将CNN和Transferaler纳入统一模型，以协作捕获本地和非本地关系。具体而言，在编码器模块中，首先采用卷积茎块来保留更详细的空间信息。然后，贴片聚合块用于生成具有四个层次结构的多尺度表示形式，每个层次结构之后分别是一系列的feft块，该块分别包括用于本地位置编码的本地位置块，一个轻巧的多功能自我自我 - 注意与良好的计算效率建模多尺度的全球上下文关系，以及用于功能转换和进一步位置信息学习的卷积馈送网络。最后，提出了一个简单但有效的解码器模块，以从编码器中的跳过连接中逐渐恢复空间细节。与其他基于CNN的网络相比，三个数据集上的广泛实验证明了我们方法的优势和效率。

translated by 谷歌翻译

Learned Queries for Efficient Local Attention

Moab Arar , Ariel Shamir , Amit H. Bermano

分类：计算机视觉

2021-12-21

视觉变压器（VIT）用作强大的视觉模型。与卷积神经网络不同，在前几年主导视觉研究，视觉变压器享有捕获数据中的远程依赖性的能力。尽管如此，任何变压器架构的组成部分，自我关注机制都存在高延迟和低效的内存利用，使其不太适合高分辨率输入图像。为了缓解这些缺点，分层视觉模型在非交错的窗口上局部使用自我关注。这种放松会降低输入尺寸的复杂性;但是，它限制了横窗相互作用，损害了模型性能。在本文中，我们提出了一种新的班次不变的本地注意层，称为查询和参加（QNA），其以重叠的方式聚集在本地输入，非常类似于卷积。 QNA背后的关键想法是介绍学习的查询，这允许快速高效地实现。我们通过将其纳入分层视觉变压器模型来验证我们的层的有效性。我们展示了速度和内存复杂性的改进，同时实现了与最先进的模型的可比准确性。最后，我们的图层尺寸尤其良好，窗口大小，需要高于X10的内存，而不是比现有方法更快。

translated by 谷歌翻译

Attention Augmented Convolutional Networks

Irwan Bello , Barret Zoph , Ashish Vaswani , Jonathon Shlens , Quoc V. Le

分类：

2019-04-22

Convolutional networks have been the paradigm of choice in many computer vision applications. The convolution operation however has a significant weakness in that it only operates on a local neighborhood, thus missing global information. Self-attention, on the other hand, has emerged as a recent advance to capture long range interactions, but has mostly been applied to sequence modeling and generative modeling tasks. In this paper, we consider the use of self-attention for discriminative visual tasks as an alternative to convolutions. We introduce a novel two-dimensional relative self-attention mechanism that proves competitive in replacing convolutions as a stand-alone computational primitive for image classification. We find in control experiments that the best results are obtained when combining both convolutions and self-attention. We therefore propose to augment convolutional operators with this self-attention mechanism by concatenating convolutional feature maps with a set of feature maps produced via self-attention. Extensive experiments show that Attention Augmentation leads to consistent improvements in image classification on Im-ageNet and object detection on COCO across many different models and scales, including ResNets and a stateof-the art mobile constrained network, while keeping the number of parameters similar. In particular, our method achieves a 1.3% top-1 accuracy improvement on ImageNet classification over a ResNet50 baseline and outperforms other attention mechanisms for images such as . It also achieves an improvement of 1.4 mAP in COCO Object Detection on top of a RetinaNet baseline.

translated by 谷歌翻译

Deep Learning for Time Series Anomaly Detection: A Survey

Zahra Zamanzadeh Darban , Geoffrey I. Webb , Shirui Pan , Charu C. Aggarwal , Mahsa Salehi

分类：机器学习 | 人工智能

2022-11-09

Time series anomaly detection has applications in a wide range of research fields and applications, including manufacturing and healthcare. The presence of anomalies can indicate novel or unexpected events, such as production faults, system defects, or heart fluttering, and is therefore of particular interest. The large size and complex patterns of time series have led researchers to develop specialised deep learning models for detecting anomalous patterns. This survey focuses on providing structured and comprehensive state-of-the-art time series anomaly detection models through the use of deep learning. It providing a taxonomy based on the factors that divide anomaly detection models into different categories. Aside from describing the basic anomaly detection technique for each category, the advantages and limitations are also discussed. Furthermore, this study includes examples of deep anomaly detection in time series across various application domains in recent years. It finally summarises open issues in research and challenges faced while adopting deep anomaly detection models.

translated by 谷歌翻译

Convolution-enhanced Evolving Attention Networks

Yujing Wang , Yaming Yang , Zhuo Li , Jiangang Bai , Mingliang Zhang , Xiangtai Li , Jing Yu , Ce Zhang , Gao Huang , Yunhai Tong

分类：机器学习 | 自然语言处理 | 计算机视觉 | 神经与进化计算

2022-12-16

Attention-based neural networks, such as Transformers, have become ubiquitous in numerous applications, including computer vision, natural language processing, and time-series analysis. In all kinds of attention networks, the attention maps are crucial as they encode semantic dependencies between input tokens. However, most existing attention networks perform modeling or reasoning based on representations, wherein the attention maps of different layers are learned separately without explicit interactions. In this paper, we propose a novel and generic evolving attention mechanism, which directly models the evolution of inter-token relationships through a chain of residual convolutional modules. The major motivations are twofold. On the one hand, the attention maps in different layers share transferable knowledge, thus adding a residual connection can facilitate the information flow of inter-token relationships across layers. On the other hand, there is naturally an evolutionary trend among attention maps at different abstraction levels, so it is beneficial to exploit a dedicated convolution-based module to capture this process. Equipped with the proposed mechanism, the convolution-enhanced evolving attention networks achieve superior performance in various applications, including time-series representation, natural language understanding, machine translation, and image classification. Especially on time-series representation tasks, Evolving Attention-enhanced Dilated Convolutional (EA-DC-) Transformer outperforms state-of-the-art models significantly, achieving an average of 17% improvement compared to the best SOTA. To the best of our knowledge, this is the first work that explicitly models the layer-wise evolution of attention maps. Our implementation is available at https://github.com/pkuyym/EvolvingAttention

translated by 谷歌翻译

A Deep Learning Approach for the Segmentation of Electroencephalography Data in Eye Tracking Applications

Lukas Wolf , Ard Kastrati , Martyna Beata Płomecka , Jie-Ming Li , Dustin Klebe , Alexander Veicht , Roger Wattenhofer , Nicolas Langer

分类：机器学习 | 人工智能

2022-06-17

眼目光信息的收集为人类认知，健康和行为的许多关键方面提供了一个窗口。此外，许多神经科学研究补充了从眼睛跟踪中获得的行为信息，以及脑电图（EEG）提供的高时间分辨率和神经生理学标记。必不可少的眼睛跟踪软件处理步骤之一是将连续数据流的分割为与扫视，固定和眨眼等眼睛跟踪应用程序相关的事件。在这里，我们介绍了Detrtime，这是一个新颖的时间序列分割框架，该框架创建了不需要额外记录的眼睛跟踪模式并仅依靠脑电图数据的眼部事件检测器。我们的端到端基于深度学习的框架将计算机视觉的最新进展带到了脑电图数据的《时代》系列分割的最前沿。 Detr Time在各种眼睛追踪实验范式上实现眼部事件检测中的最新性能。除此之外，我们还提供了证据表明我们的模型在脑电图阶段分割的任务中很好地概括了。

translated by 谷歌翻译

EAN: Event Adaptive Network for Enhanced Action Recognition

Yuan Tian , Yichao Yan , Guangtao Zhai , Guodong Guo , Zhiyong Gao

分类：计算机视觉

2021-07-22

有效地对视频中的空间信息进行建模对于动作识别至关重要。为了实现这一目标，最先进的方法通常采用卷积操作员和密集的相互作用模块，例如非本地块。但是，这些方法无法准确地符合视频中的各种事件。一方面，采用的卷积是有固定尺度的，因此在各种尺度的事件中挣扎。另一方面，密集的相互作用建模范式仅在动作 - 欧元零件时实现次优性能，给最终预测带来了其他噪音。在本文中，我们提出了一个统一的动作识别框架，以通过引入以下设计来研究视频内容的动态性质。首先，在提取本地提示时，我们会生成动态尺度的时空内核，以适应各种事件。其次，为了将这些线索准确地汇总为全局视频表示形式，我们建议仅通过变压器在一些选定的前景对象之间进行交互，从而产生稀疏的范式。我们将提出的框架称为事件自适应网络（EAN），因为这两个关键设计都适应输入视频内容。为了利用本地细分市场内的短期运动，我们提出了一种新颖有效的潜在运动代码（LMC）模块，进一步改善了框架的性能。在几个大规模视频数据集上进行了广泛的实验，例如，某种东西，动力学和潜水48，验证了我们的模型是否在低拖鞋上实现了最先进或竞争性的表演。代码可在：https：//github.com/tianyuan168326/ean-pytorch中找到。

translated by 谷歌翻译