The current success of machine learning on image-based combustion monitoring is based on massive data, which is costly even impossible for industrial applications. To address this conflict, we introduce few-shot learning in order to achieve combustion monitoring and classification for the first time. Two algorithms, Siamese Network coupled with k Nearest Neighbors (SN-kNN) and Prototypical Network (PN), were tested. Rather than utilizing solely visible images as discussed in previous studies, we also used Infrared (IR) images. We analyzed the training process, test performance and inference speed of two algorithms on both image formats, and also used t-SNE to visualize learned features. The results demonstrated that both SN-kNN and PN were capable to distinguish flame states from learning with merely 20 images per flame state. The worst performance, which was realized by PN on IR images, still possessed precision, accuracy, recall, and F1-score above 0.95. We showed that visible images demonstrated more substantial differences between classes and presented more consistent patterns inside the class, which made the training speed and model performance better compared to IR images. In contrast, the relatively low quality of IR images made it difficult for PN to extract distinguishable prototypes, which caused relatively weak performance. With the entrire training set supporting classification, SN-kNN performed well with IR images. On the other hand, benefitting from the architecture design, PN has a much faster speed in training and inference than SN-kNN. The presented work analyzed the characteristics of both algorithms and image formats for the first time, thus providing guidance for their future utilization in combustion monitoring tasks.
translated by 谷歌翻译
很少有图像分类是一个具有挑战性的问题,旨在仅基于少量培训图像来达到人类的识别水平。少数图像分类的一种主要解决方案是深度度量学习。这些方法是,通过将看不见的样本根据距离的距离进行分类,可在强大的深神经网络中学到的嵌入空间中看到的样品,可以避免以少数图像分类的少数训练图像过度拟合,并实现了最新的图像表现。在本文中,我们提供了对深度度量学习方法的最新审查,以进行2018年至2022年的少量图像分类,并根据度量学习的三个阶段将它们分为三组,即学习功能嵌入,学习课堂表示和学习距离措施。通过这种分类法,我们确定了他们面临的不同方法和问题的新颖性。我们通过讨论当前的挑战和未来趋势进行了少量图像分类的讨论。
translated by 谷歌翻译
Metric-based meta-learning is one of the de facto standards in few-shot learning. It composes of representation learning and metrics calculation designs. Previous works construct class representations in different ways, varying from mean output embedding to covariance and distributions. However, using embeddings in space lacks expressivity and cannot capture class information robustly, while statistical complex modeling poses difficulty to metric designs. In this work, we use tensor fields (``areas'') to model classes from the geometrical perspective for few-shot learning. We present a simple and effective method, dubbed hypersphere prototypes (HyperProto), where class information is represented by hyperspheres with dynamic sizes with two sets of learnable parameters: the hypersphere's center and the radius. Extending from points to areas, hyperspheres are much more expressive than embeddings. Moreover, it is more convenient to perform metric-based classification with hypersphere prototypes than statistical modeling, as we only need to calculate the distance from a data point to the surface of the hypersphere. Following this idea, we also develop two variants of prototypes under other measurements. Extensive experiments and analysis on few-shot learning tasks across NLP and CV and comparison with 20+ competitive baselines demonstrate the effectiveness of our approach.
translated by 谷歌翻译
在深度学习研究中,自学学习(SSL)引起了极大的关注,引起了计算机视觉和遥感社区的兴趣。尽管计算机视觉取得了很大的成功,但SSL在地球观测领域的大部分潜力仍然锁定。在本文中,我们对在遥感的背景下为计算机视觉的SSL概念和最新发展提供了介绍,并回顾了SSL中的概念和最新发展。此外,我们在流行的遥感数据集上提供了现代SSL算法的初步基准,从而验证了SSL在遥感中的潜力,并提供了有关数据增强的扩展研究。最后,我们确定了SSL未来研究的有希望的方向的地球观察(SSL4EO),以铺平了两个领域的富有成效的相互作用。
translated by 谷歌翻译
随着深度学习技术的快速发展和计算能力的提高,深度学习已广泛应用于高光谱图像(HSI)分类领域。通常,深度学习模型通常包含许多可训练参数,并且需要大量标记的样品来实现最佳性能。然而,关于HSI分类,由于手动标记的难度和耗时的性质,大量标记的样本通常难以获取。因此,许多研究工作侧重于建立一个少数标记样本的HSI分类的深层学习模型。在本文中,我们专注于这一主题,并对相关文献提供系统审查。具体而言,本文的贡献是双重的。首先,相关方法的研究进展根据学习范式分类,包括转移学习,积极学习和少量学习。其次,已经进行了许多具有各种最先进的方法的实验,总结了结果以揭示潜在的研究方向。更重要的是,虽然深度学习模型(通常需要足够的标记样本)和具有少量标记样本的HSI场景之间存在巨大差距,但是通过深度学习融合,可以很好地表征小样本集的问题方法和相关技术,如转移学习和轻量级模型。为了再现性,可以在HTTPS://github.com/shuguoj/hsi-classification中找到纸张中评估的方法的源代码.git。
translated by 谷歌翻译
无线电星系的连续排放通常可以分为不同的形态学类,如FRI,Frii,弯曲或紧凑。在本文中,我们根据使用深度学习方法使用小规模数据集的深度学习方法来探讨基于形态的无线电星系分类的任务($ \ SIM 2000 $ Samples)。我们基于双网络应用了几次射击学习技术,并使用预先培训的DENSENET模型进行了先进技术的传输学习技术,如循环学习率和歧视性学习迅速训练模型。我们使用最佳表演模型实现了超过92 \%的分类准确性,其中最大的混乱来源是弯曲和周五型星系。我们的结果表明,专注于一个小但策划数据集随着使用最佳实践来训练神经网络可能会导致良好的结果。自动分类技术对于即将到来的下一代无线电望远镜的调查至关重要,这预计将在不久的将来检测数十万个新的无线电星系。
translated by 谷歌翻译
Recent years witnessed the breakthrough of face recognition with deep convolutional neural networks. Dozens of papers in the field of FR are published every year. Some of them were applied in the industrial community and played an important role in human life such as device unlock, mobile payment, and so on. This paper provides an introduction to face recognition, including its history, pipeline, algorithms based on conventional manually designed features or deep learning, mainstream training, evaluation datasets, and related applications. We have analyzed and compared state-of-the-art works as many as possible, and also carefully designed a set of experiments to find the effect of backbone size and data distribution. This survey is a material of the tutorial named The Practical Face Recognition Technology in the Industrial World in the FG2023.
translated by 谷歌翻译
Sunquakes are seismic emissions visible on the solar surface, associated with some solar flares. Although discovered in 1998, they have only recently become a more commonly detected phenomenon. Despite the availability of several manual detection guidelines, to our knowledge, the astrophysical data produced for sunquakes is new to the field of Machine Learning. Detecting sunquakes is a daunting task for human operators and this work aims to ease and, if possible, to improve their detection. Thus, we introduce a dataset constructed from acoustic egression-power maps of solar active regions obtained for Solar Cycles 23 and 24 using the holography method. We then present a pedagogical approach to the application of machine learning representation methods for sunquake detection using AutoEncoders, Contrastive Learning, Object Detection and recurrent techniques, which we enhance by introducing several custom domain-specific data augmentation transformations. We address the main challenges of the automated sunquake detection task, namely the very high noise patterns in and outside the active region shadow and the extreme class imbalance given by the limited number of frames that present sunquake signatures. With our trained models, we find temporal and spatial locations of peculiar acoustic emission and qualitatively associate them to eruptive and high energy emission. While noting that these models are still in a prototype stage and there is much room for improvement in metrics and bias levels, we hypothesize that their agreement on example use cases has the potential to enable detection of weak solar acoustic manifestations.
translated by 谷歌翻译
Recently, webly supervised learning (WSL) has been studied to leverage numerous and accessible data from the Internet. Most existing methods focus on learning noise-robust models from web images while neglecting the performance drop caused by the differences between web domain and real-world domain. However, only by tackling the performance gap above can we fully exploit the practical value of web datasets. To this end, we propose a Few-shot guided Prototypical (FoPro) representation learning method, which only needs a few labeled examples from reality and can significantly improve the performance in the real-world domain. Specifically, we initialize each class center with few-shot real-world data as the ``realistic" prototype. Then, the intra-class distance between web instances and ``realistic" prototypes is narrowed by contrastive learning. Finally, we measure image-prototype distance with a learnable metric. Prototypes are polished by adjacent high-quality web images and involved in removing distant out-of-distribution samples. In experiments, FoPro is trained on web datasets with a few real-world examples guided and evaluated on real-world datasets. Our method achieves the state-of-the-art performance on three fine-grained datasets and two large-scale datasets. Compared with existing WSL methods under the same few-shot settings, FoPro still excels in real-world generalization. Code is available at https://github.com/yuleiqin/fopro.
translated by 谷歌翻译
如果没有巨大的数据集,许多现代的深度学习技术就无法正常工作。同时,几个领域要求使用稀缺数据的方法。当样本具有变化的结构时,此问题甚至更为复杂。图表示学习技术最近已证明在各种领域中都成功。然而,当面对数据稀缺时,就业的体系结构表现不佳。另一方面,很少的学习允许在稀缺的数据制度中采用现代深度学习模型,而不会放弃其有效性。在这项工作中,我们解决了几乎没有图形分类的问题,这表明将简单的距离度量学习基线配备了最新的图形嵌入式嵌入者,可以在任务上获得竞争性结果。虽然体系结构的简单性足以超越更复杂的功能,它还可以直接添加。为此,我们表明可以通过鼓励任务条件的嵌入空间来获得其他改进。最后,我们提出了一种基于混合的在线数据增强技术,该技术在潜在空间中起作用,并显示其对任务的有效性。
translated by 谷歌翻译
X-ray imaging technology has been used for decades in clinical tasks to reveal the internal condition of different organs, and in recent years, it has become more common in other areas such as industry, security, and geography. The recent development of computer vision and machine learning techniques has also made it easier to automatically process X-ray images and several machine learning-based object (anomaly) detection, classification, and segmentation methods have been recently employed in X-ray image analysis. Due to the high potential of deep learning in related image processing applications, it has been used in most of the studies. This survey reviews the recent research on using computer vision and machine learning for X-ray analysis in industrial production and security applications and covers the applications, techniques, evaluation metrics, datasets, and performance comparison of those techniques on publicly available datasets. We also highlight some drawbacks in the published research and give recommendations for future research in computer vision-based X-ray analysis.
translated by 谷歌翻译
我们解决了几次拍摄语义分割(FSS)的问题,该问题旨在通过一些带有一些注释的样本分段为目标图像中的新型类对象。尽管通过结合基于原型的公制学习来进行最近的进步,但由于其特征表示差,现有方法仍然显示出在极端内部对象变化和语义相似的类别对象下的有限性能。为了解决这个问题,我们提出了一种针对FSS任务定制的双重原型对比学习方法,以有效地捕获代表性的语义。主要思想是通过增加阶级距离来鼓励原型更差异,同时减少了原型特征空间中的课堂距离。为此,我们首先向类别特定的对比丢失丢失具有动态原型字典,该字典字典存储在训练期间的类感知原型,从而实现相同的类原型和不同的类原型是不同的。此外,我们通过压缩每集内语义类的特征分布来提高类别无话的对比损失,以提高未经看不见的类别的概念能力。我们表明,所提出的双重原型对比学习方法优于Pascal-5i和Coco-20i数据集的最先进的FSS方法。该代码可用于:https://github.com/kwonjunn01/dpcl1。
translated by 谷歌翻译
机器学习模型通常会遇到与训练分布不同的样本。无法识别分布(OOD)样本,因此将该样本分配给课堂标签会显着损害模​​型的可靠性。由于其对在开放世界中的安全部署模型的重要性,该问题引起了重大关注。由于对所有可能的未知分布进行建模的棘手性,检测OOD样品是具有挑战性的。迄今为止,一些研究领域解决了检测陌生样本的问题,包括异常检测,新颖性检测,一级学习,开放式识别识别和分布外检测。尽管有相似和共同的概念,但分别分布,开放式检测和异常检测已被独立研究。因此,这些研究途径尚未交叉授粉,创造了研究障碍。尽管某些调查打算概述这些方法,但它们似乎仅关注特定领域,而无需检查不同领域之间的关系。这项调查旨在在确定其共同点的同时,对各个领域的众多著名作品进行跨域和全面的审查。研究人员可以从不同领域的研究进展概述中受益,并协同发展未来的方法。此外,据我们所知,虽然进行异常检测或单级学习进行了调查,但没有关于分布外检测的全面或最新的调查,我们的调查可广泛涵盖。最后,有了统一的跨域视角,我们讨论并阐明了未来的研究线,打算将这些领域更加紧密地融为一体。
translated by 谷歌翻译
从有限的数据学习是一个具有挑战性的任务,因为数据的稀缺导致训练型模型的较差。经典的全局汇总表示可能会失去有用的本地信息。最近,许多射击学习方法通​​过使用深度描述符和学习像素级度量来解决这一挑战。但是,使用深描述符作为特征表示可能丢失图像的上下文信息。这些方法中的大多数方法独立地处理支持集中的每个类,这不能充分利用鉴别性信息和特定于特定的嵌入。在本文中,我们提出了一种名为稀疏空间变压器(SSFormers)的新型变压器的神经网络架构,可以找到任务相关的功能并抑制任务无关的功能。具体地,我们首先将每个输入图像划分为不同大小的几个图像斑块,以获得密集的局部特征。这些功能在表达本地信息时保留上下文信息。然后,提出了一种稀疏的空间变压器层以在查询图像和整个支持集之间找到空间对应关系,以选择任务相关的图像斑块并抑制任务 - 无关的图像斑块。最后,我们建议使用图像补丁匹配模块来计算密集的本地表示之间的距离,从而确定查询图像属于支持集中的哪个类别。广泛的少量学习基准测试表明,我们的方法实现了最先进的性能。
translated by 谷歌翻译
We propose prototypical networks for the problem of few-shot classification, where a classifier must generalize to new classes not seen in the training set, given only a small number of examples of each new class. Prototypical networks learn a metric space in which classification can be performed by computing distances to prototype representations of each class. Compared to recent approaches for few-shot learning, they reflect a simpler inductive bias that is beneficial in this limited-data regime, and achieve excellent results. We provide an analysis showing that some simple design decisions can yield substantial improvements over recent approaches involving complicated architectural choices and meta-learning. We further extend prototypical networks to zero-shot learning and achieve state-of-theart results on the CU-Birds dataset.
translated by 谷歌翻译
epiSodic学习是对几枪学习感兴趣的研究人员和从业者的流行练习。它包括在一系列学习问题(或剧集)中组织培训,每个人分为小型训练和验证子集,以模仿评估期间遇到的情况。但这总是必要吗?在本文中,我们调查了在集发作的级别使用非参数方法,例如最近邻居等方法的焦点学习的有用性。对于这些方法,我们不仅展示了广州学习的限制是如何不必要的,而是他们实际上导致利用培训批次的数据低效方式。我们通过匹配和原型网络进行广泛的消融实验,其中两个最流行的方法在集中的级别使用非参数方法。他们的“非焦化”对应物具有很大的更简单,具有较少的近似参数,并在多个镜头分类数据集中提高它们的性能。
translated by 谷歌翻译
我们提出了一种新的四管齐下的方法,在文献中首次建立消防员的情境意识。我们构建了一系列深度学习框架,彼此之叠,以提高消防员在紧急首次响应设置中进行的救援任务的安全性,效率和成功完成。首先,我们使用深度卷积神经网络(CNN)系统,以实时地分类和识别来自热图像的感兴趣对象。接下来,我们将此CNN框架扩展了对象检测,跟踪,分割与掩码RCNN框架,以及具有多模级自然语言处理(NLP)框架的场景描述。第三,我们建立了一个深入的Q学习的代理,免受压力引起的迷失方向和焦虑,能够根据现场消防环境中观察和存储的事实来制定明确的导航决策。最后,我们使用了一种低计算无监督的学习技术,称为张量分解,在实时对异常检测进行有意义的特征提取。通过这些临时深度学习结构,我们建立了人工智能系统的骨干,用于消防员的情境意识。要将设计的系统带入消防员的使用,我们设计了一种物理结构,其中处理后的结果被用作创建增强现实的投入,这是一个能够建议他们所在地的消防员和周围的关键特征,这对救援操作至关重要在手头,以及路径规划功能,充当虚拟指南,以帮助迷彩的第一个响应者恢复安全。当组合时,这四种方法呈现了一种新颖的信息理解,转移和综合方法,这可能会大大提高消防员响应和功效,并降低寿命损失。
translated by 谷歌翻译
海洋生态系统及其鱼类栖息地越来越重要,因为它们在提供有价值的食物来源和保护效果方面的重要作用。由于它们的偏僻且难以接近自然,因此通常使用水下摄像头对海洋环境和鱼类栖息地进行监测。这些相机产生了大量数字数据,这些数据无法通过当前的手动处理方法有效地分析,这些方法涉及人类观察者。 DL是一种尖端的AI技术,在分析视觉数据时表现出了前所未有的性能。尽管它应用于无数领域,但仍在探索其在水下鱼类栖息地监测中的使用。在本文中,我们提供了一个涵盖DL的关键概念的教程,该教程可帮助读者了解对DL的工作原理的高级理解。该教程还解释了一个逐步的程序,讲述了如何为诸如水下鱼类监测等挑战性应用开发DL算法。此外,我们还提供了针对鱼类栖息地监测的关键深度学习技术的全面调查,包括分类,计数,定位和细分。此外,我们对水下鱼类数据集进行了公开调查,并比较水下鱼类监测域中的各种DL技术。我们还讨论了鱼类栖息地加工深度学习的新兴领域的一些挑战和机遇。本文是为了作为希望掌握对DL的高级了解,通过遵循我们的分步教程而为其应用开发的海洋科学家的教程,并了解如何发展其研究,以促进他们的研究。努力。同时,它适用于希望调查基于DL的最先进方法的计算机科学家,以进行鱼类栖息地监测。
translated by 谷歌翻译
在过去的几年里,几年枪支学习(FSL)引起了极大的关注,以最大限度地减少标有标记的训练示例的依赖。FSL中固有的困难是处理每个课程的培训样本太少的含糊不清的歧义。为了在FSL中解决这一基本挑战,我们的目标是培训可以利用关于新颖类别的先前语义知识来引导分类器合成过程的元学习模型。特别是,我们提出了语义调节的特征注意力和样本注意机制,估计表示尺寸和培训实例的重要性。我们还研究了FSL的样本噪声问题,以便在更现实和不完美的环境中利用Meta-Meverys。我们的实验结果展示了所提出的语义FSL模型的有效性,而没有样品噪声。
translated by 谷歌翻译
传统的细颗粒图像分类通常依赖于带注释的地面真相的大规模训练样本。但是,某些子类别在实际应用中可能几乎没有可用的样本。在本文中,我们建议使用多频邻域(MFN)和双交叉调制(DCM)提出一个新颖的几弹性细颗粒图像分类网络(FICNET)。采用模块MFN来捕获空间域和频域中的信息。然后,提取自相似性和多频成分以产生多频结构表示。 DCM使用分别考虑全球环境信息和类别之间的微妙关系来调节嵌入过程。针对两个少量任务的三个细粒基准数据集进行的综合实验验证了FICNET与最先进的方法相比具有出色的性能。特别是,在两个数据集“ Caltech-UCSD鸟”和“ Stanford Cars”上进行的实验分别可以获得分类精度93.17 \%和95.36 \%。它们甚至高于一般的细粒图像分类方法可以实现的。
translated by 谷歌翻译