智能论文笔记

Image Feature Information Extraction for Interest Point Detection: A Review

Junfeng Jing , Tian Gao , Weichuan Zhang , Yongsheng Gao , Changming Sun

分类：计算机视觉

2021-06-15

兴趣点检测是计算机视觉和图像处理中最根本，最关键的问题之一。在本文中，我们对图像特征信息（IFI）提取技术进行了全面综述，以进行利益点检测。为了系统地介绍现有的兴趣点检测方法如何从输入图像中提取IFI，我们提出了IFI提取技术的分类学检测。根据该分类法，我们讨论了不同类型的IFI提取技术以进行兴趣点检测。此外，我们确定了与现有的IFI提取技术有关的主要未解决的问题，以及以前尚未讨论过的任何兴趣点检测方法。提供了现有的流行数据集和评估标准，并评估和讨论了18种最先进方法的性能。此外，还详细阐述了有关IFI提取技术的未来研究方向。

translated by 谷歌翻译

Surf: Speeded up robust features

分类：

In this paper, we present a novel scale-and rotation-invariant interest point detector and descriptor, coined SURF (Speeded Up Robust Features). It approximates or even outperforms previously proposed schemes with respect to repeatability, distinctiveness, and robustness, yet can be computed and compared much faster.This is achieved by relying on integral images for image convolutions; by building on the strengths of the leading existing detectors and descriptors (in casu, using a Hessian matrix-based measure for the detector, and a distribution-based descriptor); and by simplifying these methods to the essential. This leads to a combination of novel detection, description, and matching steps. The paper presents experimental results on a standard evaluation set, as well as on imagery obtained in the context of a real-life object recognition application. Both show SURF's strong performance.

translated by 谷歌翻译

Color Image Edge Detection using Multi-scale and Multi-directional Gabor filter

Yunhong Li , Yuandong Bi , Weichuan Zhang , Jie Ren , Jinni Chen

分类：计算机视觉

2022-08-16

在本文中，提出了一种颜色边缘检测方法，其中使用多尺度Gabor滤波器从输入颜色图像获得边缘。该方法的主要优点是在保持良好的噪声稳健性的同时，达到了高边缘检测精度。提出的方法包括三个方面：首先，RGB颜色图像由于其宽阔的着色区域和均匀的颜色分布而转换为CIE L*A*B*空间。其次，使用一组Gabor过滤器来平滑输入图像，并提取了色边缘强度图，并将其融合到具有噪声稳健性和准确边缘提取的新ESM中。第三，将熔融ESM嵌入精美探测器的途径中会产生噪声颜色边缘检测器。结果表明，所提出的检测器在检测准确性和噪声过程中具有更好的经验。

translated by 谷歌翻译

R2FD2: Fast and Robust Matching of Multimodal Remote Sensing Image via Repeatable Feature Detector and Rotation-invariant Feature Descriptor

Bai Zhu , Chao Yang , Jinkun Dai , Jianwei Fan , Yuanxin Ye

分类：计算机视觉

2022-12-05

Automatically identifying feature correspondences between multimodal images is facing enormous challenges because of the significant differences both in radiation and geometry. To address these problems, we propose a novel feature matching method, named R2FD2, that is robust to radiation and rotation differences.Our R2FD2 is conducted in two critical contributions, consisting of a repeatable feature detector and a rotation-invariant feature descriptor. In the first stage, a repeatable feature detector called the Multi-channel Auto-correlation of the Log-Gabor is presented for feature detection, which combines the multi-channel auto-correlation strategy with the Log-Gabor wavelets to detect interest points with high repeatability and uniform distribution. In the second stage, a rotation-invariant feature descriptor is constructed, named the Rotation-invariant Maximum index map of the Log-Gabor, which consists of two components: fast assignment of dominant orientation and construction of feature representation. In the process of fast assignment of dominant orientation, a Rotation-invariant Maximum Index Map is built to address rotation deformations. Then, the proposed RMLG incorporates the rotation-invariant RMIM with the spatial configuration of DAISY to depict a more discriminative feature representation, which improves RMLGs resistance to radiation and rotation variances.

translated by 谷歌翻译

Skin feature point tracking using deep feature encodings

Jose Ramon Chang , Torbjörn E. M. Nordling

分类：计算机视觉 | 人工智能 | 机器学习

2021-12-28

面部特征跟踪是成像跳芭式（BCG）的关键组成部分，其中需要精确定量面部关键点的位移，以获得良好的心率估计。皮肤特征跟踪能够在帕金森病中基于视频的电机降解量化。传统的计算机视觉算法包括刻度不变特征变换（SIFT），加速强大的功能（冲浪）和LUCAS-KANADE方法（LK）。这些长期代表了最先进的效率和准确性，但是当存在常见的变形时，如图所示，如图所示，如此。在过去的五年中，深度卷积神经网络对大多数计算机视觉任务的传统方法表现优于传统的传统方法。我们提出了一种用于特征跟踪的管道，其应用卷积堆积的AutoEncoder，以将图像中最相似的裁剪标识到包含感兴趣的特征的参考裁剪。 AutoEncoder学会将图像作物代表到特定于对象类别的深度特征编码。我们在面部图像上培训AutoEncoder，并验证其在手动标记的脸部和手视频中通常验证其跟踪皮肤功能的能力。独特的皮肤特征（痣）的跟踪误差是如此之小，因为我们不能排除他们基于$ \ chi ^ 2 $ -test的手动标签。对于0.6-4.2像素的平均误差，我们的方法在所有情况下都表现出了其他方法。更重要的是，我们的方法是唯一一个不分歧的方法。我们得出的结论是，我们的方法为特征跟踪，特征匹配和图像配准比传统算法创建更好的特征描述符。

translated by 谷歌翻译

Copy-Move Image Forgery Detection Based on Evolving Circular Domains Coverage

Shilin Lu , Xinghong Hu , Chengyou Wang , Lu Chen , Shulu Han , Yuejia Han

分类：计算机视觉

2021-09-09

本文的目的是通过提出一种新颖的方案，提高图像取证中复制伪造检测（CMFD）的准确性，主要贡献正在不断发展循环域覆盖（ECDC）算法。该方案集成了基于块和基于关键点的伪造检测方法。首先，从整个图像中提取逻辑极性空间和比例不变特征变换（SIFT）中的加速强大功能（SURF）。其次，采用广义2最近邻（G2NN）来获得大规模匹配的对。然后，采用随机样本共识（RANSAC）算法来滤除不匹配的对，从而允许粗略地位伪造区域。要更准确地展示这些伪造地区，我们提出了高效准确的ECDC算法呈现它们。该算法可以通过从联合演化的圆形域中提取块特征来找到满意的阈值区域，这些域在匹配对上以匹配的对。最后，应用形态学操作来优化检测到的伪造区域。实验结果表明，与其他最先进的CMFD方案相比，所提出的CMFD方案可以在各种攻击下实现更好的检测性能。

translated by 谷歌翻译

Human Treelike Tubular Structure Segmentation: A Comprehensive Review and Future Perspectives

Hao Li , Zeyu Tang , Yang Nan , Guang Yang

分类：计算机视觉 | 机器学习

2022-07-12

人类生理学中的各种结构遵循特异性形态，通常在非常细的尺度上表达复杂性。这种结构的例子是胸前气道，视网膜血管和肝血管。可以观察到可以观察到可以观察到可以观察到可以观察到空间排列的磁共振成像（MRI），计算机断层扫描（CT），光学相干断层扫描（OCT）等医学成像模式（MRI），计算机断层扫描（CT），可以观察到空间排列的大量2D和3D图像的集合。这些结构在医学成像中的分割非常重要，因为对结构的分析提供了对疾病诊断，治疗计划和预后的见解。放射科医生手动标记广泛的数据通常是耗时且容易出错的。结果，在过去的二十年中，自动化或半自动化的计算模型已成为医学成像的流行研究领域，迄今为止，许多计算模型已经开发出来。在这项调查中，我们旨在对当前公开可用的数据集，细分算法和评估指标进行全面审查。此外，讨论了当前的挑战和未来的研究方向。

translated by 谷歌翻译

BALF: Simple and Efficient Blur Aware Local Feature Detector

Zhenjun Zhao , Yu Zhai , Ben M. Chen , Peidong Liu

分类：计算机视觉

2022-11-27

Local feature detection is a key ingredient of many image processing and computer vision applications, such as visual odometry and localization. Most existing algorithms focus on feature detection from a sharp image. They would thus have degraded performance once the image is blurred, which could happen easily under low-lighting conditions. To address this issue, we propose a simple yet both efficient and effective keypoint detection method that is able to accurately localize the salient keypoints in a blurred image. Our method takes advantages of a novel multi-layer perceptron (MLP) based architecture that significantly improve the detection repeatability for a blurred image. The network is also light-weight and able to run in real-time, which enables its deployment for time-constrained applications. Extensive experimental results demonstrate that our detector is able to improve the detection repeatability with blurred images, while keeping comparable performance as existing state-of-the-art detectors for sharp images.

translated by 谷歌翻译

A comprehensive survey on computer-aided diagnostic systems in diabetic retinopathy screening

Meysam Tavakoli , Patrick Kelley

分类：计算机视觉

2022-08-03

糖尿病（DM）可导致严重的微脉管破坏，最终导致糖尿病性视网膜病变（DR）或由于糖尿病引起的眼睛并发症。如果不受组织的检查，这种疾病会随着时间的流逝而增加，并最终导致完全视力丧失。检测到这种光学发展的一般方法是通过检查视网膜图像的血管，视神经头，微型毛发，出血，渗出液等。最终，这受到经验丰富的眼科医生和大量DM案例的数量的限制。为了启用早期有效的DR诊断，眼科领域需要强大的计算机辅助诊断（CAD）系统。我们的审查旨在为从学生到成熟的研究人员提供给任何人，他们想了解CAD系统及其算法可以完成的工作，再到建模以及计算机视觉和模式识别中的视网膜图像处理领域的发展方向。对于刚开始的人来说，我们特别强调了不同数据库和算法框架的逻辑，优势和缺点，重点是最近的方法。

translated by 谷歌翻译

Vision-Based Environmental Perception for Autonomous Driving

Fei Liu , Zihao Lu , Xianke Lin

分类：计算机视觉

2022-12-22

Visual perception plays an important role in autonomous driving. One of the primary tasks is object detection and identification. Since the vision sensor is rich in color and texture information, it can quickly and accurately identify various road information. The commonly used technique is based on extracting and calculating various features of the image. The recent development of deep learning-based method has better reliability and processing speed and has a greater advantage in recognizing complex elements. For depth estimation, vision sensor is also used for ranging due to their small size and low cost. Monocular camera uses image data from a single viewpoint as input to estimate object depth. In contrast, stereo vision is based on parallax and matching feature points of different views, and the application of deep learning also further improves the accuracy. In addition, Simultaneous Location and Mapping (SLAM) can establish a model of the road environment, thus helping the vehicle perceive the surrounding environment and complete the tasks. In this paper, we introduce and compare various methods of object detection and identification, then explain the development of depth estimation and compare various methods based on monocular, stereo, and RDBG sensors, next review and compare various methods of SLAM, and finally summarize the current problems and present the future development trends of vision technologies.

translated by 谷歌翻译

Extracting Deformation-Aware Local Features by Learning to Deform

Guilherme Potje , Renato Martins , Felipe Cadar , Erickson R. Nascimento

分类：计算机视觉 | 机器学习

2021-11-20

尽管提取了通过手工制作和基于学习的描述符实现的本地特征的进步，但它们仍然受到不符合非刚性转换的不变性的限制。在本文中，我们提出了一种计算来自静止图像的特征的新方法，该特征对于非刚性变形稳健，以避免匹配可变形表面和物体的问题。我们的变形感知当地描述符，命名优惠，利用极性采样和空间变压器翘曲，以提供旋转，尺度和图像变形的不变性。我们通过将等距非刚性变形应用于模拟环境中的对象作为指导来提供高度辨别的本地特征来培训模型架构端到端。该实验表明，我们的方法优于静止图像中的实际和现实合成可变形对象的不同数据集中的最先进的手工制作，基于学习的图像和RGB-D描述符。描述符的源代码和培训模型在https://www.verlab.dcc.ufmg.br/descriptors/neUrips2021上公开可用。

translated by 谷歌翻译

Object recognition from local scale-invariant features

分类：

An object recognition system has been developed that uses a new class of local image features. The features are invariantto image scaling, translation, and rotation, and partially invariant to illumination changes and affine or 3D projection. These features share similar properties with neurons in inferior temporal cortex that are used for object recognition in primate vision. Features are efficiently detected through a staged filtering approach that identifies stable points in scale space. Image keys are created that allow for local geometric deformations by representing blurred image gradients in multiple orientation planes and at multiple scales. The keys are used as input to a nearest-neighbor indexing method that identifies candidate object matches. Final verification of each match is achieved by finding a low-residual least-squares solution for the unknown model parameters. Experimental results show that robust object recognition can be achieved in cluttered partially-occluded images with a computation time of under 2 seconds.

translated by 谷歌翻译

Self-Supervised Endoscopic Image Key-Points Matching

Manel Farhat , Houda Chaabouni-Chouayakh , Achraf Ben-Hamadou

分类：计算机视觉

2022-08-24

在许多临床应用中，内窥镜图像之间的特征匹配和查找对应关系是从临床序列中进行快速异常定位的许多临床应用中的关键步骤。尽管如此，由于内窥镜图像中存在较高的纹理可变性，稳健和准确的特征匹配的发展成为一项具有挑战性的任务。最近，通过卷积神经网络（CNN）提取的深度学习技术已在各种计算机视觉任务中获得了吸引力。但是，他们都遵循一个有监督的学习计划，其中需要大量注释的数据才能达到良好的性能，这通常不总是可用于医疗数据数据库。为了克服与标记的数据稀缺性有关的限制，自我监督的学习范式最近在许多应用程序中表现出了巨大的成功。本文提出了一种基于深度学习技术的内窥镜图像匹配的新型自我监督方法。与标准手工制作的本地功能描述符相比，我们的方法在精度和召回方面优于它们。此外，与选择基于精度和匹配分数的基于最先进的基于深度学习的监督方法相比，我们的自我监管的描述符提供了竞争性能。

translated by 谷歌翻译

Visual Object Tracking with Discriminative Filters and Siamese Networks: A Survey and Outlook

Sajid Javed , Martin Danelljan , Fahad Shahbaz Khan , Muhammad Haris Khan , Michael Felsberg , Jiri Matas

分类：计算机视觉

2021-12-06

准确且强大的视觉对象跟踪是最具挑战性和最基本的计算机视觉问题之一。它需要在图像序列中估计目标的轨迹，仅给出其初始位置和分段，或者在边界框的形式中粗略近似。判别相关滤波器（DCF）和深度暹罗网络（SNS）被出现为主导跟踪范式，这导致了重大进展。在过去十年的视觉对象跟踪快速演变之后，该调查介绍了90多个DCFS和暹罗跟踪器的系统和彻底审查，基于九个跟踪基准。首先，我们介绍了DCF和暹罗跟踪核心配方的背景理论。然后，我们在这些跟踪范式中区分和全面地审查共享以及具体的开放研究挑战。此外，我们彻底分析了DCF和暹罗跟踪器对九个基准的性能，涵盖了视觉跟踪的不同实验方面：数据集，评估度量，性能和速度比较。通过提出根据我们的分析提出尊重开放挑战的建议和建议来完成调查。

translated by 谷歌翻译

AstroVision: Towards Autonomous Feature Detection and Description for Missions to Small Bodies Using Deep Learning

Travis Driver , Katherine Skinner , Mehregan Dor , Panagiotis Tsiotras

分类：计算机视觉 | 机器人

2022-08-03

小天体的任务在很大程度上依赖于光学特征跟踪，以表征和相对导航。尽管深度学习导致了功能检测和描述方面的巨大进步，但由于大规模，带注释的数据集的可用性有限，因此培训和验证了空间应用程序的数据驱动模型具有挑战性。本文介绍了Astrovision，这是一个大规模数据集，由115,970个密集注释的，真实的图像组成，这些图像是过去和正在进行的任务中捕获的16个不同物体的真实图像。我们利用Astrovision开发一组标准化基准，并对手工和数据驱动的功能检测和描述方法进行详尽的评估。接下来，我们采用Astrovision对最先进的，深刻的功能检测和描述网络进行端到端培训，并在多个基准测试中表现出改善的性能。将公开使用完整的基准管道和数据集，以促进用于空间应用程序的计算机视觉算法的发展。

translated by 谷歌翻译

A Survey of Orthogonal Moments for Image Representation: Theory, Implementation, and Evaluation

Shuren Qi , Yushu Zhang , Chao Wang , Jiantao Zhou , Xiaochun Cao

分类：计算机视觉

2021-03-27

图像表示是计算机视觉和模式识别中的一个重要主题。它在一系列应用中扮演了了解视觉内容的基本作用。据报道，基于矩的图像表示在满足其由于其有益的数学特性而满足语义描述的核心条件，特别是几何不变性和独立性。本文介绍了对图像表示的正交矩的全面调查，涵盖了快速/准确计算，鲁棒性/不变性优化，定义扩展和应用程序的最新进步。我们还为各种广泛使用的正交瞬间创建一个软件包，并在同一基地中评估此类方法。提出的理论分析，软件实施和评估结果可以支持社区，特别是在开发新颖的技术和促进现实世界的应用方面。

translated by 谷歌翻译

Visual and Object Geo-localization: A Comprehensive Survey

Daniel Wilson , Xiaohan Zhang , Waqas Sultani , Safwan Wshah

分类：计算机视觉

2021-12-30

地理定位的概念是指确定地球上的某些“实体”的位置的过程，通常使用全球定位系统（GPS）坐标。感兴趣的实体可以是图像，图像序列，视频，卫星图像，甚至图像中可见的物体。由于GPS标记媒体的大规模数据集由于智能手机和互联网而迅速变得可用，而深入学习已经上升以提高机器学习模型的性能能力，因此由于其显着影响而出现了视觉和对象地理定位的领域广泛的应用，如增强现实，机器人，自驾驶车辆，道路维护和3D重建。本文提供了对涉及图像的地理定位的全面调查，其涉及从捕获图像（图像地理定位）或图像内的地理定位对象（对象地理定位）的地理定位的综合调查。我们将提供深入的研究，包括流行算法的摘要，对所提出的数据集的描述以及性能结果的分析来说明每个字段的当前状态。

translated by 谷歌翻译

Geometric and Learning-based Mesh Denoising: A Comprehensive Survey

Honghua Chen , Mingqiang Wei , Jun Wang

分类：计算机视觉

2022-09-02

网状denoising是数字几何处理中的基本问题。它试图消除表面噪声，同时尽可能准确地保留表面固有信号。尽管传统的智慧是基于专门的先验来平稳表面的，但基于学习的方法在概括和自动化方面取得了巨大的成功。在这项工作中，我们对网格denoising的进步进行了全面的综述，其中包含传统的几何方法和最近的基于学习的方法。首先，要熟悉读者的denoising任务，我们总结了网格denoising中的四个常见问题。然后，我们提供了两种现有的脱氧方法的分类。此外，分别详细介绍和分析了三个重要类别，包括优化，过滤器和基于数据驱动的技术。说明了定性和定量比较，以证明最先进的去核方法的有效性。最后，指出未来工作的潜在方向来解决这些方法的共同问题。这项工作还建立了网格denoising基准测试，未来的研究人员将通过最先进的方法轻松方便地评估其方法。

translated by 谷歌翻译

HTML版本

Two Decades of Bengali Handwritten Digit Recognition: A Survey

A. B. M. Ashikur Rahman , Md. Bakhtiar Hasan , Sabbir Ahmed , Tasnim Ahmed , Md. Hamjajul Ashmafee , Mohammad Ridwan Kabir , Md. Hasanul Kabir

分类：计算机视觉

2022-06-05

手写数字识别（HDR）是光学特征识别（OCR）领域中最具挑战性的任务之一。不管语言如何，HDR都存在一些固有的挑战，这主要是由于个人跨个人的写作风格的变化，编写媒介和环境的变化，无法在反复编写任何数字等时保持相同的笔触。除此之外，特定语言数字的结构复杂性可能会导致HDR的模棱两可。多年来，研究人员开发了许多离线和在线HDR管道，其中不同的图像处理技术与传统的机器学习（ML）基于基于的和/或基于深度学习（DL）的体系结构相结合。尽管文献中存在有关HDR的广泛审查研究的证据，例如：英语，阿拉伯语，印度，法尔西，中文等，但几乎没有对孟加拉人HDR（BHDR）的调查，这缺乏对孟加拉语HDR（BHDR）的研究，而这些调查缺乏对孟加拉语HDR（BHDR）的研究。挑战，基础识别过程以及可能的未来方向。在本文中，已经分析了孟加拉语手写数字的特征和固有的歧义，以及二十年来最先进的数据集的全面见解和离线BHDR的方法。此外，还详细讨论了一些涉及BHDR的现实应用特定研究。本文还将作为对离线BHDR背后科学感兴趣的研究人员的汇编，煽动了对相关研究的新途径的探索，这可能会进一步导致在不同应用领域对孟加拉语手写数字进行更好的离线认识。

translated by 谷歌翻译

SuperPoint: Self-Supervised Interest Point Detection and Description

Daniel DeTone , Tomasz Malisiewicz , Andrew Rabinovich

分类：

2017-12-20

This paper presents a self-supervised framework for training interest point detectors and descriptors suitable for a large number of multiple-view geometry problems in computer vision. As opposed to patch-based neural networks, our fully-convolutional model operates on full-sized images and jointly computes pixel-level interest point locations and associated descriptors in one forward pass. We introduce Homographic Adaptation, a multi-scale, multihomography approach for boosting interest point detection repeatability and performing cross-domain adaptation (e.g., synthetic-to-real). Our model, when trained on the MS-COCO generic image dataset using Homographic Adaptation, is able to repeatedly detect a much richer set of interest points than the initial pre-adapted deep model and any other traditional corner detector. The final system gives rise to state-of-the-art homography estimation results on HPatches when compared to LIFT, SIFT and ORB.

translated by 谷歌翻译