人类轻松地检测突出物体是几个领域的研究的主题,包括计算机愿景,因为它具有许多应用。然而,突出物体检测对于处理颜色和纹理图像的许多计算机模型仍然是一个挑战。这里,我们通过简单的模型提出了一种新颖和有效的策略,几乎没有内部参数,它为自然图像产生了强大的显着性图。该策略包括将颜色信息集成到局部纹理图案中,以表征颜色微纹理。使用颜色和纹理功能的文献中的大多数模型分别对待它们。在我们的情况下,它是一个简单而强大的LTP(本地三元模式)纹理描述符,应用于允许我们实现这一结束的彩色空间的相对颜色对。每种颜色微纹理由载体表示,载体由Slico(简单的线性迭代聚类与零参数)算法所获得的超像素,这是简单,快速的,表现出最先进的边界依从性。每对颜色微观纹理之间的异常程度是通过FastMAP方法计算的,该方法的快速版本(多维缩放),其在保持其距离时考虑颜色微纹理非线性。这些不同程度的不相似性为每个RGB,HSL,LUV和CMY颜色空间提供了中间显着图。最终的显着图是它们的组合,以利用它们中的每一个的强度。 MAE(平均绝对误差)和F $ _ {\ beta} $衡量我们的显着性图,在复杂的ECSSD数据集上显示,我们的模型既简单又高效,表现出几种最先进的模型。
translated by 谷歌翻译
Reliable estimation of visual saliency allows appropriate processing of images without prior knowledge of their contents, and thus remains an important step in many computer vision tasks including image segmentation, object recognition, and adaptive compression. We propose a regional contrast based saliency extraction algorithm, which simultaneously evaluates global contrast differences and spatial coherence. The proposed algorithm is simple, efficient, and yields full resolution saliency maps. Our algorithm consistently outperformed existing saliency detection methods, yielding higher precision and better recall rates, when evaluated using one of the largest publicly available data sets. We also demonstrate how the extracted saliency map can be used to create high quality segmentation masks for subsequent image processing.
translated by 谷歌翻译
We investigate the properties of a metric between two distributions, the Earth Mover's Distance (EMD), for content-based image retrieval. The EMD is based on the minimal cost that must be paid to transform one distribution into the other, in a precise sense, and was first proposed for certain vision problems by Peleg, Werman, and Rom. For image retrieval, we combine this idea with a representation scheme for distributions that is based on vector quantization. This combination leads to an image comparison framework that often accounts for perceptual similarity better than other previously proposed methods. The EMD is based on a solution to the transportation problem from linear optimization, for which efficient algorithms are available, and also allows naturally for partial matching. It is more robust than histogram matching techniques, in that it can operate on variable-length representations of the distributions that avoid quantization and other binning problems typical of histograms. When used to compare distributions with the same overall mass, the EMD is a true metric. In this paper we focus on applications to color and texture, and we compare the retrieval performance of the EMD with that of other distances.
translated by 谷歌翻译
Saliency detection is one of the most challenging problems in image analysis and computer vision. Many approaches propose different architectures based on the psychological and biological properties of the human visual attention system. However, there is still no abstract framework that summarizes the existing methods. In this paper, we offered a general framework for saliency models, which consists of five main steps: pre-processing, feature extraction, saliency map generation, saliency map combination, and post-processing. Also, we study different saliency models containing each level and compare their performance. This framework helps researchers to have a comprehensive view of studying new methods.
translated by 谷歌翻译
显着对象检测(SOD)在图像分析中具有若干应用。基于深度学习的SOD方法是最有效的,但它们可能会错过具有相似颜色的前景部分。为了规避问题,我们介绍了一个后处理方法,名为\ Texit {SuperPixel Materionity}(Sess)的后期处理方法,其交替地执行两个操作,以便显着完成:基于对象的SuperPixel分段和基于SuperPixel的显着性估算。 Sess使用输入显着图来估算超像素描绘的种子,并在前景和背景中定义超顶盒查询。新的显着性图是由查询和超像素之间的颜色相似性产生的。对于给定数量的迭代的过程重复,使得所有产生的显着性图通过蜂窝自动机组合成单个。最后,使用其平均值合并后处理和初始映射。我们展示SES可以始终如一地,并在五个图像数据集上一致而大大提高三种基于深度学习的SOD方法的结果。
translated by 谷歌翻译
在本文中,将颜色RGB图像转换为灰度,涵盖了用于将3个颜色通道投射到单个颜色通道的数学运算符的表征。基于以下事实:大多数运营商将$ 256^3 $颜色的每个颜色分配为单个灰度,范围从0到255,他们正在聚集算法,这些算法将颜色总体分配到256个亮度增加的簇中。为了可视化操作员的工作方式,绘制了簇的大小和每个集群的平均亮度。这项工作中引入的均衡模式(EQ)集中在集群大小上,而亮度映射(BM)模式描述了每个群集的CIE L*亮度分布。在线性操作员中发现了三类EQ模式和两类BM模式,定义了6级分类法。考虑到同等重量统一操作员,NTSC标准操作员以及被选为理想的算法,以减轻黑人的面孔以改善当前有偏见的分类器的面部识别,这是在案例研究中应用的理论/方法学框架。发现大多数用于评估颜色转换质量的当前指标更好地评估了两个BM模式类别之一,但是人团队选择的理想操作员属于另一个类别。因此,该警告不要将这些通用指标用于特定目的颜色到灰色转换。应该注意的是,该框架对非线性操作员的最终应用可能会引起新的EQ和BM模式。本文的主要贡献是提供一种工具,以更好地理解灰色转换器的颜色,即使是基于机器学习的灰色转换器,也可以在模型更好地解释的当前趋势中。
translated by 谷歌翻译
Recent progress on salient object detection is substantial, benefiting mostly from the explosive development of Convolutional Neural Networks (CNNs). Semantic segmentation and salient object detection algorithms developed lately have been mostly based on Fully Convolutional Neural Networks (FCNs). There is still a large room for improvement over the generic FCN models that do not explicitly deal with the scale-space problem. Holistically-Nested Edge Detector (HED) provides a skip-layer structure with deep supervision for edge and boundary detection, but the performance gain of HED on saliency detection is not obvious. In this paper, we propose a new salient object detection method by introducing short connections to the skip-layer structures within the HED architecture. Our framework takes full advantage of multi-level and multi-scale features extracted from FCNs, providing more advanced representations at each layer, a property that is critically needed to perform segment detection. Our method produces state-of-theart results on 5 widely tested salient object detection benchmarks, with advantages in terms of efficiency (0.08 seconds per image), effectiveness, and simplicity over the existing algorithms. Beyond that, we conduct an exhaustive analysis on the role of training data on performance. Our experimental results provide a more reasonable and powerful training set for future research and fair comparisons.
translated by 谷歌翻译
A visual attention system, inspired by the behavior and the neuronal architecture of the early primate visual system, is presented. Multiscale image features are combined into a single topographical saliency map. A dynamical neural network then selects attended locations in order of decreasing saliency. The system breaks down the complex problem of scene understanding by rapidly selecting, in a computationally efficient manner, conspicuous locations to be analyzed in detail.
translated by 谷歌翻译
着色是一个计算机辅助过程,旨在为灰色图像或视频赋予色彩。它可用于增强黑白图像,包括黑白照片,老式电影和科学成像结果。相反,不着色是将颜色图像或视频转换为灰度。灰度图像或视频是指没有颜色信息的亮度信息的图像或视频。它是一些下游图像处理应用程序的基础,例如模式识别,图像分割和图像增强。与图像脱色不同,视频脱色不仅应考虑每个视频框架中的图像对比度保存,而且还应尊重视频框架之间的时间和空间一致性。研究人员致力于通过平衡时空的一致性和算法效率来开发脱色方法。随着数码相机和手机的流行,研究人员越来越关注图像和视频着色和脱色。本文概述了过去二十年来图像和视频着色和脱色方法的进度。
translated by 谷歌翻译
面部特征跟踪是成像跳芭式(BCG)的关键组成部分,其中需要精确定量面部关键点的位移,以获得良好的心率估计。皮肤特征跟踪能够在帕金森病中基于视频的电机降解量化。传统的计算机视觉算法包括刻度不变特征变换(SIFT),加速强大的功能(冲浪)和LUCAS-KANADE方法(LK)。这些长期代表了最先进的效率和准确性,但是当存在常见的变形时,如图所示,如图所示,如此。在过去的五年中,深度卷积神经网络对大多数计算机视觉任务的传统方法表现优于传统的传统方法。我们提出了一种用于特征跟踪的管道,其应用卷积堆积的AutoEncoder,以将图像中最相似的裁剪标识到包含感兴趣的特征的参考裁剪。 AutoEncoder学会将图像作物代表到特定于对象类别的深度特征编码。我们在面部图像上培训AutoEncoder,并验证其在手动标记的脸部和手视频中通常验证其跟踪皮肤功能的能力。独特的皮肤特征(痣)的跟踪误差是如此之小,因为我们不能排除他们基于$ \ chi ^ 2 $ -test的手动标签。对于0.6-4.2像素的平均误差,我们的方法在所有情况下都表现出了其他方法。更重要的是,我们的方法是唯一一个不分歧的方法。我们得出的结论是,我们的方法为特征跟踪,特征匹配和图像配准比传统算法创建更好的特征描述符。
translated by 谷歌翻译
人类视觉大脑使用三个主要成分,例如颜色,纹理和形状来检测或识别环境和物体。因此,在过去的二十年中,科学研究人员对纹理分析引起了很多关注。纹理功能可用于通勤视觉或机器学习问题的许多不同应用中。从现在开始,已经提出了许多不同的方法来对纹理进行分类。他们中的大多数将分类准确性视为应改进的主要挑战。在本文中,基于两个有效纹理描述符,共发生矩阵和局部三元模式(LTP)的组合提出了一种新方法。首先,进行基本的本地二进制模式和LTP以提取本地纹理信息。接下来,从灰度共发生矩阵中提取统计特征的子集。最后,串联功能用于训练分类器。根据准确性,在Brodatz基准数据集上评估了该性能。实验结果表明,与某些最新方法相比,提出的方法提供了更高的分类率。
translated by 谷歌翻译
Most existing bottom-up methods measure the foreground saliency of a pixel or region based on its contrast within a local context or the entire image, whereas a few methods focus on segmenting out background regions and thereby salient objects. Instead of considering the contrast between the salient objects and their surrounding regions, we consider both foreground and background cues in a different way. We rank the similarity of the image elements (pixels or regions) with foreground cues or background cues via graph-based manifold ranking. The saliency of the image elements is defined based on their relevances to the given seeds or queries. We represent the image as a close-loop graph with superpixels as nodes. These nodes are ranked based on the similarity to background and foreground queries, based on affinity matrices. Saliency detection is carried out in a two-stage scheme to extract background regions and foreground salient objects efficiently. Experimental results on two large benchmark databases demonstrate the proposed method performs well when against the state-of-the-art methods in terms of accuracy and speed. We also create a more difficult benchmark database containing 5,172 images to test the proposed saliency model and make this database publicly available with this paper for further studies in the saliency field.
translated by 谷歌翻译
Bilateral filtering smooths images while preserving edges, by means of a nonlinear combination of nearby image values. The method is noniterative, local, and simple. It combines gray levels or colors based on both their geometric closeness and their photometric similarity, and prefers near values to distant values in both domain and range. In contrast with filters that operate on the three bands of a color image separately, a bilateral filter can enforce the perceptual metric underlying the CIE-Lab color space, and smooth colors and preserve edges in a way that is tuned to human perception. Also, in contrast with standard filtering, bilateral filtering produces no phantom colors along edges in color images, and reduces phantom colors where they appear in the original image.
translated by 谷歌翻译
近年来,由于其对科学和社会的重要性,人们的重新识别(RE-ID)一直受到越来越多的关注。机器学习,尤其是深度学习(DL)已成为主要的重新ID工具,该工具使研究能够在基准数据集上实现前所未有的精度水平。但是,DL模型的概括性不佳存在已知的问题。也就是说,经过训练以实现一个数据集的模型在另一个数据集上的表现不佳,并且需要重新训练。为了解决这个问题,我们提出了一个没有可训练参数的模型,该模型显示出高概括的巨大潜力。它将完全分析的特征提取和相似性排名方案与用于获得初始子区域分类的基于DL的人解析相结合。我们表明,这种组合在很大程度上消除了现有分析方法的缺点。我们使用可解释的颜色和纹理功能,这些功能具有与之相关的人类可读性相似性度量。为了验证提出的方法,我们在Market1501和CuHK03数据集上进行实验,以达到与DL模型相当的竞争排名1精度。最重要的是,我们证明我们的方法将应用于转移学习任务时,将达到63.9%和93.5%的跨域准确性。它明显高于先前报道的30-50%传输精度。我们讨论添加新功能以进一步改善模型的潜在方法。我们还展示了可解释的功能的优势,用于构建口头描述中的人类生成的查询,以进行无查询图像进行搜索。
translated by 谷歌翻译
Fully convolutional neural networks (FCNs) have shown their advantages in the salient object detection task. However, most existing FCNs-based methods still suffer from coarse object boundaries. In this paper, to solve this problem, we focus on the complementarity between salient edge information and salient object information. Accordingly, we present an edge guidance network (EGNet) for salient object detection with three steps to simultaneously model these two kinds of complementary information in a single network. In the first step, we extract the salient object features by a progressive fusion way. In the second step, we integrate the local edge information and global location information to obtain the salient edge features. Finally, to sufficiently leverage these complementary features, we couple the same salient edge features with salient object features at various resolutions. Benefiting from the rich edge information and location information in salient edge features, the fused features can help locate salient objects, especially their boundaries more accurately. Experimental results demonstrate that the proposed method performs favorably against the state-of-the-art methods on six widely used datasets without any pre-processing and post-processing. The source code is available at http: //mmcheng.net/egnet/.
translated by 谷歌翻译
兴趣点检测是计算机视觉和图像处理中最根本,最关键的问题之一。在本文中,我们对图像特征信息(IFI)提取技术进行了全面综述,以进行利益点检测。为了系统地介绍现有的兴趣点检测方法如何从输入图像中提取IFI,我们提出了IFI提取技术的分类学检测。根据该分类法,我们讨论了不同类型的IFI提取技术以进行兴趣点检测。此外,我们确定了与现有的IFI提取技术有关的主要未解决的问题,以及以前尚未讨论过的任何兴趣点检测方法。提供了现有的流行数据集和评估标准,并评估和讨论了18种最先进方法的性能。此外,还详细阐述了有关IFI提取技术的未来研究方向。
translated by 谷歌翻译
我们提出了一种新颖的方法,该方法将基于机器学习的交互式图像分割结合在一起,使用Supersoxels与聚类方法结合了用于自动识别大型数据集中类似颜色的图像的聚类方法,从而使分类器的指导重复使用。我们的方法解决了普遍的颜色可变性的问题,并且在生物学和医学图像中通常不可避免,这通常会导致分割恶化和量化精度,从而大大降低了必要的训练工作。效率的这种提高促进了大量图像的量化,从而为高通量成像中的最新技术进步提供了交互式图像分析。所呈现的方法几乎适用于任何图像类型,并代表通常用于图像分析任务的有用工具。
translated by 谷歌翻译
Objective methods for assessing perceptual image quality have traditionally attempted to quantify the visibility of errors between a distorted image and a reference image using a variety of known properties of the human visual system. Under the assumption that human visual perception is highly adapted for extracting structural information from a scene, we introduce an alternative framework for quality assessment based on the degradation of structural information. As a specific example of this concept, we develop a Structural Similarity Index and demonstrate its promise through a set of intuitive examples, as well as comparison to both subjective ratings and state-of-the-art objective methods on a database of images compressed with JPEG and JPEG2000. 1
translated by 谷歌翻译
皮肤病变分割是高效的非侵入性计算机辅助性早期诊断黑素瘤的关键步骤之一。本文调查了除了显着性的颜色信息,可用于自动测定着色的病变区。与仅使用显着性的大多数现有的分割方法不同,以便与周围地区的皮肤病变区分,我们提出了一种采用二值化过程的新方法,其与新的感知标准相结合,受到人类视觉感知的启发,与显着性的性质有关和输入图像数据分布的颜色。作为改进所提出的方法的准确性的手段,在分割步骤之前前面通过预处理,旨在减少计算负担,消除伪像和改善对比度。我们已经在两个公共数据库上评估了该方法,包括1497个Dermoscopic图像。我们还通过明确为DerMicopic图像明确设计的经典和最近的基于显着的方法的性能。定性和定量评估表明,该方法是有前途的,因为它产生了精确的皮肤病变分割,与其他基于显着性的分段方法相比令人满意地表现得令人满意。
translated by 谷歌翻译
We solve the problem of salient object detection by investigating how to expand the role of pooling in convolutional neural networks. Based on the U-shape architecture, we first build a global guidance module (GGM) upon the bottom-up pathway, aiming at providing layers at different feature levels the location information of potential salient objects. We further design a feature aggregation module (FAM) to make the coarse-level semantic information well fused with the fine-level features from the top-down pathway. By adding FAMs after the fusion operations in the topdown pathway, coarse-level features from the GGM can be seamlessly merged with features at various scales. These two pooling-based modules allow the high-level semantic features to be progressively refined, yielding detail enriched saliency maps. Experiment results show that our proposed approach can more accurately locate the salient objects with sharpened details and hence substantially improve the performance compared to the previous state-of-the-arts. Our approach is fast as well and can run at a speed of more than 30 FPS when processing a 300 × 400 image. Code can be found at http://mmcheng.net/poolnet/.
translated by 谷歌翻译