Employing part-level features for pedestrian image description offers fine-grained information and has been verified as beneficial for person retrieval in very recent literature. A prerequisite of part discovery is that each part should be well located. Instead of using external cues, e.g., pose estimation, to directly locate parts, this paper lays emphasis on the content consistency within each part.Specifically, we target at learning discriminative partinformed features for person retrieval and make two contributions. (i) A network named Part-based Convolutional Baseline (PCB). Given an image input, it outputs a convolutional descriptor consisting of several part-level features. With a uniform partition strategy, PCB achieves competitive results with the state-of-the-art methods, proving itself as a strong convolutional baseline for person retrieval. (ii) A refined part pooling (RPP) method. Uniform partition inevitably incurs outliers in each part, which are in fact more similar to other parts. RPP re-assigns these outliers to the parts they are closest to, resulting in refined parts with enhanced within-part consistency. Experiment confirms that RPP allows PCB to gain another round of performance boost. For instance, on the Market-1501 dataset, we achieve (77.4+4.2)% mAP and (92.3+1.5)% rank-1 accuracy, surpassing the state of the art by a large margin.
translated by 谷歌翻译
The combination of global and partial features has been an essential solution to improve discriminative performances in person re-identification (Re-ID) tasks. Previous part-based methods mainly focus on locating regions with specific pre-defined semantics to learn local representations, which increases learning difficulty but not efficient or robust to scenarios with large variances. In this paper, we propose an end-to-end feature learning strategy integrating discriminative information with various granularities. We carefully design the Multiple Granularity Network (MGN), a multi-branch deep network architecture consisting of one branch for global feature representations and two branches for local feature representations. Instead of learning on semantic regions, we uniformly partition the images into several stripes, and vary the number of parts in different local branches to obtain local feature representations with multiple granularities. Comprehensive experiments implemented on the mainstream evaluation datasets including Market-1501, DukeMTMC-reid and CUHK03 indicate that our method robustly achieves state-of-the-art performances and outperforms any existing approaches by a large margin. For example, on Market-1501 dataset in single query mode, we obtain a top result of Rank-1/mAP=96.6%/94.2% with this method after re-ranking.
translated by 谷歌翻译
人重新识别(Reid)旨在从不同摄像机捕获的图像中检索一个人。对于基于深度学习的REID方法,已经证明,使用本地特征与人物图像的全局特征可以帮助为人员检索提供强大的特征表示。人类的姿势信息可以提供人体骨架的位置,有效地指导网络在这些关键领域更加关注这些关键领域,也可能有助于减少来自背景或闭塞的噪音分散。然而,先前与姿势相关的作品提出的方法可能无法充分利用姿势信息的好处,并没有考虑不同当地特征的不同贡献。在本文中,我们提出了一种姿势引导图注意网络,一个多分支架构,包括一个用于全局特征的一个分支,一个用于中粒体特征的一个分支,一个分支用于细粒度关键点特征。我们使用预先训练的姿势估计器来生成本地特征学习的关键点热图,并仔细设计图表卷积层以通过建模相似关系来重新评估提取的本地特征的贡献权重。实验结果表明我们对歧视特征学习的方法的有效性,我们表明我们的模型在几个主流评估数据集上实现了最先进的表演。我们还对我们的网络进行了大量的消融研究和设计不同类型的比较实验,以证明其有效性和鲁棒性,包括整体数据集,部分数据集,遮挡数据集和跨域测试。
translated by 谷歌翻译
从图像中学习代表,健壮和歧视性信息对于有效的人重新识别(RE-ID)至关重要。在本文中,我们提出了一种基于身体和手部图像的人重新ID的端到端判别深度学习的复合方法。我们仔细设计了本地感知的全球注意力网络(Laga-Net),这是一个多分支深度网络架构,由一个用于空间注意力的分支组成,一个用于渠道注意。注意分支集中在图像的相关特征上,同时抑制了无关紧要的背景。为了克服注意力机制的弱点,与像素改组一样,我们将相对位置编码整合到空间注意模块中以捕获像素的空间位置。全球分支机构打算保留全球环境或结构信息。对于打算捕获细粒度信息的本地分支,我们进行统一的分区以水平在Conv-Layer上生成条纹。我们通过执行软分区来检索零件,而无需明确分区图像或需要外部线索,例如姿势估计。一组消融研究表明,每个组件都会有助于提高拉加网络的性能。对四个受欢迎的人体重新ID基准和两个公开可用的手数据集的广泛评估表明,我们的建议方法始终优于现有的最新方法。
translated by 谷歌翻译
The main contribution of this paper is a simple semisupervised pipeline that only uses the original training set without collecting extra data. It is challenging in 1) how to obtain more training data only from the training set and 2) how to use the newly generated data. In this work, the generative adversarial network (GAN) is used to generate unlabeled samples. We propose the label smoothing regularization for outliers (LSRO). This method assigns a uniform label distribution to the unlabeled images, which regularizes the supervised model and improves the baseline.We verify the proposed method on a practical problem: person re-identification (re-ID). This task aims to retrieve a query person from other cameras. We adopt the deep convolutional generative adversarial network (DCGAN) for sample generation, and a baseline convolutional neural network (CNN) for representation learning. Experiments show that adding the GAN-generated data effectively improves the discriminative ability of learned CNN embeddings. On three large-scale datasets, Market-1501, CUHK03 and DukeMTMC-reID, we obtain +4.37%, +1.6% and +2.46% improvement in rank-1 precision over the baseline CNN, respectively. We additionally apply the proposed method to fine-grained bird recognition and achieve a +0.6% improvement over a strong baseline. The code is available at https://github.com/layumi/Person-reID_GAN .
translated by 谷歌翻译
在严重犯罪的情况下,包括性虐待,往往是唯一可以证明识别潜力的可用信息是手的图像。由于这种证据在不受控制的情况下捕获,因此难以分析。随着全局对特征比较的方法在这种情况下有限,重要的是要考虑当地信息。在这项工作中,我们通过学习全球和地方深度特征表示来提出基于手的人识别。我们提出的方法,全局和部分感知网络(GPA-Net),在Conv-Tother上创建全局和本地分支,以学习强大的歧视全局和零级功能。为了学习本地(零件级别)功能,我们在水平和垂直方向上对CONC层执行统一分区。我们通过进行软分区检索零件,而无需明确地分区图像或需要外部提示,例如姿势估计。我们对两个大型多民族和公开的手部数据集进行了广泛的评估,表明我们所提出的方法显着优于竞争方法。
translated by 谷歌翻译
可见红外人重新识别(VI-REID)由于可见和红外模式之间存在较大的差异而受到挑战。大多数开创性方法通过学习模态共享和ID相关的功能来降低类内变型和跨性间差异。但是,在VI-REID中尚未充分利用一个显式模态共享提示。此外,现有特征学习范例在全局特征或分区特征条带上强加约束,忽略了全局和零件特征的预测一致性。为了解决上述问题,我们将构成估算作为辅助学习任务,以帮助vi-reid任务在端到端的框架中。通过以互利的方式联合培训这两个任务,我们的模型学习了更高质量的模态共享和ID相关的功能。在它之上,通过分层特征约束(HFC)无缝同步全局功能和本地特征的学习,前者使用知识蒸馏策略监督后者。两个基准VI-REID数据集的实验结果表明,该方法始终如一地通过显着的利润来改善最先进的方法。具体而言,我们的方法在RegDB数据集上取决于针对最先进的方法的近20美元\%$地图改进。我们的兴趣调查结果突出了vi-reid中辅助任务学习的使用。
translated by 谷歌翻译
许多现有人员的重新识别(RE-ID)方法取决于特征图,这些特征图可以分区以定位一个人的部分或减少以创建全球表示形式。尽管部分定位已显示出显着的成功,但它使用了基于位置的分区或静态特征模板。但是,这些假设假设零件在给定图像或其位置中的先前存在,忽略了特定于图像的信息,这些信息限制了其在挑战性场景中的可用性,例如用部分遮挡和部分探针图像进行重新添加。在本文中,我们介绍了一个基于空间注意力的动态零件模板初始化模块,该模块在主链的早期层中使用中级语义特征动态生成零件序列。遵循自发注意力的层,使用简化的跨注意方案来使用主链的人体部分特征来提取各种人体部位的模板特征,提高整个模型的判别能力。我们进一步探索零件描述符的自适应加权,以量化局部属性的缺失或阻塞,并抑制相应零件描述子对匹配标准的贡献。关于整体,遮挡和部分重新ID任务基准的广泛实验表明,我们提出的架构能够实现竞争性能。代码将包含在补充材料中,并将公开提供。
translated by 谷歌翻译
遮挡对人重新识别(Reid)构成了重大挑战。现有方法通常依赖于外部工具来推断可见的身体部位,这在计算效率和Reid精度方面可能是次优。特别是,在面对复杂的闭塞时,它们可能会失败,例如行人之间的遮挡。因此,在本文中,我们提出了一种名为M质量感知部分模型(QPM)的新方法,用于遮挡鲁棒Reid。首先,我们建议共同学习零件特征和预测部分质量分数。由于没有提供质量注释,我们介绍了一种自动将低分分配给闭塞体部位的策略,从而削弱了遮挡体零落在Reid结果上的影响。其次,基于预测部分质量分数,我们提出了一种新颖的身份感知空间关注(ISA)模块。在该模块中,利用粗略标识感知功能来突出目标行人的像素,以便处理行人之间的遮挡。第三,我们设计了一种自适应和有效的方法,用于了解来自每个图像对的共同非遮挡区域的全局特征。这种设计至关重要,但经常被现有方法忽略。 QPM有三个关键优势:1)它不依赖于培训或推理阶段的任何外部工具; 2)它处理由物体和其他行人引起的闭塞; 3)它是高度计算效率。对闭塞Reid的四个流行数据库的实验结果证明QPM始终如一地以显着的利润方式优于最先进的方法。 QPM代码将被释放。
translated by 谷歌翻译
车辆重新识别(RE-ID)旨在通过不同的摄像机检索具有相同车辆ID的图像。当前的零件级特征学习方法通​​常通过统一的部门,外部工具或注意力建模来检测车辆零件。但是,此部分功能通常需要昂贵的额外注释,并在不可靠的零件遮罩预测的情况下导致次优性能。在本文中,我们提出了一个针对车辆重新ID的弱监督零件注意网络(Panet)和零件式网络(PMNET)。首先,Panet通过与零件相关的通道重新校准和基于群集的掩模生成无需车辆零件监管信息来定位车辆零件。其次,PMNET利用教师指导的学习来从锅et中提取特定于车辆的特定功能,并进行多尺度的全球零件特征提取。在推断过程中,PMNET可以自适应提取歧视零件特征,而无需围绕锅et定位,从而防止了不稳定的零件掩模预测。我们将重新ID问题作为一个多任务问题,并采用同质的不确定性来学习最佳的ID损失权衡。实验是在两个公共基准上进行的,这表明我们的方法优于最近的方法,这不需要额外的注释,即CMC@5的平均增加3.0%,而Veri776的MAP中不需要超过1.4%。此外,我们的方法可以扩展到遮挡的车辆重新ID任务,并具有良好的概括能力。
translated by 谷歌翻译
Person re-identification (Re-ID) aims at retrieving a person of interest across multiple non-overlapping cameras. With the advancement of deep neural networks and increasing demand of intelligent video surveillance, it has gained significantly increased interest in the computer vision community. By dissecting the involved components in developing a person Re-ID system, we categorize it into the closed-world and open-world settings. The widely studied closed-world setting is usually applied under various research-oriented assumptions, and has achieved inspiring success using deep learning techniques on a number of datasets. We first conduct a comprehensive overview with in-depth analysis for closed-world person Re-ID from three different perspectives, including deep feature representation learning, deep metric learning and ranking optimization. With the performance saturation under closed-world setting, the research focus for person Re-ID has recently shifted to the open-world setting, facing more challenging issues. This setting is closer to practical applications under specific scenarios. We summarize the open-world Re-ID in terms of five different aspects. By analyzing the advantages of existing methods, we design a powerful AGW baseline, achieving state-of-the-art or at least comparable performance on twelve datasets for FOUR different Re-ID tasks. Meanwhile, we introduce a new evaluation metric (mINP) for person Re-ID, indicating the cost for finding all the correct matches, which provides an additional criteria to evaluate the Re-ID system for real applications. Finally, some important yet under-investigated open issues are discussed.
translated by 谷歌翻译
Person re-identification (re-ID) models trained on one domain often fail to generalize well to another. In our attempt, we present a "learning via translation" framework. In the baseline, we translate the labeled images from source to target domain in an unsupervised manner. We then train re-ID models with the translated images by supervised methods. Yet, being an essential part of this framework, unsupervised image-image translation suffers from the information loss of source-domain labels during translation.Our motivation is two-fold. First, for each image, the discriminative cues contained in its ID label should be maintained after translation. Second, given the fact that two domains have entirely different persons, a translated image should be dissimilar to any of the target IDs. To this end, we propose to preserve two types of unsupervised similarities, 1) self-similarity of an image before and after translation, and 2) domain-dissimilarity of a translated source image and a target image. Both constraints are implemented in the similarity preserving generative adversarial network (SPGAN) which consists of an Siamese network and a Cy-cleGAN. Through domain adaptation experiment, we show that images generated by SPGAN are more suitable for domain adaptation and yield consistent and competitive re-ID accuracy on two large-scale datasets.
translated by 谷歌翻译
This paper contributes a new high quality dataset for person re-identification, named "Market-1501". Generally, current datasets: 1) are limited in scale; 2) consist of hand-drawn bboxes, which are unavailable under realistic settings; 3) have only one ground truth and one query image for each identity (close environment). To tackle these problems, the proposed Market-1501 dataset is featured in three aspects. First, it contains over 32,000 annotated bboxes, plus a distractor set of over 500K images, making it the largest person re-id dataset to date. Second, images in Market-1501 dataset are produced using the Deformable Part Model (DPM) as pedestrian detector. Third, our dataset is collected in an open system, where each identity has multiple images under each camera.As a minor contribution, inspired by recent advances in large-scale image search, this paper proposes an unsupervised Bag-of-Words descriptor. We view person reidentification as a special task of image search. In experiment, we show that the proposed descriptor yields competitive accuracy on VIPeR, CUHK03, and Market-1501 datasets, and is scalable on the large-scale 500k dataset.
translated by 谷歌翻译
最近,无监督的人重新识别(RE-ID)引起了人们的关注,因为其开放世界情景设置有限,可用的带注释的数据有限。现有的监督方法通常无法很好地概括在看不见的域上,而无监督的方法(大多数缺乏多范围的信息),并且容易患有确认偏见。在本文中,我们旨在从两个方面从看不见的目标域上找到更好的特征表示形式,1)在标记的源域上进行无监督的域适应性和2)2)在未标记的目标域上挖掘潜在的相似性。此外,提出了一种协作伪标记策略,以减轻确认偏见的影响。首先,使用生成对抗网络将图像从源域转移到目标域。此外,引入了人身份和身份映射损失,以提高生成图像的质量。其次,我们提出了一个新颖的协作多元特征聚类框架(CMFC),以学习目标域的内部数据结构,包括全局特征和部分特征分支。全球特征分支(GB)在人体图像的全球特征上采用了无监督的聚类,而部分特征分支(PB)矿山在不同人体区域内的相似性。最后,在两个基准数据集上进行的广泛实验表明,在无监督的人重新设置下,我们的方法的竞争性能。
translated by 谷歌翻译
由于源域和目标域之间的巨大差距,对于人重新识别的无监督域适应(UDA)是具有挑战性的。典型的自我训练方法是使用群集算法生成的伪标签来迭代优化目标域上的模型。然而,对此的缺点是嘈杂的伪标签通常在学习时造成麻烦。为了解决这个问题,已经开发了双网络的相互学习方法来生产可靠的软标签。然而,随着两个神经网络逐渐收敛,它们的互补性被削弱,并且它们可能变得偏向相同的噪音。本文提出了一种新颖的轻量级模块,细小波块(AWB),可以集成到相互学习的双网络中,以增强伪标签中的互补性和进一步抑制噪声。具体而言,我们首先介绍一种无参数模块,该波块通过不同的方式挥动特征映射块的两个网络创造了两个网络之间的差异。然后,利用注意机制来扩大创建的差异并发现更多互补特征。此外,探讨了两种组合策略,即探讨了与后关注。实验表明,该方法实现了最先进的性能,具有对多个UDA人重新识别任务的显着改进。我们还通过将其应用于车辆重新识别和图像分类任务来证明所提出的方法的一般性。我们的代码和模型可在https://github.com/wangwenhao0716/attentive-waveblock上使用。
translated by 谷歌翻译
闭塞者重新识别(REID)旨在匹配遮挡人物在不同的相机视图上的整体上。目标行人(TP)通常受到非行人闭塞(NPO)和Nontarget行人(NTP)的干扰。以前的方法主要集中在忽略NTP的特征污染的同时越来越越来越多的模型对非NPO的鲁棒性。在本文中,我们提出了一种新颖的特征擦除和扩散网络(FED),同时处理NPO和NTP。具体地,我们的建议闭塞擦除模块(OEM)消除了NPO特征,并由NPO增强策略辅助,该策略模拟整体行人图像上的NPO并产生精确的遮挡掩模。随后,我们随后,我们将行人表示与其他记忆特征弥散,以通过学习的跨关注机构通过新颖的特征扩散模块(FDM)实现的特征空间中的NTP特征。随着OEM的闭塞分数的指导,特征扩散过程主要在可见的身体部位上进行,保证合成的NTP特性的质量。通过在我们提出的联邦网络中联合优化OEM和FDM,我们可以大大提高模型对TP的看法能力,并减轻NPO和NTP的影响。此外,所提出的FDM仅用作用于训练的辅助模块,并将在推理阶段中丢弃,从而引入很少的推理计算开销。遮挡和整体人员Reid基准的实验表明了美联储最先进的优越性,喂养的含量在封闭式封闭的内容上取得了86.3%的排名 - 1准确性,超过其他人至少4.7%。
translated by 谷歌翻译
Occluded person re-identification (ReID) is a person retrieval task which aims at matching occluded person images with holistic ones. For addressing occluded ReID, part-based methods have been shown beneficial as they offer fine-grained information and are well suited to represent partially visible human bodies. However, training a part-based model is a challenging task for two reasons. Firstly, individual body part appearance is not as discriminative as global appearance (two distinct IDs might have the same local appearance), this means standard ReID training objectives using identity labels are not adapted to local feature learning. Secondly, ReID datasets are not provided with human topographical annotations. In this work, we propose BPBreID, a body part-based ReID model for solving the above issues. We first design two modules for predicting body part attention maps and producing body part-based features of the ReID target. We then propose GiLt, a novel training scheme for learning part-based representations that is robust to occlusions and non-discriminative local appearance. Extensive experiments on popular holistic and occluded datasets show the effectiveness of our proposed method, which outperforms state-of-the-art methods by 0.7% mAP and 5.6% rank-1 accuracy on the challenging Occluded-Duke dataset. Our code is available at https://github.com/VlSomers/bpbreid.
translated by 谷歌翻译
近年来,已经产生了大量的视觉内容,并从许多领域共享,例如社交媒体平台,医学成像和机器人。这种丰富的内容创建和共享引入了新的挑战,特别是在寻找类似内容内容的图像检索(CBIR)-A的数据库中,即长期建立的研究区域,其中需要改进的效率和准确性来实时检索。人工智能在CBIR中取得了进展,并大大促进了实例搜索过程。在本调查中,我们审查了最近基于深度学习算法和技术开发的实例检索工作,通过深网络架构类型,深度功能,功能嵌入方法以及网络微调策略组织了调查。我们的调查考虑了各种各样的最新方法,在那里,我们识别里程碑工作,揭示各种方法之间的联系,并呈现常用的基准,评估结果,共同挑战,并提出未来的未来方向。
translated by 谷歌翻译
无监督的视频人重新识别(Reid)方法通常取决于全局级别功能。许多监督的Reid方法采用了本地级别的功能,并实现了显着的性能改进。但是,将本地级别的功能应用于无监督的方法可能会引入不稳定的性能。为了提高无监督视频REID的性能稳定,本文介绍了一般方案融合零件模型和无监督的学习。在该方案中,全局级别功能分为等于的本地级别。用于探索无监督学习的本地感知模块以探索对本地级别功能的概括。建议克服本地级别特征的缺点来克服全局感知模块。来自这两个模块的功能融合以形成每个输入图像的鲁棒特征表示。此特征表示具有本地级别功能的优点,而不会遭受其缺点。综合实验是在三个基准上进行的,包括PRID2011,ILIDS-VID和Dukemtmc-Videoreid,结果表明,该方法实现了最先进的性能。广泛的消融研究证明了所提出的计划,本地感知模块和全局感知模块的有效性和稳健性。
translated by 谷歌翻译
在本文中,我们基于任何卷积神经网络中中间注意图的弱监督生成机制,并更加直接地披露了注意模块的有效性,以充分利用其潜力。鉴于现有的神经网络配备了任意注意模块,我们介绍了一个元评论家网络,以评估主网络中注意力图的质量。由于我们设计的奖励的离散性,提出的学习方法是在强化学习环境中安排的,在此设置中,注意力参与者和经常性的批评家交替优化,以提供临时注意力表示的即时批评和修订,因此,由于深度强化的注意力学习而引起了人们的关注。 (Dreal)。它可以普遍应用于具有不同类型的注意模块的网络体系结构,并通过最大程度地提高每个单独注意模块产生的最终识别性能的相对增益来促进其表现能力,如类别和实例识别基准的广泛实验所证明的那样。
translated by 谷歌翻译