智能论文笔记

KartalOl: Transfer learning using deep neural network for iris segmentation and localization: New dataset for iris segmentation

Jalil Nourmohammadi Khiarak , Samaneh Salehi Nasab , Farhang Jaryani , Seyed Naeim Moafinejad , Rana Pourmohamad , Yasin Amini , Morteza Noshad

分类：计算机视觉

2021-12-09

由于长距离，照明变化，有限的用户合作和移动科目，虹膜分割和定位在不受约束环境中具有挑战性。为了解决这个问题，我们介绍了一个U-Net，具有预先培训的MobileNetv2深神经网络方法。我们使用MobileNetv2的预先训练的权重，用于想象成数据集，并在虹膜识别和本地化域上进行微调。此外，我们推出了一个名为Kartalol的新数据集，以更好地评估虹膜识别方案中的检测器。为了提供域适应，我们可以在Casia-Iris-Asia，Casia-Iris-M1和Casia-Iris-Africa和Casia-Iris-Africa和我们的数据集中微调MobileNetv2模型。我们还通过执行左右翻转，旋转，缩放和亮度来增强数据。我们通过迭代所提供的数据集中的图像来选择二进制掩码的二值化阈值。沿着Kartalol DataSet，Casia-Iris-Asia，Casia-Iris-M1，Casia-Iris-M1，Casia-Iris-M1，Casia-Iris-M1，Casia-Iris-M1，Casia-Iris-M1培训。实验结果强调了我们的方法在基于移动的基准上超越了最先进的方法。代码和评估结果在https://github.com/jalilnkh/kartalol-nir -isl2021031301上公开可用。

translated by 谷歌翻译

Ocular Recognition Databases and Competitions: A Survey

Luiz A. Zanlorensi , Rayson Laroca , Eduardo Luz , Alceu S. Britto Jr. , Luiz S. Oliveira , David Menotti

分类：计算机视觉

2019-11-21

已经广泛地研究了使用虹膜和围眼区域作为生物特征，主要是由于虹膜特征的奇异性以及当图像分辨率不足以提取虹膜信息时的奇异区域的使用。除了提供有关个人身份的信息外，还可以探索从这些特征提取的功能，以获得其他信息，例如个人的性别，药物使用的影响，隐形眼镜的使用，欺骗等。这项工作提出了对为眼部识别创建的数据库的调查，详细说明其协议以及如何获取其图像。我们还描述并讨论了最受欢迎的眼镜识别比赛（比赛），突出了所提交的算法，只使用Iris特征和融合虹膜和周边地区信息实现了最佳结果。最后，我们描述了一些相关工程，将深度学习技术应用于眼镜识别，并指出了新的挑战和未来方向。考虑到有大量的眼部数据库，并且每个人通常都设计用于特定问题，我们认为这项调查可以广泛概述眼部生物识别学中的挑战。

translated by 谷歌翻译

Interpretable Deep Learning-Based Forensic Iris Segmentation and Recognition

Andrey Kuehlkamp , Aidan Boyd , Adam Czajka , Kevin Bowyer , Patrick Flynn , Dennis Chute , Eric Benjamin

分类：计算机视觉

2021-12-01

虹膜识别生活人员是一项成熟的生物识别方式，这些模型已通过政府ID计划，边境交通，选民登记和重复，以解锁移动电话。另一方面，最近出现了识别死者模式的死者受试者的可能性。在本文中，我们提出了一种基于端到端的深度学习方法，用于后期虹膜虹膜分割和具有特殊可视化技术的识别，旨在支持您的努力中取证人类审查员。所提出的后期虹膜分割方法优于现有技术，并且除虹膜环上，如古典虹膜分割方法 - 检测眼部分解过程所引起的异常区域，如犁沟或干燥和皱纹的不规则镜面亮点角膜。该方法培训并验证了从171名尸体获取的数据，保存在核心条件下，并在从259名死亡科目获得的主题脱节数据上进行测试。据我们所知，这是迄今为止迄今为止的虹膜识别研究中使用的最大数据核心。纸张提供了该方法的源代码。测试数据将通过刑事司法数据（NACJD）档案馆的国家档案提供。

translated by 谷歌翻译

Human Saliency-Driven Patch-based Matching for Interpretable Post-mortem Iris Recognition

Aidan Boyd , Daniel Moreira , Andrey Kuehlkamp , Kevin Bowyer , Adam Czajka

分类：计算机视觉

2022-08-03

法医虹膜认可，而不是活着的Iris认可，是一个新兴的研究领域，它利用Iris Biometrics的判别能力来帮助人类检查员识别死者。作为一种主要是人为控制的任务，作为一种基于机器学习的技术，法医识别是在验证后识别任务中对人类专业知识的“备份”。因此，机器学习模型必须是（a）可解释的，并且（b）验尸特异性，以说明衰减眼组织的变化。在这项工作中，我们提出了一种满足需求的方法，并以人类感知的方式以一种新颖的方式接近验尸的创建。我们首先使用人类突出的图像区域的注释来训练基于学习的特征探测器，这是他们的决策。实际上，该方法直接从人类那里学习可解释的特征，而不是纯粹的数据驱动特征。其次，区域虹膜代码（同样，具有人体驱动的过滤内核）用于配对检测到的虹膜斑块，这些颗粒被转化为基于斑块的比较分数。通过这种方式，我们的方法为人类考官提供了人为理解的视觉提示，以证明身份决定和相应的置信度得分是合理的。当在259名死者的验尸虹膜图像的数据集上进行测试时，提出的三个最佳虹膜匹配者中提出的方法位置比商业（非人类互换）的Verieye方法更好。我们提出了一种独特的验尸后虹膜识别方法，该方法接受了人类显着性的培训，可以在法医检查的背景下提供完全解释的比较结果，从而实现最先进的识别表现。

translated by 谷歌翻译

Artificial Pupil Dilation for Data Augmentation in Iris Semantic Segmentation

Daniel P. Benalcazar , David A. Benalcazar , Andres Valenzuela

分类：计算机视觉

2022-12-24

Biometrics is the science of identifying an individual based on their intrinsic anatomical or behavioural characteristics, such as fingerprints, face, iris, gait, and voice. Iris recognition is one of the most successful methods because it exploits the rich texture of the human iris, which is unique even for twins and does not degrade with age. Modern approaches to iris recognition utilize deep learning to segment the valid portion of the iris from the rest of the eye, so it can then be encoded, stored and compared. This paper aims to improve the accuracy of iris semantic segmentation systems by introducing a novel data augmentation technique. Our method can transform an iris image with a certain dilation level into any desired dilation level, thus augmenting the variability and number of training examples from a small dataset. The proposed method is fast and does not require training. The results indicate that our data augmentation method can improve segmentation accuracy up to 15% for images with high pupil dilation, which creates a more reliable iris recognition pipeline, even under extreme dilation.

translated by 谷歌翻译

Experimental analysis regarding the influence of iris segmentation on the recognition rate

Heinz Hofbauer , Fernando Alonso-Fernandez , Josef Bigun , Andreas Uhl

分类：计算机视觉

2022-11-10

In this study the authors will look at the detection and segmentation of the iris and its influence on the overall performance of the iris-biometric tool chain. The authors will examine whether the segmentation accuracy, based on conformance with a ground truth, can serve as a predictor for the overall performance of the iris-biometric tool chain. That is: If the segmentation accuracy is improved will this always improve the overall performance? Furthermore, the authors will systematically evaluate the influence of segmentation parameters, pupillary and limbic boundary and normalisation centre (based on Daugman's rubbersheet model), on the rest of the iris-biometric tool chain. The authors will investigate if accurately finding these parameters is important and how consistency, that is, extracting the same exact region of the iris during segmenting, influences the overall performance.

translated by 谷歌翻译

Near-infrared and visible-light periocular recognition with Gabor features using frequency-adaptive automatic eye detection

Fernando Alonso-Fernandez , Josef Bigun

分类：计算机视觉

2022-11-10

Periocular recognition has gained attention recently due to demands of increased robustness of face or iris in less controlled scenarios. We present a new system for eye detection based on complex symmetry filters, which has the advantage of not needing training. Also, separability of the filters allows faster detection via one-dimensional convolutions. This system is used as input to a periocular algorithm based on retinotopic sampling grids and Gabor spectrum decomposition. The evaluation framework is composed of six databases acquired both with near-infrared and visible sensors. The experimental setup is complemented with four iris matchers, used for fusion experiments. The eye detection system presented shows very high accuracy with near-infrared data, and a reasonable good accuracy with one visible database. Regarding the periocular system, it exhibits great robustness to small errors in locating the eye centre, as well as to scale changes of the input image. The density of the sampling grid can also be reduced without sacrificing accuracy. Lastly, despite the poorer performance of the iris matchers with visible data, fusion with the periocular system can provide an improvement of more than 20%. The six databases used have been manually annotated, with the annotation made publicly available.

translated by 谷歌翻译

Complex-valued Iris Recognition Network

Kien Nguyen , Clinton Fookes , Sridha Sridharan , Arun Ross

分类：计算机视觉

2020-11-23

在这项工作中，我们设计了一个完全复杂的神经网络，用于虹膜识别的任务。与一般物体识别的问题不同，在实际值的神经网络可以用于提取相关特征的情况下，虹膜识别取决于从输入的虹膜纹理提取两个相位和幅度信息，以便更好地表示其生物识别内容。这需要提取和处理不能由实值神经网络有效处理的相位信息。在这方面，我们设计了一个完全复杂的神经网络，可以更好地捕获虹膜纹理的多尺度，多分辨率和多向阶段和多向阶段和幅度特征。我们展示了具有用于生成经典iRIscode的Gabor小波的提出的复合值虹膜识别网络的强烈对应关系;然而，所提出的方法使得能够为IRIS识别量身定制的自动复数特征学习的新能力。我们对三个基准数据集进行实验 - Nd-Crosssensor-2013，Casia-Iris-千和Ubiris.v2 - 并显示了拟议网络的虹膜识别任务的好处。我们利用可视化方案来传达复合网络的方式，与标准的实际网络相比，从虹膜纹理提取根本不同的特征。

translated by 谷歌翻译

Computer Vision on X-ray Data in Industrial Production and Security Applications: A survey

Mehdi Rafiei , Jenni Raitoharju , Alexandros Iosifidis

分类：计算机视觉

2022-11-10

X-ray imaging technology has been used for decades in clinical tasks to reveal the internal condition of different organs, and in recent years, it has become more common in other areas such as industry, security, and geography. The recent development of computer vision and machine learning techniques has also made it easier to automatically process X-ray images and several machine learning-based object (anomaly) detection, classification, and segmentation methods have been recently employed in X-ray image analysis. Due to the high potential of deep learning in related image processing applications, it has been used in most of the studies. This survey reviews the recent research on using computer vision and machine learning for X-ray analysis in industrial production and security applications and covers the applications, techniques, evaluation metrics, datasets, and performance comparison of those techniques on publicly available datasets. We also highlight some drawbacks in the published research and give recommendations for future research in computer vision-based X-ray analysis.

translated by 谷歌翻译

REFUGE2 Challenge: A Treasure Trove for Multi-Dimension Analysis and Evaluation in Glaucoma Screening

Huihui Fang , Fei Li , Junde Wu , Huazhu Fu , Xu Sun , Jaemin Son , Shuang Yu , Menglu Zhang , Chenglang Yuan , Cheng Bian

分类：计算机视觉

2022-02-18

With the rapid development of artificial intelligence (AI) in medical image processing, deep learning in color fundus photography (CFP) analysis is also evolving. Although there are some open-source, labeled datasets of CFPs in the ophthalmology community, large-scale datasets for screening only have labels of disease categories, and datasets with annotations of fundus structures are usually small in size. In addition, labeling standards are not uniform across datasets, and there is no clear information on the acquisition device. Here we release a multi-annotation, multi-quality, and multi-device color fundus image dataset for glaucoma analysis on an original challenge -- Retinal Fundus Glaucoma Challenge 2nd Edition (REFUGE2). The REFUGE2 dataset contains 2000 color fundus images with annotations of glaucoma classification, optic disc/cup segmentation, as well as fovea localization. Meanwhile, the REFUGE2 challenge sets three sub-tasks of automatic glaucoma diagnosis and fundus structure analysis and provides an online evaluation framework. Based on the characteristics of multi-device and multi-quality data, some methods with strong generalizations are provided in the challenge to make the predictions more robust. This shows that REFUGE2 brings attention to the characteristics of real-world multi-domain data, bridging the gap between scientific research and clinical application.

translated by 谷歌翻译

DeformIrisNet: An Identity-Preserving Model of Iris Texture Deformation

Siamul Karim Khan , Patrick Tinsley , Adam Czajka

分类：计算机视觉

2022-07-18

由于瞳孔大小变化而导致的非线性虹膜纹理变形是导致虹膜识别中真正比较分数的类内差异的主要因素之一。在虹膜识别的主要方法中，环形虹膜区域的大小线性缩放到规范矩形，在编码和匹配中进一步使用。然而，虹膜括约肌和扩张肌的生物复杂性导致虹膜特征的运动在学生大小的函数中是非线性的，而不仅仅是沿着径向路径的组织。或者，与基于虹膜肌肉的生物力学的现有理论模型，在本文中，我们提出了一种新型的基于Deep AutoCoder的模型，该模型可以直接从数据中直接从数据中直接学习虹膜纹理特征的复杂运动。提出的模型采用两个输入，（a）具有初始瞳孔大小的ISO兼容近红外虹膜图像，以及（b）定义虹膜目标形状的二进制掩码。该模型使虹膜纹理的所有必要的非线性变形使图像（a）中的虹膜形状与目标蒙版（b）提供的形状相匹配。损失函数的身份保护成分有助于模型找到保留身份的变形，而不仅仅是生成样品的视觉现实主义。我们还展示了该模型的两个直接应用：与线性模型相比，虹膜识别算法中的虹膜纹理变形更好，以及创建可以帮助人类法医检查人员的生成算法，他们可能需要比较虹膜图像与学生差异很大的图像扩张。我们提供源代码和模型权重，以及本文。

translated by 谷歌翻译

Multistream Gaze Estimation with Anatomical Eye Region Isolation by Synthetic to Real Transfer Learning

Zunayed Mahmud , Paul Hungler , Ali Etemad

分类：计算机视觉 | 机器学习

2022-06-18

我们提出了一条新型的神经管道Msgazenet，该管道通过通过多发射框架利用眼睛解剖学信息来学习凝视的表示。我们提出的解决方案包括两个组件，首先是一个用于隔离解剖眼区域的网络，以及第二个用于多发达凝视估计的网络。眼睛区域的隔离是通过U-NET样式网络进行的，我们使用合成数据集训练该网络，该数据集包含可见眼球和虹膜区域的眼睛区域掩模。此阶段使用的合成数据集是一个由60,000张眼睛图像组成的新数据集，我们使用眼视线模拟器Unityeyes创建。然后将眼睛区域隔离网络转移到真实域，以生成真实世界图像的面具。为了成功进行转移，我们在训练过程中利用域随机化，这允许合成图像从较大的差异中受益，并在类似于伪影的增强的帮助下从更大的差异中受益。然后，生成的眼睛区域掩模与原始眼睛图像一起用作我们凝视估计网络的多式输入。我们在三个基准凝视估计数据集（Mpiigaze，Eyediap和Utmultiview）上评估框架，在那里我们通过分别获得7.57％和1.85％的性能，在Eyediap和Utmultiview数据集上设置了新的最新技术Mpiigaze的竞争性能。我们还研究了方法在数据中的噪声方面的鲁棒性，并证明我们的模型对噪声数据不太敏感。最后，我们执行各种实验，包括消融研究，以评估解决方案中不同组件和设计选择的贡献。

translated by 谷歌翻译

Presentation Attack Detection Methods based on Gaze Tracking and Pupil Dynamic: A Comprehensive Survey

Jalil Nourmohammadi Khiarak

分类：计算机视觉

2021-12-07

研究的目的：在生物社区，可见人类的特征是普遍和可行的验证和识别移动设备上。然而，驾驶员能够通过创造假人和人工生物识别来欺骗系统来欺骗这些特征。可见的生物识别系统遭遇了呈现攻击的高安全性风险。方法：在此期间，基于挑战的方法，特别是视线跟踪和瞳孔动态似乎比别人接触生物系统更加安全的方法。我们审查了探索凝视跟踪和瞳孔动态活力检测的现有工作。主要结果：本研究分析了视线跟踪和瞳孔动态演示攻击的各个方面，如国家的最先进的活跃度检测算法，各种文物，公共数据库的可访问性和标准化的在这方面的总结。此外，我们讨论了未来的工作和开放挑战，以基于基于挑战的系统创造安全的活力检测。

translated by 谷歌翻译

A new eye segmentation method based on improved U2Net in TCM eye diagnosis

Peng Hong

分类：计算机视觉

2022-12-06

For the diagnosis of Chinese medicine, tongue segmentation has reached a fairly mature point, but it has little application in the eye diagnosis of Chinese medicine.First, this time we propose Res-UNet based on the architecture of the U2Net network, and use the Data Enhancement Toolkit based on small datasets, Finally, the feature blocks after noise reduction are fused with the high-level features.Finally, the number of network parameters and inference time are used as evaluation indicators to evaluate the model. At the same time, different eye data segmentation frames were compared using Miou, Precision, Recall, F1-Score and FLOPS. To convince people, we cite the UBIVIS. V1 public dataset this time, in which Miou reaches 97.8%, S-measure reaches 97.7%, F1-Score reaches 99.09% and for 320*320 RGB input images, the total parameter volume is 167.83 MB,Due to the excessive number of parameters, we experimented with a small-scale U2Net combined with a Res module with a parameter volume of 4.63 MB, which is similar to U2Net in related indicators, which verifies the effectiveness of our structure.which achieves the best segmentation effect in all the comparison networks and lays a foundation for the application of subsequent visual apparatus recognition symptoms.

translated by 谷歌翻译

A Survey on Computer Vision based Human Analysis in the COVID-19 Era

Fevziye Irem Eyiokur , Alperen Kantarcı , Mustafa Ekrem Erakın , Naser Damer , Ferda Ofli , Muhammad Imran , Janez Križaj , Albert Ali Salah , Alexander Waibel , Vitomir Štruc

分类：计算机视觉

2022-11-07

The emergence of COVID-19 has had a global and profound impact, not only on society as a whole, but also on the lives of individuals. Various prevention measures were introduced around the world to limit the transmission of the disease, including face masks, mandates for social distancing and regular disinfection in public spaces, and the use of screening applications. These developments also triggered the need for novel and improved computer vision techniques capable of (i) providing support to the prevention measures through an automated analysis of visual data, on the one hand, and (ii) facilitating normal operation of existing vision-based services, such as biometric authentication schemes, on the other. Especially important here, are computer vision techniques that focus on the analysis of people and faces in visual data and have been affected the most by the partial occlusions introduced by the mandates for facial masks. Such computer vision based human analysis techniques include face and face-mask detection approaches, face recognition techniques, crowd counting solutions, age and expression estimation procedures, models for detecting face-hand interactions and many others, and have seen considerable attention over recent years. The goal of this survey is to provide an introduction to the problems induced by COVID-19 into such research and to present a comprehensive review of the work done in the computer vision based human analysis field. Particular attention is paid to the impact of facial masks on the performance of various methods and recent solutions to mitigate this problem. Additionally, a detailed review of existing datasets useful for the development and evaluation of methods for COVID-19 related applications is also provided. Finally, to help advance the field further, a discussion on the main open challenges and future research direction is given.

translated by 谷歌翻译

Automatic Gaze Analysis: A Survey of Deep Learning based Approaches

Shreya Ghosh , Abhinav Dhall , Munawar Hayat , Jarrod Knibbe , Qiang Ji

分类：计算机视觉

2021-08-12

眼目光分析是计算机视觉和人类计算机相互作用领域的重要研究问题。即使在过去十年中取得了显着进展，由于眼睛外观，眼头相互作用，遮挡，图像质量和照明条件的独特性，自动凝视分析仍然具有挑战性。有几个开放的问题，包括在没有先验知识的情况下，在不受限制的环境中解释凝视方向的重要提示以及如何实时编码它们。我们回顾了一系列目光分析任务和应用程序的进展，以阐明这些基本问题，确定凝视分析中的有效方法并提供可能的未来方向。我们根据其优势和报告的评估指标分析了最近的凝视估计和分割方法，尤其是在无监督和弱监督的领域中。我们的分析表明，强大而通用的凝视分析方法的开发仍然需要解决现实世界中的挑战，例如不受限制的设置和学习，并减少了监督。最后，我们讨论了设计现实的目光分析系统的未来研究方向，该系统可以传播到其他领域，包括计算机视觉，增强现实（AR），虚拟现实（VR）和人类计算机交互（HCI）。项目页面：https：//github.com/i-am-shreya/eyegazesurvey} {https://github.com/i-am-shreya/eyegazesurvey

translated by 谷歌翻译

State Of The Art In Open-Set Iris Presentation Attack Detection

Aidan Boyd , Jeremy Speth , Lucas Parzianello , Kevin Bowyer , Adam Czajka

分类：计算机视觉

2022-08-22

在“封闭设置”场景中的评估之外，在呈现虹膜识别的演示攻击检测（PAD）中的研究基本上已经转移，以强调概括培训数据中不存在的演示攻击类型的能力。本文提供了几项贡献，可以理解和扩展开放式虹膜垫的最先进。首先，它描述了虹膜垫迄今为止最权威的评估。我们已经为此问题策划了最大的公共可用图像数据集，该数据集从先前由各个组发布的26个基准中绘制出来，并在本文的期刊版本中添加了150,000张图像，以创建一组450,000张代表正宗Iris和7的图像演示攻击工具的类型（PAI）。我们制定了一项保留的评估协议，并表明封闭式评估中的最佳算法在开放集情况下在多种攻击类型上都会显示出灾难性的失败。这包括在最新的Livdet-IRIS 2020竞赛中表现良好的算法，这可能来自以下事实：Livdet-IRIS协议强调隔离图像而不是看不见的攻击类型。其次，我们评估了当今可用的五种开源虹膜呈现攻击算法的准确性，其中一种是本文新近提出的，并建立了一种合奏方法，该方法以大幅度的利润击败了Livdet-IRIS 2020的获胜者。本文表明，当训练期间所有PAIS都知道时，封闭设置的虹膜垫是一个解决问题，多种算法显示出非常高的精度，而开放式虹膜垫（正确评估）尚未解决。新创建的数据集，新的开源算法和评估协议可公开使用本文的期刊版本，提供了研究人员可以用来衡量这一重要问题的进度的实验文物。

translated by 谷歌翻译

Synthetic Data in Human Analysis: A Survey

Indu Joshi , Marcel Grimmer , Christian Rathgeb , Christoph Busch , Francois Bremond , Antitza Dantcheva

分类：计算机视觉

2022-08-19

深度神经网络在人类分析中已经普遍存在，增强了应用的性能，例如生物识别识别，动作识别以及人重新识别。但是，此类网络的性能通过可用的培训数据缩放。在人类分析中，对大规模数据集的需求构成了严重的挑战，因为数据收集乏味，廉价，昂贵，并且必须遵守数据保护法。当前的研究研究了\ textit {合成数据}的生成，作为在现场收集真实数据的有效且具有隐私性的替代方案。这项调查介绍了基本定义和方法，在生成和采用合成数据进行人类分析时必不可少。我们进行了一项调查，总结了当前的最新方法以及使用合成数据的主要好处。我们还提供了公开可用的合成数据集和生成模型的概述。最后，我们讨论了该领域的局限性以及开放研究问题。这项调查旨在为人类分析领域的研究人员和从业人员提供。

translated by 谷歌翻译

Lung-Originated Tumor Segmentation from Computed Tomography Scan (LOTUS) Benchmark

Parnian Afshar , Arash Mohammadi , Konstantinos N. Plataniotis , Keyvan Farahani , Justin Kirby , Anastasia Oikonomou , Amir Asif , Leonard Wee , Andre Dekker , Xin Wu

分类：计算机视觉 | 机器学习

2022-01-03

肺癌是最致命的癌症之一，部分诊断和治疗取决于肿瘤的准确描绘。目前是最常见的方法的人以人为本的分割，须遵守观察者间变异性，并且考虑到专家只能提供注释的事实，也是耗时的。最近展示了有前途的结果，自动和半自动肿瘤分割方法。然而，随着不同的研究人员使用各种数据集和性能指标验证了其算法，可靠地评估这些方法仍然是一个开放的挑战。通过2018年IEEE视频和图像处理（VIP）杯竞赛创建的计算机断层摄影扫描（LOTUS）基准测试的肺起源肿瘤分割的目标是提供唯一的数据集和预定义的指标，因此不同的研究人员可以开发和以统一的方式评估他们的方法。 2018年VIP杯始于42个国家的全球参与，以获得竞争数据。在注册阶段，有129名成员组成了来自10个国家的28个团队，其中9个团队将其达到最后阶段，6队成功完成了所有必要的任务。简而言之，竞争期间提出的所有算法都是基于深度学习模型与假阳性降低技术相结合。三种决赛选手开发的方法表明，有希望的肿瘤细分导致导致越来越大的努力应降低假阳性率。本次竞争稿件概述了VIP-Cup挑战，以及所提出的算法和结果。

translated by 谷歌翻译

RAZE: Region Guided Self-Supervised Gaze Representation Learning

Neeru Dubey , Shreya Ghosh , Abhinav Dhall

分类：计算机视觉

2022-08-04

在基于视觉的辅助技术中，具有不同新兴主题的用例，例如增强现实，虚拟现实和人类计算机互动等不同的主题中的用例中，自动眼目光估计是一个重要问题。在过去的几年中，由于它克服了大规模注释的数据的要求，因此人们对无监督和自我监督的学习范式的兴趣越来越大。在本文中，我们提出了Raze，Raze是一个带有自我监督的注视表示框架的区域，该框架从非宣传的面部图像数据中发挥作用。 Raze通过辅助监督（即伪凝视区域分类）学习目光的表示，其中目的是通过利用瞳孔中心的相对位置将视野分类为不同的凝视区域（即左，右和中心）。因此，我们会自动注释154K Web爬行图像的伪凝视区标签，并通过“ IZE-NET”框架学习特征表示。 “ IZE-NET”是基于胶囊层的CNN体系结构，可以有效地捕获丰富的眼睛表示。在四个基准数据集上评估了特征表示的判别性能：洞穴，桌面，MPII和RT-GENE。此外，我们评估了所提出的网络在其他两个下游任务（即驱动器凝视估计和视觉注意估计）上的普遍性，这证明了学习的眼睛注视表示的有效性。

translated by 谷歌翻译