智能论文笔记

Classification of Beer Bottles using Object Detection and Transfer Learning

Philipp Hohlfeld , Tobias Ostermeier , Dominik Brandl

分类：计算机视觉 | 机器学习

2022-01-11

计算机愿景中的分类问题很常见。尽管如此，啤酒瓶的分类没有专门的工作。作为主课程深度学习挑战的一部分，创建了一个5207啤酒瓶图像和品牌标签的数据集。图像恰好包含一个啤酒瓶。在本文中，我们提出了一个深入的学习模式，将啤酒瓶的图片分为两步的方法。作为第一步，Faster-R-CNN检测与品牌独立于分类相关的图像部分。在第二步中，相关图像部分由Reset-18分类。具有最高置信度的图像部分作为类标签返回。我们提出了一种模型，我们超越了经典的一步转移学习方法，并在最终测试数据集的挑战期间达到了99.86％的准确性。在挑战结束后，我们能够达到100％的准确性

translated by 谷歌翻译

Orientation Aware Weapons Detection In Visual Data : A Benchmark Dataset

Nazeef Ul Haq , Muhammad Moazam Fraz , Tufail Sajjad Shah Hashmi , Muhammad Shahzad

分类：计算机视觉

2021-12-04

自动检测武器对于改善个人的安全性和福祉是重要的，仍然是由于各种尺寸，武器形状和外观，这是一项艰巨的任务。查看点变化和遮挡也是使这项任务更加困难的原因。此外，目前的物体检测算法处理矩形区域，但是一个细长和长的步枪可以真正地覆盖区域的一部分区域，其余部分可能包含未经紧的细节。为了克服这些问题，我们提出了一种用于定向意识武器检测的CNN架构，其提供具有改进的武器检测性能的面向边界框。所提出的模型不仅通过将角度作为分类问题的角度分成8个类而且提供方向，而是作为回归问题。对于培训我们的武器检测模型，包括总6400件武器图像的新数据集从网上收集，然后用面向定向的边界框手动注释。我们的数据集不仅提供导向的边界框作为地面真相，还提供了水平边界框。我们还以多种现代对象探测器提供我们的数据集，用于在该领域进一步研究。所提出的模型在该数据集上进行评估，并且与搁板对象检测器的比较分析产生了卓越的拟议模型的性能，以标准评估策略测量。数据集和模型实现在此链接上公开可用：https://bit.ly/2tyzicf。

translated by 谷歌翻译

Detection of E-scooter Riders in Naturalistic Scenes

Kumar Apurv , Renran Tian , Rini Sherony

分类：计算机视觉

2021-11-28

电子踏板车已成为全球主要城市的无处不在的车辆。电子摩托车的数量不断升级，增加了与路上其他汽车的互动。 E-Scooter Rider的正常行为对其他易受攻击的道路使用者不同。这种情况为车辆主动安全系统和自动化驾驶功能创造了新的挑战，这需要检测电子踏板车作为第一步。为了我们的最佳知识，没有现有的计算机视觉模型来检测这些电子踏板车骑手。本文介绍了一种基于愿景的基于视觉的系统，可以区分电子踏板车骑车者和常规行人以及自然场景中的电子踏板车骑手的基准数据集。我们提出了一个高效的管道，建立了两种现有的最先进的卷积神经网络（CNN），您只需看一次（Yolov3）和MobileNetv2。我们在我们的数据集中微调MobileNetv2并培训模型以对电子踏板车骑手和行人进行分类。我们在原始测试样品上获得大约0.75左右的召回，以将电子踏板车骑手与整个管道进行分类。此外，YOLOV3顶部培训的MobileNetv2的分类精度超过91％，具有精度，召回超过0.9。

translated by 谷歌翻译

Computer Vision on X-ray Data in Industrial Production and Security Applications: A survey

Mehdi Rafiei , Jenni Raitoharju , Alexandros Iosifidis

分类：计算机视觉

2022-11-10

X-ray imaging technology has been used for decades in clinical tasks to reveal the internal condition of different organs, and in recent years, it has become more common in other areas such as industry, security, and geography. The recent development of computer vision and machine learning techniques has also made it easier to automatically process X-ray images and several machine learning-based object (anomaly) detection, classification, and segmentation methods have been recently employed in X-ray image analysis. Due to the high potential of deep learning in related image processing applications, it has been used in most of the studies. This survey reviews the recent research on using computer vision and machine learning for X-ray analysis in industrial production and security applications and covers the applications, techniques, evaluation metrics, datasets, and performance comparison of those techniques on publicly available datasets. We also highlight some drawbacks in the published research and give recommendations for future research in computer vision-based X-ray analysis.

translated by 谷歌翻译

Small Object Detection using Deep Learning

Aleena Ajaz , Ayesha Salar , Tauseef Jamal , Asif Ullah Khan

分类：计算机视觉 | 机器学习

2022-01-10

现在，诸如无人机之类的无人机，从捕获和目标检测的各种目的中，从Ariel Imagery等捕获和目标检测的各种目的很大使用。轻松进入这些小的Ariel车辆到公众可能导致严重的安全威胁。例如，可以通过使用无人机在公共公共场合中混合的间谍来监视关键位置。在手中研究提出了一种改进和高效的深度学习自治系统，可以以极大的精度检测和跟踪非常小的无人机。建议的系统由自定义深度学习模型Tiny Yolov3组成，其中一个非常快速的物体检测模型的口味之一，您只能构建并用于检测一次（YOLO）。物体检测算法将有效地检测无人机。与以前的Yolo版本相比，拟议的架构表现出显着更好的性能。在资源使用和时间复杂性方面观察到改进。使用召回和精度分别为93％和91％的测量来测量性能。

translated by 谷歌翻译

Real-Time Oil Leakage Detection on Aftermarket Motorcycle Damping System with Convolutional Neural Networks

Federico Bianchi , Stefano Speziali , Andrea Marini , Massimiliano Proietti , Lorenzo Menculini , Alberto Garinei , Gabriele Bellani , Marcello Marconi

分类：计算机视觉

2022-08-10

在这项工作中，我们详细描述了深度学习和计算机视觉如何帮助检测AirTender系统的故障事件，AirTender系统是售后摩托车阻尼系统组件。监测飞行员运行的最有效方法之一是在其表面上寻找油污渍。从实时图像开始，首先在摩托车悬架系统中检测到Airtender，然后二进制分类器确定Airtender是否在溢出油。该检测是在YOLO5架构的帮助下进行的，而分类是在适当设计的卷积神经网络油网40的帮助下进行的。为了更清楚地检测油的泄漏，我们用荧光染料稀释了荧光染料，激发波长峰值约为390 nm。然后用合适的紫外线LED照亮飞行员。整个系统是设计低成本检测设置的尝试。船上设备（例如迷你计算机）被放置在悬架系统附近，并连接到全高清摄像头框架架上。板载设备通过我们的神经网络算法，然后能够将AirTender定位并分类为正常功能（非泄漏图像）或异常（泄漏图像）。

translated by 谷歌翻译

Automated Defect Recognition of Castings defects using Neural Networks

Alberto García-Pérez , María José Gómez-Silva , Arturo de la Escalera

分类：计算机视觉

2022-09-06

工业X射线分析在需要保证某些零件的结构完整性的航空航天，汽车或核行业中很常见。但是，射线照相图像的解释有时很困难，可能导致两名专家在缺陷分类上不同意。本文介绍的自动缺陷识别（ADR）系统将减少分析时间，还将有助于减少对缺陷的主观解释，同时提高人类检查员的可靠性。我们的卷积神经网络（CNN）模型达到94.2 \％准确性（MAP@iou = 50 \％），当应用于汽车铝铸件数据集（GDXRAR）时，它被认为与预期的人类性能相似，超过了当前状态该数据集的艺术。在工业环境上，其推理时间少于每个DICOM图像，因此可以安装在生产设施上，不会影响交付时间。此外，还进行了对主要高参数的消融研究，以优化从75 \％映射的初始基线结果最高94.2 \％map的模型准确性。

translated by 谷歌翻译

CNN-transformer mixed model for object detection

Wenshuo Li

分类：计算机视觉

2022-12-13

Object detection, one of the three main tasks of computer vision, has been used in various applications. The main process is to use deep neural networks to extract the features of an image and then use the features to identify the class and location of an object. Therefore, the main direction to improve the accuracy of object detection tasks is to improve the neural network to extract features better. In this paper, I propose a convolutional module with a transformer[1], which aims to improve the recognition accuracy of the model by fusing the detailed features extracted by CNN[2] with the global features extracted by a transformer and significantly reduce the computational effort of the transformer module by deflating the feature mAP. The main execution steps are convolutional downsampling to reduce the feature map size, then self-attention calculation and upsampling, and finally concatenation with the initial input. In the experimental part, after splicing the block to the end of YOLOv5n[3] and training 300 epochs on the coco dataset, the mAP improved by 1.7% compared with the previous YOLOv5n, and the mAP curve did not show any saturation phenomenon, so there is still potential for improvement. After 100 rounds of training on the Pascal VOC dataset, the accuracy of the results reached 81%, which is 4.6 better than the faster RCNN[4] using resnet101[5] as the backbone, but the number of parameters is less than one-twentieth of it.

translated by 谷歌翻译

Deep Feature Fusion for Mitosis Counting

Robin Elizabeth Yancey

分类：计算机视觉 | 机器学习 | (统计)机器学习

2020-02-01

居住在美国的每个妇女在8次发育侵袭性乳腺癌的可能性下有大约1。有丝分裂细胞计数是评估乳腺癌侵袭性或等级最常见的测试之一。在该预后，必须通过病理学家使用高分辨率显微镜检查组织病理学图像以计算细胞。不幸的是，可以是一种完整的任务，可重复性差，特别是对于非专家来说。最近深入学习网络适用于能够自动定位这些感兴趣区域的医学应用。然而，这些基于区域的网络缺乏利用通常用作唯一检测方法的完整图像CNN产生的分割特征的能力。因此，所提出的方法利用更快的RCNN进行对象检测，同时使用RGB图像特征的UNET产生的分割特征，以实现在Mitos-Atypia 2014分数上的F分数为0.508，计数数据集，优于最先进的攻击方法。

translated by 谷歌翻译

You Only Look Once: Unified, Real-Time Object Detection

Joseph Redmon , Santosh Divvala , Ross Girshick , Ali Farhadi

分类：

2015-06-08

We present YOLO, a new approach to object detection. Prior work on object detection repurposes classifiers to perform detection. Instead, we frame object detection as a regression problem to spatially separated bounding boxes and associated class probabilities. A single neural network predicts bounding boxes and class probabilities directly from full images in one evaluation. Since the whole detection pipeline is a single network, it can be optimized end-to-end directly on detection performance.Our unified architecture is extremely fast. Our base YOLO model processes images in real-time at 45 frames per second. A smaller version of the network, Fast YOLO, processes an astounding 155 frames per second while still achieving double the mAP of other real-time detectors. Compared to state-of-the-art detection systems, YOLO makes more localization errors but is less likely to predict false positives on background. Finally, YOLO learns very general representations of objects. It outperforms other detection methods, including DPM and R-CNN, when generalizing from natural images to other domains like artwork.

translated by 谷歌翻译

COVID-19 Monitoring System using Social Distancing and Face Mask Detection on Surveillance video datasets

Sahana Srinivasan , Rujula Singh R , Ruchita R Biradar , Revathi SA

分类：计算机视觉 | 机器学习

2021-10-08

In the current times, the fear and danger of COVID-19 virus still stands large. Manual monitoring of social distancing norms is impractical with a large population moving about and with insufficient task force and resources to administer them. There is a need for a lightweight, robust and 24X7 video-monitoring system that automates this process. This paper proposes a comprehensive and effective solution to perform person detection, social distancing violation detection, face detection and face mask classification using object detection, clustering and Convolution Neural Network (CNN) based binary classifier. For this, YOLOv3, Density-based spatial clustering of applications with noise (DBSCAN), Dual Shot Face Detector (DSFD) and MobileNetV2 based binary classifier have been employed on surveillance video datasets. This paper also provides a comparative study of different face detection and face mask classification models. Finally, a video dataset labelling method is proposed along with the labelled video dataset to compensate for the lack of dataset in the community and is used for evaluation of the system. The system performance is evaluated in terms of accuracy, F1 score as well as the prediction time, which has to be low for practical applicability. The system performs with an accuracy of 91.2% and F1 score of 90.79% on the labelled video dataset and has an average prediction time of 7.12 seconds for 78 frames of a video.

translated by 谷歌翻译

Weed Recognition using Deep Learning Techniques on Class-imbalanced Imagery

A S M Mahmudul Hasan , Ferdous Sohel , Dean Diepeveen , Hamid Laga , Michael G. K. Jones

分类：计算机视觉 | 人工智能

2021-12-15

大多数杂草物种都会通过竞争高价值作物所需的营养而产生对农业生产力的不利影响。手动除草对于大型种植区不实用。已经开展了许多研究，为农业作物制定了自动杂草管理系统。在这个过程中，其中一个主要任务是识别图像中的杂草。但是，杂草的认可是一个具有挑战性的任务。它是因为杂草和作物植物的颜色，纹理和形状类似，可以通过成像条件，当记录图像时的成像条件，地理或天气条件进一步加剧。先进的机器学习技术可用于从图像中识别杂草。在本文中，我们调查了五个最先进的深神经网络，即VGG16，Reset-50，Inception-V3，Inception-Resnet-V2和MobileNetv2，并评估其杂草识别的性能。我们使用了多种实验设置和多个数据集合组合。特别是，我们通过组合几个较小的数据集，通过数据增强构成了一个大型DataSet，缓解了类别不平衡，并在基于深度神经网络的基准测试中使用此数据集。我们通过保留预先训练的权重来调查使用转移学习技术来利用作物和杂草数据集的图像提取特征和微调它们。我们发现VGG16比小规模数据集更好地执行，而ResET-50比其他大型数据集上的其他深网络更好地执行。

translated by 谷歌翻译

Towards Asteroid Detection in Microlensing Surveys with Deep Learning

Preeti Cowan , Ian A. Bond , Napoleon H. Reyes

分类：计算机视觉 | 机器学习

2022-11-04

Asteroids are an indelible part of most astronomical surveys though only a few surveys are dedicated to their detection. Over the years, high cadence microlensing surveys have amassed several terabytes of data while scanning primarily the Galactic Bulge and Magellanic Clouds for microlensing events and thus provide a treasure trove of opportunities for scientific data mining. In particular, numerous asteroids have been observed by visual inspection of selected images. This paper presents novel deep learning-based solutions for the recovery and discovery of asteroids in the microlensing data gathered by the MOA project. Asteroid tracklets can be clearly seen by combining all the observations on a given night and these tracklets inform the structure of the dataset. Known asteroids were identified within these composite images and used for creating the labelled datasets required for supervised learning. Several custom CNN models were developed to identify images with asteroid tracklets. Model ensembling was then employed to reduce the variance in the predictions as well as to improve the generalisation error, achieving a recall of 97.67%. Furthermore, the YOLOv4 object detector was trained to localize asteroid tracklets, achieving a mean Average Precision (mAP) of 90.97%. These trained networks will be applied to 16 years of MOA archival data to find both known and unknown asteroids that have been observed by the survey over the years. The methodologies developed can be adapted for use by other surveys for asteroid recovery and discovery.

translated by 谷歌翻译

Application of the YOLOv5 Model for the Detection of Microobjects in the Marine Environment

Aleksandr N. Grekov , Yurii E. Shishkin , Sergei S. Peliushenko , Aleksandr S. Mavrin

分类：计算机视觉 | 机器学习 | 神经与进化计算

2022-11-28

The efficiency of using the YOLOV5 machine learning model for solving the problem of automatic de-tection and recognition of micro-objects in the marine environment is studied. Samples of microplankton and microplastics were prepared, according to which a database of classified images was collected for training an image recognition neural network. The results of experiments using a trained network to find micro-objects in photo and video images in real time are presented. Experimental studies have shown high efficiency, comparable to manual recognition, of the proposed model in solving problems of detect-ing micro-objects in the marine environment.

translated by 谷歌翻译

Automatic Signboard Detection and Localization in Densely Populated Developing Cities

Md. Sadrul Islam Toaha , Sakib Bin Asad , Chowdhury Rafeed Rahman , S. M. Shahriar Haque , Mahfuz Ara Proma , Md. Ahsan Habib Shuvo , Tashin Ahmed , Md. Amimul Basher

分类：计算机视觉

2020-03-04

由于缺乏自动注释系统，大多数发展城市的城市机构都是数字未标记的。因此，在此类城市中，位置和轨迹服务（例如Google Maps，Uber等）仍然不足。自然场景图像中的准确招牌检测是从此类城市街道检索无错误的信息的最重要任务。然而，开发准确的招牌本地化系统仍然是尚未解决的挑战，因为它的外观包括文本图像和令人困惑的背景。我们提出了一种新型的对象检测方法，该方法可以自动检测招牌，适合此类城市。我们通过合并两种专业预处理方法和一种运行时效高参数值选择算法来使用更快的基于R-CNN的定位。我们采用了一种增量方法，通过使用我们构造的SVSO（Street View Signboard对象）签名板数据集，通过详细评估和与基线进行比较，以达到最终提出的方法，这些方法包含六个发展中国家的自然场景图像。我们在SVSO数据集和Open Image数据集上展示了我们提出的方法的最新性能。我们提出的方法可以准确地检测招牌（即使图像包含多种形状和颜色的多种嘈杂背景的招牌）在SVSO独立测试集上达到0.90 MAP（平均平均精度）得分。我们的实施可在以下网址获得：https：//github.com/sadrultoaha/signboard-detection

translated by 谷歌翻译

Two Decades of Bengali Handwritten Digit Recognition: A Survey

A. B. M. Ashikur Rahman , Md. Bakhtiar Hasan , Sabbir Ahmed , Tasnim Ahmed , Md. Hamjajul Ashmafee , Mohammad Ridwan Kabir , Md. Hasanul Kabir

分类：计算机视觉

2022-06-05

手写数字识别（HDR）是光学特征识别（OCR）领域中最具挑战性的任务之一。不管语言如何，HDR都存在一些固有的挑战，这主要是由于个人跨个人的写作风格的变化，编写媒介和环境的变化，无法在反复编写任何数字等时保持相同的笔触。除此之外，特定语言数字的结构复杂性可能会导致HDR的模棱两可。多年来，研究人员开发了许多离线和在线HDR管道，其中不同的图像处理技术与传统的机器学习（ML）基于基于的和/或基于深度学习（DL）的体系结构相结合。尽管文献中存在有关HDR的广泛审查研究的证据，例如：英语，阿拉伯语，印度，法尔西，中文等，但几乎没有对孟加拉人HDR（BHDR）的调查，这缺乏对孟加拉语HDR（BHDR）的研究，而这些调查缺乏对孟加拉语HDR（BHDR）的研究。挑战，基础识别过程以及可能的未来方向。在本文中，已经分析了孟加拉语手写数字的特征和固有的歧义，以及二十年来最先进的数据集的全面见解和离线BHDR的方法。此外，还详细讨论了一些涉及BHDR的现实应用特定研究。本文还将作为对离线BHDR背后科学感兴趣的研究人员的汇编，煽动了对相关研究的新途径的探索，这可能会进一步导致在不同应用领域对孟加拉语手写数字进行更好的离线认识。

translated by 谷歌翻译

Object Detection with Deep Learning: A Review

Zhong-Qiu Zhao , Peng Zheng , Shou-tao Xu , Xindong Wu

分类：

2018-07-15

Due to object detection's close relationship with video analysis and image understanding, it has attracted much research attention in recent years. Traditional object detection methods are built on handcrafted features and shallow trainable architectures. Their performance easily stagnates by constructing complex ensembles which combine multiple low-level image features with high-level context from object detectors and scene classifiers. With the rapid development in deep learning, more powerful tools, which are able to learn semantic, high-level, deeper features, are introduced to address the problems existing in traditional architectures. These models behave differently in network architecture, training strategy and optimization function, etc. In this paper, we provide a review on deep learning based object detection frameworks. Our review begins with a brief introduction on the history of deep learning and its representative tool, namely Convolutional Neural Network (CNN). Then we focus on typical generic object detection architectures along with some modifications and useful tricks to improve detection performance further. As distinct specific detection tasks exhibit different characteristics, we also briefly survey several specific tasks, including salient object detection, face detection and pedestrian detection. Experimental analyses are also provided to compare various methods and draw some meaningful conclusions. Finally, several promising directions and tasks are provided to serve as guidelines for future work in both object detection and relevant neural network based learning systems.

translated by 谷歌翻译

Terrain Classification using Transfer Learning on Hyperspectral Images: A Comparative study

Uphar Singh , Kumar Saurabh , Neelaksh Trehan , Ranjana Vyas , O. P. Vyas

分类：计算机视觉 | 人工智能 | 机器学习

2022-06-19

与RGB图像相比，高光谱图像包含更多数量的通道，因此包含有关图像中实体的更多信息。卷积神经网络（CNN）和多层感知器（MLP）已被证明是一种有效的图像分类方法。但是，他们遭受了长期培训时间和大量标记数据的要求，以达到预期的结果。在处理高光谱图像时，这些问题变得更加复杂。为了减少训练时间并减少对大型标记数据集的依赖性，我们建议使用转移学习方法。使用PCA将高光谱数据集预处理到较低的维度，然后将深度学习模型应用于分类。然后，转移学习模型使用该模型学到的功能来解决看不见的数据集上的新分类问题。进行了CNN和多个MLP体系结构模型的详细比较，以确定最适合目标的最佳体系结构。结果表明，层的缩放并不总是会导致准确性的提高，但通常会导致过度拟合，并增加训练时间。通过应用转移学习方法而不仅仅是解决问题，训练时间更大程度地减少了。通过直接在大型数据集上训练新模型，而不会影响准确性。

translated by 谷歌翻译

AlertTrap: A study on object detection in remote insects trap monitoring system using on-the-edge deep learning platform

An D. Le , Duy A. Pham , Dong T. Pham , Hien B. Vo

分类：计算机视觉

2021-12-26

水果苍蝇是果实产量最有害的昆虫物种之一。在AlertTrap中，使用不同的最先进的骨干功能提取器（如MobiLenetv1和MobileNetv2）的SSD架构的实现似乎是实时检测问题的潜在解决方案。SSD-MobileNetv1和SSD-MobileNetv2表现良好并导致AP至0.5分别为0.957和1.0。YOLOV4-TINY优于SSD家族，在AP@0.5中为1.0;但是，其吞吐量速度略微慢。

translated by 谷歌翻译

Multiple instance learning on deep features for weakly supervised object detection with extreme domain shifts

Nicolas Gonthier , Saïd Ladjal , Yann Gousseau

分类：计算机视觉

2020-08-03

仅使用图像级注释的弱监督对象检测（WSOD）在过去几年中引起了不断增长的关注。然而，此类任务通常以专注于自然图像的特定于域的解决方案，而我们表明应用于预先训练的深度特征的简单多实例方法会产生优异的非摄影数据集的性能，可能包括新类。该方法不包括任何微调或跨域学习，因此有效且可能适用于任意数据集和类。我们调查了拟议方法的几种口味，一些包括多层的Perceptron和多层分类器。尽管其简单性，我们的方法在一系列公开的数据集中展示了竞争结果，包括绘画（人民艺术，象征），水彩画，剪贴画和漫画，并允许快速学习未经看的视觉类别。

translated by 谷歌翻译