智能论文笔记

Detection of Active Emergency Vehicles using Per-Frame CNNs and Output Smoothing

Meng Fan , Craig Bidstrup , Zhaoen Su , Jason Owens , Gary Yang , Nemanja Djuric

分类：计算机视觉

2022-12-28

While inferring common actor states (such as position or velocity) is an important and well-explored task of the perception system aboard a self-driving vehicle (SDV), it may not always provide sufficient information to the SDV. This is especially true in the case of active emergency vehicles (EVs), where light-based signals also need to be captured to provide a full context. We consider this problem and propose a sequential methodology for the detection of active EVs, using an off-the-shelf CNN model operating at a frame level and a downstream smoother that accounts for the temporal aspect of flashing EV lights. We also explore model improvements through data augmentation and training with additional hard samples.

translated by 谷歌翻译

Detection of E-scooter Riders in Naturalistic Scenes

Kumar Apurv , Renran Tian , Rini Sherony

分类：计算机视觉

2021-11-28

电子踏板车已成为全球主要城市的无处不在的车辆。电子摩托车的数量不断升级，增加了与路上其他汽车的互动。 E-Scooter Rider的正常行为对其他易受攻击的道路使用者不同。这种情况为车辆主动安全系统和自动化驾驶功能创造了新的挑战，这需要检测电子踏板车作为第一步。为了我们的最佳知识，没有现有的计算机视觉模型来检测这些电子踏板车骑手。本文介绍了一种基于愿景的基于视觉的系统，可以区分电子踏板车骑车者和常规行人以及自然场景中的电子踏板车骑手的基准数据集。我们提出了一个高效的管道，建立了两种现有的最先进的卷积神经网络（CNN），您只需看一次（Yolov3）和MobileNetv2。我们在我们的数据集中微调MobileNetv2并培训模型以对电子踏板车骑手和行人进行分类。我们在原始测试样品上获得大约0.75左右的召回，以将电子踏板车骑手与整个管道进行分类。此外，YOLOV3顶部培训的MobileNetv2的分类精度超过91％，具有精度，召回超过0.9。

translated by 谷歌翻译

1st Workshop on Maritime Computer Vision (MaCVi) 2023: Challenge Results

Benjamin Kiefer , Matej Kristan , Janez Perš , Lojze Žust , Fabio Poiesi , Fabio Augusto de Alcantara Andrade , Alexandre Bernardino , Matthew Dawkins , Jenni Raitoharju , Yitong Quan

分类：计算机视觉 | 人工智能 | 机器学习 | 机器人

2022-11-24

The 1$^{\text{st}}$ Workshop on Maritime Computer Vision (MaCVi) 2023 focused on maritime computer vision for Unmanned Aerial Vehicles (UAV) and Unmanned Surface Vehicle (USV), and organized several subchallenges in this domain: (i) UAV-based Maritime Object Detection, (ii) UAV-based Maritime Object Tracking, (iii) USV-based Maritime Obstacle Segmentation and (iv) USV-based Maritime Obstacle Detection. The subchallenges were based on the SeaDronesSee and MODS benchmarks. This report summarizes the main findings of the individual subchallenges and introduces a new benchmark, called SeaDronesSee Object Detection v2, which extends the previous benchmark by including more classes and footage. We provide statistical and qualitative analyses, and assess trends in the best-performing methodologies of over 130 submissions. The methods are summarized in the appendix. The datasets, evaluation code and the leaderboard are publicly available at https://seadronessee.cs.uni-tuebingen.de/macvi.

translated by 谷歌翻译

Video-based Smoky Vehicle Detection with A Coarse-to-Fine Framework

Xiaojiang Peng , Xiaomao Fan , Qingyang Wu , Jieyan Zhao , Pan Gao

分类：计算机视觉

2022-07-08

视频中的自动烟熏车辆检测是用于传统昂贵的遥感遥控器，其中具有紫外线的紫外线设备，用于环境保护机构。但是，将车辆烟雾与后车辆或混乱道路的阴影和湿区域区分开来是一项挑战，并且由于注释数据有限，可能会更糟。在本文中，我们首先引入了一个现实世界中的大型烟熏车数据集，其中有75,000个带注释的烟熏车像图像，从而有助于对先进的深度学习模型进行有效的培训。为了启用公平算法比较，我们还构建了一个烟熏车视频数据集，其中包括163个带有细分级注释的长视频。此外，我们提出了一个新的粗到烟熏车辆检测（代码）框架，以进行有效的烟熏车辆检测。这些代码首先利用轻质的Yolo检测器以高召回率进行快速烟雾检测，然后采用烟极车匹配策略来消除非车辆烟雾，并最终使用精心设计的3D模型进一步完善结果，以进一步完善结果。空间时间空间。四个指标的广泛实验表明，我们的框架比基于手工的特征方法和最新的高级方法要优越。代码和数据集将在https://github.com/pengxj/smokyvehicle上发布。

translated by 谷歌翻译

Object Detection for Autonomous Dozers

Chun-Hao Liu , Burhaneddin Yaman

分类：计算机视觉

2022-08-17

我们引入了一种新型的自动驾驶汽车 - 一种自动推土机，有望以有效，健壮和安全的方式完成建筑工地任务。为了更好地处理推土机的路径规划并确保建筑工地的安全性，对象检测是感知任务中最关键的组成部分之一。在这项工作中，我们首先通过开车来收集建筑工地数据。然后，我们彻底分析数据以了解其分布。最后，对两个众所周知的对象检测模型进行了训练，他们的性能通过广泛的训练策略和超参数进行了基准测试。

translated by 谷歌翻译

Scalability in Perception for Autonomous Driving: Waymo Open Dataset

Pei Sun , Henrik Kretzschmar , Xerxes Dotiwalla , Aurelien Chouard , Vijaysai Patnaik , Paul Tsui , James Guo , Yin Zhou , Yuning Chai , Benjamin Caine

分类：

2019-12-10

The research community has increasing interest in autonomous driving research, despite the resource intensity of obtaining representative real world data. Existing selfdriving datasets are limited in the scale and variation of the environments they capture, even though generalization within and between operating regions is crucial to the overall viability of the technology. In an effort to help align the research community's contributions with real-world selfdriving problems, we introduce a new large-scale, high quality, diverse dataset. Our new dataset consists of 1150 scenes that each span 20 seconds, consisting of well synchronized and calibrated high quality LiDAR and camera data captured across a range of urban and suburban geographies. It is 15x more diverse than the largest cam-era+LiDAR dataset available based on our proposed geographical coverage metric. We exhaustively annotated this data with 2D (camera image) and 3D (LiDAR) bounding boxes, with consistent identifiers across frames. Finally, we provide strong baselines for 2D as well as 3D detection and tracking tasks. We further study the effects of dataset size and generalization across geographies on 3D detection methods. Find data, code and more up-to-date information at http://www.waymo.com/open.

translated by 谷歌翻译

Synthehicle: Multi-Vehicle Multi-Camera Tracking in Virtual Cities

Fabian Herzog , Junpeng Chen , Torben Teepe , Johannes Gilg , Stefan Hörmann , Gerhard Rigoll

分类：计算机视觉

2022-08-30

智能城市应用程序（例如智能交通路由或事故预防）依赖计算机视觉方法来确切的车辆定位和跟踪。由于精确标记的数据缺乏，从多个摄像机中检测和跟踪3D的车辆被证明是探索挑战的。我们提出了一个庞大的合成数据集，用于多个重叠和非重叠摄像头视图中的多个车辆跟踪和分割。与现有的数据集不同，该数据集仅为2D边界框提供跟踪地面真实，我们的数据集还包含适用于相机和世界坐标中的3D边界框的完美标签，深度估计以及实例，语义和泛型细分。该数据集由17个小时的标记视频材料组成，从64个不同的一天，雨，黎明和夜幕播放的340张摄像机录制，使其成为迄今为止多目标多型多相机跟踪的最广泛数据集。我们提供用于检测，车辆重新识别以及单摄像机跟踪的基准。代码和数据公开可用。

translated by 谷歌翻译

HTML版本

AirTrack: Onboard Deep Learning Framework for Long-Range Aircraft Detection and Tracking

Sourish Ghosh , Jay Patrikar , Brady Moon , Milad Moghassem Hamidi , and Sebastian Scherer

分类：计算机视觉 | 机器学习 | 机器人

2022-09-26

检测和避免（DAA）功能对于无人飞机系统（UAS）的安全操作至关重要。本文介绍了Airtrack，这是一个仅实时视觉检测和跟踪框架，尊重SUAS系统的大小，重量和功率（交换）约束。鉴于遥远飞机的低信噪比（SNR），我们建议在深度学习框架中使用完整的分辨率图像，以对齐连续的图像以消除自我动态。然后，对齐的图像在级联的初级和次级分类器中下游使用，以改善多个指标的检测和跟踪性能。我们表明，Airtrack在亚马逊机载对象跟踪（AOT）数据集上胜过最先进的基线。多次现实世界的飞行测试与CESSNA 172与通用航空交通相互作用，并在受控的设置中朝着UAS飞向UAS的其他近碰撞飞行测试，该拟议方法满足了新引入的ASTM F3442/F3442M标准DAA标准。经验评估表明，我们的系统的概率超过900m，范围超过95％。视频可在https://youtu.be/h3ll_wjxjpw上找到。

translated by 谷歌翻译

Small Object Detection using Deep Learning

Aleena Ajaz , Ayesha Salar , Tauseef Jamal , Asif Ullah Khan

分类：计算机视觉 | 机器学习

2022-01-10

现在，诸如无人机之类的无人机，从捕获和目标检测的各种目的中，从Ariel Imagery等捕获和目标检测的各种目的很大使用。轻松进入这些小的Ariel车辆到公众可能导致严重的安全威胁。例如，可以通过使用无人机在公共公共场合中混合的间谍来监视关键位置。在手中研究提出了一种改进和高效的深度学习自治系统，可以以极大的精度检测和跟踪非常小的无人机。建议的系统由自定义深度学习模型Tiny Yolov3组成，其中一个非常快速的物体检测模型的口味之一，您只能构建并用于检测一次（YOLO）。物体检测算法将有效地检测无人机。与以前的Yolo版本相比，拟议的架构表现出显着更好的性能。在资源使用和时间复杂性方面观察到改进。使用召回和精度分别为93％和91％的测量来测量性能。

translated by 谷歌翻译

Deep Learning based Computer Vision Methods for Complex Traffic Environments Perception: A Review

Talha Azfar , Jinlong Li , Hongkai Yu , Ruey Long Cheu , Yisheng Lv , Ruimin Ke

分类：计算机视觉

2022-11-09

Computer vision applications in intelligent transportation systems (ITS) and autonomous driving (AD) have gravitated towards deep neural network architectures in recent years. While performance seems to be improving on benchmark datasets, many real-world challenges are yet to be adequately considered in research. This paper conducted an extensive literature review on the applications of computer vision in ITS and AD, and discusses challenges related to data, models, and complex urban environments. The data challenges are associated with the collection and labeling of training data and its relevance to real world conditions, bias inherent in datasets, the high volume of data needed to be processed, and privacy concerns. Deep learning (DL) models are commonly too complex for real-time processing on embedded hardware, lack explainability and generalizability, and are hard to test in real-world settings. Complex urban traffic environments have irregular lighting and occlusions, and surveillance cameras can be mounted at a variety of angles, gather dirt, shake in the wind, while the traffic conditions are highly heterogeneous, with violation of rules and complex interactions in crowded scenarios. Some representative applications that suffer from these problems are traffic flow estimation, congestion detection, autonomous driving perception, vehicle interaction, and edge computing for practical deployment. The possible ways of dealing with the challenges are also explored while prioritizing practical deployment.

translated by 谷歌翻译

RADACS: Towards Higher-Order Reasoning using Action Recognition in Autonomous Vehicles

Alex Zhuang , Eddy Zhou , Quanquan Li , Rowan Dempster , Alikasim Budhwani , Mohammad Al-Sharman , Derek Rayside , William Melek

分类：计算机视觉 | 机器学习 | 机器人

2022-09-28

当应用于自动驾驶汽车设置时，行动识别可以帮助丰富环境模型对世界的理解并改善未来行动的计划。为了改善自动驾驶汽车决策，我们在这项工作中提出了一种新型的两阶段在线行动识别系统，称为RADAC。RADAC提出了主动剂检测的问题，并在直接的两阶段管道中以进行动作检测和分类的直接识别人类活动识别中的参与者关系的想法。我们表明，我们提出的计划可以胜过ICCV2021 ROAD挑战数据集上的基线，并通过将其部署在真实的车辆平台上，我们演示了对环境中代理行动的高阶理解如何可以改善对真实自动驾驶汽车的决策。

translated by 谷歌翻译

Real-time Bangla License Plate Recognition System for Low Resource Video-based Applications

Alif Ashrafee , Akib Mohammed Khan , Mohammad Sabik Irbaz , MD Abdullah Al Nasim

分类：计算机视觉 | 人工智能

2021-08-18

自动许可板识别系统旨在提供从视频帧中出现的车辆检测，本地化和识别车牌字符的解决方案。但是，在现实世界中部署此类系统需要在低资源环境中实时性能。在我们的论文中，我们提出了一种双级检测管线与视觉API配对，提供实时推理速度以及始终如一的准确检测和识别性能。我们使用Haar-Cascade分类器作为骨干MobileNet SSDv2检测模型顶部的过滤器。这仅通过专注于高置信度检测并使用它们来识别来减少推理时间。我们还施加了一个时间帧分离策略，以区分同一夹子中的多个车辆牌照。此外，没有公开的Bangla许可证板数据集，我们创建了一个图像数据集和野外包含许可板的视频数据集。我们在图像数据集上培训了模型，并达到了86％的AP（0.5）得分，并在视频数据集上测试了我们的管道，并观察到合理的检测和识别性能（82.7％的检测率，60.8％OCR F1得分）具有真实 - 时间处理速度（每秒27.2帧）。

translated by 谷歌翻译

Drone Detection and Tracking in Real-Time by Fusion of Different Sensing Modalities

Fredrik Svanström , Fernando Alonso-Fernandez , Cristofer Englund

分类：计算机视觉

2022-07-05

自动检测飞行无人机是一个关键问题，其存在（特别是未经授权）可以造成风险的情况或损害安全性。在这里，我们设计和评估了多传感器无人机检测系统。结合常见的摄像机和麦克风传感器，我们探索了热红外摄像机的使用，指出是一种可行且有希望的解决方案，在相关文献中几乎没有解决。我们的解决方案还集成了鱼眼相机，以监视天空的更大部分，并将其他摄像机转向感兴趣的对象。传感溶液与ADS-B接收器，GPS接收器和雷达模块相辅相成，尽管由于其有限的检测范围，后者未包含在我们的最终部署中。即使此处使用的摄像机的分辨率较低，热摄像机也被证明是与摄像机一样好的可行解决方案。我们作品的另外两个新颖性是创建一个新的公共数据集的多传感器注释数据，该数据与现有的类别相比扩大了类的数量，以及对探测器性能的研究作为传感器到传感器的函数的研究目标距离。还探索了传感器融合，表明可以以这种方式使系统更强大，从而减轻对单个传感器的虚假检测

translated by 谷歌翻译

TAD: A Large-Scale Benchmark for Traffic Accidents Detection from Video Surveillance

Yajun Xu , Chuwen Huang , Yibing Nan , Shiguo Lian

分类：计算机视觉

2022-09-26

自动交通事故检测已吸引机器视觉社区，因为它对自动智能运输系统（ITS）的发展产生了影响和对交通安全的重要性。然而，大多数关于有效分析和交通事故预测的研究都使用了覆盖范围有限的小规模数据集，从而限制了其效果和适用性。交通事故中现有的数据集是小规模，不是来自监视摄像机，而不是开源的，或者不是为高速公路场景建造的。由于在高速公路上发生事故，因此往往会造成严重损坏，并且太快了，无法赶上现场。针对从监视摄像机收集的高速公路交通事故的开源数据集非常需要和实际上。为了帮助视觉社区解决这些缺点，我们努力收集涵盖丰富场景的真实交通事故的视频数据。在通过各个维度进行集成和注释后，在这项工作中提出了一个名为TAD的大规模交通事故数据集。在这项工作中，使用公共主流视觉算法或框架进行了有关图像分类，对象检测和视频分类任务的各种实验，以证明不同方法的性能。拟议的数据集以及实验结果将作为改善计算机视觉研究的新基准提出，尤其是在其中。

translated by 谷歌翻译

DualCam: A Novel Benchmark Dataset for Fine-grained Real-time Traffic Light Detection

Harindu Jayarathne , Tharindu Samarakoon , Hasara Koralege , Asitha Divisekara , Ranga Rodrigo , Peshala Jayasekara

分类：计算机视觉 | 人工智能 | 机器人

2022-09-03

交通灯检测对于自动驾驶汽车在城市地区安全导航至关重要。公开可用的交通灯数据集不足以开发用于检测提供重要导航信息的遥远交通信号灯的算法。我们介绍了一个新颖的基准交通灯数据集，该数据集使用一对涵盖城市和半城市道路的狭窄角度和广角摄像机捕获。我们提供1032张训练图像和813个同步图像对进行测试。此外，我们提供同步视频对进行定性分析。该数据集包括第1920 $ \ times $ 1080的分辨率图像，覆盖10个不同类别。此外，我们提出了一种用于结合两个相机输出的后处理算法。结果表明，与使用单个相机框架的传统方法相比，我们的技术可以在速度和准确性之间取得平衡。

translated by 谷歌翻译

Performance Analysis of YOLO-based Architectures for Vehicle Detection from Traffic Images in Bangladesh

Refaat Mohammad Alamgir , Ali Abir Shuvro , Mueeze Al Mushabbir , Mohammed Ashfaq Raiyan , Nusrat Jahan Rani , Md. Mushfiqur Rahman , Md. Hasanul Kabir , Sabbir Ahmed

分类：计算机视觉

2022-12-18

The task of locating and classifying different types of vehicles has become a vital element in numerous applications of automation and intelligent systems ranging from traffic surveillance to vehicle identification and many more. In recent times, Deep Learning models have been dominating the field of vehicle detection. Yet, Bangladeshi vehicle detection has remained a relatively unexplored area. One of the main goals of vehicle detection is its real-time application, where `You Only Look Once' (YOLO) models have proven to be the most effective architecture. In this work, intending to find the best-suited YOLO architecture for fast and accurate vehicle detection from traffic images in Bangladesh, we have conducted a performance analysis of different variants of the YOLO-based architectures such as YOLOV3, YOLOV5s, and YOLOV5x. The models were trained on a dataset containing 7390 images belonging to 21 types of vehicles comprising samples from the DhakaAI dataset, the Poribohon-BD dataset, and our self-collected images. After thorough quantitative and qualitative analysis, we found the YOLOV5x variant to be the best-suited model, performing better than YOLOv3 and YOLOv5s models respectively by 7 & 4 percent in mAP, and 12 & 8.5 percent in terms of Accuracy.

translated by 谷歌翻译

Providentia -- A Large-Scale Sensor System for the Assistance of Autonomous Vehicles and Its Evaluation

Annkathrin Krämmer , Christoph Schöller , Dhiraj Gulati , Venkatnarayanan Lakshminarasimhan , Franz Kurz , Dominik Rosenbaum , Claus Lenz , Alois Knoll

分类：机器人 | 计算机视觉

2019-06-16

自主车辆的环境感知受其物理传感器范围和算法性能的限制，以及通过降低其对正在进行的交通状况的理解的闭塞。这不仅构成了对安全和限制驾驶速度的重大威胁，而且它也可能导致不方便的动作。智能基础设施系统可以帮助缓解这些问题。智能基础设施系统可以通过在当前交通情况的数字模型的形式提供关于其周围环境的额外详细信息，填补了车辆的感知中的差距并扩展了其视野。数字双胞胎。然而，这种系统的详细描述和工作原型表明其可行性稀缺。在本文中，我们提出了一种硬件和软件架构，可实现这样一个可靠的智能基础架构系统。我们在现实世界中实施了该系统，并展示了它能够创建一个准确的延伸高速公路延伸的数字双胞胎，从而提高了自主车辆超越其车载传感器的极限的感知。此外，我们通过使用空中图像和地球观测方法来评估数字双胞胎的准确性和可靠性，用于产生地面真理数据。

translated by 谷歌翻译

Data generation using simulation technology to improve perception mechanism of autonomous vehicles

Minh Cao , Ramin Ramezani

分类：计算机视觉

2022-07-01

计算机图形技术的最新进展可以使汽车驾驶环境更现实。它们使自动驾驶汽车模拟器（例如DeepGTA-V和Carla（学习采取行动））能够生成大量的合成数据，这些数据可以补充现有的现实世界数据集中，以培训自动驾驶汽车感知。此外，由于自动驾驶汽车模拟器可以完全控制环境，因此它们可以产生危险的驾驶场景，而现实世界中数据集缺乏恶劣天气和事故情况。在本文中，我们将证明将从现实世界收集的数据与模拟世界中生成的数据相结合的有效性，以训练对象检测和本地化任务的感知系统。我们还将提出一个多层次的深度学习感知框架，旨在效仿人类的学习经验，其中在某个领域中学习了一系列从简单到更困难的任务。自动驾驶汽车感知器可以从易于驱动的方案中学习，以通过模拟软件定制的更具挑战性的方案。

translated by 谷歌翻译

Indian Licence Plate Dataset in the wild

Sanchit Tanwar , Ayush Tiwari , Ritesh Chowdhry

分类：计算机视觉

2021-11-11

印度车牌检测是一个问题，它在开源级别尚未探讨。可以使用专有解决方案，但没有大的开源数据集可用于执行实验并测试不同的方法。可用的大型数据集是中国，巴西等国家，但在这些数据集上培训的模型对印度板块表现不佳，因为字体样式和板材设计从国家到国家差异很大。这篇论文介绍了印度车牌数据集使用16192图像和21683板板用每个板的4个点注释，并且相应的板中的每个字符.WE呈现了一种使用语义分割来解决数字板检测的基准模型。我们提出了一种两级方法，其中第一阶段是用于本地化板，第二阶段是读取裁剪板图像中的文本.WE测试的基准对象检测和语义分段模型，用于第二阶段，我们使用了LPRNET基于OCR。

translated by 谷歌翻译

Convolutions for Spatial Interaction Modeling

Zhaoen Su , Chao Wang , David Bradley , Carlos Vallespi-Gonzalez , Carl Wellington , Nemanja Djuric

分类：计算机视觉

2021-04-15

在许多不同的领域中，对象之间的相互作用在确定其行为方面起着关键作用。图形神经网络（GNN）已成为建模相互作用的强大工具，尽管通常以增加相当大的复杂性和延迟为代价。在本文中，我们考虑了在预测围绕自动驾驶汽车的行为者运动并研究GNN的替代方案的背景下空间相互作用建模的问题。我们重新审视2D卷积，并表明它们可以在与较低延迟的空间相互作用时表现出与图网络相当的性能，从而在时间策略系统中提供了有效和有效的替代方案。此外，我们提出了一种新型的相互作用损失，以进一步改善所考虑方法的相互作用模型。

translated by 谷歌翻译