智能论文笔记

Performance Analysis of YOLO-based Architectures for Vehicle Detection from Traffic Images in Bangladesh

Refaat Mohammad Alamgir , Ali Abir Shuvro , Mueeze Al Mushabbir , Mohammed Ashfaq Raiyan , Nusrat Jahan Rani , Md. Mushfiqur Rahman , Md. Hasanul Kabir , Sabbir Ahmed

分类：计算机视觉

2022-12-18

The task of locating and classifying different types of vehicles has become a vital element in numerous applications of automation and intelligent systems ranging from traffic surveillance to vehicle identification and many more. In recent times, Deep Learning models have been dominating the field of vehicle detection. Yet, Bangladeshi vehicle detection has remained a relatively unexplored area. One of the main goals of vehicle detection is its real-time application, where `You Only Look Once' (YOLO) models have proven to be the most effective architecture. In this work, intending to find the best-suited YOLO architecture for fast and accurate vehicle detection from traffic images in Bangladesh, we have conducted a performance analysis of different variants of the YOLO-based architectures such as YOLOV3, YOLOV5s, and YOLOV5x. The models were trained on a dataset containing 7390 images belonging to 21 types of vehicles comprising samples from the DhakaAI dataset, the Poribohon-BD dataset, and our self-collected images. After thorough quantitative and qualitative analysis, we found the YOLOV5x variant to be the best-suited model, performing better than YOLOv3 and YOLOv5s models respectively by 7 & 4 percent in mAP, and 12 & 8.5 percent in terms of Accuracy.

translated by 谷歌翻译

Huruf: An Application for Arabic Handwritten Character Recognition Using Deep Learning

Minhaz Kamal , Fairuz Shaiara , Chowdhury Mohammad Abdullah , Sabbir Ahmed , Tasnim Ahmed , Md. Hasanul Kabir

分类：计算机视觉

2022-12-16

Handwriting Recognition has been a field of great interest in the Artificial Intelligence domain. Due to its broad use cases in real life, research has been conducted widely on it. Prominent work has been done in this field focusing mainly on Latin characters. However, the domain of Arabic handwritten character recognition is still relatively unexplored. The inherent cursive nature of the Arabic characters and variations in writing styles across individuals makes the task even more challenging. We identified some probable reasons behind this and proposed a lightweight Convolutional Neural Network-based architecture for recognizing Arabic characters and digits. The proposed pipeline consists of a total of 18 layers containing four layers each for convolution, pooling, batch normalization, dropout, and finally one Global average pooling and a Dense layer. Furthermore, we thoroughly investigated the different choices of hyperparameters such as the choice of the optimizer, kernel initializer, activation function, etc. Evaluating the proposed architecture on the publicly available 'Arabic Handwritten Character Dataset (AHCD)' and 'Modified Arabic handwritten digits Database (MadBase)' datasets, the proposed model respectively achieved an accuracy of 96.93% and 99.35% which is comparable to the state-of-the-art and makes it a suitable solution for real-life end-level applications.

translated by 谷歌翻译

Fruit Quality Assessment with Densely Connected Convolutional Neural Network

Md. Samin Morshed , Sabbir Ahmed , Tasnim Ahmed , Muhammad Usama Islam , A. B. M. Ashikur Rahman

分类：计算机视觉

2022-12-08

Accurate recognition of food items along with quality assessment is of paramount importance in the agricultural industry. Such automated systems can speed up the wheel of the food processing sector and save tons of manual labor. In this connection, the recent advancement of Deep learning-based architectures has introduced a wide variety of solutions offering remarkable performance in several classification tasks. In this work, we have exploited the concept of Densely Connected Convolutional Neural Networks (DenseNets) for fruit quality assessment. The feature propagation towards the deeper layers has enabled the network to tackle the vanishing gradient problems and ensured the reuse of features to learn meaningful insights. Evaluating on a dataset of 19,526 images containing six fruits having three quality grades for each, the proposed pipeline achieved a remarkable accuracy of 99.67%. The robustness of the model was further tested for fruit classification and quality assessment tasks where the model produced a similar performance, which makes it suitable for real-life applications.

translated by 谷歌翻译

A Comparative Study on COVID-19 Fake News Detection Using Different Transformer Based Models

Sajib Kumar Saha Joy , Dibyo Fabian Dofadar , Riyo Hayat Khan , Md. Sabbir Ahmed , Rafeed Rahman

分类：自然语言处理 | 机器学习

2022-08-02

社交网络的快速发展以及互联网可用性的便利性加剧了虚假新闻和社交媒体网站上的谣言的泛滥。在共同19的流行病中，这种误导性信息通过使人们的身心生命处于危险之中，从而加剧了这种情况。为了限制这种不准确性的传播，从在线平台上确定虚假新闻可能是第一步。在这项研究中，作者通过实施了五个基于变压器的模型，例如Bert，Bert没有LSTM，Albert，Roberta和Bert＆Albert的混合体，以检测Internet的Covid 19欺诈新闻。Covid 19假新闻数据集已用于培训和测试模型。在所有这些模型中，Roberta模型的性能优于其他模型，通过在真实和虚假类中获得0.98的F1分数。

translated by 谷歌翻译

Two Decades of Bengali Handwritten Digit Recognition: A Survey

A. B. M. Ashikur Rahman , Md. Bakhtiar Hasan , Sabbir Ahmed , Tasnim Ahmed , Md. Hamjajul Ashmafee , Mohammad Ridwan Kabir , Md. Hasanul Kabir

分类：计算机视觉

2022-06-05

手写数字识别（HDR）是光学特征识别（OCR）领域中最具挑战性的任务之一。不管语言如何，HDR都存在一些固有的挑战，这主要是由于个人跨个人的写作风格的变化，编写媒介和环境的变化，无法在反复编写任何数字等时保持相同的笔触。除此之外，特定语言数字的结构复杂性可能会导致HDR的模棱两可。多年来，研究人员开发了许多离线和在线HDR管道，其中不同的图像处理技术与传统的机器学习（ML）基于基于的和/或基于深度学习（DL）的体系结构相结合。尽管文献中存在有关HDR的广泛审查研究的证据，例如：英语，阿拉伯语，印度，法尔西，中文等，但几乎没有对孟加拉人HDR（BHDR）的调查，这缺乏对孟加拉语HDR（BHDR）的研究，而这些调查缺乏对孟加拉语HDR（BHDR）的研究。挑战，基础识别过程以及可能的未来方向。在本文中，已经分析了孟加拉语手写数字的特征和固有的歧义，以及二十年来最先进的数据集的全面见解和离线BHDR的方法。此外，还详细讨论了一些涉及BHDR的现实应用特定研究。本文还将作为对离线BHDR背后科学感兴趣的研究人员的汇编，煽动了对相关研究的新途径的探索，这可能会进一步导致在不同应用领域对孟加拉语手写数字进行更好的离线认识。

translated by 谷歌翻译

Less is More: Lighter and Faster Deep Neural Architecture for Tomato Leaf Disease Classification

Sabbir Ahmed , Md. Bakhtiar Hasan , Tasnim Ahmed , Redwan Karim Sony , Md. Hasanul Kabir

分类：计算机视觉 | 机器学习

2021-09-06

为了确保全球粮食安全和利益相关者的总体利润，正确检测和分类植物疾病的重要性至关重要。在这方面，基于深度学习的图像分类的出现引入了大量解决方案。但是，这些解决方案在低端设备中的适用性需要快速，准确和计算廉价的系统。这项工作提出了一种基于轻巧的转移学习方法，用于从番茄叶中检测疾病。它利用一种有效的预处理方法来增强具有照明校正的叶片图像，以改善分类。我们的系统使用组合模型来提取功能，该模型由预审计的MobilenETV2体系结构和分类器网络组成，以进行有效的预测。传统的增强方法被运行时的增加取代，以避免数据泄漏并解决类不平衡问题。来自PlantVillage数据集的番茄叶图像的评估表明，所提出的体系结构可实现99.30％的精度，型号大小为9.60mb和4.87亿个浮点操作，使其成为低端设备中现实生活的合适选择。我们的代码和型号可在https://github.com/redwankarimsony/project-tomato中找到。

translated by 谷歌翻译

Rethinking Cooking State Recognition with Vision Transformers

Akib Mohammed Khan , Alif Ashrafee , Reeshoon Sayera , Shahriar Ivan , Sabbir Ahmed

分类：计算机视觉

2022-12-16

To ensure proper knowledge representation of the kitchen environment, it is vital for kitchen robots to recognize the states of the food items that are being cooked. Although the domain of object detection and recognition has been extensively studied, the task of object state classification has remained relatively unexplored. The high intra-class similarity of ingredients during different states of cooking makes the task even more challenging. Researchers have proposed adopting Deep Learning based strategies in recent times, however, they are yet to achieve high performance. In this study, we utilized the self-attention mechanism of the Vision Transformer (ViT) architecture for the Cooking State Recognition task. The proposed approach encapsulates the globally salient features from images, while also exploiting the weights learned from a larger dataset. This global attention allows the model to withstand the similarities between samples of different cooking objects, while the employment of transfer learning helps to overcome the lack of inductive bias by utilizing pretrained weights. To improve recognition accuracy, several augmentation techniques have been employed as well. Evaluation of our proposed framework on the `Cooking State Recognition Challenge Dataset' has achieved an accuracy of 94.3%, which significantly outperforms the state-of-the-art.

translated by 谷歌翻译

Converting OpenStreetMap Data to Road Networks for Downstream Applications

Md Kaisar Ahmed

分类：机器学习

2022-11-22

We study how to convert OpenStreetMap data to road networks for downstream applications. OpenStreetMap data has different formats. Extensible Markup Language (XML) is one of them. OSM data consist of nodes, ways, and relations. We process OSM XML data to extract the information of nodes and ways to obtain the map of streets of the Memphis area. We can use this map for different downstream applications.

translated by 谷歌翻译

Traffic Congestion Prediction using Deep Convolutional Neural Networks: A Color-coding Approach

Mirza Fuad Adnan , Nadim Ahmed , Imrez Ishraque , Md. Sifath Al Amin , Md. Sumit Hasan

分类：计算机视觉 | 人工智能

2022-09-16

由于计算机视觉的最新进展，流量视频数据已成为限制交通拥堵状况的关键因素。这项工作为使用颜色编码方案提供了一种独特的技术，用于在深度卷积神经网络中训练流量数据之前。首先，将视频数据转换为图像数据集。然后，使用您只看一次算法进行车辆检测。已经采用了颜色编码的方案将图像数据集转换为二进制图像数据集。这些二进制图像被馈送到深度卷积神经网络中。使用UCSD数据集，我们获得了98.2％的分类精度。

translated by 谷歌翻译

Leveraging Smartphone Sensors for Detecting Abnormal Gait for Smart Wearable Mobile Technologies

Md Shahriar Tasjid , Ahmed Al Marouf

分类：计算机视觉 | 机器学习

2022-08-03

步行是人类陆地运动的最常见模式之一。步行对于人类进行大多数日常活动至关重要。当一个人走路时，其中有一个模式，被称为步态。步态分析用于体育和医疗保健。我们可以以不同的方式分析该步态，例如使用监视摄像机捕获的视频或在实验室环境中的深度图像摄像机。它也可以通过可穿戴传感器识别。例如，加速度计，力传感器，陀螺仪，柔性旋转仪，磁电阻传感器，电磁跟踪系统，力传感器和肌电图（EMG）。通过这些传感器进行分析需要实验室条件，否则用户必须佩戴这些传感器。为了检测人的步态作用异常，我们需要分别合并传感器。我们可以在发现后通过异常步态知道自己的健康状况。了解常规的步态与异常步态可能会使用智能可穿戴技术对受试者的健康状况有所了解。因此，在本文中，我们提出了一种通过智能手机传感器分析异常步态的方法。尽管如今，大多数人都使用了智能手机和智能手表等智能设备。因此，我们可以使用这些智能可穿戴设备的传感器来追踪他们的步态。

translated by 谷歌翻译