In light of the COVID-19 pandemic, patients were required to manually input their daily oxygen saturation (SpO2) and pulse rate (PR) values into a health monitoring system-unfortunately, such a process trend to be an error in typing. Several studies attempted to detect the physiological value from the captured image using optical character recognition (OCR). However, the technology has limited availability with high cost. Thus, this study aimed to propose a novel framework called PACMAN (Pandemic Accelerated Human-Machine Collaboration) with a low-resource deep learning-based computer vision. We compared state-of-the-art object detection algorithms (scaled YOLOv4, YOLOv5, and YOLOR), including the commercial OCR tools for digit recognition on the captured images from pulse oximeter display. All images were derived from crowdsourced data collection with varying quality and alignment. YOLOv5 was the best-performing model against the given model comparison across all datasets, notably the correctly orientated image dataset. We further improved the model performance with the digits auto-orientation algorithm and applied a clustering algorithm to extract SpO2 and PR values. The accuracy performance of YOLOv5 with the implementations was approximately 81.0-89.5%, which was enhanced compared to without any additional implementation. Accordingly, this study highlighted the completion of PACMAN framework to detect and read digits in real-world datasets. The proposed framework has been currently integrated into the patient monitoring system utilized by hospitals nationwide.
translated by 谷歌翻译
The International Workshop on Reading Music Systems (WoRMS) is a workshop that tries to connect researchers who develop systems for reading music, such as in the field of Optical Music Recognition, with other researchers and practitioners that could benefit from such systems, like librarians or musicologists. The relevant topics of interest for the workshop include, but are not limited to: Music reading systems; Optical music recognition; Datasets and performance evaluation; Image processing on music scores; Writer identification; Authoring, editing, storing and presentation systems for music scores; Multi-modal systems; Novel input-methods for music to produce written music; Web-based Music Information Retrieval services; Applications and projects; Use-cases related to written music. These are the proceedings of the 3rd International Workshop on Reading Music Systems, held in Alicante on the 23rd of July 2021.
translated by 谷歌翻译
更换具有智能电表的模拟仪表昂贵,艰巨,远非完全在发展中国家。ParaNa(Copel)(巴西)的能源公司每月执行超过400万米的读数(几乎完全是非智能设备),我们估计其中850万人来自拨号米。因此,基于图像的自动读取系统可以减少人类错误,创建读取证明,并使客户能够通过移动应用程序执行读取本身。我们提出了用于自动拨号抄表(ADMR)的新方法,并在不约束场景中引入ADMR的新数据集,称为UFPR-ADMR-V2。我们的最佳方法将YOLOV4与新的回归方法(ANGREG)结合起来,探讨了几种后处理技术。与以前的作品相比,它降低了1,343至129的平均绝对误差(MAE),并实现了98.90%的仪表识别率(MRR) - 误差容差为1千瓦时(千瓦时)。
translated by 谷歌翻译
X-ray imaging technology has been used for decades in clinical tasks to reveal the internal condition of different organs, and in recent years, it has become more common in other areas such as industry, security, and geography. The recent development of computer vision and machine learning techniques has also made it easier to automatically process X-ray images and several machine learning-based object (anomaly) detection, classification, and segmentation methods have been recently employed in X-ray image analysis. Due to the high potential of deep learning in related image processing applications, it has been used in most of the studies. This survey reviews the recent research on using computer vision and machine learning for X-ray analysis in industrial production and security applications and covers the applications, techniques, evaluation metrics, datasets, and performance comparison of those techniques on publicly available datasets. We also highlight some drawbacks in the published research and give recommendations for future research in computer vision-based X-ray analysis.
translated by 谷歌翻译
水果苍蝇是果实产量最有害的昆虫物种之一。在AlertTrap中,使用不同的最先进的骨干功能提取器(如MobiLenetv1和MobileNetv2)的SSD架构的实现似乎是实时检测问题的潜在解决方案。SSD-MobileNetv1和SSD-MobileNetv2表现良好并导致AP至0.5分别为0.957和1.0。YOLOV4-TINY优于SSD家族,在AP@0.5中为1.0;但是,其吞吐量速度略微慢。
translated by 谷歌翻译
如今,使用微创手术(MIS)进行了更多的手术程序。这是由于其许多好处,例如最小的术后问题,较少的出血,较小的疤痕和快速的康复。但是,MIS的视野,小手术室和对操作场景的间接查看可能导致手术工具发生冲突并可能损害人体器官或组织。因此,通过使用内窥镜视频饲料实时检测和监视手术仪器,可以大大减少MIS问题,并且可以提高手术程序的准确性和成功率。在本文中,研究,分析和评估了对Yolov5对象检测器的一系列改进,以增强手术仪器的检测。在此过程中,我们进行了基于性能的消融研究,探索了改变Yolov5模型的骨干,颈部和锚固结构元素的影响,并注释了独特的内窥镜数据集。此外,我们将消融研究的有效性与其他四个SOTA对象探测器(Yolov7,Yolor,Scaled-Yolov4和Yolov3-SPP)进行了比较。除了Yolov3-SPP(在MAP中具有98.3%的模型性能和相似的推理速度)外,我们的所有基准模型(包括原始的Yolov5)在使用新的内窥镜数据集的实验中超过了我们的顶级精制模型。
translated by 谷歌翻译
The task of locating and classifying different types of vehicles has become a vital element in numerous applications of automation and intelligent systems ranging from traffic surveillance to vehicle identification and many more. In recent times, Deep Learning models have been dominating the field of vehicle detection. Yet, Bangladeshi vehicle detection has remained a relatively unexplored area. One of the main goals of vehicle detection is its real-time application, where `You Only Look Once' (YOLO) models have proven to be the most effective architecture. In this work, intending to find the best-suited YOLO architecture for fast and accurate vehicle detection from traffic images in Bangladesh, we have conducted a performance analysis of different variants of the YOLO-based architectures such as YOLOV3, YOLOV5s, and YOLOV5x. The models were trained on a dataset containing 7390 images belonging to 21 types of vehicles comprising samples from the DhakaAI dataset, the Poribohon-BD dataset, and our self-collected images. After thorough quantitative and qualitative analysis, we found the YOLOV5x variant to be the best-suited model, performing better than YOLOv3 and YOLOv5s models respectively by 7 & 4 percent in mAP, and 12 & 8.5 percent in terms of Accuracy.
translated by 谷歌翻译
2019年冠状病毒疾病(Covid-19)继续自爆发以来对世界产生巨大挑战。为了对抗这种疾病,开发了一系列人工智能(AI)技术,并应用于现实世界的情景,如安全监测,疾病诊断,感染风险评估,Covid-19 CT扫描的病变细分等。 Coronavirus流行病迫使人们佩戴面膜来抵消病毒的传播,这也带来了监控戴着面具的大群人群的困难。在本文中,我们主要关注蒙面面部检测和相关数据集的AI技术。从蒙面面部检测数据集的描述开始,我们调查了最近的进步。详细描述并详细讨论了十三可用数据集。然后,该方法大致分为两类:传统方法和基于神经网络的方法。常规方法通常通过用手工制作的特征升高算法来训练,该算法占少比例。基于神经网络的方法根据处理阶段的数量进一步归类为三个部分。详细描述了代表性算法,与一些简要描述的一些典型技术耦合。最后,我们总结了最近的基准测试结果,讨论了关于数据集和方法的局限性,并扩大了未来的研究方向。据我们所知,这是关于蒙面面部检测方法和数据集的第一次调查。希望我们的调查可以提供一些帮助对抗流行病的帮助。
translated by 谷歌翻译
远程光插图学(RPPG)是一种快速,有效,廉价和方便的方法,用于收集生物识别数据,因为它可以使用面部视频来估算生命体征。事实证明,远程非接触式医疗服务供应在COVID-19大流行期间是可怕的必要性。我们提出了一个端到端框架,以根据用户的视频中的RPPG方法来衡量人们的生命体征,包括心率(HR),心率变异性(HRV),氧饱和度(SPO2)和血压(BP)(BP)(BP)用智能手机相机捕获的脸。我们以实时的基于深度学习的神经网络模型来提取面部标志。通过使用预测的面部标志来提取多个称为利益区域(ROI)的面部斑块(ROI)。应用了几个过滤器,以减少称为血量脉冲(BVP)信号的提取的心脏信号中ROI的噪声。我们使用两个公共RPPG数据集培训和验证了机器学习模型,即Tokyotech RPPG和脉搏率检测(PURE)数据集,我们的模型在其上实现了以下平均绝对错误(MAE):a),HR,1.73和3.95 BEATS- beats-beats-beats-beats-beats-beats-beats-beats-beats-beats-beats-beats-beats-beats-beats-beats-s-s-s-s-s-y-peats-beats-beats-beats-ship-s-s-s-in-chin-p-in-in-in-in-in-c--in-in-c-le-in-in- -t一下制。每分钟(bpm),b)分别为HRV,分别为18.55和25.03 ms,c)对于SPO2,纯数据集上的MAE为1.64。我们在现实生活环境中验证了端到端的RPPG框架,修订,从而创建了视频HR数据集。我们的人力资源估计模型在此数据集上达到了2.49 bpm的MAE。由于没有面对视频的BP测量不存在公开可用的RPPG数据集,因此我们使用了带有指标传感器信号的数据集来训练我们的模型,还创建了我们自己的视频数据集Video-BP。在我们的视频BP数据集中,我们的BP估计模型的收缩压(SBP)达到6.7 mmHg,舒张压(DBP)的MAE为9.6 mmHg。
translated by 谷歌翻译
The International Workshop on Reading Music Systems (WoRMS) is a workshop that tries to connect researchers who develop systems for reading music, such as in the field of Optical Music Recognition, with other researchers and practitioners that could benefit from such systems, like librarians or musicologists. The relevant topics of interest for the workshop include, but are not limited to: Music reading systems; Optical music recognition; Datasets and performance evaluation; Image processing on music scores; Writer identification; Authoring, editing, storing and presentation systems for music scores; Multi-modal systems; Novel input-methods for music to produce written music; Web-based Music Information Retrieval services; Applications and projects; Use-cases related to written music. These are the proceedings of the 2nd International Workshop on Reading Music Systems, held in Delft on the 2nd of November 2019.
translated by 谷歌翻译
每年,AEDESAEGYPTI蚊子都感染了数百万人,如登录,ZIKA,Chikungunya和城市黄热病等疾病。战斗这些疾病的主要形式是通过寻找和消除潜在的蚊虫养殖场来避免蚊子繁殖。在这项工作中,我们介绍了一个全面的空中视频数据集,获得了无人驾驶飞行器,含有可能的蚊帐。使用识别所有感兴趣对象的边界框手动注释视频数据集的所有帧。该数据集被用于开发基于深度卷积网络的这些对象的自动检测系统。我们提出了通过在可以注册检测到的对象的时空检测管道的对象检测流水线中的融合来利用视频中包含的时间信息,这些时间是可以注册检测到的对象的,最大限度地减少最伪正和假阴性的出现。此外,我们通过实验表明使用视频比仅使用框架对马赛克组成马赛克更有利。使用Reset-50-FPN作为骨干,我们可以分别实现0.65和0.77的F $ _1 $ -70分别对“轮胎”和“水箱”的对象级别检测,说明了正确定位潜在蚊子的系统能力育种对象。
translated by 谷歌翻译
The 1$^{\text{st}}$ Workshop on Maritime Computer Vision (MaCVi) 2023 focused on maritime computer vision for Unmanned Aerial Vehicles (UAV) and Unmanned Surface Vehicle (USV), and organized several subchallenges in this domain: (i) UAV-based Maritime Object Detection, (ii) UAV-based Maritime Object Tracking, (iii) USV-based Maritime Obstacle Segmentation and (iv) USV-based Maritime Obstacle Detection. The subchallenges were based on the SeaDronesSee and MODS benchmarks. This report summarizes the main findings of the individual subchallenges and introduces a new benchmark, called SeaDronesSee Object Detection v2, which extends the previous benchmark by including more classes and footage. We provide statistical and qualitative analyses, and assess trends in the best-performing methodologies of over 130 submissions. The methods are summarized in the appendix. The datasets, evaluation code and the leaderboard are publicly available at https://seadronessee.cs.uni-tuebingen.de/macvi.
translated by 谷歌翻译
Covid-19大流行为感染检测和监测解决方案产生了重大的兴趣和需求。在本文中,我们提出了一种机器学习方法,可以使用在消费者设备上进行的录音来快速分离Covid-19。该方法将信号处理方法与微调深层学习网络相结合,提供了信号去噪,咳嗽检测和分类的方法。我们还开发并部署了一个移动应用程序,使用症状检查器与语音,呼吸和咳嗽信号一起使用,以检测Covid-19感染。该应用程序对两个开放的数据集和最终用户在测试版测试期间收集的嘈杂数据显示了鲁棒性能。
translated by 谷歌翻译
颈腺细胞(GC)检测是计算机辅助诊断宫颈腺癌筛查的关键步骤。精确识别宫颈涂片中的GC是挑战的,其中鳞状细胞是主要的。在整个涂片线索中,广泛存在的分布(OOD)数据可降低机器学习系统用于GC检测的可靠性。尽管,最新的(SOTA)深度学习模型可以胜过感兴趣的预选区域中的病理学家,但是当面对这样的吉吉像素整个滑动图像时,质量假阳性(FP)预测仍无法解决。本文提出了一种基于GC的形态学知识,试图通过八邻居中的自我发项机制来解决FP问题的新极性知识。它估计了GC核的极性方向。作为插件模块,Polarnet可以指导一般对象检测模型的深度功能和预测的置信度。在实验中,我们发现基于四个不同框架的通用模型可以在小图像集中拒绝fp,并将平均精度(地图)的平均值增加$ \ text {0.007} \ sim \ sim \ text {0.015} $,其中平均最高超过了最近的宫颈细胞检测模型0.037。通过插入极地,部署的C ++程序在从外部WSI的前20个GC检测准确性上提高了8.8%,同时牺牲了14.4 s的计算时间。代码可在https://github.com/chrisa142857/polarnet-gcdet中找到
translated by 谷歌翻译
海洋生态系统及其鱼类栖息地越来越重要,因为它们在提供有价值的食物来源和保护效果方面的重要作用。由于它们的偏僻且难以接近自然,因此通常使用水下摄像头对海洋环境和鱼类栖息地进行监测。这些相机产生了大量数字数据,这些数据无法通过当前的手动处理方法有效地分析,这些方法涉及人类观察者。 DL是一种尖端的AI技术,在分析视觉数据时表现出了前所未有的性能。尽管它应用于无数领域,但仍在探索其在水下鱼类栖息地监测中的使用。在本文中,我们提供了一个涵盖DL的关键概念的教程,该教程可帮助读者了解对DL的工作原理的高级理解。该教程还解释了一个逐步的程序,讲述了如何为诸如水下鱼类监测等挑战性应用开发DL算法。此外,我们还提供了针对鱼类栖息地监测的关键深度学习技术的全面调查,包括分类,计数,定位和细分。此外,我们对水下鱼类数据集进行了公开调查,并比较水下鱼类监测域中的各种DL技术。我们还讨论了鱼类栖息地加工深度学习的新兴领域的一些挑战和机遇。本文是为了作为希望掌握对DL的高级了解,通过遵循我们的分步教程而为其应用开发的海洋科学家的教程,并了解如何发展其研究,以促进他们的研究。努力。同时,它适用于希望调查基于DL的最先进方法的计算机科学家,以进行鱼类栖息地监测。
translated by 谷歌翻译
人们的个人卫生习惯在每日生活方式中照顾身体和健康的状况。保持良好的卫生习惯不仅减少了患疾病的机会,而且还可以降低社区中传播疾病的风险。鉴于目前的大流行,每天的习惯,例如洗手或定期淋浴,在人们中至关重要,尤其是对于单独生活在家里或辅助生活设施中的老年人。本文提出了一个新颖的非侵入性框架,用于使用我们采用机器学习技术的振动传感器监测人卫生。该方法基于地球通传感器,数字化器和实用外壳中具有成本效益的计算机板的组合。监测日常卫生常规可能有助于医疗保健专业人员积极主动,而不是反应性,以识别和控制社区内潜在暴发的传播。实验结果表明,将支持向量机(SVM)用于二元分类,在不同卫生习惯的分类中表现出约95%的有希望的准确性。此外,基于树的分类器(随机福雷斯特和决策树)通过实现最高精度(100%)优于其他模型,这意味着可以使用振动和非侵入性传感器对卫生事件进行分类,以监测卫生活动。
translated by 谷歌翻译
2019年12月,一个名为Covid-19的新型病毒导致了迄今为止的巨大因果关系。与新的冠状病毒的战斗在西班牙语流感后令人振奋和恐怖。虽然前线医生和医学研究人员在控制高度典型病毒的传播方面取得了重大进展,但技术也证明了在战斗中的重要性。此外,许多医疗应用中已采用人工智能,以诊断许多疾病,甚至陷入困境的经验丰富的医生。因此,本调查纸探讨了提议的方法,可以提前援助医生和研究人员,廉价的疾病诊断方法。大多数发展中国家难以使用传统方式进行测试,但机器和深度学习可以采用显着的方式。另一方面,对不同类型的医学图像的访问已经激励了研究人员。结果,提出了一种庞大的技术数量。本文首先详细调了人工智能域中传统方法的背景知识。在此之后,我们会收集常用的数据集及其用例日期。此外,我们还显示了采用深入学习的机器学习的研究人员的百分比。因此,我们对这种情况进行了彻底的分析。最后,在研究挑战中,我们详细阐述了Covid-19研究中面临的问题,我们解决了我们的理解,以建立一个明亮健康的环境。
translated by 谷歌翻译
随着物联网,AI和ML/DL算法的出现,数据驱动的医疗应用已成为一种有前途的工具,用于从医学数据设计可靠且可扩展的诊断和预后模型。近年来,这引起了从学术界到工业的广泛关注。这无疑改善了医疗保健提供的质量。但是,由于这些基于AI的医疗应用程序在满足严格的安全性,隐私和服务标准(例如低延迟)方面的困难,因此仍然采用较差。此外,医疗数据通常是分散的和私人的,这使得在人群之间产生强大的结果具有挑战性。联邦学习(FL)的最新发展使得以分布式方式训练复杂的机器学习模型成为可能。因此,FL已成为一个积极的研究领域,尤其是以分散的方式处理网络边缘的医疗数据,以保护隐私和安全问题。为此,本次调查论文重点介绍了数据共享是重大负担的医疗应用中FL技术的当前和未来。它还审查并讨论了当前的研究趋势及其设计可靠和可扩展模型的结果。我们概述了FL将军的统计问题,设备挑战,安全性,隐私问题及其在医疗领域的潜力。此外,我们的研究还集中在医疗应用上,我们重点介绍了全球癌症的负担以及有效利用FL来开发计算机辅助诊断工具来解决这些诊断工具。我们希望这篇评论是一个检查站,以彻底的方式阐明现有的最新最新作品,并为该领域提供开放的问题和未来的研究指示。
translated by 谷歌翻译
肺超声(LUS)可能是唯一可用于连续和周期性监测肺的医学成像方式。这对于在肺部感染开始期间跟踪肺表现或跟踪疫苗接种对肺部的影响非常有用,如Covid-19中的肺部作用。有许多尝试将肺严重程度分为各个类别或自动分割各种LUS地标和表现形式的尝试。但是,所有这些方法均基于训练静态机器学习模型,该模型需要大量临床注释的大数据集,并且在计算上是沉重的,并且大部分时间非现实时间。在这项工作中,提出了一种实时重量的基于活跃的学习方法,以在资源约束设置中在COVID-19的受试者中更快地进行分类。该工具基于您看起来仅一次(YOLO)网络,具有基于各种LUS地标,人工制品和表现形式的标识,肺部感染严重程度的预测,基于主动学习的可能性,提供图像质量的能力。临床医生的反馈或图像质量以及对感染严重程度高的重要框架的汇总,以进一步分析。结果表明,对于LUS地标的预测,该提议的工具在联合(IOU)阈值的交叉点上的平均平均精度(MAP)为66%。在Quadro P4000 GPU运行时,14MB轻量级Yolov5S网络可实现123 fps。该工具可根据作者的要求进行使用和分析。
translated by 谷歌翻译
Video, as a key driver in the global explosion of digital information, can create tremendous benefits for human society. Governments and enterprises are deploying innumerable cameras for a variety of applications, e.g., law enforcement, emergency management, traffic control, and security surveillance, all facilitated by video analytics (VA). This trend is spurred by the rapid advancement of deep learning (DL), which enables more precise models for object classification, detection, and tracking. Meanwhile, with the proliferation of Internet-connected devices, massive amounts of data are generated daily, overwhelming the cloud. Edge computing, an emerging paradigm that moves workloads and services from the network core to the network edge, has been widely recognized as a promising solution. The resulting new intersection, edge video analytics (EVA), begins to attract widespread attention. Nevertheless, only a few loosely-related surveys exist on this topic. A dedicated venue for collecting and summarizing the latest advances of EVA is highly desired by the community. Besides, the basic concepts of EVA (e.g., definition, architectures, etc.) are ambiguous and neglected by these surveys due to the rapid development of this domain. A thorough clarification is needed to facilitate a consensus on these concepts. To fill in these gaps, we conduct a comprehensive survey of the recent efforts on EVA. In this paper, we first review the fundamentals of edge computing, followed by an overview of VA. The EVA system and its enabling techniques are discussed next. In addition, we introduce prevalent frameworks and datasets to aid future researchers in the development of EVA systems. Finally, we discuss existing challenges and foresee future research directions. We believe this survey will help readers comprehend the relationship between VA and edge computing, and spark new ideas on EVA.
translated by 谷歌翻译