The most widely studied explainable AI (XAI) approaches are unsound. This is the case with well-known model-agnostic explanation approaches, and it is also the case with approaches based on saliency maps. One solution is to consider intrinsic interpretability, which does not exhibit the drawback of unsoundness. Unfortunately, intrinsic interpretability can display unwieldy explanation redundancy. Formal explainability represents the alternative to these non-rigorous approaches, with one example being PI-explanations. Unfortunately, PI-explanations also exhibit important drawbacks, the most visible of which is arguably their size. Recently, it has been observed that the (absolute) rigor of PI-explanations can be traded off for a smaller explanation size, by computing the so-called relevant sets. Given some positive {\delta}, a set S of features is {\delta}-relevant if, when the features in S are fixed, the probability of getting the target class exceeds {\delta}. However, even for very simple classifiers, the complexity of computing relevant sets of features is prohibitive, with the decision problem being NPPP-complete for circuit-based classifiers. In contrast with earlier negative results, this paper investigates practical approaches for computing relevant sets for a number of widely used classifiers that include Decision Trees (DTs), Naive Bayes Classifiers (NBCs), and several families of classifiers obtained from propositional languages. Moreover, the paper shows that, in practice, and for these families of classifiers, relevant sets are easy to compute. Furthermore, the experiments confirm that succinct sets of relevant features can be obtained for the families of classifiers considered.
translated by 谷歌翻译
Plastic shopping bags that get carried away from the side of roads and tangled on cotton plants can end up at cotton gins if not removed before the harvest. Such bags may not only cause problem in the ginning process but might also get embodied in cotton fibers reducing its quality and marketable value. Therefore, it is required to detect, locate, and remove the bags before cotton is harvested. Manually detecting and locating these bags in cotton fields is labor intensive, time-consuming and a costly process. To solve these challenges, we present application of four variants of YOLOv5 (YOLOv5s, YOLOv5m, YOLOv5l and YOLOv5x) for detecting plastic shopping bags using Unmanned Aircraft Systems (UAS)-acquired RGB (Red, Green, and Blue) images. We also show fixed effect model tests of color of plastic bags as well as YOLOv5-variant on average precision (AP), mean average precision (mAP@50) and accuracy. In addition, we also demonstrate the effect of height of plastic bags on the detection accuracy. It was found that color of bags had significant effect (p < 0.001) on accuracy across all the four variants while it did not show any significant effect on the AP with YOLOv5m (p = 0.10) and YOLOv5x (p = 0.35) at 95% confidence level. Similarly, YOLOv5-variant did not show any significant effect on the AP (p = 0.11) and accuracy (p = 0.73) of white bags, but it had significant effects on the AP (p = 0.03) and accuracy (p = 0.02) of brown bags including on the mAP@50 (p = 0.01) and inference speed (p < 0.0001). Additionally, height of plastic bags had significant effect (p < 0.0001) on overall detection accuracy. The findings reported in this paper can be useful in speeding up removal of plastic bags from cotton fields before harvest and thereby reducing the amount of contaminants that end up at cotton gins.
translated by 谷歌翻译
Data-driven modeling approaches such as jump tables are promising techniques to model populations of resistive random-access memory (ReRAM) or other emerging memory devices for hardware neural network simulations. As these tables rely on data interpolation, this work explores the open questions about their fidelity in relation to the stochastic device behavior they model. We study how various jump table device models impact the attained network performance estimates, a concept we define as modeling bias. Two methods of jump table device modeling, binning and Optuna-optimized binning, are explored using synthetic data with known distributions for benchmarking purposes, as well as experimental data obtained from TiOx ReRAM devices. Results on a multi-layer perceptron trained on MNIST show that device models based on binning can behave unpredictably particularly at low number of points in the device dataset, sometimes over-promising, sometimes under-promising target network accuracy. This paper also proposes device level metrics that indicate similar trends with the modeling bias metric at the network level. The proposed approach opens the possibility for future investigations into statistical device models with better performance, as well as experimentally verified modeling bias in different in-memory computing and neural network architectures.
translated by 谷歌翻译
Despite being responsible for state-of-the-art results in several computer vision and natural language processing tasks, neural networks have faced harsh criticism due to some of their current shortcomings. One of them is that neural networks are correlation machines prone to model biases within the data instead of focusing on actual useful causal relationships. This problem is particularly serious in application domains affected by aspects such as race, gender, and age. To prevent models from incurring on unfair decision-making, the AI community has concentrated efforts in correcting algorithmic biases, giving rise to the research area now widely known as fairness in AI. In this survey paper, we provide an in-depth overview of the main debiasing methods for fairness-aware neural networks in the context of vision and language research. We propose a novel taxonomy to better organize the literature on debiasing methods for fairness, and we discuss the current challenges, trends, and important future work directions for the interested researcher and practitioner.
translated by 谷歌翻译
Recent studies suggest that early stages of diabetic retinopathy (DR) can be diagnosed by monitoring vascular changes in the deep vascular complex. In this work, we investigate a novel method for automated DR grading based on optical coherence tomography angiography (OCTA) images. Our work combines OCTA scans with their vessel segmentations, which then serve as inputs to task specific networks for lesion segmentation, image quality assessment and DR grading. For this, we generate synthetic OCTA images to train a segmentation network that can be directly applied on real OCTA data. We test our approach on MICCAI 2022's DR analysis challenge (DRAC). In our experiments, the proposed method performs equally well as the baseline model.
translated by 谷歌翻译
病理学家对患病组织的视觉微观研究一直是一个多世纪以来癌症诊断和预后的基石。最近,深度学习方法在组织图像的分析和分类方面取得了重大进步。但是,关于此类模型在生成组织病理学图像的实用性方面的工作有限。这些合成图像在病理学中有多种应用,包括教育,熟练程度测试,隐私和数据共享的公用事业。最近,引入了扩散概率模型以生成高质量的图像。在这里,我们首次研究了此类模型的潜在用途以及优先的形态加权和颜色归一化,以合成脑癌的高质量组织病理学图像。我们的详细结果表明,与生成对抗网络相比,扩散概率模型能够合成各种组织病理学图像,并且具有较高的性能。
translated by 谷歌翻译
文本分类在许多真实世界的情况下可能很有用,为最终用户节省了很多时间。但是,构建自定义分类器通常需要编码技能和ML知识,这对许多潜在用户构成了重大障碍。为了提高此障碍,我们介绍了标签侦探,这是一种免费的开源系统,用于标记和创建文本分类器。该系统对于(a)是一个无代码系统是独一无二的分类器在几个小时内,(c)开发用于开发人员进行配置和扩展。通过开放采购标签侦探,我们希望建立一个用户和开发人员社区,以扩大NLP模型的利用率。
translated by 谷歌翻译
Boll Weevil(Anthonomus Grandis L.)是一种严重的害虫,主要以棉花为食。由于亚热带气候条件,在德克萨斯州的下里奥格兰德山谷等地方,棉花植物可以全年生长,因此,收获期间上一个季节的剩下的种子可以在玉米中的旋转中继续生长(Zea Mays L.)和高粱(高粱双色L.)。这些野性或志愿棉花(VC)植物到达Pinhead平方阶段(5-6叶阶段)可以充当Boll Weevil Pest的宿主。得克萨斯州的鲍尔象鼻虫根除计划(TBWEP)雇用人们在道路或田野侧面生长的风险投资和消除旋转作物的田间生长,但在田野中生长的植物仍未被发现。在本文中,我们证明了基于您的计算机视觉(CV)算法的应用,仅在三个不同的生长阶段(V3,V6)(V3,V6)中检测出在玉米场中生长的VC植物,以检测在玉米场中生长的VC植物的应用。使用无人飞机系统(UAS)遥感图像。使用Yolov5(S,M,L和X)的所有四个变体,并根据分类精度,平均平均精度(MAP)和F1得分进行比较。发现Yolov5s可以在玉米的V6阶段检测到最大分类精度为98%,地图为96.3%,而Yolov5s和Yolov5m的地图为96.3%,而Yolov5m的分类精度为85%,Yolov5m和Yolov5m的分类准确性最小,而Yolov5L的分类精度最少。在VT阶段,在尺寸416 x 416像素的图像上为86.5%。开发的CV算法有可能有效地检测和定位在玉米场中间生长的VC植物,并加快TBWEP的管理方面。
translated by 谷歌翻译
光学相干断层扫描血管造影(OCTA)可以非侵入地对眼睛的循环系统进行图像。为了可靠地表征视网膜脉管系统,有必要自动从这些图像中提取定量指标。这种生物标志物的计算需要对血管进行精确的语义分割。但是,基于深度学习的分割方法主要依赖于使用体素级注释的监督培训,这是昂贵的。在这项工作中,我们提出了一条管道,以合成具有本质上匹配的地面真实标签的大量逼真的八颗图像。从而消除了需要手动注释培训数据的需求。我们提出的方法基于两个新的组成部分:1)基于生理的模拟,该模拟对各种视网膜血管丛进行建模和2)基于物理学的图像增强套件,这些图像增强量模拟了八八章图像采集过程,包括典型文物。在广泛的基准测试实验中,我们通过成功训练视网膜血管分割算法来证明合成数据的实用性。在我们方法的竞争性定量和优越的定性性能的鼓励下,我们认为它构成了一种多功能工具,可以推进对八章图像的定量分析。
translated by 谷歌翻译
Majorana示威者是一项领先的实验,寻找具有高纯净锗探测器(HPGE)的中性s中性双β衰变。机器学习提供了一种最大化这些检测器提供的信息量的新方法,但是与传统分析相比,数据驱动的性质使其不可解释。一项可解释性研究揭示了机器的决策逻辑,使我们能够从机器中学习以反馈传统分析。在这项工作中,我们介绍了Majorana演示者数据的第一个机器学习分析。这也是对任何锗探测器实验的第一个可解释的机器学习分析。训练了两个梯度增强的决策树模型,以从数据中学习,并进行了基于游戏理论的模型可解释性研究,以了解分类功率的起源。通过从数据中学习,该分析识别重建参数之间的相关性,以进一步增强背景拒绝性能。通过从机器中学习,该分析揭示了新的背景类别对相互利用的标准Majorana分析的重要性。该模型与下一代锗探测器实验(如传说)高度兼容,因为它可以同时在大量探测器上进行训练。
translated by 谷歌翻译