组织病理学癌症诊断已经变得更加复杂,并且越来越多的活组织检查是大多数病理实验室的挑战。因此,用于评估组织病理学癌细胞的自动化方法的发展是值。在这项研究中,我们使用了来自挪威队的624个整个乳腺癌(WSIS)乳腺癌。我们提出了一种级联卷积神经网络设计,称为H2G-NET,用于千兆子宫内病理学图像的语义分割。该设计涉及使用PATCH-WISE方法的检测阶段,以及使用卷积AutoEncoder的细化阶段。为了验证设计,我们进行了一个消融研究,以评估所选组分在管道上对肿瘤分割的影响。指导分割,使用等级取样和深热敷细化,在分割组织病理学图像时被证明是有益的。当使用细化网络后,我们发现了一种显着的改进,以便后处理产生的肿瘤分割热量。整体最佳设计在90个WSIS的独立测试集中实现了0.933的骰子得分。该设计表现优于单分辨率方法,例如使用MobileNetv2(0.872)和低分辨率U-Net(0.874)的聚类引导,Patch-Wise高分辨率分类。此外,代表性X400 WSI的分割〜58秒,仅使用CPU。调查结果展示了利用细化网络来改善修补程序预测的潜力。解决方案是有效的,不需要重叠的补丁推断或合并。此外,我们表明,可以使用随机采样方案训练深度神经网络,该方案同时在多个不同的标签上余下,而无需在磁盘上存储斑块。未来的工作应涉及更有效的补丁生成和采样,以及改进的聚类。
translated by 谷歌翻译
在研究和实践中,近几十年来,机器学习(ML)取得了巨大的成功。在网络物理系统(CPS)中,ML例如用于优化系统,以检测异常或识别系统故障的根本原因。然而,现有算法遭受了两个主要缺点:(i)他们很难被人类专家解释。 (ii)将一个系统转移到另一个系统(类似)系统的结果通常是一个挑战。概念学习,或代表学习(Repl),是两个缺点的解决方案;模仿人的解决方案方法来解释能力和转移能力:通过学习诸如物理量或系统状态的一般概念,模型由人类解释。此外,这种抽象水平的概念通常可以应用于各种不同的系统。现代ML方法已广泛用于CPS,但到目前为止,概念学习和转移学习几乎不使用。在本文中,我们提供了关于在时间序列数据中学习物理概念的方法的当前研究状态的概述,这是CPS的传感器数据的主要形式。我们还使用三箱系统的示例来分析来自现有技术的最重要的方法。基于这些混凝土实现1,我们讨论了方法的优缺点,并显示了哪些目的,并且可以在其中使用它们的条件。
translated by 谷歌翻译
深度学习对组织病理学整体幻灯片图像(WSIS)的应用持有提高诊断效率和再现性,但主要取决于写入计算机代码或购买商业解决方案的能力。我们介绍了一种使用自由使用,开源软件(Qupath,DeepMib和Spenthology)的无代码管道,用于创建和部署基于深度学习的分段模型,以进行计算病理学。我们展示了从结肠粘膜中分离上皮的用例的管道。通过使用管道的主动学习开发,包括140苏木蛋白 - 曙红(HE) - 染色的WSI(HE)-SIN(HE)-SIOS和111个CD3免疫染色体活检WSIS的数据集。在36人的持有试验组上,21个CD3染色的WSIS在上皮细分上实现了96.6%的平均交叉口96.6%和95.3%。我们展示了病理学家级分割准确性和临床可接受的运行时间绩效,并显示了没有编程经验的病理学家可以仅使用自由使用软件为组织病理WSIS创建近最先进的分段解决方案。该研究进一步展示了开源解决方案的强度在其创建普遍的开放管道的能力中,其中培训的模型和预测可以无缝地以开放格式导出,从而在外部解决方案中使用。所有脚本,培训的型号,视频教程和251个WSI的完整数据集在https://github.com/andreped/nocodeSeg中公开可用,以加速在该领域的研究。
translated by 谷歌翻译
Distribution shifts-where the training distribution differs from the test distribution-can substantially degrade the accuracy of machine learning (ML) systems deployed in the wild. Despite their ubiquity in the real-world deployments, these distribution shifts are under-represented in the datasets widely used in the ML community today. To address this gap, we present Wilds, a curated benchmark of 10 datasets reflecting a diverse range of distribution shifts that naturally arise in real-world applications, such as shifts across hospitals for tumor identification; across camera traps for wildlife monitoring; and across time and location in satellite imaging and poverty mapping. On each dataset, we show that standard training yields substantially lower out-of-distribution than in-distribution performance. This gap remains even with models trained by existing methods for tackling distribution shifts, underscoring the need for new methods for training models that are more robust to the types of distribution shifts that arise in practice. To facilitate method development, we provide an open-source package that automates dataset loading, contains default model architectures and hyperparameters, and standardizes evaluations. Code and leaderboards are available at https://wilds.stanford.edu.
translated by 谷歌翻译
Large, labeled datasets have driven deep learning methods to achieve expert-level performance on a variety of medical imaging tasks. We present CheXpert, a large dataset that contains 224,316 chest radiographs of 65,240 patients. We design a labeler to automatically detect the presence of 14 observations in radiology reports, capturing uncertainties inherent in radiograph interpretation. We investigate different approaches to using the uncertainty labels for training convolutional neural networks that output the probability of these observations given the available frontal and lateral radiographs. On a validation set of 200 chest radiographic studies which were manually annotated by 3 board-certified radiologists, we find that different uncertainty approaches are useful for different pathologies. We then evaluate our best model on a test set composed of 500 chest radiographic studies annotated by a consensus of 5 board-certified radiologists, and compare the performance of our model to that of 3 additional radiologists in the detection of 5 selected pathologies. On Cardiomegaly, Edema, and Pleural Effusion, the model ROC and PR curves lie above all 3 radiologist operating points. We release the dataset to the public as a standard benchmark to evaluate performance of chest radiograph interpretation models. 1
translated by 谷歌翻译
Designing experiments often requires balancing between learning about the true treatment effects and earning from allocating more samples to the superior treatment. While optimal algorithms for the Multi-Armed Bandit Problem (MABP) provide allocation policies that optimally balance learning and earning, they tend to be computationally expensive. The Gittins Index (GI) is a solution to the MABP that can simultaneously attain optimality and computationally efficiency goals, and it has been recently used in experiments with Bernoulli and Gaussian rewards. For the first time, we present a modification of the GI rule that can be used in experiments with exponentially-distributed rewards. We report its performance in simulated 2- armed and 3-armed experiments. Compared to traditional non-adaptive designs, our novel GI modified design shows operating characteristics comparable in learning (e.g. statistical power) but substantially better in earning (e.g. direct benefits). This illustrates the potential that designs using a GI approach to allocate participants have to improve participant benefits, increase efficiencies, and reduce experimental costs in adaptive multi-armed experiments with exponential rewards.
translated by 谷歌翻译
Quadruped robots are currently used in industrial robotics as mechanical aid to automate several routine tasks. However, presently, the usage of such a robot in a domestic setting is still very much a part of the research. This paper discusses the understanding and virtual simulation of such a robot capable of detecting and understanding human emotions, generating its gait, and responding via sounds and expression on a screen. To this end, we use a combination of reinforcement learning and software engineering concepts to simulate a quadruped robot that can understand emotions, navigate through various terrains and detect sound sources, and respond to emotions using audio-visual feedback. This paper aims to establish the framework of simulating a quadruped robot that is emotionally intelligent and can primarily respond to audio-visual stimuli using motor or audio response. The emotion detection from the speech was not as performant as ERANNs or Zeta Policy learning, still managing an accuracy of 63.5%. The video emotion detection system produced results that are almost at par with the state of the art, with an accuracy of 99.66%. Due to its "on-policy" learning process, the PPO algorithm was extremely rapid to learn, allowing the simulated dog to demonstrate a remarkably seamless gait across the different cadences and variations. This enabled the quadruped robot to respond to generated stimuli, allowing us to conclude that it functions as predicted and satisfies the aim of this work.
translated by 谷歌翻译
Real-world robotic grasping can be done robustly if a complete 3D Point Cloud Data (PCD) of an object is available. However, in practice, PCDs are often incomplete when objects are viewed from few and sparse viewpoints before the grasping action, leading to the generation of wrong or inaccurate grasp poses. We propose a novel grasping strategy, named 3DSGrasp, that predicts the missing geometry from the partial PCD to produce reliable grasp poses. Our proposed PCD completion network is a Transformer-based encoder-decoder network with an Offset-Attention layer. Our network is inherently invariant to the object pose and point's permutation, which generates PCDs that are geometrically consistent and completed properly. Experiments on a wide range of partial PCD show that 3DSGrasp outperforms the best state-of-the-art method on PCD completion tasks and largely improves the grasping success rate in real-world scenarios. The code and dataset will be made available upon acceptance.
translated by 谷歌翻译
When robots learn reward functions using high capacity models that take raw state directly as input, they need to both learn a representation for what matters in the task -- the task ``features" -- as well as how to combine these features into a single objective. If they try to do both at once from input designed to teach the full reward function, it is easy to end up with a representation that contains spurious correlations in the data, which fails to generalize to new settings. Instead, our ultimate goal is to enable robots to identify and isolate the causal features that people actually care about and use when they represent states and behavior. Our idea is that we can tune into this representation by asking users what behaviors they consider similar: behaviors will be similar if the features that matter are similar, even if low-level behavior is different; conversely, behaviors will be different if even one of the features that matter differs. This, in turn, is what enables the robot to disambiguate between what needs to go into the representation versus what is spurious, as well as what aspects of behavior can be compressed together versus not. The notion of learning representations based on similarity has a nice parallel in contrastive learning, a self-supervised representation learning technique that maps visually similar data points to similar embeddings, where similarity is defined by a designer through data augmentation heuristics. By contrast, in order to learn the representations that people use, so we can learn their preferences and objectives, we use their definition of similarity. In simulation as well as in a user study, we show that learning through such similarity queries leads to representations that, while far from perfect, are indeed more generalizable than self-supervised and task-input alternatives.
translated by 谷歌翻译
and widely used information measurement metric, particularly popularized for SSVEP- based Brain-Computer (BCI) interfaces. By combining speed and accuracy into a single-valued parameter, this metric aids in the evaluation and comparison of various target identification algorithms across different BCI communities. To accurately depict performance and inspire an end-to-end design for futuristic BCI designs, a more thorough examination and definition of ITR is therefore required. We model the symbiotic communication medium, hosted by the retinogeniculate visual pathway, as a discrete memoryless channel and use the modified capacity expressions to redefine the ITR. We use graph theory to characterize the relationship between the asymmetry of the transition statistics and the ITR gain with the new definition, leading to potential bounds on data rate performance. On two well-known SSVEP datasets, we compared two cutting-edge target identification methods. Results indicate that the induced DM channel asymmetry has a greater impact on the actual perceived ITR than the change in input distribution. Moreover, it is demonstrated that the ITR gain under the new definition is inversely correlated with the asymmetry in the channel transition statistics. Individual input customizations are further shown to yield perceived ITR performance improvements. An algorithm is proposed to find the capacity of binary classification and further discussions are given to extend such results to ensemble techniques.We anticipate that the results of our study will contribute to the characterization of the highly dynamic BCI channel capacities, performance thresholds, and improved BCI stimulus designs for a tighter symbiosis between the human brain and computer systems while enhancing the efficiency of the underlying communication resources.
translated by 谷歌翻译