骨质疏松症是一种常见的慢性代谢骨病,通常是由于对骨矿物密度(BMD)检查有限的有限获得而被诊断和妥善治疗,例如。通过双能X射线吸收测定法(DXA)。在本文中,我们提出了一种方法来预测来自胸X射线(CXR)的BMD,最常见的和低成本的医学成像考试之一。我们的方法首先自动检测来自CXR的局部和全球骨骼结构的感兴趣区域(ROI)。然后,开发了一种具有变压器编码器的多ROI深模型,以利用胸部X射线图像中的本地和全局信息以进行准确的BMD估计。我们的方法在13719 CXR患者病例中进行评估,并通过金标准DXA测量其实际BMD评分。该模型预测的BMD与地面真理(Pearson相关系数0.889腰腰1)具有强烈的相关性。当施用骨质疏松症筛查时,它实现了高分类性能(腰腰1的AUC 0.963)。作为现场使用CXR扫描预测BMD的第一次努力,所提出的算法在早期骨质疏松症筛查和公共卫生促进中具有很强的潜力。
translated by 谷歌翻译
膝关节骨关节炎(OA)是一种常见的堕落联合障碍,影响全世界的大型老年人。膝关节OA严重程度的准确放射线摄影评估在慢性患者管理中起着关键作用。目前临床采用的膝盖oA分级系统是观察者主观的,遭受帧间间的分歧。在这项工作中,我们提出了一种计算机辅助诊断方法,可以同时为两种复合材料和细粒度的OA等级提供更准确和一致的评估。提出了一种新的半监督学习方法,通过从未标记的数据学习来利用复合材料和细粒度的OA等级的潜在一致性。通过使用预先训练的高斯混合模型的日志概率表示等级相干性,我们制定了不连贯的损失,以纳入训练中的未标记数据。该方法还描述了基于关键点的汇集网络,其中从疾病目标键点(沿膝关节提取)汇集了深度图像特征,以提供更准确的和病于病理信息的特征表示,以获得准确的OA级评估。拟议的方法在公共骨关节炎倡议(OAI)数据上全面评估了4,796名科目的多中心的十年观测研究。实验结果表明,我们的方法对以前的强大的整个图像的深度分类网络基线(如Resnet-50)的显着改进。
translated by 谷歌翻译
3D描绘解剖结构是医学成像分析中的主要目标。在深度学习之前,统计形状模型施加解剖结构并产生高质量的表面是核心技术。在深度学习之前,统计形状模型施加解剖结构并产生高质量的表面是核心技术。今天完全卷积的网络(FCN),而主导,不提供这些功能。我们呈现深度隐式统计形状模型(幻像),一种划分的新方法,将卷积神经网络(CNNS)的表示力与SSM的稳健性结合。幻像使用深隐性表面表示来产生紧凑且描述性的形状潜空间,允许解剖学方差的统计模型。为了可靠地适应图像到图像,我们介绍了一种新颖的刚性和非刚性姿势估计管道,其被建模为Markov决策过程(MDP)。我们概述了一个培训制度,包括倒置的焦点培训和深度领域学习(MSL)的深刻实现。数据集关于病理肝脏分割任务的实验表明,幻灯片可以比三个领先的FCN模型更加强大,包括NNU-Net:将平均豪索轿车距离(HD)减少7.7-14.3毫米并改善最坏情况骰子索兰系数(DSC)达1.2-2.3%。更富豪地,直接反映临床部署方案的数据集上的交叉数据集实验表明,分别将平均DSC和HD分别改善平均DSC和HD,以及最坏情况的DSC 5.4-7.3%。这些改进超过了具有高质量表面的划分的任何益处。
translated by 谷歌翻译
Consensus clustering aggregates partitions in order to find a better fit by reconciling clustering results from different sources/executions. In practice, there exist noise and outliers in clustering task, which, however, may significantly degrade the performance. To address this issue, we propose a novel algorithm -- robust consensus clustering that can find common ground truth among experts' opinions, which tends to be minimally affected by the bias caused by the outliers. In particular, we formalize the robust consensus clustering problem as a constraint optimization problem, and then derive an effective algorithm upon alternating direction method of multipliers (ADMM) with rigorous convergence guarantee. Our method outperforms the baselines on benchmarks. We apply the proposed method to the real-world advertising campaign segmentation and forecasting tasks using the proposed consensus clustering results based on the similarity computed via Kolmogorov-Smirnov Statistics. The accurate clustering result is helpful for building the advertiser profiles so as to perform the forecasting.
translated by 谷歌翻译
Current mainstream object detection methods for large aerial images usually divide large images into patches and then exhaustively detect the objects of interest on all patches, no matter whether there exist objects or not. This paradigm, although effective, is inefficient because the detectors have to go through all patches, severely hindering the inference speed. This paper presents an Objectness Activation Network (OAN) to help detectors focus on fewer patches but achieve more efficient inference and more accurate results, enabling a simple and effective solution to object detection in large images. In brief, OAN is a light fully-convolutional network for judging whether each patch contains objects or not, which can be easily integrated into many object detectors and jointly trained with them end-to-end. We extensively evaluate our OAN with five advanced detectors. Using OAN, all five detectors acquire more than 30.0% speed-up on three large-scale aerial image datasets, meanwhile with consistent accuracy improvements. On extremely large Gaofen-2 images (29200$\times$27620 pixels), our OAN improves the detection speed by 70.5%. Moreover, we extend our OAN to driving-scene object detection and 4K video object detection, boosting the detection speed by 112.1% and 75.0%, respectively, without sacrificing the accuracy. Code is available at https://github.com/Ranchosky/OAN.
translated by 谷歌翻译
Due to the issue that existing wireless sensor network (WSN)-based anomaly detection methods only consider and analyze temporal features, in this paper, a self-supervised learning-based anomaly node detection method based on an autoencoder is designed. This method integrates temporal WSN data flow feature extraction, spatial position feature extraction and intermodal WSN correlation feature extraction into the design of the autoencoder to make full use of the spatial and temporal information of the WSN for anomaly detection. First, a fully connected network is used to extract the temporal features of nodes by considering a single mode from a local spatial perspective. Second, a graph neural network (GNN) is used to introduce the WSN topology from a global spatial perspective for anomaly detection and extract the spatial and temporal features of the data flows of nodes and their neighbors by considering a single mode. Then, the adaptive fusion method involving weighted summation is used to extract the relevant features between different models. In addition, this paper introduces a gated recurrent unit (GRU) to solve the long-term dependence problem of the time dimension. Eventually, the reconstructed output of the decoder and the hidden layer representation of the autoencoder are fed into a fully connected network to calculate the anomaly probability of the current system. Since the spatial feature extraction operation is advanced, the designed method can be applied to the task of large-scale network anomaly detection by adding a clustering operation. Experiments show that the designed method outperforms the baselines, and the F1 score reaches 90.6%, which is 5.2% higher than those of the existing anomaly detection methods based on unsupervised reconstruction and prediction. Code and model are available at https://github.com/GuetYe/anomaly_detection/GLSL
translated by 谷歌翻译
We study algorithms for detecting and including glass objects in an optimization-based Simultaneous Localization and Mapping (SLAM) algorithm in this work. When LiDAR data is the primary exteroceptive sensory input, glass objects are not correctly registered. This occurs as the incident light primarily passes through the glass objects or reflects away from the source, resulting in inaccurate range measurements for glass surfaces. Consequently, the localization and mapping performance is impacted, thereby rendering navigation in such environments unreliable. Optimization-based SLAM solutions, which are also referred to as Graph SLAM, are widely regarded as state of the art. In this paper, we utilize a simple and computationally inexpensive glass detection scheme for detecting glass objects and present the methodology to incorporate the identified objects into the occupancy grid maintained by such an algorithm (Google Cartographer). We develop both local (submap level) and global algorithms for achieving the objective mentioned above and compare the maps produced by our method with those produced by an existing algorithm that utilizes particle filter based SLAM.
translated by 谷歌翻译
Persuasion modeling is a key building block for conversational agents. Existing works in this direction are limited to analyzing textual dialogue corpus. We argue that visual signals also play an important role in understanding human persuasive behaviors. In this paper, we introduce the first multimodal dataset for modeling persuasion behaviors. Our dataset includes 199 dialogue transcriptions and videos captured in a multi-player social deduction game setting, 26,647 utterance level annotations of persuasion strategy, and game level annotations of deduction game outcomes. We provide extensive experiments to show how dialogue context and visual signals benefit persuasion strategy prediction. We also explore the generalization ability of language models for persuasion modeling and the role of persuasion strategies in predicting social deduction game outcomes. Our dataset, code, and models can be found at https://persuasion-deductiongame.socialai-data.org.
translated by 谷歌翻译
Image-based head swapping task aims to stitch a source head to another source body flawlessly. This seldom-studied task faces two major challenges: 1) Preserving the head and body from various sources while generating a seamless transition region. 2) No paired head swapping dataset and benchmark so far. In this paper, we propose an image-based head swapping framework (HS-Diffusion) which consists of a semantic-guided latent diffusion model (SG-LDM) and a semantic layout generator. We blend the semantic layouts of source head and source body, and then inpaint the transition region by the semantic layout generator, achieving a coarse-grained head swapping. SG-LDM can further implement fine-grained head swapping with the blended layout as condition by a progressive fusion process, while preserving source head and source body with high-quality reconstruction. To this end, we design a head-cover augmentation strategy for training and a neck alignment trick for geometric realism. Importantly, we construct a new image-based head swapping benchmark and propose two tailor-designed metrics (Mask-FID and Focal-FID). Extensive experiments demonstrate the superiority of our framework. The code will be available: https://github.com/qinghew/HS-Diffusion.
translated by 谷歌翻译
Point cloud registration (PCR) is a popular research topic in computer vision. Recently, the registration method in an evolutionary way has received continuous attention because of its robustness to the initial pose and flexibility in objective function design. However, most evolving registration methods cannot tackle the local optimum well and they have rarely investigated the success ratio, which implies the probability of not falling into local optima and is closely related to the practicality of the algorithm. Evolutionary multi-task optimization (EMTO) is a widely used paradigm, which can boost exploration capability through knowledge transfer among related tasks. Inspired by this concept, this study proposes a novel evolving registration algorithm via EMTO, where the multi-task configuration is based on the idea of solution space cutting. Concretely, one task searching in cut space assists another task with complex function landscape in escaping from local optima and enhancing successful registration ratio. To reduce unnecessary computational cost, a sparse-to-dense strategy is proposed. In addition, a novel fitness function robust to various overlap rates as well as a problem-specific metric of computational cost is introduced. Compared with 7 evolving registration approaches and 4 traditional registration approaches on the object-scale and scene-scale registration datasets, experimental results demonstrate that the proposed method has superior performances in terms of precision and tackling local optima.
translated by 谷歌翻译