尺寸不正确的气球导管可能导致手术后并发症增加,但是即使术前成像,正确的选择仍然是一个挑战。在手术过程中反馈有限,很难验证正确的部署。我们建议使用集成的阻抗测量和电阻抗断层扫描(EIT)成像来评估气球的变形并确定周围腔的大小和形状。以前使用单个阻抗测量值或压力数据和分析模型的工作,同时证明了较高的尺寸精度,已经假设了圆形横截面。在这里,我们通过添加多种电极来检测椭圆形和遮挡的管腔并获得EIT图像以定位变形来扩展这些方法。以14 FR(5.3 mm)导管为例,进行数值模拟,以找到两个相距10 mm的8个电极的两个环的最佳电极构型。模拟预测,可检测到的最大纵横比在30mm时从14mm气球的0.9降低到0.5。实验验证了尺寸和椭圆度检测结果。构建了原型机器人气球导管,以自动膨胀一个兼容的气球,同时记录EIT和压力数据。在复制具有椭圆形和不对称曲线的狭窄血管的实验中收集了数据,并在血管成形术期间的管腔扩大。校准后,该系统能够正确定位闭合和检测为0.75的宽高比。 EIT图像进一步定位了阻塞,并在气球充气期间可视化管腔扩张。
translated by 谷歌翻译
In this work, we address the problem of unsupervised moving object segmentation (MOS) in 4D LiDAR data recorded from a stationary sensor, where no ground truth annotations are involved. Deep learning-based state-of-the-art methods for LiDAR MOS strongly depend on annotated ground truth data, which is expensive to obtain and scarce in existence. To close this gap in the stationary setting, we propose a novel 4D LiDAR representation based on multivariate time series that relaxes the problem of unsupervised MOS to a time series clustering problem. More specifically, we propose modeling the change in occupancy of a voxel by a multivariate occupancy time series (MOTS), which captures spatio-temporal occupancy changes on the voxel level and its surrounding neighborhood. To perform unsupervised MOS, we train a neural network in a self-supervised manner to encode MOTS into voxel-level feature representations, which can be partitioned by a clustering algorithm into moving or stationary. Experiments on stationary scenes from the Raw KITTI dataset show that our fully unsupervised approach achieves performance that is comparable to that of supervised state-of-the-art approaches.
translated by 谷歌翻译
Objective: Accurate visual classification of bladder tissue during Trans-Urethral Resection of Bladder Tumor (TURBT) procedures is essential to improve early cancer diagnosis and treatment. During TURBT interventions, White Light Imaging (WLI) and Narrow Band Imaging (NBI) techniques are used for lesion detection. Each imaging technique provides diverse visual information that allows clinicians to identify and classify cancerous lesions. Computer vision methods that use both imaging techniques could improve endoscopic diagnosis. We address the challenge of tissue classification when annotations are available only in one domain, in our case WLI, and the endoscopic images correspond to an unpaired dataset, i.e. there is no exact equivalent for every image in both NBI and WLI domains. Method: We propose a semi-surprised Generative Adversarial Network (GAN)-based method composed of three main components: a teacher network trained on the labeled WLI data; a cycle-consistency GAN to perform unpaired image-to-image translation, and a multi-input student network. To ensure the quality of the synthetic images generated by the proposed GAN we perform a detailed quantitative, and qualitative analysis with the help of specialists. Conclusion: The overall average classification accuracy, precision, and recall obtained with the proposed method for tissue classification are 0.90, 0.88, and 0.89 respectively, while the same metrics obtained in the unlabeled domain (NBI) are 0.92, 0.64, and 0.94 respectively. The quality of the generated images is reliable enough to deceive specialists. Significance: This study shows the potential of using semi-supervised GAN-based classification to improve bladder tissue classification when annotations are limited in multi-domain data.
translated by 谷歌翻译
The receptive field (RF), which determines the region of time series to be ``seen'' and used, is critical to improve the performance for time series classification (TSC). However, the variation of signal scales across and within time series data, makes it challenging to decide on proper RF sizes for TSC. In this paper, we propose a dynamic sparse network (DSN) with sparse connections for TSC, which can learn to cover various RF without cumbersome hyper-parameters tuning. The kernels in each sparse layer are sparse and can be explored under the constraint regions by dynamic sparse training, which makes it possible to reduce the resource cost. The experimental results show that the proposed DSN model can achieve state-of-art performance on both univariate and multivariate TSC datasets with less than 50\% computational cost compared with recent baseline methods, opening the path towards more accurate resource-aware methods for time series analyses. Our code is publicly available at: https://github.com/QiaoXiao7282/DSN.
translated by 谷歌翻译
While the problem of hallucinations in neural machine translation has long been recognized, so far the progress on its alleviation is very little. Indeed, recently it turned out that without artificially encouraging models to hallucinate, previously existing methods fall short and even the standard sequence log-probability is more informative. It means that characteristics internal to the model can give much more information than we expect, and before using external models and measures, we first need to ask: how far can we go if we use nothing but the translation model itself ? We propose to use a method that evaluates the percentage of the source contribution to a generated translation. Intuitively, hallucinations are translations "detached" from the source, hence they can be identified by low source contribution. This method improves detection accuracy for the most severe hallucinations by a factor of 2 and is able to alleviate hallucinations at test time on par with the previous best approach that relies on external models. Next, if we move away from internal model characteristics and allow external tools, we show that using sentence similarity from cross-lingual embeddings further improves these results.
translated by 谷歌翻译
We pose video object segmentation as spectral graph clustering in space and time, with one graph node for each pixel and edges forming local space-time neighborhoods. We claim that the strongest cluster in this video graph represents the salient object. We start by introducing a novel and efficient method based on 3D filtering for approximating the spectral solution, as the principal eigenvector of the graph's adjacency matrix, without explicitly building the matrix. This key property allows us to have a fast parallel implementation on GPU, orders of magnitude faster than classical approaches for computing the eigenvector. Our motivation for a spectral space-time clustering approach, unique in video semantic segmentation literature, is that such clustering is dedicated to preserving object consistency over time, which we evaluate using our novel segmentation consistency measure. Further on, we show how to efficiently learn the solution over multiple input feature channels. Finally, we extend the formulation of our approach beyond the segmentation task, into the realm of object tracking. In extensive experiments we show significant improvements over top methods, as well as over powerful ensembles that combine them, achieving state-of-the-art on multiple benchmarks, both for tracking and segmentation.
translated by 谷歌翻译
Many prior language modeling efforts have shown that pre-training on an in-domain corpus can significantly improve performance on downstream domain-specific NLP tasks. However, the difficulties associated with collecting enough in-domain data might discourage researchers from approaching this pre-training task. In this paper, we conducted a series of experiments by pre-training Bidirectional Encoder Representations from Transformers (BERT) with different sizes of biomedical corpora. The results demonstrate that pre-training on a relatively small amount of in-domain data (4GB) with limited training steps, can lead to better performance on downstream domain-specific NLP tasks compared with fine-tuning models pre-trained on general corpora.
translated by 谷歌翻译
In this paper, we explore the relationship between an individual's writing style and the risk that they will engage in online harmful behaviors (such as cyberbullying). In particular, we consider whether measurable differences in writing style relate to different personality types, as modeled by the Big-Five personality traits and the Dark Triad traits, and can differentiate between users who do or do not engage in harmful behaviors. We study messages from nearly 2,500 users from two online communities (Twitter and Reddit) and find that we can measure significant personality differences between regular and harmful users from the writing style of as few as 100 tweets or 40 Reddit posts, aggregate these values to distinguish between healthy and harmful communities, and also use style attributes to predict which users will engage in harmful behaviors.
translated by 谷歌翻译
3D autonomous driving semantic segmentation using deep learning has become, a well-studied subject, providing methods that can reach very high performance. Nonetheless, because of the limited size of the training datasets, these models cannot see every type of object and scenes found in real-world applications. The ability to be reliable in these various unknown environments is called domain generalization. Despite its importance, domain generalization is relatively unexplored in the case of 3D autonomous driving semantic segmentation. To fill this gap, this paper presents the first benchmark for this application by testing state-of-the-art methods and discussing the difficulty of tackling LiDAR domain shifts. We also propose the first method designed to address this domain generalization, which we call 3DLabelProp. This method relies on leveraging the geometry and sequentiality of the LiDAR data to enhance its generalization performances by working on partially accumulated point clouds. It reaches a mIoU of 52.6% on SemanticPOSS while being trained only on SemanticKITTI, making it state-of-the-art method for generalization (+7.4% better than the second best method). The code for this method will be available on Github.
translated by 谷歌翻译
Metric Elicitation (ME) is a framework for eliciting classification metrics that better align with implicit user preferences based on the task and context. The existing ME strategy so far is based on the assumption that users can most easily provide preference feedback over classifier statistics such as confusion matrices. This work examines ME, by providing a first ever implementation of the ME strategy. Specifically, we create a web-based ME interface and conduct a user study that elicits users' preferred metrics in a binary classification setting. We discuss the study findings and present guidelines for future research in this direction.
translated by 谷歌翻译