气道分割对于胸部CT图像分析至关重要。但是,由于固有的复杂树状结构和气道分支的不平衡大小,这仍然是一项具有挑战性的任务。当前的深度学习方法着眼于模型结构设计,而培训策略和损失功能的潜力尚未得到充分探索。因此,我们提出了一个简单而有效的气道分割管道,该管道表示为Naviairway,它发现具有支气管敏感的损失功能和人类视觉启发的迭代训练策略,发现了更细的细支气管。实验结果表明,Naverway的表现优于现有方法,尤其是在识别高产生的细支气管和对新CT扫描的鲁棒性方面。此外,纳维亚威是一般的。它可以与不同的骨干模型结合使用,并显着提高其性能。此外,我们建议对基于深度学习的气道细分方法进行更全面,更公平的评估,以更全面,更公平地评估。 Naveraway可以生成用于导航支气管镜检查的气道路线图,并且在生物医学图像中细分精细和长管结构时,也可以应用于其他情况。该代码可在https://github.com/antonotnawang/naviairway上公开获得。
translated by 谷歌翻译
大型语言模型已经证明了能够在自然语言和编程语言文本上进行条件和生成的能力。这样的模型打开了多语言代码生成的可能性:代码生成模型是否可以将知识从一种语言推广到另一种语言?尽管当代代码生成模型可以生成语义上正确的Python代码,但对它们使用其他语言的能力知之甚少。我们通过提出Multipl-E来促进该主题的探索,这是自然语言到代码生成的第一个多语言平行基准。 Multipl-E扩展了HumaneVal基准(Chen等,2021),以支持另外18种编程语言,涵盖了一系列编程范式和受欢迎程度。我们在Multipl-E:Codex和Incoder上评估了两个最先进的代码生成模型。我们发现,在几种语言上,法典匹配,甚至超过了其在Python上的性能。在多型E中表示的编程语言范围使我们能够探索语言频率和语言功能对模型性能的影响。最后,将代码生成基准分配给新编程语言的多重方法既可扩展又可扩展。我们描述了一种通用方法,可以轻松地增加对新基准和语言的支持。
translated by 谷歌翻译
读取文本读取序列的确定是对记录理解的基础。在文本组织成一系列行和垂直对准的页面中,可以轻松解决此问题,并运行页面的高度(生成可以从左到右读取的多列)。我们展示了一种情况 - 目录页面解析问题 - 以不规则,视觉组织的二维格式在页面上呈现信息。目录页面在金融招股说明书中相当常见,并携带有关组织,其地址和关系的信息,这是客户在车内客户端的关键。有趣的是,目录页有时有分层结构,激励需要将读取序列概括为读取树。我们向识别目录页面和构建读取树的问题提供解决方案,使用(学习)文本段和自下而上的(向左,左上,顶部顶部)遍历的段的横向。该解决方案是支持从客户端船上文件自动提取组织,地址和关系信息的生产服务的关键部分。
translated by 谷歌翻译
We aim to bridge the gap between our common-sense few-sample human learning and large-data machine learning. We derive a theory of human-like few-shot learning from von-Neuman-Landauer's principle. modelling human learning is difficult as how people learn varies from one to another. Under commonly accepted definitions, we prove that all human or animal few-shot learning, and major models including Free Energy Principle and Bayesian Program Learning that model such learning, approximate our theory, under Church-Turing thesis. We find that deep generative model like variational autoencoder (VAE) can be used to approximate our theory and perform significantly better than baseline models including deep neural networks, for image recognition, low resource language processing, and character recognition.
translated by 谷歌翻译
We propose a distributionally robust return-risk model for Markov decision processes (MDPs) under risk and reward ambiguity. The proposed model optimizes the weighted average of mean and percentile performances, and it covers the distributionally robust MDPs and the distributionally robust chance-constrained MDPs (both under reward ambiguity) as special cases. By considering that the unknown reward distribution lies in a Wasserstein ambiguity set, we derive the tractable reformulation for our model. In particular, we show that that the return-risk model can also account for risk from uncertain transition kernel when one only seeks deterministic policies, and that a distributionally robust MDP under the percentile criterion can be reformulated as its nominal counterpart at an adjusted risk level. A scalable first-order algorithm is designed to solve large-scale problems, and we demonstrate the advantages of our proposed model and algorithm through numerical experiments.
translated by 谷歌翻译
Modern deep neural networks have achieved superhuman performance in tasks from image classification to game play. Surprisingly, these various complex systems with massive amounts of parameters exhibit the same remarkable structural properties in their last-layer features and classifiers across canonical datasets. This phenomenon is known as "Neural Collapse," and it was discovered empirically by Papyan et al. \cite{Papyan20}. Recent papers have theoretically shown the global solutions to the training network problem under a simplified "unconstrained feature model" exhibiting this phenomenon. We take a step further and prove the Neural Collapse occurrence for deep linear network for the popular mean squared error (MSE) and cross entropy (CE) loss. Furthermore, we extend our research to imbalanced data for MSE loss and present the first geometric analysis for Neural Collapse under this setting.
translated by 谷歌翻译
Machine Reading Comprehension has become one of the most advanced and popular research topics in the fields of Natural Language Processing in recent years. The classification of answerability questions is a relatively significant sub-task in machine reading comprehension; however, there haven't been many studies. Retro-Reader is one of the studies that has solved this problem effectively. However, the encoders of most traditional machine reading comprehension models in general and Retro-Reader, in particular, have not been able to exploit the contextual semantic information of the context completely. Inspired by SemBERT, we use semantic role labels from the SRL task to add semantics to pre-trained language models such as mBERT, XLM-R, PhoBERT. This experiment was conducted to compare the influence of semantics on the classification of answerability for the Vietnamese machine reading comprehension. Additionally, we hope this experiment will enhance the encoder for the Retro-Reader model's Sketchy Reading Module. The improved Retro-Reader model's encoder with semantics was first applied to the Vietnamese Machine Reading Comprehension task and obtained positive results.
translated by 谷歌翻译
Existing measures and representations for trajectories have two longstanding fundamental shortcomings, i.e., they are computationally expensive and they can not guarantee the `uniqueness' property of a distance function: dist(X,Y) = 0 if and only if X=Y, where $X$ and $Y$ are two trajectories. This paper proposes a simple yet powerful way to represent trajectories and measure the similarity between two trajectories using a distributional kernel to address these shortcomings. It is a principled approach based on kernel mean embedding which has a strong theoretical underpinning. It has three distinctive features in comparison with existing approaches. (1) A distributional kernel is used for the very first time for trajectory representation and similarity measurement. (2) It does not rely on point-to-point distances which are used in most existing distances for trajectories. (3) It requires no learning, unlike existing learning and deep learning approaches. We show the generality of this new approach in three applications: (a) trajectory anomaly detection, (b) anomalous sub-trajectory detection, and (c) trajectory pattern mining. We identify that the distributional kernel has (i) a unique data-dependent property and the above uniqueness property which are the key factors that lead to its superior task-specific performance; and (ii) runtime orders of magnitude faster than existing distance measures.
translated by 谷歌翻译
Recent studies have shown that using an external Language Model (LM) benefits the end-to-end Automatic Speech Recognition (ASR). However, predicting tokens that appear less frequently in the training set is still quite challenging. The long-tail prediction problems have been widely studied in many applications, but only been addressed by a few studies for ASR and LMs. In this paper, we propose a new memory augmented lookup dictionary based Transformer architecture for LM. The newly introduced lookup dictionary incorporates rich contextual information in training set, which is vital to correctly predict long-tail tokens. With intensive experiments on Chinese and English data sets, our proposed method is proved to outperform the baseline Transformer LM by a great margin on both word/character error rate and tail tokens error rate. This is achieved without impact on the decoding efficiency. Overall, we demonstrate the effectiveness of our proposed method in boosting the ASR decoding performance, especially for long-tail tokens.
translated by 谷歌翻译
Detecting abrupt changes in data distribution is one of the most significant tasks in streaming data analysis. Although many unsupervised Change-Point Detection (CPD) methods have been proposed recently to identify those changes, they still suffer from missing subtle changes, poor scalability, or/and sensitive to noise points. To meet these challenges, we are the first to generalise the CPD problem as a special case of the Change-Interval Detection (CID) problem. Then we propose a CID method, named iCID, based on a recent Isolation Distributional Kernel (IDK). iCID identifies the change interval if there is a high dissimilarity score between two non-homogeneous temporal adjacent intervals. The data-dependent property and finite feature map of IDK enabled iCID to efficiently identify various types of change points in data streams with the tolerance of noise points. Moreover, the proposed online and offline versions of iCID have the ability to optimise key parameter settings. The effectiveness and efficiency of iCID have been systematically verified on both synthetic and real-world datasets.
translated by 谷歌翻译