智能论文笔记

Optimal Transport Features for Morphometric Population Analysis

Samuel Gerber , Marc Niethammer , Ebrahim Ebrahim , Joseph Piven , Stephen R. Dager , Martin Styner , Stephen Aylward , Andinet Enquobahrie

分类：计算机视觉

2022-08-11

脑病理通常表现为组织的部分或完全丧失。许多神经影像学研究的目的是捕获有关感兴趣的临床变量（例如疾病进展）的组织变化的位置和数量。形态分析方法捕获了与临床变量有关的组织分布或其他含量的兴趣分布的局部差异。我们建议通过基于不平衡的最佳传输的附加特征提取步骤来增强形态分析。最佳运输特征提取步骤增加了导致空间分散组织损失的病理学的统计能力，从而最大程度地减少了由于空间未对准或大脑拓扑差异而对变化的敏感性，并将由于组织位置而导致的变化而分离。我们证明了在阿尔茨海默氏病的OASIS-1研究的体积形态学分析的背景下，提出的最佳运输特征提取步骤。结果表明，所提出的方法可以识别组织的变化和差异，而这些差异是无法测量的。

translated by 谷歌翻译

Multi-view Attention for gestational age at birth prediction

Mathieu Leclercq , Martin Styner , Juan Carlos Prieto

分类：计算机视觉 | 机器学习

2022-07-08

我们介绍了SLCN出生预测时胎龄（临床神经影像学表面学习）挑战的方法。我们的方法基于一种多视图形状分析技术，该技术从不同的角度捕获3D对象的2D渲染。我们在球体表面上呈现大脑特征，然后通过2D CNN分析2D图像，并针对回归任务进行注意力层。回归任务在天然空间上达到1.637 +-1.3的MAE，模板空间上的MAE为1.38 +-1.14。该项目的源代码可在我们的GitHub存储库中获得https://github.com/mathieuleclercq/slcn_challenge_unc_unc_unc

translated by 谷歌翻译

Local Spatiotemporal Representation Learning for Longitudinally-consistent Neuroimage Analysis

Mengwei Ren , Neel Dey , Martin A. Styner , Kelly Botteron , Guido Gerig

分类：计算机视觉 | 机器学习

2022-06-09

医学计算机视觉的最新自我监督进步利用了在下游任务（例如分割）之前预处理的全球和局部解剖自我相似性。但是，当前方法假设I.I.D.图像采集是在临床研究设计中无效的，其中随访纵向扫描跟踪特定于主体的时间变化。此外，现有的自我监督方法用于医学上相关的图像到图像体系结构仅利用空间或时间自相似性，并且仅通过在单个图像尺度上应用的损失来进行，而天真的多尺度空间时空扩展崩溃了解决方案。对于这些目的，本文做出了两种贡献：（1）它提出了一种局部和多规模的时空表示方法，用于对纵向图像进行训练的图像到图像架构。它利用了学到的多尺度内部主体内特征的时空自相似性来进行训练，并开发出几种特征正规化，以避免崩溃的身份表示。（2）在填充期间，它提出了一个令人惊讶的简单的自我监督分割一致性正规化以利用受试者内部的相关性。该框架以单次分割设置为基准，该框架的表现优于良好调整的随机定位基线和为I.I.D设计的当前自我监督技术。和纵向数据集。在纵向神经退行性的成年MRI和发育的婴儿脑MRI中，这些改进都得到了证明，并产生了更高的性能和纵向一致性。

translated by 谷歌翻译

In Quest of Ground Truth: Learning Confident Models and Estimating Uncertainty in the Presence of Annotator Noise

Asma Ahmed Hashmi , Artem Agafonov , Aigerim Zhumabayeva , Mohammad Yaqub , Martin Takáč

分类：计算机视觉 | 机器学习

2023-01-02

The performance of the Deep Learning (DL) models depends on the quality of labels. In some areas, the involvement of human annotators may lead to noise in the data. When these corrupted labels are blindly regarded as the ground truth (GT), DL models suffer from performance deficiency. This paper presents a method that aims to learn a confident model in the presence of noisy labels. This is done in conjunction with estimating the uncertainty of multiple annotators. We robustly estimate the predictions given only the noisy labels by adding entropy or information-based regularizer to the classifier network. We conduct our experiments on a noisy version of MNIST, CIFAR-10, and FMNIST datasets. Our empirical results demonstrate the robustness of our method as it outperforms or performs comparably to other state-of-the-art (SOTA) methods. In addition, we evaluated the proposed method on the curated dataset, where the noise type and level of various annotators depend on the input image style. We show that our approach performs well and is adept at learning annotators' confusion. Moreover, we demonstrate how our model is more confident in predicting GT than other baselines. Finally, we assess our approach for segmentation problem and showcase its effectiveness with experiments.

translated by 谷歌翻译

Landing a UAV in Harsh Winds and Turbulent Open Waters

Parakh M. Gupta , Eric Pairet , Tiago Nascimento , Martin Saska

分类：机器人

2022-12-31

Landing an unmanned aerial vehicle unmanned aerial vehicle (UAV) on top of an unmanned surface vehicle (USV) in harsh open waters is a challenging problem, owing to forces that can damage the UAV due to a severe roll and/or pitch angle of the USV during touchdown. To tackle this, we propose a novel model predictive control (MPC) approach enabling a UAV to land autonomously on a USV in these harsh conditions. The MPC employs a novel objective function and an online decomposition of the oscillatory motion of the vessel to predict, attempt, and accomplish the landing during near-zero tilt of the landing platform. The nonlinear prediction of the motion of the vessel is performed using visual data from an onboard camera. Therefore, the system does not require any communication with the USV or a control station. The proposed method was analyzed in numerous robotics simulations in harsh and extreme conditions and further validated in various real-world scenarios.

translated by 谷歌翻译

Exploring Singularities in point clouds with the graph Laplacian: An explicit approach

Martin Andersson , Benny Avelin

分类： (统计)机器学习 | 机器学习

2022-12-31

We develop theory and methods that use the graph Laplacian to analyze the geometry of the underlying manifold of point clouds. Our theory provides theoretical guarantees and explicit bounds on the functional form of the graph Laplacian, in the case when it acts on functions defined close to singularities of the underlying manifold. We also propose methods that can be used to estimate these geometric properties of the point cloud, which are based on the theoretical guarantees.

translated by 谷歌翻译

GPT Takes the Bar Exam

Michael Bommarito II , Daniel Martin Katz

分类：自然语言处理 | 人工智能 | 机器学习

2022-12-29

Nearly all jurisdictions in the United States require a professional license exam, commonly referred to as "the Bar Exam," as a precondition for law practice. To even sit for the exam, most jurisdictions require that an applicant completes at least seven years of post-secondary education, including three years at an accredited law school. In addition, most test-takers also undergo weeks to months of further, exam-specific preparation. Despite this significant investment of time and capital, approximately one in five test-takers still score under the rate required to pass the exam on their first try. In the face of a complex task that requires such depth of knowledge, what, then, should we expect of the state of the art in "AI?" In this research, we document our experimental evaluation of the performance of OpenAI's `text-davinci-003` model, often-referred to as GPT-3.5, on the multistate multiple choice (MBE) section of the exam. While we find no benefit in fine-tuning over GPT-3.5's zero-shot performance at the scale of our training data, we do find that hyperparameter optimization and prompt engineering positively impacted GPT-3.5's zero-shot performance. For best prompt and parameters, GPT-3.5 achieves a headline correct rate of 50.3% on a complete NCBE MBE practice exam, significantly in excess of the 25% baseline guessing rate, and performs at a passing rate for both Evidence and Torts. GPT-3.5's ranking of responses is also highly-correlated with correctness; its top two and top three choices are correct 71% and 88% of the time, respectively, indicating very strong non-entailment performance. While our ability to interpret these results is limited by nascent scientific understanding of LLMs and the proprietary nature of GPT, we believe that these results strongly suggest that an LLM will pass the MBE component of the Bar Exam in the near future.

translated by 谷歌翻译

Robust Cross-vendor Mammographic Texture Models Using Augmentation-based Domain Adaptation for Long-term Breast Cancer Risk

Andreas D. Lauritzen , My Catarina von Euler-Chelpin , Elsebeth Lynge , Ilse Vejborg , Mads Nielsen , Nico Karssemeijer , Martin Lillholm

分类：计算机视觉

2022-12-27

The future of population-based breast cancer screening is likely personalized strategies based on clinically relevant risk models. Mammography-based risk models should remain robust to domain shifts caused by different populations and mammographic devices. Modern risk models do not ensure adaptation across vendor-domains and are often conflated to unintentionally rely on both precursors of cancer and systemic/global mammographic information associated with short- and long-term risk, respectively, which might limit performance. We developed a robust, cross-vendor model for long-term risk assessment. An augmentation-based domain adaption technique, based on flavorization of mammographic views, ensured generalization to an unseen vendor-domain. We trained on samples without diagnosed/potential malignant findings to learn systemic/global breast tissue features, called mammographic texture, indicative of future breast cancer. However, training so may cause erratic convergence. By excluding noise-inducing samples and designing a case-control dataset, a robust ensemble texture model was trained. This model was validated in two independent datasets. In 66,607 Danish women with flavorized Siemens views, the AUC was 0.71 and 0.65 for prediction of interval cancers within two years (ICs) and from two years after screening (LTCs), respectively. In a combination with established risk factors, the model's AUC increased to 0.68 for LTCs. In 25,706 Dutch women with Hologic-processed views, the AUCs were not different from the AUCs in Danish women with flavorized views. The results suggested that the model robustly estimated long-term risk while adapting to an unseen processed vendor-domain. The model identified 8.1% of Danish women accounting for 20.9% of ICs and 14.2% of LTCs.

translated by 谷歌翻译

Large Language Models Encode Clinical Knowledge

Karan Singhal , Shekoofeh Azizi , Tao Tu , S. Sara Mahdavi , Jason Wei , Hyung Won Chung , Nathan Scales , Ajay Tanwani , Heather Cole-Lewis , Stephen Pfohl

分类：自然语言处理

2022-12-26

Large language models (LLMs) have demonstrated impressive capabilities in natural language understanding and generation, but the quality bar for medical and clinical applications is high. Today, attempts to assess models' clinical knowledge typically rely on automated evaluations on limited benchmarks. There is no standard to evaluate model predictions and reasoning across a breadth of tasks. To address this, we present MultiMedQA, a benchmark combining six existing open question answering datasets spanning professional medical exams, research, and consumer queries; and HealthSearchQA, a new free-response dataset of medical questions searched online. We propose a framework for human evaluation of model answers along multiple axes including factuality, precision, possible harm, and bias. In addition, we evaluate PaLM (a 540-billion parameter LLM) and its instruction-tuned variant, Flan-PaLM, on MultiMedQA. Using a combination of prompting strategies, Flan-PaLM achieves state-of-the-art accuracy on every MultiMedQA multiple-choice dataset (MedQA, MedMCQA, PubMedQA, MMLU clinical topics), including 67.6% accuracy on MedQA (US Medical License Exam questions), surpassing prior state-of-the-art by over 17%. However, human evaluation reveals key gaps in Flan-PaLM responses. To resolve this we introduce instruction prompt tuning, a parameter-efficient approach for aligning LLMs to new domains using a few exemplars. The resulting model, Med-PaLM, performs encouragingly, but remains inferior to clinicians. We show that comprehension, recall of knowledge, and medical reasoning improve with model scale and instruction prompt tuning, suggesting the potential utility of LLMs in medicine. Our human evaluations reveal important limitations of today's models, reinforcing the importance of both evaluation frameworks and method development in creating safe, helpful LLM models for clinical applications.

translated by 谷歌翻译

Posterior-Variance-Based Error Quantification for Inverse Problems in Imaging

Dominik Narnhofer , Andreas Habring , Martin Holler , Thomas Pock

分类：计算机视觉

2022-12-23

In this work, a method for obtaining pixel-wise error bounds in Bayesian regularization of inverse imaging problems is introduced. The proposed method employs estimates of the posterior variance together with techniques from conformal prediction in order to obtain coverage guarantees for the error bounds, without making any assumption on the underlying data distribution. It is generally applicable to Bayesian regularization approaches, independent, e.g., of the concrete choice of the prior. Furthermore, the coverage guarantees can also be obtained in case only approximate sampling from the posterior is possible. With this in particular, the proposed framework is able to incorporate any learned prior in a black-box manner. Guaranteed coverage without assumptions on the underlying distributions is only achievable since the magnitude of the error bounds is, in general, unknown in advance. Nevertheless, experiments with multiple regularization approaches presented in the paper confirm that in practice, the obtained error bounds are rather tight. For realizing the numerical experiments, also a novel primal-dual Langevin algorithm for sampling from non-smooth distributions is introduced in this work.

translated by 谷歌翻译