由于其高质量的重建以及将现有迭代求解器结合起来的易于性,因此最近将扩散模型作为强大的生成反问题解决器研究。但是,大多数工作都专注于在无噪声设置中解决简单的线性逆问题,这显着不足以使实际问题的复杂性不足。在这项工作中,我们将扩散求解器扩展求解器,以通过后采样的拉普拉斯近似有效地处理一般噪声(非)线性反问题。有趣的是,所得的后验采样方案是扩散采样的混合版本,具有歧管约束梯度,而没有严格的测量一致性投影步骤,与先前的研究相比,在嘈杂的设置中产生了更可取的生成路径。我们的方法表明,扩散模型可以结合各种测量噪声统计量,例如高斯和泊松,并且还有效处理嘈杂的非线性反问题,例如傅立叶相检索和不均匀的脱毛。
translated by 谷歌翻译
由于其作为生成模型的强大表现,最近达到了社区内部的显着兴趣。此外,其对逆问题的应用已经证明了最先进的性能。不幸的是,扩散模型具有临界缺点 - 它们本质上是速度的速度,从而需要几千台迭代来产生来自纯高斯噪声的图像。在这项工作中,我们表明从高斯噪音开始是不必要的。相反,从具有更好初始化的单个向前扩散开始显着降低了反向条件扩散中的采样步骤的数量。这种现象是通过我们的条件扩散策略的随机差分方程的收缩理论正式解释 - 反向扩散的交替应用,然后是非膨胀性数据一致性步骤。新的采样策略被称为较近的漫射 - 更快(CCDF),还揭示了新的洞察,就如何对逆问题的方法如何协同组合扩散模型。具有超分辨率,图像染色和压缩传感MRI的实验结果表明,我们的方法可以在显着降低的采样步骤中实现最先进的重建性能。
translated by 谷歌翻译
基于分数的扩散模型为使用数据分布的梯度建模图像提供了一种强大的方法。利用学到的分数函数为先验,在这里,我们引入了一种从条件分布中进行测量的方法,以便可以轻松地用于求解成像中的反问题,尤其是用于加速MRI。简而言之,我们通过denoising得分匹配来训练连续的时间依赖分数函数。然后,在推论阶段,我们在数值SDE求解器和数据一致性投影步骤之间进行迭代以实现重建。我们的模型仅需要用于训练的幅度图像,但能够重建复杂值数据,甚至扩展到并行成像。所提出的方法是不可知论到子采样模式,可以与任何采样方案一起使用。同样,由于其生成性质,我们的方法可以量化不确定性,这是标准回归设置不可能的。最重要的是,我们的方法还具有非常强大的性能,甚至击败了经过全面监督训练的模型。通过广泛的实验,我们在质量和实用性方面验证了我们方法的优势。
translated by 谷歌翻译
最近,由于高性能,深度学习方法已成为生物学图像重建和增强问题的主要研究前沿,以及其超快速推理时间。但是,由于获得监督学习的匹配参考数据的难度,对不需要配对的参考数据的无监督学习方法越来越兴趣。特别是,已成功用于各种生物成像应用的自我监督的学习和生成模型。在本文中,我们概述了在古典逆问题的背景下的连贯性观点,并讨论其对生物成像的应用,包括电子,荧光和去卷积显微镜,光学衍射断层扫描和功能性神经影像。
translated by 谷歌翻译
Many recent works on understanding deep learning try to quantify how much individual data instances influence the optimization and generalization of a model, either by analyzing the behavior of the model during training or by measuring the performance gap of the model when the instance is removed from the dataset. Such approaches reveal characteristics and importance of individual instances, which may provide useful information in diagnosing and improving deep learning. However, most of the existing works on data valuation require actual training of a model, which often demands high-computational cost. In this paper, we provide a training-free data valuation score, called complexity-gap score, which is a data-centric score to quantify the influence of individual instances in generalization of two-layer overparameterized neural networks. The proposed score can quantify irregularity of the instances and measure how much each data instance contributes in the total movement of the network parameters during training. We theoretically analyze and empirically demonstrate the effectiveness of the complexity-gap score in finding 'irregular or mislabeled' data instances, and also provide applications of the score in analyzing datasets and diagnosing training dynamics.
translated by 谷歌翻译
In robotics and computer vision communities, extensive studies have been widely conducted regarding surveillance tasks, including human detection, tracking, and motion recognition with a camera. Additionally, deep learning algorithms are widely utilized in the aforementioned tasks as in other computer vision tasks. Existing public datasets are insufficient to develop learning-based methods that handle various surveillance for outdoor and extreme situations such as harsh weather and low illuminance conditions. Therefore, we introduce a new large-scale outdoor surveillance dataset named eXtremely large-scale Multi-modAl Sensor dataset (X-MAS) containing more than 500,000 image pairs and the first-person view data annotated by well-trained annotators. Moreover, a single pair contains multi-modal data (e.g. an IR image, an RGB image, a thermal image, a depth image, and a LiDAR scan). This is the first large-scale first-person view outdoor multi-modal dataset focusing on surveillance tasks to the best of our knowledge. We present an overview of the proposed dataset with statistics and present methods of exploiting our dataset with deep learning-based algorithms. The latest information on the dataset and our study are available at https://github.com/lge-robot-navi, and the dataset will be available for download through a server.
translated by 谷歌翻译
Crowdsourcing has emerged as an effective platform to label a large volume of data in a cost- and time-efficient manner. Most previous works have focused on designing an efficient algorithm to recover only the ground-truth labels of the data. In this paper, we consider multi-choice crowdsourced labeling with the goal of recovering not only the ground truth but also the most confusing answer and the confusion probability. The most confusing answer provides useful information about the task by revealing the most plausible answer other than the ground truth and how plausible it is. To theoretically analyze such scenarios, we propose a model where there are top-two plausible answers for each task, distinguished from the rest of choices. Task difficulty is quantified by the confusion probability between the top two, and worker reliability is quantified by the probability of giving an answer among the top two. Under this model, we propose a two-stage inference algorithm to infer the top-two answers as well as the confusion probability. We show that our algorithm achieves the minimax optimal convergence rate. We conduct both synthetic and real-data experiments and demonstrate that our algorithm outperforms other recent algorithms. We also show the applicability of our algorithms in inferring the difficulty of tasks and training neural networks with the soft labels composed of the top-two most plausible classes.
translated by 谷歌翻译
Large language models (LLMs) have demonstrated impressive capabilities in natural language understanding and generation, but the quality bar for medical and clinical applications is high. Today, attempts to assess models' clinical knowledge typically rely on automated evaluations on limited benchmarks. There is no standard to evaluate model predictions and reasoning across a breadth of tasks. To address this, we present MultiMedQA, a benchmark combining six existing open question answering datasets spanning professional medical exams, research, and consumer queries; and HealthSearchQA, a new free-response dataset of medical questions searched online. We propose a framework for human evaluation of model answers along multiple axes including factuality, precision, possible harm, and bias. In addition, we evaluate PaLM (a 540-billion parameter LLM) and its instruction-tuned variant, Flan-PaLM, on MultiMedQA. Using a combination of prompting strategies, Flan-PaLM achieves state-of-the-art accuracy on every MultiMedQA multiple-choice dataset (MedQA, MedMCQA, PubMedQA, MMLU clinical topics), including 67.6% accuracy on MedQA (US Medical License Exam questions), surpassing prior state-of-the-art by over 17%. However, human evaluation reveals key gaps in Flan-PaLM responses. To resolve this we introduce instruction prompt tuning, a parameter-efficient approach for aligning LLMs to new domains using a few exemplars. The resulting model, Med-PaLM, performs encouragingly, but remains inferior to clinicians. We show that comprehension, recall of knowledge, and medical reasoning improve with model scale and instruction prompt tuning, suggesting the potential utility of LLMs in medicine. Our human evaluations reveal important limitations of today's models, reinforcing the importance of both evaluation frameworks and method development in creating safe, helpful LLM models for clinical applications.
translated by 谷歌翻译
The nonconvex formulation of matrix completion problem has received significant attention in recent years due to its affordable complexity compared to the convex formulation. Gradient descent (GD) is the simplest yet efficient baseline algorithm for solving nonconvex optimization problems. The success of GD has been witnessed in many different problems in both theory and practice when it is combined with random initialization. However, previous works on matrix completion require either careful initialization or regularizers to prove the convergence of GD. In this work, we study the rank-1 symmetric matrix completion and prove that GD converges to the ground truth when small random initialization is used. We show that in logarithmic amount of iterations, the trajectory enters the region where local convergence occurs. We provide an upper bound on the initialization size that is sufficient to guarantee the convergence and show that a larger initialization can be used as more samples are available. We observe that implicit regularization effect of GD plays a critical role in the analysis, and for the entire trajectory, it prevents each entry from becoming much larger than the others.
translated by 谷歌翻译
Hinged on the representation power of neural networks, neural radiance fields (NeRF) have recently emerged as one of the promising and widely applicable methods for 3D object and scene representation. However, NeRF faces challenges in practical applications, such as large-scale scenes and edge devices with a limited amount of memory, where data needs to be processed sequentially. Under such incremental learning scenarios, neural networks are known to suffer catastrophic forgetting: easily forgetting previously seen data after training with new data. We observe that previous incremental learning algorithms are limited by either low performance or memory scalability issues. As such, we develop a Memory-Efficient Incremental Learning algorithm for NeRF (MEIL-NeRF). MEIL-NeRF takes inspiration from NeRF itself in that a neural network can serve as a memory that provides the pixel RGB values, given rays as queries. Upon the motivation, our framework learns which rays to query NeRF to extract previous pixel values. The extracted pixel values are then used to train NeRF in a self-distillation manner to prevent catastrophic forgetting. As a result, MEIL-NeRF demonstrates constant memory consumption and competitive performance.
translated by 谷歌翻译