创建图像数据集时,使用搜索引擎进行Web图像检索是手动策划的诱人替代方法,但是它们的主要缺点仍然是检索到错误(嘈杂)样本的比例。以前的作品证明了这些嘈杂的样本是分布式(ID)样本的混合物,分配给了错误类别,但在数据集中的其他类别中呈现了相似的视觉语义,以及分布外(OOD)图像,哪些与数据集中的任何类别共享语义相关性。实际上,后者是检索到的嘈杂图像的主要类型。为了解决这种噪声二元性,我们提出了一个两阶段算法,从检测步骤开始,我们使用无监督的对比功能学习来表示特征空间中的图像。我们发现,对比度学习的比对和统一原则使OOD样品可以与单位孔隙单位上的ID样品线性分离。然后,我们使用固定的邻域大小将无监督的表示形式嵌入,并在类级别上应用异常敏感聚类以检测清洁和OOD簇以及ID嘈杂的异常值。我们最终训练了一个噪声强大的神经网络,该网络将ID噪声纠正为正确的类别,并在具有指导性的对比度目标中使用OOD样品,从而聚集它们以改善低级功能。我们的算法改善了合成噪声图像数据集的最新结果以及现实世界中的Web爬行数据。我们的工作是完全可重现的[github]。
translated by 谷歌翻译
图形神经网络(GNN)在许多领域中显示出优异的应用,其中数据基本上表示为图(例如,化学,生物学,推荐系统)。在该静脉中,通信网络包括许多以图形结构方式(例如,拓扑,配置,交通流量)自然表示的许多基本组件。该职位文章将GNNS作为用于建模,控制和管理通信网络的基本工具。 GNN表示新一代的数据驱动模型,可以准确地学习和再现真实网络后面的复杂行为。因此,这种模型可以应用于各种网络用例,例如规划,在线优化或故障排除。 GNN在传统的神经网络上的主要优点在于在培训期间应用于其他网络和配置时的前所未有的泛化能力,这是实现用于网络实际数据驱动解决方案的关键特征。本文包括关于GNN的简要教程及其对通信网络的可能应用。为了展示这项技术的潜力,我们展示了两种用例,分别应用于有线和无线网络的最先进的GNN模型。最后,我们深入研究了这一小说研究区的关键开放挑战和机会。
translated by 谷歌翻译
广域网络(WAN)是当今社会的关键基础设施。在过去的几年中,WANS的网络流量和网络应用程序大大增加,对现有网络技术(例如,低延迟和高吞吐量)施加了新的要求。因此,互联网服务提供商(ISP)承受着确保客户服务质量和履行服务水平协议的压力。网络运营商利用交通工程(TE)技术有效地管理网络资源。但是,WAN的流量在时间期间可能会发生巨大变化,并且由于外部因素(例如,链接故障),连通性可能会受到影响。因此,TE解决方案必须能够实时适应动态方案。在本文中,我们提出了基于两阶段优化过程的有效实时TE解决方案。在第一个中,Enero利用深入的强化学习(DRL)通过生成长期的TE策略来优化路由配置。为了在动态网络方案(例如,在链接失败发生时)进行有效的操作,我们将图形神经网络集成到DRL代理中。在第二阶段,Enero使用本地搜索算法来改善DRL的解决方案,而无需将计算开销添加到优化过程中。实验结果表明,Enero能够在4.5秒内平均在现实世界中的动态网络拓扑以100个边缘进行操作。
translated by 谷歌翻译
Despite being robust to small amounts of label noise, convolutional neural networks trained with stochastic gradient methods have been shown to easily fit random labels. When there are a mixture of correct and mislabelled targets, networks tend to fit the former before the latter. This suggests using a suitable two-component mixture model as an unsupervised generative model of sample loss values during training to allow online estimation of the probability that a sample is mislabelled. Specifically, we propose a beta mixture to estimate this probability and correct the loss by relying on the network prediction (the so-called bootstrapping loss). We further adapt mixup augmentation to drive our approach a step further. Experiments on CIFAR-10/100 and TinyImageNet demonstrate a robustness to label noise that substantially outperforms recent state-of-the-art. Source code is available at https://git.io/fjsvE.
translated by 谷歌翻译
A step-search sequential quadratic programming method is proposed for solving nonlinear equality constrained stochastic optimization problems. It is assumed that constraint function values and derivatives are available, but only stochastic approximations of the objective function and its associated derivatives can be computed via inexact probabilistic zeroth- and first-order oracles. Under reasonable assumptions, a high-probability bound on the iteration complexity of the algorithm to approximate first-order stationarity is derived. Numerical results on standard nonlinear optimization test problems illustrate the advantages and limitations of our proposed method.
translated by 谷歌翻译
The success of neural networks builds to a large extent on their ability to create internal knowledge representations from real-world high-dimensional data, such as images, sound, or text. Approaches to extract and present these representations, in order to explain the neural network's decisions, is an active and multifaceted research field. To gain a deeper understanding of a central aspect of this field, we have performed a targeted review focusing on research that aims to associate internal representations with human understandable concepts. In doing this, we added a perspective on the existing research by using primarily deductive nomological explanations as a proposed taxonomy. We find this taxonomy and theories of causality, useful for understanding what can be expected, and not expected, from neural network explanations. The analysis additionally uncovers an ambiguity in the reviewed literature related to the goal of model explainability; is it understanding the ML model or, is it actionable explanations useful in the deployment domain?
translated by 谷歌翻译
Many problems in machine learning involve bilevel optimization (BLO), including hyperparameter optimization, meta-learning, and dataset distillation. Bilevel problems consist of two nested sub-problems, called the outer and inner problems, respectively. In practice, often at least one of these sub-problems is overparameterized. In this case, there are many ways to choose among optima that achieve equivalent objective values. Inspired by recent studies of the implicit bias induced by optimization algorithms in single-level optimization, we investigate the implicit bias of gradient-based algorithms for bilevel optimization. We delineate two standard BLO methods -- cold-start and warm-start -- and show that the converged solution or long-run behavior depends to a large degree on these and other algorithmic choices, such as the hypergradient approximation. We also show that the inner solutions obtained by warm-start BLO can encode a surprising amount of information about the outer objective, even when the outer parameters are low-dimensional. We believe that implicit bias deserves as central a role in the study of bilevel optimization as it has attained in the study of single-level neural net optimization.
translated by 谷歌翻译
An expansion of aberrant brain cells is referred to as a brain tumor. The brain's architecture is extremely intricate, with several regions controlling various nervous system processes. Any portion of the brain or skull can develop a brain tumor, including the brain's protective coating, the base of the skull, the brainstem, the sinuses, the nasal cavity, and many other places. Over the past ten years, numerous developments in the field of computer-aided brain tumor diagnosis have been made. Recently, instance segmentation has attracted a lot of interest in numerous computer vision applications. It seeks to assign various IDs to various scene objects, even if they are members of the same class. Typically, a two-stage pipeline is used to perform instance segmentation. This study shows brain cancer segmentation using YOLOv5. Yolo takes dataset as picture format and corresponding text file. You Only Look Once (YOLO) is a viral and widely used algorithm. YOLO is famous for its object recognition properties. You Only Look Once (YOLO) is a popular algorithm that has gone viral. YOLO is well known for its ability to identify objects. YOLO V2, V3, V4, and V5 are some of the YOLO latest versions that experts have published in recent years. Early brain tumor detection is one of the most important jobs that neurologists and radiologists have. However, it can be difficult and error-prone to manually identify and segment brain tumors from Magnetic Resonance Imaging (MRI) data. For making an early diagnosis of the condition, an automated brain tumor detection system is necessary. The model of the research paper has three classes. They are respectively Meningioma, Pituitary, Glioma. The results show that, our model achieves competitive accuracy, in terms of runtime usage of M2 10 core GPU.
translated by 谷歌翻译
Large language models (LLMs) have demonstrated impressive capabilities in natural language understanding and generation, but the quality bar for medical and clinical applications is high. Today, attempts to assess models' clinical knowledge typically rely on automated evaluations on limited benchmarks. There is no standard to evaluate model predictions and reasoning across a breadth of tasks. To address this, we present MultiMedQA, a benchmark combining six existing open question answering datasets spanning professional medical exams, research, and consumer queries; and HealthSearchQA, a new free-response dataset of medical questions searched online. We propose a framework for human evaluation of model answers along multiple axes including factuality, precision, possible harm, and bias. In addition, we evaluate PaLM (a 540-billion parameter LLM) and its instruction-tuned variant, Flan-PaLM, on MultiMedQA. Using a combination of prompting strategies, Flan-PaLM achieves state-of-the-art accuracy on every MultiMedQA multiple-choice dataset (MedQA, MedMCQA, PubMedQA, MMLU clinical topics), including 67.6% accuracy on MedQA (US Medical License Exam questions), surpassing prior state-of-the-art by over 17%. However, human evaluation reveals key gaps in Flan-PaLM responses. To resolve this we introduce instruction prompt tuning, a parameter-efficient approach for aligning LLMs to new domains using a few exemplars. The resulting model, Med-PaLM, performs encouragingly, but remains inferior to clinicians. We show that comprehension, recall of knowledge, and medical reasoning improve with model scale and instruction prompt tuning, suggesting the potential utility of LLMs in medicine. Our human evaluations reveal important limitations of today's models, reinforcing the importance of both evaluation frameworks and method development in creating safe, helpful LLM models for clinical applications.
translated by 谷歌翻译
Optimal Power Flow (OPF) is a very traditional research area within the power systems field that seeks for the optimal operation point of electric power plants, and which needs to be solved every few minutes in real-world scenarios. However, due to the nonconvexities that arise in power generation systems, there is not yet a fast, robust solution technique for the full Alternating Current Optimal Power Flow (ACOPF). In the last decades, power grids have evolved into a typical dynamic, non-linear and large-scale control system, known as the power system, so searching for better and faster ACOPF solutions is becoming crucial. Appearance of Graph Neural Networks (GNN) has allowed the natural use of Machine Learning (ML) algorithms on graph data, such as power networks. On the other hand, Deep Reinforcement Learning (DRL) is known for its powerful capability to solve complex decision-making problems. Although solutions that use these two methods separately are beginning to appear in the literature, none has yet combined the advantages of both. We propose a novel architecture based on the Proximal Policy Optimization algorithm with Graph Neural Networks to solve the Optimal Power Flow. The objective is to design an architecture that learns how to solve the optimization problem and that is at the same time able to generalize to unseen scenarios. We compare our solution with the DCOPF in terms of cost after having trained our DRL agent on IEEE 30 bus system and then computing the OPF on that base network with topology changes
translated by 谷歌翻译