Levenberg-Marquardt(LM)优化算法已广泛用于解决机器学习问题。文学评论表明,当网络中的权重数不超过几百个时,LM对中等函数近似问题的LM非常强大而有效。相比之下,在处理模式识别或分类问题时,LM似乎并不表现,并且当网络变大时效率低(例如,超过500重量)。在本文中,我们利用一些现实世界飞机数据集利用LM算法的真正力量。在这些数据集上,大多数其他常用的优化器无法检测到飞机发动机的变化条件引起的异常。数据集的具有挑战性是时间序列数据的突然变化。我们发现LM优化器具有更好的近似突然变化的能力,并检测除其他优化器的异常。我们比较LM和几个其他优化器的这种异常/更改检测问题的性能。我们基于一系列措施评估了相对性能,包括网络复杂性(即权重的数量),拟合精度,拟合,培训时间,GPU和内存要求等的使用等措施。我们还讨论了Matlab中强大的LM实现问题Tensorflow用于推广LM算法的更多流行使用以及LM优化器的潜在使用进行大规模问题。
translated by 谷歌翻译
Machine Translation (MT) system generally aims at automatic representation of source language into target language retaining the originality of context using various Natural Language Processing (NLP) techniques. Among various NLP methods, Statistical Machine Translation(SMT). SMT uses probabilistic and statistical techniques to analyze information and conversion. This paper canvasses about the development of bilingual SMT models for translating English to fifteen low-resource Indian Languages (ILs) and vice versa. At the outset, all 15 languages are briefed with a short description related to our experimental need. Further, a detailed analysis of Samanantar and OPUS dataset for model building, along with standard benchmark dataset (Flores-200) for fine-tuning and testing, is done as a part of our experiment. Different preprocessing approaches are proposed in this paper to handle the noise of the dataset. To create the system, MOSES open-source SMT toolkit is explored. Distance reordering is utilized with the aim to understand the rules of grammar and context-dependent adjustments through a phrase reordering categorization framework. In our experiment, the quality of the translation is evaluated using standard metrics such as BLEU, METEOR, and RIBES
translated by 谷歌翻译
Because of their close relationship with humans, non-human apes (chimpanzees, bonobos, gorillas, orangutans, and gibbons, including siamangs) are of great scientific interest. The goal of understanding their complex behavior would be greatly advanced by the ability to perform video-based pose tracking. Tracking, however, requires high-quality annotated datasets of ape photographs. Here we present OpenApePose, a new public dataset of 71,868 photographs, annotated with 16 body landmarks, of six ape species in naturalistic contexts. We show that a standard deep net (HRNet-W48) trained on ape photos can reliably track out-of-sample ape photos better than networks trained on monkeys (specifically, the OpenMonkeyPose dataset) and on humans (COCO) can. This trained network can track apes almost as well as the other networks can track their respective taxa, and models trained without one of the six ape species can track the held out species better than the monkey and human models can. Ultimately, the results of our analyses highlight the importance of large specialized databases for animal tracking systems and confirm the utility of our new ape database.
translated by 谷歌翻译
Neural network-based approaches for solving partial differential equations (PDEs) have recently received special attention. However, the large majority of neural PDE solvers only apply to rectilinear domains, and do not systematically address the imposition of Dirichlet/Neumann boundary conditions over irregular domain boundaries. In this paper, we present a framework to neurally solve partial differential equations over domains with irregularly shaped (non-rectilinear) geometric boundaries. Our network takes in the shape of the domain as an input (represented using an unstructured point cloud, or any other parametric representation such as Non-Uniform Rational B-Splines) and is able to generalize to novel (unseen) irregular domains; the key technical ingredient to realizing this model is a novel approach for identifying the interior and exterior of the computational grid in a differentiable manner. We also perform a careful error analysis which reveals theoretical insights into several sources of error incurred in the model-building process. Finally, we showcase a wide variety of applications, along with favorable comparisons with ground truth solutions.
translated by 谷歌翻译
机器翻译系统(MTS)是通过将文本或语音从一种语言转换为另一种语言的有效工具。在像印度这样的大型多语言环境中,对有效的翻译系统的需求变得显而易见,英语和一套印度语言(ILS)正式使用。与英语相反,由于语料库的不可用,IL仍然被视为低资源语言。为了解决不对称性质,多语言神经机器翻译(MNMT)系统会发展为在这个方向上的理想方法。在本文中,我们提出了一个MNMT系统,以解决与低资源语言翻译有关的问题。我们的模型包括两个MNMT系统,即用于英语印度(一对多),另一个用于指示英语(多一对多),其中包含15个语言对(30个翻译说明)的共享编码器码头。由于大多数IL对具有很少的平行语料库,因此不足以训练任何机器翻译模型。我们探索各种增强策略,以通过建议的模型提高整体翻译质量。最先进的变压器体系结构用于实现所提出的模型。大量数据的试验揭示了其优越性比常规模型的优势。此外,本文解决了语言关系的使用(在方言,脚本等方面),尤其是关于同一家族的高资源语言在提高低资源语言表现方面的作用。此外,实验结果还表明了ILS的倒退和域适应性的优势,以提高源和目标语言的翻译质量。使用所有这些关键方法,我们提出的模型在评估指标方面比基线模型更有效,即一组ILS的BLEU(双语评估研究)得分。
translated by 谷歌翻译
知识图(kg)以其大规模和知识推断能力而闻名,但也因与之相关的不完整而臭名昭著。由于关系长尾分布在公斤中的长尾分布,因此很少有人提出完成kg的完成,以减轻不完整和扩大kg的覆盖范围。它旨在对涉及新关系的三胞胎进行预测,当时仅提供少量培训三胞胎作为参考。以前的方法主要集中在设计本地邻居聚合器以学习实体级信息和/或在三胞胎级别实现顺序依赖性假设以学习元关系信息。但是,对于学习几乎没有射击关系的元表示,很大程度上忽略了宝贵的成对三重级交互和上下文级别的关系信息。在本文中,我们提出了一种分层的关系学习方法(雇用),以完成几次kg完成。通过共同捕获三个级别的关系信息(实体级别,三胞胎级别和上下文级别),雇用可以有效地学习和完善几乎没有射击关系的元表示,因此可以很好地推广到新的看不见的关系。在两个基准数据集上进行的广泛实验验证了雇用与其他最先进方法的优势。
translated by 谷歌翻译
Large pretrained language models generate fluent text but are notoriously hard to controllably sample from. In this work, we study constrained sampling from such language models: generating text that satisfies user-defined constraints, while maintaining fluency and the model's performance in a downstream task. We propose MuCoLa -- a sampling procedure that combines the log-likelihood of the language model with arbitrary (differentiable) constraints in a single energy function, and then generates samples in a non-autoregressive manner. Specifically, it initializes the entire output sequence with noise and follows a Markov chain defined by Langevin Dynamics using the gradients of the energy function. We evaluate MuCoLa on text generation with soft and hard constraints as well as their combinations obtaining significant improvements over competitive baselines for toxicity avoidance, sentiment control, and keyword-guided generation.
translated by 谷歌翻译
在RL的许多实际应用中,观察来自环境的状态过渡是昂贵的。例如,在核聚变的等离子体控制问题中,计算给定的状态对对的下一个状态需要查询昂贵的过渡功能,这可以导致许多小时的计算机模拟或美元科学研究。这种昂贵的数据收集禁止应用标准RL算法,该算法通常需要大量观察来学习。在这项工作中,我们解决了有效地学习策略的问题,同时为转换函数进行最小数量的状态动作查询。特别是,我们利用贝叶斯最优实验设计的想法,以指导选择国家行动查询以获得高效学习。我们提出了一种采集功能,该函数量化了状态动作对将提供多少信息对Markov决策过程提供的最佳解决方案。在每次迭代时,我们的算法最大限度地提高了该采集功能,选择要查询的最具信息性的状态动作对,从而产生数据有效的RL方法。我们试验各种模拟的连续控制问题,并显示我们的方法学习最佳政策,最高$ 5 $ - $ 1,000 \倍的数据,而不是基于模型的RL基线,10 ^ 3美元 - $ 10 ^ 5 \ times比无模型RL基线更少的数据。我们还提供了几种消融比较,这指出了从获得数据的原理方法产生的大量改进。
translated by 谷歌翻译
医疗AI通过支持基于证据的医学实践,个性化患者治疗,降低成本以及改善提供者和患者体验,推进医疗保健的巨大潜力。我们认为解锁此潜力需要一种系统的方法来衡量在大规模异构数据上的医疗AI模型的性能。为了满足这种需求,我们正在建立Medperf,这是一个开放的框架,用于在医疗领域的基准测试机器学习。 Medperf将使联合评估能够将模型安全地分配给不同的评估设施,从而赋予医疗组织在高效和人类监督过程中评估和验证AI模型的性能,同时优先考虑隐私。我们描述了当前的挑战医疗保健和AI社区面临,需要开放平台,Medperf的设计理念,其目前的实施状态和我们的路线图。我们呼吁研究人员和组织加入我们创建Medperf开放基准平台。
translated by 谷歌翻译
Our method performs local semantic editing on GAN output images, transferring the appearance of a specific object part from a reference image to a target image.
translated by 谷歌翻译