对于真实世界的语音识别应用,噪声稳健性仍然是一个挑战。在这项工作中,我们采用师生(T / S)学习技术,使用并行干净和嘈杂的语料库来改善多媒体噪声下的自动语音识别(ASR)性能。最重要的是,我们应用logits选择方法,该方法仅保留k个最高值,以防止教师错误地强调知识并减少传输数据所需的带宽。我们整合了长达8000小时的未转录数据,并且除了受过交叉熵训练的模型之外,还在序列训练模型上呈现我们的结果。与训练有序的教师相比,最佳序列训练的学生模型分别对我们的清洁,模拟噪声和真实测试集产生约10.1%,28.7%和19.6%的相关误差率(WER)减少。
translated by 谷歌翻译
多个假设检验,即我们希望考虑许多假设的情况,是在整个科学领域出现的统计推断的核心问题。在此设置中,控制错误发现率(FDR)(即I类错误的预期比例)是进行有意义推断的重要挑战。在本文中,我们考虑以在线方式控制FDR的问题。具体地说,我们考虑无序的,可能是无限的假设序列,在每个时间步到达一个假设,并且对于每个假设,我们观察到p值以及特定于该假设的一组特征。在观察到nexthothesis之前,必须立即在每个时间步骤做出是否拒绝当前假设的决定。多维特征集模型提供了一种利用数据中辅助信息的通用方法,有助于最大化发现次数。我们提出了一类新的强大的在线测试程序,其中通过包含上下文信息和先前的结果来顺序学习注入阈值(显着性水平)。我们证明本课程中的任何规则都在某些标准假设下控制在线FDR。然后,我们基于加权显着性水平,聚焦于这些过程的子类,以得到一种实用的算法,该算法以在线方式学习参数权重函数以获得更多的发现。我们在理论上也在理论上证明,我们提出的程序将导致Javanmard&Montanari(2018)提出的流行在线测试程序所取得的统计能力提高。最后,我们通过比较合成数据和来自不同应用的实际数据,将其与最新的多线测试程序进行比较,证明了我们程序的有利性能。
translated by 谷歌翻译
一个名为Lipi Gnani的Kannada OCR,是从头开始设计和开发的,其动机是能够在不限制词汇量的情况下转换印刷文本或诗歌卡纳达语。从1970年到2002年期间出版的35本书中收集了培训和测试集,其中包括用Halegannada编写的书籍和包含用卡纳达语书写的梵文口号的书籍。 OCR的覆盖范围几乎是完整的,因为它识别所有标点符号,特殊符号,印度阿拉伯语和卡纳达语数字以及散布的英语词汇。在不同的处理阶段,例如二值化,线和字符分割,识别和Unicode映射,已经完成了几个小的和主要的原始贡献。如结果所示,这创造了一个卡纳达OCR,其​​性能与谷歌的Tesseract OCR相当,并且在某些情况下更好。根据作者的知识,这是完整的卡纳达语OCR的处女报告,处理所涉及的所有问题。目前,没有基于字典的后处理,并且所获得的结果仅归因于识别过程。已经创建了四个基准测试数据库,其中包含来自卡纳达语,梵语,康卡尼语和图卢语的书籍的扫描页面,但所有这些数据库都以卡纳达文字印刷。 Lipi Gnani的单词水平识别准确度在卡纳达数据集上比谷歌的Tesseract OCR高4%,在Tulu和梵语的数据集上高8%,在Konkani数据集上高25%。
translated by 谷歌翻译
本文提出了一种独立于语言的深度学习架构,适用于多字表达(MWE)识别任务。我们采用了包含卷积层和复发层的神经结构,顶部还增加了可选的CRF层。该系统参与了Parseme共享任务的开放轨道,该任务由于使用预先训练的维基百科词嵌入而自动识别口头MWE。它在开放和封闭的轨道上都优于所有参与系统,所有语言的平均MWE平均得分为58.09。系统的特殊强度是其在看不见的数据条目上的卓越性能。
translated by 谷歌翻译
我们研究了差分隐私(DP)中的子采样问题,这是许多成功的差异私有机器学习算法背后的核心问题。具体来说,我们为R \'enyi差分隐私(RDP)(Mironov,2017)参数提供了一个紧上限,用于算法:(1)对数据集进行子采样,然后(2)将一个随机机制M应用于子样本, M的RDP参数和子采样概率参数。这个结果推广了$(\ epsilon,\ delta)$的经典的基于子采样的“隐私放大”属性 - 差异隐私只适用于一对固定的$(\ epsilon,\ delta)$到一个利用属性的强化转换每个特定的随机算法,并且满足$(\ epsilon(\ delta),\ delta)$的整个系列 - 所有$ \ delta \ in [0,1] $的差异隐私。我们的实验证实了使用ourtechniques直接跟踪$(\ epsilon,\ delta)$的优势,特别是在我们需要组合多轮数据访问的设置中。
translated by 谷歌翻译
基于草图的图像检索(SBIR)是从对应于给定手绘草图的解剖图像数据库检索图像的任务。理想情况下,SBIR模型应学会将草图中的组件(例如,脚,尾等)与具有类似特征的图像中的相应组件相关联。然而,当前的评估方法仅关注于粗粒度评估,其中焦点在于检索与草图属于同一类但不一定具有与草图中相同的特征的图像。因此,现有方法只是学习将草图与训练期间看到的类相关联,因此无法概括为看不见的类。在本文中,我们提出了一个新的零镜头SBIR基准,其中模型在新的训练阶段进行评估。我们通过大量实验证明,在判别设置中训练的现有SBS模型仅学习类特定映射,并且未能推广到建议的零点设置。为此,我们提出了SBIR任务的生成方法,提出了深度条件生成模型,该模型将草图作为输入并随机填充缺失的信息。从“Sketchy”数据集创建的这个新基准的实验,这是一个大型的草图 - 照片对数据库,表明这些生成模型的性能明显优于提出的零镜头框架中的几种最先进的方法。粗粒度SBIR任务。
translated by 谷歌翻译
This paper describes a novel approach in human-robot interaction driven by ergonomics. With a clear focus on optimising ergonomics, the approach proposed here continuously observes a human user's posture and by invoking appropriate cooperative robot movements, the user's posture is, whenever required, brought back to an ergonomic optimum. Effectively, the new protocol optimises the human-robot relative position and orientation as a function of human ergonomics. An RGB-D camera is used to calculate and monitor human joint angles in real-time and to determine the current ergonomics state. A total of 6 main causes of low ergonomic states are identified, leading to 6 universal robot responses to allow the human to return to an optimal ergonomics state. The algorithmic framework identifies these 6 causes and controls the cooperating robot to always adapt the environment (e.g. change the pose of the workpiece) in a way that is ergonomically most comfortable for the interacting user. Hence, human-robot interaction is continuously re-evaluated optimizing ergonomics states. The approach is validated through an experimental study, based on established ergonomic methods and their adaptation for real-time application. The study confirms improved ergonomics using the new approach.
translated by 谷歌翻译
Bayesian optimization is an effective methodology for the global optimizationof functions with expensive evaluations. It relies on querying a distributionover functions defined by a relatively cheap surrogate model. An accurate modelfor this distribution over functions is critical to the effectiveness of theapproach, and is typically fit using Gaussian processes (GPs). However, sinceGPs scale cubically with the number of observations, it has been challenging tohandle objectives whose optimization requires many evaluations, and as such,massively parallelizing the optimization. In this work, we explore the use of neural networks as an alternative to GPsto model distributions over functions. We show that performing adaptive basisfunction regression with a neural network as the parametric form performscompetitively with state-of-the-art GP-based approaches, but scales linearlywith the number of data rather than cubically. This allows us to achieve apreviously intractable degree of parallelism, which we apply to large scalehyperparameter optimization, rapidly finding competitive models on benchmarkobject recognition tasks using convolutional networks, and image captiongeneration using neural language models.
translated by 谷歌翻译
Learning problems form an important category of computational tasks thatgeneralizes many of the computations researchers apply to large real-life datasets. We ask: what concept classes can be learned privately, namely, by analgorithm whose output does not depend too heavily on any one input or specifictraining example? More precisely, we investigate learning algorithms thatsatisfy differential privacy, a notion that provides strong confidentialityguarantees in contexts where aggregate information is released about a databasecontaining sensitive information about individuals. We demonstrate that,ignoring computational constraints, it is possible to privately agnosticallylearn any concept class using a sample size approximately logarithmic in thecardinality of the concept class. Therefore, almost anything learnable islearnable privately: specifically, if a concept class is learnable by a(non-private) algorithm with polynomial sample complexity and output size, thenit can be learned privately using a polynomial number of samples. We alsopresent a computationally efficient private PAC learner for the class of parityfunctions. Local (or randomized response) algorithms are a practical class ofprivate algorithms that have received extensive investigation. We provide aprecise characterization of local private learning algorithms. We show that aconcept class is learnable by a local algorithm if and only if it is learnablein the statistical query (SQ) model. Finally, we present a separation betweenthe power of interactive and noninteractive local learning algorithms.
translated by 谷歌翻译