例如,近似K-Nearest的邻居搜索(AKNNS)现在已经在现代应用程序中变得无处不在,例如,作为一个快速搜索程序,具有两个塔式深度学习模型。特别是基于图的AKNN方法,由于其出色的性能,因此受到了极大的关注。这些方法依靠贪婪的图形搜索来遍历数据库中的载体。在这种贪婪的搜索方案下,我们进行了一个关键的观察:许多距离计算不会影响搜索更新,因此可以在不损害性能的情况下近似这些计算。结果,我们提出了手指,这是一种快速的推理方法,以实现有效的图形搜索。手指通过估计较低碱基和分布匹配的相邻残留向量之间的角度来近似距离函数。近似距离可用于绕过不必要的计算,从而导致更快的搜索。从经验上讲,在不同的基准数据集中加速了一种名为HNSW的流行基于图形的方法,其名称为HNSW的HNSW方法可超过现有的基于图的方法20%-60%。
translated by 谷歌翻译
数据增强在大型神经网络的培训中很受欢迎;但是,目前,关于如何使用增强数据的不同算法选择之间没有明确的理论比较。在本文中,我们朝这个方向迈出了一步 - 我们首先提出了对线性回归的简单新颖的分析,该分析具有标签不变性增强,这表明数据增强一致性(DAC)本质上比对增强数据的经验风险最小化更为有效(DA- erm)。然后将分析扩展到误指定的增强(即更改标签的增强),这再次证明了DAC比DA-MERM的优点。此外,我们将分析扩展到非线性模型(例如神经网络)并呈现泛化范围。最后,我们使用CIFAR-100和WIDERESNET进行DAC和DA-MER之间的DAC和DA-MER之间进行干净和苹果对比较的实验;这些共同证明了DAC的效果。
translated by 谷歌翻译
我们提出了两种线性土匪算法,具有每步复杂性sublerear的武器$ k $。该算法专为手臂集非常大且缓慢变化的应用而设计。我们的关键意识到,选择手臂还原为最大的内部产品搜索(MIPS)问题,该问题可以大约解决,而无需打破后悔保证。现有的近似MIPS求解器以均匀时间运行。我们扩展了这些求解器,并为在线学习问题提供理论保证,在线学习问题(即,以后的步骤取决于上一步中的反馈)成为一个独特的挑战。然后,我们明确表征了每步复杂性与遗憾之间的权衡。对于足够大的$ k $,我们的算法具有sublinear每步复杂性和$ \ tilde o(\ sqrt {t})$遗憾。从经验上讲,我们在合成环境和现实世界中的电影推荐问题中评估了我们提出的算法。与线性时间基线相比,我们提出的算法可以提供超过72倍的速度,同时保留了类似的遗憾。
translated by 谷歌翻译
Verifying the robustness property of a general Rectified Linear Unit (ReLU) network is an NPcomplete problem. Although finding the exact minimum adversarial distortion is hard, giving a certified lower bound of the minimum distortion is possible. Current available methods of computing such a bound are either time-consuming or deliver low quality bounds that are too loose to be useful. In this paper, we exploit the special structure of ReLU networks and provide two computationally efficient algorithms (Fast-Lin,Fast-Lip) that are able to certify non-trivial lower bounds of minimum adversarial distortions. Experiments show that (1) our methods deliver bounds close to (the gap is 2-3X) exact minimum distortions found by Reluplex in small networks while our algorithms are more than 10,000 times faster; (2) our methods deliver similar quality of bounds (the gap is within 35% and usually around 10%; sometimes our bounds are even better) for larger networks compared to the methods based on solving linear programming problems but our algorithms are 33-14,000 times faster; (3) our method is capable of solving large MNIST and CIFAR networks up to 7 layers with more than 10,000 neurons within tens of seconds on a single CPU core. In addition, we show that there is no polynomial time algorithm that can approximately find the minimum 1 adversarial distortion of a ReLU network with a 0.99 ln n approximation ratio unless NP=P, where n is the number of neurons in the network.
translated by 谷歌翻译
开发有效的自动分类器将真实来源与工件分开,对于宽场光学调查的瞬时随访至关重要。在图像差异过程之后,从减法伪像的瞬态检测鉴定是此类分类器的关键步骤,称为真实 - 博格斯分类问题。我们将自我监督的机器学习模型,深入的自组织地图(DESOM)应用于这个“真实的模拟”分类问题。 DESOM结合了自动编码器和一个自组织图以执行聚类,以根据其维度降低的表示形式来区分真实和虚假的检测。我们使用32x32归一化检测缩略图作为底部的输入。我们展示了不同的模型训练方法,并发现我们的最佳DESOM分类器显示出6.6%的检测率,假阳性率为1.5%。 Desom提供了一种更细微的方法来微调决策边界,以确定与其他类型的分类器(例如在神经网络或决策树上构建的)结合使用时可能进行的实际检测。我们还讨论了DESOM及其局限性的其他潜在用法。
translated by 谷歌翻译
在基于模型的马尔可夫决策过程的大多数应用中,通常从经验数据中估算出未知基础模型的参数。由于噪音,从估计模型中学到的政策通常与基础模型的最佳政策相去甚远。当应用于基础模型的环境时,学习的政策会导致次优性能,因此要求提供更好的概括性能的解决方案。在这项工作中,我们采用贝叶斯的观点,并通过先验信息将马尔可夫决策过程的目标函数正规化,以获得更强大的策略。提出了两种方法,一种基于$ l^1 $正则化,另一种基于相对熵正则化。我们评估了有关合成模拟和大规模在线购物商店的现实搜索日志的建议算法。我们的结果证明了正则MDP策略对模型中存在的噪声的鲁棒性。
translated by 谷歌翻译
迄今为止,通信系统主要旨在可靠地交流位序列。这种方法提供了有效的工程设计,这些设计对消息的含义或消息交换所旨在实现的目标不可知。但是,下一代系统可以通过将消息语义和沟通目标折叠到其设计中来丰富。此外,可以使这些系统了解进行交流交流的环境,从而为新颖的设计见解提供途径。本教程总结了迄今为止的努力,从早期改编,语义意识和以任务为导向的通信开始,涵盖了基础,算法和潜在的实现。重点是利用信息理论提供基础的方法,以及学习在语义和任务感知通信中的重要作用。
translated by 谷歌翻译
极端多标签文本分类(XMC)问题问题是从大型标签集查找输入文本实例的大多数相关标签。但是,XMC设置面临两个挑战:(1)不允许在动态环境中预测看不见的标签,(2)它需要大量监督(实例,标签)对,这可能难以获得新兴域名。最近,已经研究了广义零拍XMC(GZ-XMC)设置,并相应地提出了Zestxml以处理未经调整的标签,这仍需要大量注释(实例,标签)对。在本文中,我们考虑了一个更实际的场景,称为极端零拍摄XMC(EZ-XMC),其中不需要监督,并且只能访问实例的原始文本和标签。少量XMC(FS-XMC),还调查了具有有限监督的EZ-XMC的扩展。要学习实例的语义嵌入和标签与原始文本,我们建议预先列车基于变压器的编码器,具有自我监督的对比损失。具体而言,我们开发了一种预训练方法MACLR,它彻底利用了使用多尺度自适应聚类,标签正则化和具有伪正对的自我训练的技术的原始文本。四个公共EZ-XMC数据集的实验结果表明,与所有其他领先的基线方法相比,MaclR达到了卓越的性能,特别是平均精度和召回的预测约为5-10%。此外,我们还表明,当在训练中存在有限数量的地面真相阳性对时,我们的预训练编码器可以进一步提高FS-XMC。通过在这样的几滴子集中进行微调,Maclr仍然显着优于其他极端分类器。
translated by 谷歌翻译
Designing experiments often requires balancing between learning about the true treatment effects and earning from allocating more samples to the superior treatment. While optimal algorithms for the Multi-Armed Bandit Problem (MABP) provide allocation policies that optimally balance learning and earning, they tend to be computationally expensive. The Gittins Index (GI) is a solution to the MABP that can simultaneously attain optimality and computationally efficiency goals, and it has been recently used in experiments with Bernoulli and Gaussian rewards. For the first time, we present a modification of the GI rule that can be used in experiments with exponentially-distributed rewards. We report its performance in simulated 2- armed and 3-armed experiments. Compared to traditional non-adaptive designs, our novel GI modified design shows operating characteristics comparable in learning (e.g. statistical power) but substantially better in earning (e.g. direct benefits). This illustrates the potential that designs using a GI approach to allocate participants have to improve participant benefits, increase efficiencies, and reduce experimental costs in adaptive multi-armed experiments with exponential rewards.
translated by 谷歌翻译
Quadruped robots are currently used in industrial robotics as mechanical aid to automate several routine tasks. However, presently, the usage of such a robot in a domestic setting is still very much a part of the research. This paper discusses the understanding and virtual simulation of such a robot capable of detecting and understanding human emotions, generating its gait, and responding via sounds and expression on a screen. To this end, we use a combination of reinforcement learning and software engineering concepts to simulate a quadruped robot that can understand emotions, navigate through various terrains and detect sound sources, and respond to emotions using audio-visual feedback. This paper aims to establish the framework of simulating a quadruped robot that is emotionally intelligent and can primarily respond to audio-visual stimuli using motor or audio response. The emotion detection from the speech was not as performant as ERANNs or Zeta Policy learning, still managing an accuracy of 63.5%. The video emotion detection system produced results that are almost at par with the state of the art, with an accuracy of 99.66%. Due to its "on-policy" learning process, the PPO algorithm was extremely rapid to learn, allowing the simulated dog to demonstrate a remarkably seamless gait across the different cadences and variations. This enabled the quadruped robot to respond to generated stimuli, allowing us to conclude that it functions as predicted and satisfies the aim of this work.
translated by 谷歌翻译