众所周知,深度神经网络具有很强的合适能力,即使使用随机分配的类标签,也可以轻松达到较低的训练错误。当训练样本的数量很少,或类标签嘈杂时,网络倾向于记住特定于单个实例的模式,以最大程度地减少训练错误。这导致了过度拟合和泛化性能不佳的问题。本文通过抑制网络依靠特定实例模式以最小化的实例模式来探讨一种补救措施。提出的方法基于对抗性训练框架。它抑制了可以利用的功能来识别每个类中样本之间的单个实例。这导致分类器仅使用各个类别和每个类中常见的功能。我们称我们的方法对对抗性特征(ASIF)的对抗性抑制,并在面对小数据集或嘈杂标签时演示了该技术在提高概括精度中的有用性。我们的源代码可用。
translated by 谷歌翻译
Recent advances in upper limb prostheses have led to significant improvements in the number of movements provided by the robotic limb. However, the method for controlling multiple degrees of freedom via user-generated signals remains challenging. To address this issue, various machine learning controllers have been developed to better predict movement intent. As these controllers become more intelligent and take on more autonomy in the system, the traditional approach of representing the human-machine interface as a human controlling a tool becomes limiting. One possible approach to improve the understanding of these interfaces is to model them as collaborative, multi-agent systems through the lens of joint action. The field of joint action has been commonly applied to two human partners who are trying to work jointly together to achieve a task, such as singing or moving a table together, by effecting coordinated change in their shared environment. In this work, we compare different prosthesis controllers (proportional electromyography with sequential switching, pattern recognition, and adaptive switching) in terms of how they present the hallmarks of joint action. The results of the comparison lead to a new perspective for understanding how existing myoelectric systems relate to each other, along with recommendations for how to improve these systems by increasing the collaborative communication between each partner.
translated by 谷歌翻译
In atomistic simulations of solids, ability to classify crystal phases and lattice defects in the presence of thermal fluctuations is essential for gaining deeper insights into the simulated dynamics. The need for accurate and efficient characterization methods is especially acute in presently emerging large-scale simulations of multi-phase systems far from equilibrium. Taking the perspective that delineating order and disorder features from ubiquitous thermal vibrations is akin to extracting signal from noise, we consider classification of ordered phases and identification of disordered crystal defects to be fundamentally the same problem and address them both with a unified approach: a denoising score function that removes thermal noise and recovers any underlying crystalline order-disorder. Built on a rotationally equivariant graph neural network (NequIP), the denoiser was trained entirely with synthetically noised structures and requires no simulation data during training. To demonstrate its denoising capabilities, the denoiser is shown to effectively remove thermal vibrations of BCC, FCC, and HCP crystal structures without impacting the underlying disordered defects, including point defects, dislocations, grain boundaries, and liquid disorder. In particular the denoiser was applied to two relatively complex MD simulations that present practical challenges: a Cu solidification trajectory involving a polymorphic nucleus, and a trajectory of BCC Ta undergoing plastic deformation resulting in dislocation networks and point defect clusters. In both cases the denoiser facilitates or trivializes the subsequent characterization of the order-disorder features. Lastly, we outline future work to extend our denoising model to more complex crystal structures and to multi-element systems.
translated by 谷歌翻译
Generalized Eigenvalue Problems (GEPs) encompass a range of interesting dimensionality reduction methods. Development of efficient stochastic approaches to these problems would allow them to scale to larger datasets. Canonical Correlation Analysis (CCA) is one example of a GEP for dimensionality reduction which has found extensive use in problems with two or more views of the data. Deep learning extensions of CCA require large mini-batch sizes, and therefore large memory consumption, in the stochastic setting to achieve good performance and this has limited its application in practice. Inspired by the Generalized Hebbian Algorithm, we develop an approach to solving stochastic GEPs in which all constraints are softly enforced by Lagrange multipliers. Then by considering the integral of this Lagrangian function, its pseudo-utility, and inspired by recent formulations of Principal Components Analysis and GEPs as games with differentiable utilities, we develop a game-theory inspired approach to solving GEPs. We show that our approaches share much of the theoretical grounding of the previous Hebbian and game theoretic approaches for the linear case but our method permits extension to general function approximators like neural networks for certain GEPs for dimensionality reduction including CCA which means our method can be used for deep multiview representation learning. We demonstrate the effectiveness of our method for solving GEPs in the stochastic setting using canonical multiview datasets and demonstrate state-of-the-art performance for optimizing Deep CCA.
translated by 谷歌翻译
预测经济的短期动态 - 对经济代理商决策过程的重要意见 - 经常在线性模型中使用滞后指标。这通常在正常时期就足够了,但在危机期间可能不足。本文旨在证明,在非线性机器学习方法的帮助下,非传统和及时的数据(例如零售和批发付款)可以为决策者提供复杂的模型,以准确地估算几乎实时的关键宏观经济指标。此外,我们提供了一组计量经济学工具,以减轻机器学习模型中的过度拟合和解释性挑战,以提高其政策使用的有效性。我们的模型具有付款数据,非线性方法和量身定制的交叉验证方法,有助于提高宏观经济的启示准确性高达40 \% - 在COVID-19期间的增长较高。我们观察到,付款数据对经济预测的贡献很小,在低和正常增长期间是线性的。但是,在强年或正增长期间,付款数据的贡献很大,不对称和非线性。
translated by 谷歌翻译
半监督学习是通过将一个小标签的数据集与大概更大的未标记数据集相结合来训练准确的预测模型的问题。已经开发了许多半监督深度学习的方法,包括伪标记,一致性正则化和对比度学习技术。然而,伪标记方法非常容易受到混淆,在这种方法中假定错误的伪标记在早期迭代中是真正的标签,从而导致该模型增强其先前的偏见,从而无法推广到强大的预测性能。我们提出了一种新方法来通过一种方法来抑制混杂的错误,我们将其描述为伪预期最大化(范围)的半监督对比度删除。像基本的伪标记一样,范围与期望最大化有关(EM),这是一个潜在的变量框架,可以扩展到理解群集实现深度半监督算法。但是,与基本的伪标记不同,该假标签无法充分考虑到鉴于模型的未标记样品的概率,范围引入了一个异常抑制项,旨在改善EM迭代的行为,因为在异常存在的情况下具有歧视DNN骨架。我们的结果表明,范围极大地提高了基线的半监督分类精度,并且当结合一致性正规化时,使用250和4000个标记的样品将半监督的CIFAR-10分类任务获得了最高报告的准确性。此外,我们表明范围通过修剪错误的高信心伪标记样品来降低伪标记迭代期间混杂误差的流行率,否则这些样品否则会污染随后的重新迭代中标记的设置。
translated by 谷歌翻译
在过去的几年中,对MPMRI的恶性前列腺癌患者进行了自动诊断。模型解释和域漂移一直是临床利用的主要路障。作为我们以前的工作的扩展,我们在公共队列上培训了一个定制的卷积神经网络,其中有201名患者和感兴趣区域周围的裁剪2D斑块作为输入,将前列腺的2.5d片用作前列腺的2.5d片。使用Autokeras在模型空间中搜索了输入和最佳模型。外围区(PZ)和中央腺(CG)分别进行了训练和测试,有效地证明了一些不同的东西,PZ探测器和CG探测器有效地展示了序列中最可疑的切片,希望极大地减轻医生的工作量。
translated by 谷歌翻译
随机近似算法是迭代过程,用于在目标未知且直接观察结果被噪声损坏的环境中近似目标值。例如,当目标函数或模型不直接知道时,这些算法对于根找到和最小化是有用的。最初是在Robbins和Monro的1951年论文中引入的,随机近似领域已大大增长,并影响了从自适应信号处理到人工智能的应用领域。例如,在机器学习的各个子域中无处不在的随机梯度下降算法是基于随机近似理论。在本文中,我们为由于Aryeh dvoretzky的一般融合定理提供了正式的证明(在COQ证明助手中),这意味着重要的经典方法(例如Robbins-Monro和Kiefer-Wolfowitz算法)的收敛性。在此过程中,我们构建了一个综合的量子库库理论概率理论和随机过程。
translated by 谷歌翻译
在本文中,我们研究了由布尔阈值函数组成的AutoEncoders的大小和宽度,其中AutoEncoder是分层神经网络,其结构可以被视为由编码器组成,该编码器将输入向量压缩到较低尺寸向量,以及一个将低维向量转换回原始输入向量的解码器确切地(或大致)。我们专注于解码器部分,并显示$ \ omega(\ sqrt {dn / d})$和$ o(\ sqrt {dn})$节点需要在$ d $ -dimential二进制中转换$ n $ vectors空间到$ d $ -dimimential二进制空间。我们还表明,如果我们允许小错误,则可以减少宽度,其中误差被定义为在对编码器部分的每个向量和由解码器输出的每个向量输入的汉明距离的平均值。
translated by 谷歌翻译
学习和概括与少数样本(少量学习)的新概念仍然是对现实世界应用的重要挑战。实现少量学习的原则方法是实现一种可以快速适应给定任务的上下文的模型。已经显示动态网络能够有效地学习内容自适应参数,使其适用于几次学习。在本文中,我们建议将卷积网络的动态内核作为手掌的任务的函数学习,从而实现更快的泛化。为此,我们基于整个任务和每个样本获得我们的动态内核,并在每个单独的频道和位置进行进一步调节机制。这导致动态内核,同时考虑可用的微型信息。我们经验证明,我们的模型在几次拍摄分类和检测任务上提高了性能,实现了几种基线模型的切实改进。这包括最先进的结果,以4次拍摄分类基准:迷你想象,分层 - 想象成,幼崽和FC100以及少量检测数据集的竞争结果:Coco-Pascal-VOC。
translated by 谷歌翻译