本文提出了一个贝叶斯框架,用于构建非线性,简约的浅层模型,用于多任务回归。提出的框架依赖于这样一个事实,即随机傅立叶特征(RFF)可以通过极端学习机器将RBF内核近似,其隐藏层由RFF形成。主要思想是将同一模型的两个双重视图结合在单个贝叶斯公式下,将稀疏的贝叶斯极限学习机器扩展到多任务问题。从内核方法的角度来看,提出的公式有助于通过RBF内核参数引入先前的域知识。从极端的学习机的角度来看,新的配方有助于控制过度拟合并实现简约的总体模型(服务每个任务的模型共享联合贝叶斯优化中选择的相同的RFF集合)。实验结果表明,在同一框架内将内核方法和极端学习机器的优势相结合可能会导致这两个范式中的每一个范式独立地取得的性能显着改善。
translated by 谷歌翻译
现实世界数据库很复杂,它们通常会呈现冗余,并在同一数据的异质和多个表示之间共享相关性。因此,在视图之间利用和解开共享信息至关重要。为此,最近的研究经常将所有观点融合到共享的非线性复杂潜在空间中,但它们失去了解释性。为了克服这一局限性,我们在这里提出了一种新的方法,将多个变异自动编码器(VAE)结构与因子分析潜在空间(FA-VAE)相结合。具体而言,我们使用VAE在连续的潜在空间中学习每个异质观点的私人表示。然后,我们通过使用线性投影矩阵将每个私有变量投影到低维的潜在空间来对共享潜在空间进行建模。因此,我们在私人信息和共享信息之间创建了可解释的层次依赖性。这样,新型模型可以同时:(i)从多种异质观点中学习,(ii)获得可解释的层次共享空间,以及(iii)在生成模型之间执行传输学习。
translated by 谷歌翻译
机器学习技术通常应用于痴呆症预测缺乏其能力,共同学习多个任务,处理时间相关的异构数据和缺失值。在本文中,我们建议使用最近呈现的SShiba模型提出了一个框架,用于在缺失值的纵向数据上联合学习不同的任务。该方法使用贝叶斯变分推理来赋予缺失值并组合多个视图的信息。这样,我们可以将不同的数据视图与共同的潜在空间中的不同时间点相结合,并在同时建模和预测若干输出变量的同时学习每个时间点之间的关系。我们应用此模型以预测痴呆症中的诊断,心室体积和临床评分。结果表明,SSHIBA能够学习缺失值的良好归因,同时预测三个不同任务的同时表现出基线。
translated by 谷歌翻译
音频或视觉数据分析任务通常必须处理高维和非负信号。然而,当数据具有多维数减少预处理时,大多数数据分析方法遭受过度拟合和数值问题。此外,关于如何以及为什么滤波器为音频或可视应用的方式工作是所需的属性,特别是当涉及能量或频谱信号时。在这些情况下,由于这些信号的性质,滤波器重量的非承诺是所需的性质,以更好地理解其工作。由于这两个必需品,我们提出了不同的方法来减少数据的维度,而保证溶液的非承诺和可解释性。特别是,我们提出了一种广义方法,以在处理非负数据的应用程序中以监督方式设计过滤器银行,并且我们探讨了解决所提出的目标函数的不同方式,包括非负面的部分最小二乘法的非负图。我们分析了通过拟议的两种不同和广泛研究的应用方法获得的特征的辨别力:纹理和音乐类型分类。此外,我们比较我们的方法实现的滤波器银行,具体设计用于特征提取的其他最先进的方法。
translated by 谷歌翻译
多变量分析(MVA)包括用于特征提取的众所周知的方法,该方法提取,其利用表示数据的输入变量之间的相关性。大多数此类方法享有的一个重要属性是提取特征之间的不相关性。最近,MVA方法的正则化版本在文献中出现,主要是为了获得解决方案的解释性。在这些情况下,不再以封闭的方式获得解决方案,并且经常使用更复杂的优化方法,依赖于两个步骤的迭代。本文回到了替代方法来解决这个迭代问题。这种方法的主要新颖性在于保持原始方法的几个属性,最值得注意的是提取特征的不相关性。在此框架下,我们提出了一种新的方法,该方法利用L-21规范在特征提取过程中执行变量选择。不同问题的实验结果证实了与现有化配方的拟议配方的优点。
translated by 谷歌翻译
多任务高斯流程(MTGP)是高斯流程(GP)框架的多输出回归问题的解决方案,其中在观察值的情况下,回归器的$ T $元素不能被认为是有条件独立的。标准MTGP模型假设同时存在多任务协方差矩阵,该矩阵是插入式矩阵的函数和噪声协方差矩阵。这些矩阵需要通过订单$ p $的低级简化来近似,以减少从$ t^2 $到$ tp $学习的参数数量。在这里,我们介绍了一种新颖的方法,该方法通过将其减少到一组条件的单变量GP来简化了多任务学习,而无需任何低级近似值,因此完全消除了为超参数$ p $选择足够值的要求。同时,通过使用层次结构和近似模型扩展此方法,提出的扩展可以在仅学习$ 2T $参数后能够恢复多任务协方差和噪声矩阵,从而避免对任何模型超参数的验证并减少整体的验证模型的复杂性以及过度拟合的风险。关于合成和实际问题的实验结果证实了这种推论方法在其准确恢复原始噪声和信号矩阵的能力方面的优势,以及与其他最先进的MTGP方法相比,实现的性能提高。我们还将该模型与标准GP工具箱集成在一起,表明它具有与最先进的选项的计算竞争。
translated by 谷歌翻译
Recurrent neural networks (RNN) are the backbone of many text and speech applications. These architectures are typically made up of several computationally complex components such as; non-linear activation functions, normalization, bi-directional dependence and attention. In order to maintain good accuracy, these components are frequently run using full-precision floating-point computation, making them slow, inefficient and difficult to deploy on edge devices. In addition, the complex nature of these operations makes them challenging to quantize using standard quantization methods without a significant performance drop. We present a quantization-aware training method for obtaining a highly accurate integer-only recurrent neural network (iRNN). Our approach supports layer normalization, attention, and an adaptive piecewise linear (PWL) approximation of activation functions, to serve a wide range of state-of-the-art RNNs. The proposed method enables RNN-based language models to run on edge devices with $2\times$ improvement in runtime, and $4\times$ reduction in model size while maintaining similar accuracy as its full-precision counterpart.
translated by 谷歌翻译
Sunquakes are seismic emissions visible on the solar surface, associated with some solar flares. Although discovered in 1998, they have only recently become a more commonly detected phenomenon. Despite the availability of several manual detection guidelines, to our knowledge, the astrophysical data produced for sunquakes is new to the field of Machine Learning. Detecting sunquakes is a daunting task for human operators and this work aims to ease and, if possible, to improve their detection. Thus, we introduce a dataset constructed from acoustic egression-power maps of solar active regions obtained for Solar Cycles 23 and 24 using the holography method. We then present a pedagogical approach to the application of machine learning representation methods for sunquake detection using AutoEncoders, Contrastive Learning, Object Detection and recurrent techniques, which we enhance by introducing several custom domain-specific data augmentation transformations. We address the main challenges of the automated sunquake detection task, namely the very high noise patterns in and outside the active region shadow and the extreme class imbalance given by the limited number of frames that present sunquake signatures. With our trained models, we find temporal and spatial locations of peculiar acoustic emission and qualitatively associate them to eruptive and high energy emission. While noting that these models are still in a prototype stage and there is much room for improvement in metrics and bias levels, we hypothesize that their agreement on example use cases has the potential to enable detection of weak solar acoustic manifestations.
translated by 谷歌翻译
With climate change predicted to increase the likelihood of landslide events, there is a growing need for rapid landslide detection technologies that help inform emergency responses. Synthetic Aperture Radar (SAR) is a remote sensing technique that can provide measurements of affected areas independent of weather or lighting conditions. Usage of SAR, however, is hindered by domain knowledge that is necessary for the pre-processing steps and its interpretation requires expert knowledge. We provide simplified, pre-processed, machine-learning ready SAR datacubes for four globally located landslide events obtained from several Sentinel-1 satellite passes before and after a landslide triggering event together with segmentation maps of the landslides. From this dataset, using the Hokkaido, Japan datacube, we study the feasibility of SAR-based landslide detection with supervised deep learning (DL). Our results demonstrate that DL models can be used to detect landslides from SAR data, achieving an Area under the Precision-Recall curve exceeding 0.7. We find that additional satellite visits enhance detection performance, but that early detection is possible when SAR data is combined with terrain information from a digital elevation model. This can be especially useful for time-critical emergency interventions. Code is made publicly available at https://github.com/iprapas/landslide-sar-unet.
translated by 谷歌翻译
与简单英语的德国同行“莱希特·斯普拉奇(Leichte Sprache)”是一种旨在促进复杂的书面语言的受监管语言,否则不同的人群将无法访问。我们为简单德语 - 德语提供了一个新的与句子一致的单语语料库。它包含多个使用自动句子对准方法对齐的文档对准源。我们根据手动标记的对齐文档子集评估我们的对齐方式。通过F1得分衡量的句子对齐质量超过了先前的工作。我们根据CC BY-SA和MIT许可证的随附代码发布数据集。
translated by 谷歌翻译