在本文中,我们详细阐述了基于旋转的迭代高斯rbig的扩展,这使图像高斯化成为可能。尽管RBIG已成功应用于许多任务,但它仅限于中等维度数据(按千维数据)。在图像中,其应用程序仅限于小图像贴片或孤立的像素,因为RBIG中的旋转基于主或独立的组件分析,并且这些转换很难学习和扩展。在这里,我们提出\ emph {卷积rbig}:通过强加rbig中的旋转是卷积来减轻此问题的扩展。我们建议通过优化使用转置卷积操作的输入和转换转换的近似反向来学习卷积旋转(即正交卷积)。此外,我们建议在学习这些正规卷积方面不同。例如,激活中施加稀疏性会导致一种转换,该转换将卷积独立的组件分析扩展到多层体系结构。我们还强调了如何从\ emph {卷积rbig}获得数据的统计属性(例如多元互信息)。我们通过简单的纹理合成示例来说明转换的行为,并通过可视化刺激来分析其属性,从而最大程度地提高某些特征和层中的响应。
translated by 谷歌翻译
One of the key problems in computer vision is adaptation: models are too rigid to follow the variability of the inputs. The canonical computation that explains adaptation in sensory neuroscience is divisive normalization, and it has appealing effects on image manifolds. In this work we show that including divisive normalization in current deep networks makes them more invariant to non-informative changes in the images. In particular, we focus on U-Net architectures for image segmentation. Experiments show that the inclusion of divisive normalization in the U-Net architecture leads to better segmentation results with respect to conventional U-Net. The gain increases steadily when dealing with images acquired in bad weather conditions. In addition to the results on the Cityscapes and Foggy Cityscapes datasets, we explain these advantages through visualization of the responses: the equalization induced by the divisive normalization leads to more invariant features to local changes in contrast and illumination.
translated by 谷歌翻译
Spacecraft pose estimation is a key task to enable space missions in which two spacecrafts must navigate around each other. Current state-of-the-art algorithms for pose estimation employ data-driven techniques. However, there is an absence of real training data for spacecraft imaged in space conditions due to the costs and difficulties associated with the space environment. This has motivated the introduction of 3D data simulators, solving the issue of data availability but introducing a large gap between the training (source) and test (target) domains. We explore a method that incorporates 3D structure into the spacecraft pose estimation pipeline to provide robustness to intensity domain shift and we present an algorithm for unsupervised domain adaptation with robust pseudo-labelling. Our solution has ranked second in the two categories of the 2021 Pose Estimation Challenge organised by the European Space Agency and the Stanford University, achieving the lowest average error over the two categories.
translated by 谷歌翻译
Automated medical image segmentation using deep neural networks typically requires substantial supervised training. However, these models fail to generalize well across different imaging modalities. This shortcoming, amplified by the limited availability of annotated data, has been hampering the deployment of such methods at a larger scale across modalities. To address these issues, we propose M-GenSeg, a new semi-supervised training strategy for accurate cross-modality tumor segmentation on unpaired bi-modal datasets. Based on image-level labels, a first unsupervised objective encourages the model to perform diseased to healthy translation by disentangling tumors from the background, which encompasses the segmentation task. Then, teaching the model to translate between image modalities enables the synthesis of target images from a source modality, thus leveraging the pixel-level annotations from the source modality to enforce generalization to the target modality images. We evaluated the performance on a brain tumor segmentation datasets composed of four different contrast sequences from the public BraTS 2020 challenge dataset. We report consistent improvement in Dice scores on both source and unannotated target modalities. On all twelve distinct domain adaptation experiments, the proposed model shows a clear improvement over state-of-the-art domain-adaptive baselines, with absolute Dice gains on the target modality reaching 0.15.
translated by 谷歌翻译
Privacy-preserving machine learning in data-sharing processes is an ever-critical task that enables collaborative training of Machine Learning (ML) models without the need to share the original data sources. It is especially relevant when an organization must assure that sensitive data remains private throughout the whole ML pipeline, i.e., training and inference phases. This paper presents an innovative framework that uses Representation Learning via autoencoders to generate privacy-preserving embedded data. Thus, organizations can share the data representation to increase machine learning models' performance in scenarios with more than one data source for a shared predictive downstream task.
translated by 谷歌翻译
In this work, we propose a framework relying solely on chat-based customer support (CS) interactions for predicting the recommendation decision of individual users. For our case study, we analyzed a total number of 16.4k users and 48.7k customer support conversations within the financial vertical of a large e-commerce company in Latin America. Consequently, our main contributions and objectives are to use Natural Language Processing (NLP) to assess and predict the recommendation behavior where, in addition to using static sentiment analysis, we exploit the predictive power of each user's sentiment dynamics. Our results show that, with respective feature interpretability, it is possible to predict the likelihood of a user to recommend a product or service, based solely on the message-wise sentiment evolution of their CS conversations in a fully automated way.
translated by 谷歌翻译
We present the GPry algorithm for fast Bayesian inference of general (non-Gaussian) posteriors with a moderate number of parameters. GPry does not need any pre-training, special hardware such as GPUs, and is intended as a drop-in replacement for traditional Monte Carlo methods for Bayesian inference. Our algorithm is based on generating a Gaussian Process surrogate model of the log-posterior, aided by a Support Vector Machine classifier that excludes extreme or non-finite values. An active learning scheme allows us to reduce the number of required posterior evaluations by two orders of magnitude compared to traditional Monte Carlo inference. Our algorithm allows for parallel evaluations of the posterior at optimal locations, further reducing wall-clock times. We significantly improve performance using properties of the posterior in our active learning scheme and for the definition of the GP prior. In particular we account for the expected dynamical range of the posterior in different dimensionalities. We test our model against a number of synthetic and cosmological examples. GPry outperforms traditional Monte Carlo methods when the evaluation time of the likelihood (or the calculation of theoretical observables) is of the order of seconds; for evaluation times of over a minute it can perform inference in days that would take months using traditional methods. GPry is distributed as an open source Python package (pip install gpry) and can also be found at https://github.com/jonaselgammal/GPry.
translated by 谷歌翻译
可以通过玩游戏来训练代理商来回答困难的数学问题吗?我们考虑了整数可行性问题,这是决定线性方程和不平等系统是否具有具有整数值的解决方案的挑战。对于许多数学和计算机科学领域的应用,这是一个著名的NP完整问题。我们的论文描述了一个新颖的代数增强学习框架,该框架使代理商可以玩相当于整数可行性问题的游戏。我们解释了如何将整数可行性问题转换为具有固定保证金总和的一组阵列的游戏。游戏从初始状态(数组)开始,并采取法律举措使利润率保持不变,我们的目标是最终与零位置的零位置达到胜利状态。为了赢得比赛,玩家必须在初始状态和最终终端获胜状态之间找到一条路径。找到这样的获胜状态等同于解决整数可行性问题。关键代数成分是“基础轴向运输polyhedron的曲折理想的基础”。gr \'obner可以看作是游戏的一组连接移动(动作)。然后,我们提出了一种新型的RL方法,该方法训练代理以预测连续空间中的移动,以应对较大的动作空间。然后将连续的移动投射到一组法律移动上,以使该路径始终导致有效状态。作为概念的证明,我们在实验中证明了我们的代理商可以很好地发挥我们最简单的游戏版本,用于2向表。我们的工作突出了培训代理商通过当代机器学习方法来训练代理商玩游戏的潜力来解决非平凡的数学查询的潜力。
translated by 谷歌翻译
考虑到大量未标记的语音数据和高标签成本,无监督的学习方法对于更好的系统开发至关重要。最成功的方法之一是对比度的自我监督方法,这些方法需要负采样:采样替代样品与当前样品(锚)对比。但是,很难确保所有负样本属于与没有标签的锚类别不同的​​类别。本文在未标记的语音语料库上应用了一种非对抗性的自我监督学习方法来学习话语级的嵌入。我们使用没有标签的蒸馏(Dino),在计算机视觉中提出,并将其改编为语音域。与对比度方法不同,Dino不需要负采样。这些嵌入是根据说话者验证和情感识别评估的。在说话者验证中,无监督的恐龙与余弦评分嵌入了voxceleb1测试试验中的4.38%EER。这表现优于最佳的对比度自我监督方法,而EER中的相对相对40%。不需要扬声器标签的迭代伪标记训练管道将EER进一步提高到1.89%。在情感识别中,Iemocap,Crema-D和MSP播客的Micro-F1得分分别进行了60.87、79.21和56.98%的恐龙。结果暗示着恐龙嵌入到不同语音应用中的普遍性。
translated by 谷歌翻译
本文通过数学形态的代数基础,分析了深卷积神经网络(DCNN)的非线性激活函数和空间最大化。此外,通过在形态代表的背景下考虑最大 - 释放和非线性算子,提出了一般的激活功能家族。实验部分验证了我们在经典基准测试中的方法,用于DCNN的监督学习。
translated by 谷歌翻译