随着新趋势影响在线讨论,用户生成的社交媒体数据正在不断变化,从而导致社交媒体NLP应用程序的测试数据分布变化。此外,随着用户数据删除,培训数据通常可能会更改。当前的大多数NLP系统都是静态的,并且依赖固定培训数据。结果,他们无法在没有频繁,昂贵的重新训练的情况下适应时间变化 - 既包括测试分配变化又删除了培训数据。在本文中,我们通过纵向主题标签预测的任务来研究时间适应,并提出一种非参数技术作为一种简单但有效的解决方案:非参数分类器使用可以更新的数据存储器,以适应测试分配移位或培训数据删除,无需重新训练。我们发布了一个新的基准数据集,该数据集由2021年的713m推文以及它们的主题标签组成,分为连续的颞桶。我们将需要重新训练进行适应的参数神经主题标签分类和标签生成模型与非参数,无训练的密集检索方法进行了比较,该方法基于文本嵌入距离返回最近的邻居的主题标签。在我们的纵向Twitter数据集的实验中,我们发现密集的邻居检索的相对性能增益比测试集的最佳参数基线的相对性能增长率为64.12%,该测试集的表现出分布移位而不需要基于梯度的重新训练。此外,我们表明我们的数据存储方法特别适合动态删除的用户数据,并具有可忽略的计算成本和性能损失。我们的新颖基准数据集和实证分析可以支持未来对现实世界用户数据中AI系统部署时的重要挑战的研究。
We study representation learning for efficient imitation learning over linear systems. In particular, we consider a setting where learning is split into two phases: (a) a pre-training step where a shared $k$-dimensional representation is learned from $H$ source policies, and (b) a target policy fine-tuning step where the learned representation is used to parameterize the policy class. We find that the imitation gap over trajectories generated by the learned target policy is bounded by $\tilde{O}\left( \frac{k n_x}{HN_{\mathrm{shared}}} + \frac{k n_u}{N_{\mathrm{target}}}\right)$, where $n_x > k$ is the state dimension, $n_u$ is the input dimension, $N_{\mathrm{shared}}$ denotes the total amount of data collected for each policy during representation learning, and $N_{\mathrm{target}}$ is the amount of target task data. This result formalizes the intuition that aggregating data across related tasks to learn a representation can significantly improve the sample efficiency of learning a target task. The trends suggested by this bound are corroborated in simulation.
The well-documented presence of texture bias in modern convolutional neural networks has led to a plethora of algorithms that promote an emphasis on shape cues, often to support generalization to new domains. Yet, common datasets, benchmarks and general model selection strategies are missing, and there is no agreed, rigorous evaluation protocol. In this paper, we investigate difficulties and limitations when training networks with reduced texture bias. In particular, we also show that proper evaluation and meaningful comparisons between methods are not trivial. We introduce BiasBed, a testbed for texture- and style-biased training, including multiple datasets and a range of existing algorithms. It comes with an extensive evaluation protocol that includes rigorous hypothesis testing to gauge the significance of the results, despite the considerable training instability of some style bias methods. Our extensive experiments, shed new light on the need for careful, statistically founded evaluation protocols for style bias (and beyond). E.g., we find that some algorithms proposed in the literature do not significantly mitigate the impact of style bias at all. With the release of BiasBed, we hope to foster a common understanding of consistent and meaningful comparisons, and consequently faster progress towards learning methods free of texture bias. Code is available at https://github.com/D1noFuzi/BiasBed
Pre-trained protein language models have demonstrated significant applicability in different protein engineering task. A general usage of these pre-trained transformer models latent representation is to use a mean pool across residue positions to reduce the feature dimensions to further downstream tasks such as predicting bio-physics properties or other functional behaviours. In this paper we provide a two-fold contribution to machine learning (ML) driven drug design. Firstly, we demonstrate the power of sparsity by promoting penalization of pre-trained transformer models to secure more robust and accurate melting temperature (Tm) prediction of single-chain variable fragments with a mean absolute error of 0.23C. Secondly, we demonstrate the power of framing our prediction problem in a probabilistic framework. Specifically, we advocate for the need of adopting probabilistic frameworks especially in the context of ML driven drug design.
在这项工作中,我们介绍了亲和力-VAE:基于其相似性在多维图像数据中自动聚类和对象分类的框架。该方法扩展了$ \ beta $ -vaes的概念,其基于亲和力矩阵驱动的知情相似性损失组件。与标准的$ \ beta $ -VAE相比,该亲和力VAE能够在潜在表示中创建旋转不变的,形态上均匀的簇,并具有改进的群集分离。我们探讨了2D和3D图像数据上潜在空间的潜在分离和连续性的程度,包括模拟的生物电子冷冻术(Cryo-ET)体积,作为科学应用的一个例子。
