图神经网络(GNN)是非欧盟数据的强大深度学习方法。流行的GNN是通信算法(MPNNS),它们在本地图中汇总并结合了信号。但是,浅的mpnns倾向于错过远程信号,并且在某些异质图上表现不佳,而深度mpnns可能会遇到过度平滑或过度阵型等问题。为了减轻此类问题,现有的工作通常会从欧几里得数据上训练神经网络或修改图形结构中借用归一化技术。然而,这些方法在理论上并不是很好地理解,并且可能会提高整体计算复杂性。在这项工作中,我们从光谱图嵌入中汲取灵感,并提出$ \ texttt {powerembed} $ - 一种简单的层归一化技术来增强mpnns。我们显示$ \ texttt {powerembed} $可以证明图形运算符的顶部 - $ k $引导特征向量,该算子可以防止过度光滑,并且对图形拓扑是不可知的;同时,它产生了从本地功能到全球信号的表示列表,避免了过度阵列。我们将$ \ texttt {powerembed} $应用于广泛的模拟和真实图表,并展示其竞争性能,尤其是对于异性图。
translated by 谷歌翻译
Ultra-fine entity typing (UFET) predicts extremely free-formed types (e.g., president, politician) of a given entity mention (e.g., Joe Biden) in context. State-of-the-art (SOTA) methods use the cross-encoder (CE) based architecture. CE concatenates the mention (and its context) with each type and feeds the pairs into a pretrained language model (PLM) to score their relevance. It brings deeper interaction between mention and types to reach better performance but has to perform N (type set size) forward passes to infer types of a single mention. CE is therefore very slow in inference when the type set is large (e.g., N = 10k for UFET). To this end, we propose to perform entity typing in a recall-expand-filter manner. The recall and expand stages prune the large type set and generate K (K is typically less than 256) most relevant type candidates for each mention. At the filter stage, we use a novel model called MCCE to concurrently encode and score these K candidates in only one forward pass to obtain the final type prediction. We investigate different variants of MCCE and extensive experiments show that MCCE under our paradigm reaches SOTA performance on ultra-fine entity typing and is thousands of times faster than the cross-encoder. We also found MCCE is very effective in fine-grained (130 types) and coarse-grained (9 types) entity typing. Our code is available at \url{https://github.com/modelscope/AdaSeq/tree/master/examples/MCCE}.
translated by 谷歌翻译
Diffusion models have emerged as a powerful tool for point cloud generation. A key component that drives the impressive performance for generating high-quality samples from noise is iteratively denoise for thousands of steps. While beneficial, the complexity of learning steps has limited its applications to many 3D real-world. To address this limitation, we propose Point Straight Flow (PSF), a model that exhibits impressive performance using one step. Our idea is based on the reformulation of the standard diffusion model, which optimizes the curvy learning trajectory into a straight path. Further, we develop a distillation strategy to shorten the straight path into one step without a performance loss, enabling applications to 3D real-world with latency constraints. We perform evaluations on multiple 3D tasks and find that our PSF performs comparably to the standard diffusion model, outperforming other efficient 3D point cloud generation methods. On real-world applications such as point cloud completion and training-free text-guided generation in a low-latency setup, PSF performs favorably.
translated by 谷歌翻译
Ultra-fine entity typing (UFET) aims to predict a wide range of type phrases that correctly describe the categories of a given entity mention in a sentence. Most recent works infer each entity type independently, ignoring the correlations between types, e.g., when an entity is inferred as a president, it should also be a politician and a leader. To this end, we use an undirected graphical model called pairwise conditional random field (PCRF) to formulate the UFET problem, in which the type variables are not only unarily influenced by the input but also pairwisely relate to all the other type variables. We use various modern backbones for entity typing to compute unary potentials, and derive pairwise potentials from type phrase representations that both capture prior semantic information and facilitate accelerated inference. We use mean-field variational inference for efficient type inference on very large type sets and unfold it as a neural network module to enable end-to-end training. Experiments on UFET show that the Neural-PCRF consistently outperforms its backbones with little cost and results in a competitive performance against cross-encoder based SOTA while being thousands of times faster. We also find Neural- PCRF effective on a widely used fine-grained entity typing dataset with a smaller type set. We pack Neural-PCRF as a network module that can be plugged onto multi-label type classifiers with ease and release it in https://github.com/modelscope/adaseq/tree/master/examples/NPCRF.
translated by 谷歌翻译
Swarm learning (SL) is an emerging promising decentralized machine learning paradigm and has achieved high performance in clinical applications. SL solves the problem of a central structure in federated learning by combining edge computing and blockchain-based peer-to-peer network. While there are promising results in the assumption of the independent and identically distributed (IID) data across participants, SL suffers from performance degradation as the degree of the non-IID data increases. To address this problem, we propose a generative augmentation framework in swarm learning called SL-GAN, which augments the non-IID data by generating the synthetic data from participants. SL-GAN trains generators and discriminators locally, and periodically aggregation via a randomly elected coordinator in SL network. Under the standard assumptions, we theoretically prove the convergence of SL-GAN using stochastic approximations. Experimental results demonstrate that SL-GAN outperforms state-of-art methods on three real world clinical datasets including Tuberculosis, Leukemia, COVID-19.
translated by 谷歌翻译
我们提出了整流的流程,这是一种令人惊讶的简单学习方法(神经)的普通微分方程(ODE)模型,用于在两个经验观察到的分布\ pi_0和\ pi_1之间运输,因此为生成建模和域转移提供了统一的解决方案,以及其他各种任务。涉及分配运输。整流流的想法是学习ode,以遵循尽可能多的连接从\ pi_0和\ pi_1的直径。这是通过解决直接的非线性最小二乘优化问题来实现的,该问题可以轻松地缩放到大型模型,而无需在标准监督学习之外引入额外的参数。直径是特殊的,因此是特殊的,因为它们是两个点之间的最短路径,并且可以精确模拟而无需时间离散,因此可以在计算上产生高效的模型。我们表明,从数据(称为整流)中学习的整流流的过程将\ pi_0和\ pi_1的任意耦合转变为新的确定性耦合,并证明是非侵入的凸面运输成本。此外,递归应用矫正使我们能够获得具有越来越直的路径的流动序列,可以在推理阶段进行粗略的时间离散化来准确地模拟。在实证研究中,我们表明,整流流对图像产生,图像到图像翻译和域的适应性表现出色。特别是,在图像生成和翻译上,我们的方法几乎产生了几乎直流的流,即使是单个Euler离散步骤,也会产生高质量的结果。
translated by 谷歌翻译
基于AI的分子生成为大量生物医学科学和工程(例如抗体设计,水解酶工程或疫苗开发)提供了一种有希望的方法。由于分子受物理定律的管辖,所以关键的挑战是将先前的信息纳入训练程序中,以产生高质量和现实的分子。我们提出了一种简单而新颖的方法,以引导基于扩散的生成模型培训具有物理和统计的先验信息。这是通过构建物理知情的扩散桥,即保证在固定末端产生给定观察的随机过程来实现的。我们开发了一种基于Lyapunov函数的方法来构建和确定桥梁,并提出了许多有关高质量分子生成和均匀性促进的3D点云生成的信息丰富的先验桥的建议。通过全面的实验,我们表明我们的方法为3D生成任务提供了强大的方法,从而产生具有更好质量和稳定性得分的分子结构,并且具有更高质量的分布点云。
translated by 谷歌翻译
积极的学习有效地收集了无标记的数据以进行注释,从而减少了对标记数据的需求。在这项工作中,我们建议以局部灵敏度和硬度感知的获取功能检索未标记的样品。所提出的方法通过局部扰动生成数据副本,并选择其预测可能性与其副本最大的数据点。我们通过注入选择的情况扰动来进一步增强我们的采集功能。我们的方法可以在各种分类任务中对常用的活跃学习策略获得一致的收益。此外,我们在基于迅速的几次学习中迅速选择的研究中观察到对基准的持续改进。这些实验表明,我们以局部敏感性和硬度为指导的获取对许多NLP任务都是有效和有益的。
translated by 谷歌翻译
生成自然语言指令的图像是一个有趣但高度挑战的任务。我们通过将reverting剪辑表示与现成的图像发生器(GAN)的功率组合来实现文本到图像生成,在GaN的潜在空间中优化,找到与给定输入文本实现最大剪辑分数的图像。与传统方法相比,从划痕开始从文本到图像培训生成模型,剪辑+ GaN方法是无训练,零射击,可以用不同的发电机轻松定制。然而,在GaN空间中优化剪辑得分投射了一个高度挑战的优化问题,以及诸如ADAM的现成优化器,不能产生满足结果。在这项工作中,我们提出了一个FusedReam管道,它通过三个关键技术改进了剪辑+ GaN方法:1)通过在图像上引入随机增强来强制剪辑目标的Augclip分数。 2)优化的新颖初始化和过参数化策略,允许我们有效地导航GaN空间中的非凸景观。 3)通过利用新型双级优化制剂的组合生成技术,可以构成多个图像以扩展GaN空间并克服数据偏置。当由不同的输入文本推广时,FusedReam可以产生具有不同对象,背景,艺术风格的高质量图像,甚至没有出现在我们使用的GaN的训练数据中的新的反事概念。定量地,由FusedReam生成的图像在MS Coco DataSet上产生顶级初始成绩和FID分数,而无需额外的架构设计或培训。我们的代码公开可用于\ url {https:/github.com/gnobitab/fusedream}。
translated by 谷歌翻译
This paper focuses on designing efficient models with low parameters and FLOPs for dense predictions. Even though CNN-based lightweight methods have achieved stunning results after years of research, trading-off model accuracy and constrained resources still need further improvements. This work rethinks the essential unity of efficient Inverted Residual Block in MobileNetv2 and effective Transformer in ViT, inductively abstracting a general concept of Meta-Mobile Block, and we argue that the specific instantiation is very important to model performance though sharing the same framework. Motivated by this phenomenon, we deduce a simple yet efficient modern \textbf{I}nverted \textbf{R}esidual \textbf{M}obile \textbf{B}lock (iRMB) for mobile applications, which absorbs CNN-like efficiency to model short-distance dependency and Transformer-like dynamic modeling capability to learn long-distance interactions. Furthermore, we design a ResNet-like 4-phase \textbf{E}fficient \textbf{MO}del (EMO) based only on a series of iRMBs for dense applications. Massive experiments on ImageNet-1K, COCO2017, and ADE20K benchmarks demonstrate the superiority of our EMO over state-of-the-art methods, \eg, our EMO-1M/2M/5M achieve 71.5, 75.1, and 78.4 Top-1 that surpass \textbf{SoTA} CNN-/Transformer-based models, while trading-off the model accuracy and efficiency well.
translated by 谷歌翻译