智能论文笔记

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

Teven Le Scao , Angela Fan , Christopher Akiki , Ellie Pavlick , Suzana Ilić , Daniel Hesslow , Roman Castagné , Alexandra Sasha Luccioni , François Yvon , Matthias Gallé

分类：自然语言处理

2022-11-09

Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.

translated by 谷歌翻译

Retrieval Based Time Series Forecasting

Baoyu Jing , Si Zhang , Yada Zhu , Bin Peng , Kaiyu Guan , Andrew Margenot , Hanghang Tong

分类：人工智能 | 机器学习

2022-09-27

时间序列数据出现在各种应用程序中，例如智能运输和环境监测。时间序列分析的基本问题之一是时间序列预测。尽管最近的深度时间序列预测方法取得了成功，但它们仍需要足够的历史价值观察才能进行准确的预测。换句话说，输出长度（或预测范围）与输入和输出长度之和的比率应足够低（例如，0.3）。随着比率的增加（例如，到0.8），预测准确性的不确定性显着增加。在本文中，我们从理论和经验上都表明，通过将相关时间序列检索作为参考文献可以有效地降低不确定性。在理论分析中，我们首先量化不确定性，并显示其与平方误差（MSE）的连接。然后，我们证明，带有参考的模型比没有参考的模型更容易学习，因为检索到的参考可能会降低不确定性。为了凭经验证明基于检索的时间序列预测模型的有效性，我们引入了一种简单而有效的两阶段方法，称为“保留”，该方法由关系检索和内容合成组成。我们还表明，可以轻松地适应时空时间序列和时间序列插补设置。最后，我们评估了现实世界数据集上的延迟，以证明其有效性。

translated by 谷歌翻译

ARIEL: Adversarial Graph Contrastive Learning

Shengyu Feng , Baoyu Jing , Yada Zhu , Hanghang Tong

分类：机器学习

2022-08-15

对比度学习是图表学习中的有效无监督方法，对比度学习的关键组成部分在于构建正和负样本。以前的方法通常利用图中节点的接近度作为原理。最近，基于数据增强的对比度学习方法已进步以显示视觉域中的强大力量，一些作品将此方法从图像扩展到图形。但是，与图像上的数据扩展不同，图上的数据扩展远不那么直观，而且很难提供高质量的对比样品，这为改进留出了很大的空间。在这项工作中，通过引入一个对抗性图视图以进行数据增强，我们提出了一种简单但有效的方法，对抗图对比度学习（ARIEL），以在合理的约束中提取信息性的对比样本。我们开发了一种称为稳定训练的信息正则化的新技术，并使用子图抽样以进行可伸缩。我们通过将每个图形实例视为超级节点，从节点级对比度学习到图级。 Ariel始终优于在现实世界数据集上的节点级别和图形级分类任务的当前图对比度学习方法。我们进一步证明，面对对抗性攻击，Ariel更加强大。

translated by 谷歌翻译

Adversarial Graph Contrastive Learning with Information Regularization

Shengyu Feng , Baoyu Jing , Yada Zhu , Hanghang Tong

分类：机器学习

2022-02-14

对比度学习是图表学习中有效的无监督方法。最近，基于数据增强的对比度学习方法已从图像扩展到图形。但是，大多数先前的作品都直接根据为图像设计的模型进行了调整。与图像上的数据增强不同，图表上的数据扩展远不那么直观，而且很难提供高质量的对比样本，这是对比度学习模型的性能的关键。这为改进现有图形对比学习框架留出了很多空间。在这项工作中，通过引入对抗图视图和信息正常化程序，我们提出了一种简单但有效的方法，即对逆向对比度学习（ARIEL），以在合理的约束中提取信息性的对比样本。它始终优于各种现实世界数据集的节点分类任务中当前的图形对比度学习方法，并进一步提高了图对比度学习的鲁棒性。

translated by 谷歌翻译

Multi-Domain Transformer-Based Counterfactual Augmentation for Earnings Call Analysis

Zixuan Yuan , Yada Zhu , Wei Zhang , Ziming Huang , Guangnan Ye , Hui Xiong

分类：机器学习 | 人工智能

2021-12-02

作为公开交易公司的定期电话会议，盈利呼叫（EC）已被广泛地研究作为企业基本面的高分析价值，作为基本的市场指标。最近的深度学习技术的出现在创建自动化管道方面表现出很大的承诺，使EC支持的财务应用程序受益。然而，这些方法认为所有包含的内容都是信息，而无需从长文本的成绩单中炼制有价值的语义并遭受EC稀缺问题。同时，这些黑箱方法具有在提供人为可理解的解释方面具有固有的困难。为此，本文提出了一种基于多域变换器的反事实增强，命名为MTCA，以解决上述问题。具体而言，我们首先提出基于变压器的EC编码器，以术语地量化关键额型欧共事位议对市场推理的任务启发意义。然后，开发了一种多域反事实学习框架，以评估具有充满丰富的跨域文档的有限EC信息文本之后基于梯度的变体，使MTCA能够执行无监督的数据增强。作为奖励，我们发现一种使用非培训数据作为基于实例的解释，我们将结果与案例研究显示。对现实世界金融数据集的广泛实验证明了可解释的MTCA的有效性，以提高最先进的最新的挥发性评估能力14.2 \％的准确性。

translated by 谷歌翻译

Contrastive Learning with Complex Heterogeneity

Lecheng Zheng , Jinjun Xiong , Yada Zhu , Jingrui He

分类：机器学习 | 计算机视觉

2021-05-19

随着大数据在多个高影响应用程序中的出现，我们经常面临复杂异质性的挑战。新收集的数据通常由多种模态组成，并具有多个标签，因此表现出多种类型的异质性的共存。尽管最先进的技术擅长使用足够的标签信息对复杂的异质性进行建模，但是在实际应用中获得的标签信息可能非常昂贵。最近，研究人员通过利用丰富的未标记数据，非常关注对比度学习的出色表现。但是，对比度学习上的现有工作无法解决虚假负面对的问题，即，如果某些“负”对具有相同的标签，则可能具有相似的表示。为了克服这些问题，在本文中，我们提出了一个统一的异质学习框架，该框架结合了加权的无监督对比损失和加权监督的对比损失，以模拟多种类型的异质性。我们首先提供了理论分析，表明在存在假负对的情况下，香草对比度学习损失很容易导致次优的解决方案，而拟议的加权损失可以自动根据学习表示的相似性自动调整重量，从而减轻这种情况以减轻这种情况问题。现实世界数据集的实验结果证明了对多种类型的异质性建模所提出的框架的有效性和效率。

translated by 谷歌翻译

JaMIE: A Pipeline Japanese Medical Information Extraction System

Fei Cheng , Shuntaro Yada , Ribeka Tanaka , Eiji Aramaki , Sadao Kurohashi

分类：自然语言处理 | 人工智能

2021-11-08

我们为日本医疗信息提取提供了一个开放式自然语言处理工具包。我们首先提出了一种新的关系注释架构，用于调查日本医疗报告中医学实体的医疗和时间关系。我们通过单独注释两种不同类型的报告来尝试实用的注释方案。我们设计了一个带有三个组件的管道系统，用于识别医疗实体，分类实体模式和提取关系。经验结果表明，准确的分析性能，提出了令人满意的注释质量，针对报告类型的有效注释策略，以及最新的上下文嵌入模型的优越性。

translated by 谷歌翻译

SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems

Alex Wang , Yada Pruksachatkun , Nikita Nangia , Amanpreet Singh , Julian Michael , Felix Hill , Omer Levy , Samuel R. Bowman

分类：

2019-05-02

In the last year, new models and methods for pretraining and transfer learning have driven striking performance improvements across a range of language understanding tasks. The GLUE benchmark, introduced a little over one year ago, offers a single-number metric that summarizes progress on a diverse set of such tasks, but performance on the benchmark has recently surpassed the level of non-expert humans, suggesting limited headroom for further research. In this paper we present SuperGLUE, a new benchmark styled after GLUE with a new set of more difficult language understanding tasks, a software toolkit, and a public leaderboard. SuperGLUE is available at super.gluebenchmark.com.

translated by 谷歌翻译

Cluster-guided Contrastive Graph Clustering Network

Xihong Yang , Yue Liu , Sihang Zhou , Siwei Wang , Wenxuan Tu , Qun Zheng , Xinwang Liu , Liming Fang , En Zhu

分类：机器学习

2023-01-03

Benefiting from the intrinsic supervision information exploitation capability, contrastive learning has achieved promising performance in the field of deep graph clustering recently. However, we observe that two drawbacks of the positive and negative sample construction mechanisms limit the performance of existing algorithms from further improvement. 1) The quality of positive samples heavily depends on the carefully designed data augmentations, while inappropriate data augmentations would easily lead to the semantic drift and indiscriminative positive samples. 2) The constructed negative samples are not reliable for ignoring important clustering information. To solve these problems, we propose a Cluster-guided Contrastive deep Graph Clustering network (CCGC) by mining the intrinsic supervision information in the high-confidence clustering results. Specifically, instead of conducting complex node or edge perturbation, we construct two views of the graph by designing special Siamese encoders whose weights are not shared between the sibling sub-networks. Then, guided by the high-confidence clustering information, we carefully select and construct the positive samples from the same high-confidence cluster in two views. Moreover, to construct semantic meaningful negative sample pairs, we regard the centers of different high-confidence clusters as negative samples, thus improving the discriminative capability and reliability of the constructed sample pairs. Lastly, we design an objective function to pull close the samples from the same cluster while pushing away those from other clusters by maximizing and minimizing the cross-view cosine similarity between positive and negative samples. Extensive experimental results on six datasets demonstrate the effectiveness of CCGC compared with the existing state-of-the-art algorithms.

translated by 谷歌翻译

High-Quality Supersampling via Mask-reinforced Deep Learning for Real-time Rendering

Hongliang Yuan , Boyu Zhang , Mingyan Zhu , Ligang Liu , Jue Wang

分类：计算机视觉

2023-01-03

To generate high quality rendering images for real time applications, it is often to trace only a few samples-per-pixel (spp) at a lower resolution and then supersample to the high resolution. Based on the observation that the rendered pixels at a low resolution are typically highly aliased, we present a novel method for neural supersampling based on ray tracing 1/4-spp samples at the high resolution. Our key insight is that the ray-traced samples at the target resolution are accurate and reliable, which makes the supersampling an interpolation problem. We present a mask-reinforced neural network to reconstruct and interpolate high-quality image sequences. First, a novel temporal accumulation network is introduced to compute the correlation between current and previous features to significantly improve their temporal stability. Then a reconstruct network based on a multi-scale U-Net with skip connections is adopted for reconstruction and generation of the desired high-resolution image. Experimental results and comparisons have shown that our proposed method can generate higher quality results of supersampling, without increasing the total number of ray-tracing samples, over current state-of-the-art methods.

translated by 谷歌翻译