智能论文笔记

Attentive Fine-Grained Structured Sparsity for Image Restoration

Junghun Oh , Heewon Kim , Seungjun Nah , Cheeun Hong , Jonghyun Choi , Kyoung Mu Lee

分类：计算机视觉

2022-04-26

近年来，通过开发大型的深层模型，图像修复任务已经见证了绩效的巨大提高。尽管表现出色，但深层模型要求的重量计算限制了图像恢复的应用。为了提高限制，需要减少网络的大小，同时保持准确性。最近，N：M结构化修剪似乎是使模型具有准确性约束的有效且实用的修剪方法之一。但是，它无法解释图像恢复网络不同层的不同计算复杂性和性能要求。为了进一步优化效率和恢复精度之间的权衡，我们提出了一种新型的修剪方法，该方法确定了每一层N：M结构稀疏性的修剪比。关于超分辨率和脱张任务的广泛实验结果证明了我们方法的功效，该方法的表现胜过以前的修剪方法。拟议方法的Pytorch实施将在https://github.com/junghunoh/sls_cvpr2r2022上公开获得。

translated by 谷歌翻译

Iconary: A Pictionary-Based Game for Testing Multimodal Communication with Drawings and Text

Christopher Clark , Jordi Salvador , Dustin Schwenk , Derrick Bonafilia , Mark Yatskar , Eric Kolve , Alvaro Herrasti , Jonghyun Choi , Sachin Mehta , Sam Skjonsberg

分类：自然语言处理 | 人工智能

2021-12-01

与人类沟通对AIS有挑战性，因为它需要对世界的共同理解，复杂的语义（例如，隐喻或类似物），并且在多码模态手势（例如，指向手指，或图中的箭头）。我们在基于图案的基础上的绘画和猜测的语境中调查了这些挑战，这对研究界构成了一种新的挑战。在ICONARY中，猜测者试图通过编写图标来识别抽屉绘制的短语，以及抽屉迭代地修改绘图以帮助猜测响应的猜测。这次来回经常使用规范场景，视觉隐喻或图标组成来表达具有挑战性的词语，使其成为AI中混合语言和视觉/象征性通信的理想测试。我们提出模型进行图标，并在人类球员之间的55,000多场比赛中培训。我们的型号是熟练的玩家，能够在语言模型中雇用世界知识，以便在训练期间与看不见的文字一起玩。精英人类球员优于我们的模型，特别是在绘图任务中，留下了未来研究的重要缺口。我们将数据集，代码和评估设置释放为对社区的挑战http://www.github.com/allenai/conary。

translated by 谷歌翻译

Unsupervised Representation Learning for Binary Networks by Joint Classifier Learning

Dahyun Kim , Jonghyun Choi

分类：机器学习 | 计算机视觉

2021-10-17

自我监督的学习是一个有希望的无监督学习框架，实现了大型浮点网络取得成功。但这种网络不易部署到边缘设备。为了加速模型部署模型，在为各种下游任务中学习这种资源有限的设备的益处，我们向使用移动目标网络的二进制网络提出了一种自我监督的学习方法。特别是，我们建议共同列车，随机初始化的分类器，附加到预用浮点特征提取器，具有二进制网络。此外，我们提出了一种特征相似性损失，动态丢失平衡和改进的多级训练，以进一步提高准确性，并呼叫我们的方法燃烧。我们使用七个数据集的五个下游任务的经验验证显示，烧伤优于二进制网络的自我监督基线，有时优于预测预测。

translated by 谷歌翻译

Carousel Memory: Rethinking the Design of Episodic Memory for Continual Learning

Soobee Lee , Minindu Weerakoon , Jonghyun Choi , Minjia Zhang , Di Wang , Myeongjae Jeon

分类：机器学习 | 人工智能

2021-10-14

Continual Learning (CL) is an emerging machine learning paradigm that aims to learn from a continuous stream of tasks without forgetting knowledge learned from the previous tasks. To avoid performance decrease caused by forgetting, prior studies exploit episodic memory (EM), which stores a subset of the past observed samples while learning from new non-i.i.d. data. Despite the promising results, since CL is often assumed to execute on mobile or IoT devices, the EM size is bounded by the small hardware memory capacity and makes it infeasible to meet the accuracy requirements for real-world applications. Specifically, all prior CL methods discard samples overflowed from the EM and can never retrieve them back for subsequent training steps, incurring loss of information that would exacerbate catastrophic forgetting. We explore a novel hierarchical EM management strategy to address the forgetting issue. In particular, in mobile and IoT devices, real-time data can be stored not just in high-speed RAMs but in internal storage devices as well, which offer significantly larger capacity than the RAMs. Based on this insight, we propose to exploit the abundant storage to preserve past experiences and alleviate the forgetting by allowing CL to efficiently migrate samples between memory and storage without being interfered by the slow access speed of the storage. We call it Carousel Memory (CarM). As CarM is complementary to existing CL methods, we conduct extensive evaluations of our method with seven popular CL methods and show that CarM significantly improves the accuracy of the methods across different settings by large margins in final average accuracy (up to 28.4%) while retaining the same training efficiency.

translated by 谷歌翻译

Data Valuation Without Training of a Model

Nohyun Ki , Hoyong Choi , Hye Won Chung

分类：机器学习

2023-01-03

Many recent works on understanding deep learning try to quantify how much individual data instances influence the optimization and generalization of a model, either by analyzing the behavior of the model during training or by measuring the performance gap of the model when the instance is removed from the dataset. Such approaches reveal characteristics and importance of individual instances, which may provide useful information in diagnosing and improving deep learning. However, most of the existing works on data valuation require actual training of a model, which often demands high-computational cost. In this paper, we provide a training-free data valuation score, called complexity-gap score, which is a data-centric score to quantify the influence of individual instances in generalization of two-layer overparameterized neural networks. The proposed score can quantify irregularity of the instances and measure how much each data instance contributes in the total movement of the network parameters during training. We theoretically analyze and empirically demonstrate the effectiveness of the complexity-gap score in finding 'irregular or mislabeled' data instances, and also provide applications of the score in analyzing datasets and diagnosing training dynamics.

translated by 谷歌翻译

DMOps: Data Management Operation and Recipes

Eujeong Choi , Chanjun Park

分类：机器学习

2023-01-02

Data-centric AI has shed light on the significance of data within the machine learning (ML) pipeline. Acknowledging its importance, various research and policies are suggested by academia, industry, and government departments. Although the capability of utilizing existing data is essential, the capability to build a dataset has become more important than ever. In consideration of this trend, we propose a "Data Management Operation and Recipes" that will guide the industry regardless of the task or domain. In other words, this paper presents the concept of DMOps derived from real-world experience. By offering a baseline for building data, we want to help the industry streamline its data operation optimally.

translated by 谷歌翻译

MAUVE Scores for Generative Models: Theory and Practice

Krishna Pillutla , Lang Liu , John Thickstun , Sean Welleck , Swabha Swayamdipta , Rowan Zellers , Sewoong Oh , Yejin Choi , Zaid Harchaoui

分类：机器学习 | 人工智能 | 自然语言处理

2022-12-30

Generative AI has matured to a point where large-scale models can generate text that seems indistinguishable from human-written text and remarkably photorealistic images. Automatically measuring how close the distribution of generated data is to the target real data distribution is a key step in diagnosing existing models and developing better models. We present MAUVE, a family of comparison measures between pairs of distributions such as those encountered in the generative modeling of text or images. These scores are statistical summaries of divergence frontiers capturing two types of errors in generative modeling. We explore four approaches to statistically estimate these scores: vector quantization, non-parametric estimation, classifier-based estimation, and parametric Gaussian approximations. We provide statistical bounds for the vector quantization approach. Empirically, we find that the proposed scores paired with a range of $f$-divergences and statistical estimation methods can quantify the gaps between the distributions of human-written text and those of modern neural language models by correlating with human judgments and identifying known properties of the generated texts. We conclude the paper by demonstrating its applications to other AI domains and discussing practical recommendations.

translated by 谷歌翻译

X-MAS: Extremely Large-Scale Multi-Modal Sensor Dataset for Outdoor Surveillance in Real Environments

DongKi Noh , Changki Sung , Teayoung Uhm , WooJu Lee , Hyungtae Lim , Jaeseok Choi , Kyuewang Lee , Dasol Hong , Daeho Um , Inseop Chung

分类：机器人

2022-12-30

In robotics and computer vision communities, extensive studies have been widely conducted regarding surveillance tasks, including human detection, tracking, and motion recognition with a camera. Additionally, deep learning algorithms are widely utilized in the aforementioned tasks as in other computer vision tasks. Existing public datasets are insufficient to develop learning-based methods that handle various surveillance for outdoor and extreme situations such as harsh weather and low illuminance conditions. Therefore, we introduce a new large-scale outdoor surveillance dataset named eXtremely large-scale Multi-modAl Sensor dataset (X-MAS) containing more than 500,000 image pairs and the first-person view data annotated by well-trained annotators. Moreover, a single pair contains multi-modal data (e.g. an IR image, an RGB image, a thermal image, a depth image, and a LiDAR scan). This is the first large-scale first-person view outdoor multi-modal dataset focusing on surveillance tasks to the best of our knowledge. We present an overview of the proposed dataset with statistics and present methods of exploiting our dataset with deep learning-based algorithms. The latest information on the dataset and our study are available at https://github.com/lge-robot-navi, and the dataset will be available for download through a server.

translated by 谷歌翻译

Strangeness-driven Exploration in Multi-Agent Reinforcement Learning

Ju-Bong Kim , Ho-Bin Choi , Youn-Hee Han

分类：机器学习 | 人工智能

2022-12-27

Efficient exploration strategy is one of essential issues in cooperative multi-agent reinforcement learning (MARL) algorithms requiring complex coordination. In this study, we introduce a new exploration method with the strangeness that can be easily incorporated into any centralized training and decentralized execution (CTDE)-based MARL algorithms. The strangeness refers to the degree of unfamiliarity of the observations that an agent visits. In order to give the observation strangeness a global perspective, it is also augmented with the the degree of unfamiliarity of the visited entire state. The exploration bonus is obtained from the strangeness and the proposed exploration method is not much affected by stochastic transitions commonly observed in MARL tasks. To prevent a high exploration bonus from making the MARL training insensitive to extrinsic rewards, we also propose a separate action-value function trained by both extrinsic reward and exploration bonus, on which a behavioral policy to generate transitions is designed based. It makes the CTDE-based MARL algorithms more stable when they are used with an exploration method. Through a comparative evaluation in didactic examples and the StarCraft Multi-Agent Challenge, we show that the proposed exploration method achieves significant performance improvement in the CTDE-based MARL algorithms.

translated by 谷歌翻译

Bring Your Own View: Graph Neural Networks for Link Prediction with Personalized Subgraph Selection

Qiaoyu Tan , Xin Zhang , Ninghao Liu , Daochen Zha , Li Li , Rui Chen , Soo-Hyun Choi , Xia Hu

分类：机器学习

2022-12-23

Graph neural networks (GNNs) have received remarkable success in link prediction (GNNLP) tasks. Existing efforts first predefine the subgraph for the whole dataset and then apply GNNs to encode edge representations by leveraging the neighborhood structure induced by the fixed subgraph. The prominence of GNNLP methods significantly relies on the adhoc subgraph. Since node connectivity in real-world graphs is complex, one shared subgraph is limited for all edges. Thus, the choices of subgraphs should be personalized to different edges. However, performing personalized subgraph selection is nontrivial since the potential selection space grows exponentially to the scale of edges. Besides, the inference edges are not available during training in link prediction scenarios, so the selection process needs to be inductive. To bridge the gap, we introduce a Personalized Subgraph Selector (PS2) as a plug-and-play framework to automatically, personally, and inductively identify optimal subgraphs for different edges when performing GNNLP. PS2 is instantiated as a bi-level optimization problem that can be efficiently solved differently. Coupling GNNLP models with PS2, we suggest a brand-new angle towards GNNLP training: by first identifying the optimal subgraphs for edges; and then focusing on training the inference model by using the sampled subgraphs. Comprehensive experiments endorse the effectiveness of our proposed method across various GNNLP backbones (GCN, GraphSage, NGCF, LightGCN, and SEAL) and diverse benchmarks (Planetoid, OGB, and Recommendation datasets). Our code is publicly available at \url{https://github.com/qiaoyu-tan/PS2}

translated by 谷歌翻译