智能论文笔记

Improved Latent Tree Induction with Distant Supervision via Span Constraints

Zhiyang Xu , Andrew Drozdov , Jay Yoon Lee , Tim O'Gorman , Subendhu Rongali , Dylan Finkbeiner , Shilpa Suresh , Mohit Iyyer , Andrew McCallum

分类：自然语言处理

2021-09-10

超过三十年，研究人员已经开发和分析了潜伏树诱导的方法作为无监督句法解析的方法。尽管如此，与其监督的对应物相比，现代系统仍然不足以使其具有任何实际用途作为文本的结构注释。在这项工作中，我们提出了一种技术，该技术以跨度约束（即短语包围）的形式使用远端监督，以提高在无监督选项解析中的性能。使用相对少量的跨度约束，我们可以大大提高Diora的输出，这是一个已经竞争的无监督解析系统。与完整的解析树注释相比，可以通过最小的努力来获取跨度约束，例如使用从维基百科派生的词典，以查找确切的文本匹配。我们的实验显示了基于实体的跨度约束，提高了英语WSJ Penn TreeBank的选区分析超过5 F1。此外，我们的方法延伸到跨度约束易于实现的任何域，以及作为一个案例研究，我们通过从工艺数据集解析生物医学文本来证明其有效性。

translated by 谷歌翻译

Diffusion Probabilistic Models for Scene-Scale 3D Categorical Data

Jumin Lee , Woobin Im , Sebin Lee , Sung-Eui Yoon

分类：计算机视觉

2023-01-02

In this paper, we learn a diffusion model to generate 3D data on a scene-scale. Specifically, our model crafts a 3D scene consisting of multiple objects, while recent diffusion research has focused on a single object. To realize our goal, we represent a scene with discrete class labels, i.e., categorical distribution, to assign multiple objects into semantic categories. Thus, we extend discrete diffusion models to learn scene-scale categorical distributions. In addition, we validate that a latent diffusion model can reduce computation costs for training and deploying. To the best of our knowledge, our work is the first to apply discrete and latent diffusion for 3D categorical data on a scene-scale. We further propose to perform semantic scene completion (SSC) by learning a conditional distribution using our diffusion model, where the condition is a partial observation in a sparse point cloud. In experiments, we empirically show that our diffusion models not only generate reasonable scenes, but also perform the scene completion task better than a discriminative model. Our code and models are available at https://github.com/zoomin-lee/scene-scale-diffusion

translated by 谷歌翻译

Novel Deep Learning Framework For Bovine Iris Segmentation

Heemoon Yoon , Mira Park , Sang-Hee Lee

分类：计算机视觉 | 机器学习

2022-12-22

Iris segmentation is the initial step to identify biometric of animals to establish a traceability system of livestock. In this study, we propose a novel deep learning framework for pixel-wise segmentation with minimum use of annotation labels using BovineAAEyes80 public dataset. In the experiment, U-Net with VGG16 backbone was selected as the best combination of encoder and decoder model, demonstrating a 99.50% accuracy and a 98.35% Dice coefficient score. Remarkably, the selected model accurately segmented corrupted images even without proper annotation data. This study contributes to the advancement of the iris segmentation and the development of a reliable DNNs training framework.

translated by 谷歌翻译

Optimal Planning of Hybrid Energy Storage Systems using Curtailed Renewable Energy through Deep Reinforcement Learning

Dongju Kang , Doeun Kang , Sumin Hwangbo , Haider Niaz , Won Bo Lee , J. Jay Liu , Jonggeol Na

分类：机器学习

2022-12-12

Energy management systems (EMS) are becoming increasingly important in order to utilize the continuously growing curtailed renewable energy. Promising energy storage systems (ESS), such as batteries and green hydrogen should be employed to maximize the efficiency of energy stakeholders. However, optimal decision-making, i.e., planning the leveraging between different strategies, is confronted with the complexity and uncertainties of large-scale problems. Here, we propose a sophisticated deep reinforcement learning (DRL) methodology with a policy-based algorithm to realize the real-time optimal ESS planning under the curtailed renewable energy uncertainty. A quantitative performance comparison proved that the DRL agent outperforms the scenario-based stochastic optimization (SO) algorithm, even with a wide action and observation space. Owing to the uncertainty rejection capability of the DRL, we could confirm a robust performance, under a large uncertainty of the curtailed renewable energy, with a maximizing net profit and stable system. Action-mapping was performed for visually assessing the action taken by the DRL agent according to the state. The corresponding results confirmed that the DRL agent learns the way like what a human expert would do, suggesting reliable application of the proposed methodology.

translated by 谷歌翻译

Efficient and Accurate Quantized Image Super-Resolution on Mobile NPUs, Mobile AI & AIM 2022 challenge: Report

Andrey Ignatov , Radu Timofte , Maurizio Denna , Abdel Younes , Ganzorig Gankhuyag , Jingang Huh , Myeong Kyun Kim , Kihwan Yoon , Hyeon-Cheol Moon , Seungho Lee

分类：计算机视觉

2022-11-07

Image super-resolution is a common task on mobile and IoT devices, where one often needs to upscale and enhance low-resolution images and video frames. While numerous solutions have been proposed for this problem in the past, they are usually not compatible with low-power mobile NPUs having many computational and memory constraints. In this Mobile AI challenge, we address this problem and propose the participants to design an efficient quantized image super-resolution solution that can demonstrate a real-time performance on mobile NPUs. The participants were provided with the DIV2K dataset and trained INT8 models to do a high-quality 3X image upscaling. The runtime of all models was evaluated on the Synaptics VS680 Smart Home board with a dedicated edge NPU capable of accelerating quantized neural networks. All proposed solutions are fully compatible with the above NPU, demonstrating an up to 60 FPS rate when reconstructing Full HD resolution images. A detailed description of all models developed in the challenge is provided in this paper.

translated by 谷歌翻译

Design, Field Evaluation, and Traffic Analysis of a Competitive Autonomous Driving Model in a Congested Environment

Daegyu Lee , Hyunki Seong , Seungil Han , Gyuree Kang , D. Hyunchul Shim , Yoonjin Yoon

分类：机器人

2022-10-31

Recently, numerous studies have investigated cooperative traffic systems using the communication among vehicle-to-everything (V2X). Unfortunately, when multiple autonomous vehicles are deployed while exposed to communication failure, there might be a conflict of ideal conditions between various autonomous vehicles leading to adversarial situation on the roads. In South Korea, virtual and real-world urban autonomous multi-vehicle races were held in March and November of 2021, respectively. During the competition, multiple vehicles were involved simultaneously, which required maneuvers such as overtaking low-speed vehicles, negotiating intersections, and obeying traffic laws. In this study, we introduce a fully autonomous driving software stack to deploy a competitive driving model, which enabled us to win the urban autonomous multi-vehicle races. We evaluate module-based systems such as navigation, perception, and planning in real and virtual environments. Additionally, an analysis of traffic is performed after collecting multiple vehicle position data over communication to gain additional insight into a multi-agent autonomous driving scenario. Finally, we propose a method for analyzing traffic in order to compare the spatial distribution of multiple autonomous vehicles. We study the similarity distribution between each team's driving log data to determine the impact of competitive autonomous driving on the traffic environment.

translated by 谷歌翻译

Neural Fields for Robotic Object Manipulation from a Single Image

Valts Blukis , Taeyeop Lee , Jonathan Tremblay , Bowen Wen , In So Kweon , Kuk-Jin Yoon , Dieter Fox , Stan Birchfield

分类：机器人 | 人工智能 | 计算机视觉 | 机器学习

2022-10-21

We present a unified and compact representation for object rendering, 3D reconstruction, and grasp pose prediction that can be inferred from a single image within a few seconds. We achieve this by leveraging recent advances in the Neural Radiance Field (NeRF) literature that learn category-level priors and fine-tune on novel objects with minimal data and time. Our insight is that we can learn a compact shape representation and extract meaningful additional information from it, such as grasping poses. We believe this to be the first work to retrieve grasping poses directly from a NeRF-based representation using a single viewpoint (RGB-only), rather than going through a secondary network and/or representation. When compared to prior art, our method is two to three orders of magnitude smaller while achieving comparable performance at view reconstruction and grasping. Accompanying our method, we also propose a new dataset of rendered shoes for training a sim-2-real NeRF method with grasping poses for different widths of grippers.

translated by 谷歌翻译

Knowledge Unlearning for Mitigating Privacy Risks in Language Models

Joel Jang , Dongkeun Yoon , Sohee Yang , Sungmin Cha , Moontae Lee , Lajanugen Logeswaran , Minjoon Seo

分类：自然语言处理

2022-10-04

Pretrained Language Models (LMs) memorize a vast amount of knowledge during initial pretraining, including information that may violate the privacy of personal lives and identities. Previous work addressing privacy issues for language models has mostly focused on data preprocessing and differential privacy methods, both requiring re-training the underlying LM. We propose knowledge unlearning as an alternative method to reduce privacy risks for LMs post hoc. We show that simply performing gradient ascent on target token sequences is effective at forgetting them with little to no degradation of general language modeling performances for larger LMs; it sometimes even substantially improves the underlying LM with just a few iterations. We also find that sequential unlearning is better than trying to unlearn all the data at once and that unlearning is highly dependent on which kind of data (domain) is forgotten. By showing comparisons with a previous data preprocessing method and a decoding method known to mitigate privacy risks for LMs, we show that unlearning can give a stronger empirical privacy guarantee in scenarios where the data vulnerable to extraction attacks are known a priori while being much more efficient and robust. We release the code and dataset needed to replicate our results at https://github.com/joeljang/knowledge-unlearning.

translated by 谷歌翻译

Towards Better Generalization with Flexible Representation of Multi-Module Graph Neural Networks

Hyungeun Lee , Hyunmok Park , Kijung Yoon

分类：机器学习 | 人工智能

2022-09-14

图形神经网络（GNNS）已成为旨在对图形结构数据进行学习和推断的引人注目的模型，但是在理解GNN的基本限制方面几乎没有做出的工作，该限制可扩展到较大的图形并推广到分布外输入。。在本文中，我们使用一个随机图生成器，该生成器使我们能够系统地研究图形大小和结构属性如何影响GNN的预测性能。我们提供的具体证据表明，在许多图形属性中，节点度分布的平均值和模态是确定GNN是否可以推广到看不见的图的关键特征。因此，我们使用多个节点更新功能和内部循环优化作为对汇总输入的单一类型的规范非线性转换的概括，提出了灵活的GNN（flex-gnn），并将内部循环优化作为概括，从而使网络可以灵活地适应新图。 Flex-GNN框架改善了几个推理任务的培训设置的概括。

translated by 谷歌翻译

IMG2IMU: Applying Knowledge from Large-Scale Images to IMU Applications via Contrastive Learning

Hyungjun Yoon , Hyeongheon Cha , Canh Hoang Nguyen , Taesik Gong , Sung-Ju Lee

分类：机器学习

2022-09-02

机器学习的最新进展表明，通过自我监督的学习获得的预训练表示形式可以通过小型培训数据实现高精度。与视觉和自然语言处理域不同，基于IMU的应用程序的预培训是具有挑战性的，因为只有少数公开可用的数据集具有足够的规模和多样性来学习可推广的表示。为了克服这个问题，我们提出了IMG2IMU，这是一种新颖的方法，可以适应从大规模图像到不同弹药的IMU感应任务的预训练表示。我们将传感器数据转换为可解释的频谱图，以便模型利用从视觉中获得的知识。此外，我们将对比度学习应用于我们旨在学习用于解释传感器数据的表示形式。我们对五个IMU感应任务的广泛评估表明，IMG2IMU始终优于基准，这说明视力知识可以纳入一些用于IMU感应任务的学习环境中。

translated by 谷歌翻译

HTML版本