智能论文笔记

STEM image analysis based on deep learning: identification of vacancy defects and polymorphs of ${MoS_2}$

Kihyun Lee , Jinsub Park , Soyeon Choi , Yangjin Lee , Sol Lee , Joowon Jung , Jong-Young Lee , Farman Ullah , Zeeshan Tahir , Yong Soo Kim

分类：计算机视觉

2022-06-09

扫描透射电子显微镜（STEM）是用于多种材料的原子分辨率结构分析的必不可少的工具。 STEM图像的常规分析是一个广泛的动手过程，它限制了高通量数据的有效处理。在这里，我们应用一个完全卷积网络（FCN）来识别二维晶体的重要结构特征。 Resunet是一种FCN的类型，用于识别来自原子分辨率STEM图像的$ {MOS_2} $的硫磺空缺和多晶型物类型。在存在不同水平的噪声，畸变和碳污染的情况下，基于模拟图像的训练来实现有效的模型。 FCN模型对广泛的实验茎图像的准确性与仔细的动手分析相当。我们的工作提供了有关最佳实践的指南，以训练深度学习模型进行STEM图像分析，并证明了FCN有效地处理大量STEM数据的应用。

translated by 谷歌翻译

Conditional Approximate Normalizing Flows for Joint Multi-Step Probabilistic Electricity Demand Forecasting

Arec Jamgochian , Di Wu , Kunal Menda , Soyeon Jung , Mykel J. Kochenderfer

分类：机器学习

2022-01-08

一些现实世界决策问题需要立即对多个步骤进行概率预测。然而，概率预测方法可能无法捕获在长时间视野中存在的基础时间序列中的相关性，因为累积累积。一个这样的应用是在网格环境中不确定性下的资源调度，这需要预测电力需求，这是自然嘈杂的，但通常是循环的。在本文中，我们介绍了条件近似标准化流量（CANF），以便在长时间视野中存在相关性时进行概率的多步时间序列预测。我们首先展示了我们对估计玩具分布密度的方法的功效，发现CANF与高斯混合模型相比通过三分之一提高了KL发散，同时仍可用于显式调理。然后，我们使用公开的家用电力消耗数据集来展示CANF在联合概率多步预测上的有效性。经验结果表明，条件近似标准化流动在多步骤预测精度方面优于其他方法，并导致高达10倍的调度决策。我们的实现可在https://github.com/sisl/jointdemandforecast中获得。

translated by 谷歌翻译

Classification of Goods Using Text Descriptions With Sentences Retrieval

Eunji Lee , Sundong Kim , Sihyun Kim , Sungwon Park , Meeyoung Cha , Soyeon Jung , Suyoung Yang , Yeonsoo Choi , Sungdae Ji , Minsoo Song

分类：人工智能

2021-11-02

分配和验证国际公认的商品代码（HS编码）的任务是贸易货物的是海关办公室的关键职能之一。这一决定对于进口商和出口商至关重要，因为它决定了关税率。但是，类似于法官作出的法院决定，即使对于经验丰富的海关官员，任务也可能是非琐碎的。目前的论文提出了一个深入的学习模式，以协助这一看似挑战HS代码分类。与韩国海关服务一起，我们建立了基于科电的决策模型，该决策模型建议了HS代码的最有可能的标题和副标题（即，前四位和六位数）。在129,084件之前的情况下评估显示，我们模型的前3个建议在分类265个副标题方面的准确性为95.5％。这个有希望的结果意味着算法可以通过协助HS代码分类任务来减少海关官员所采取的时间和精力。

translated by 谷歌翻译

Situation-Aware Deep Reinforcement Learning for Autonomous Nonlinear Mobility Control in Cyber-Physical Loitering Munition Systems

Hyunsoo Lee , Soohyun Park , Won Joon Yun , Soyi Jung , Joongheon Kim

分类：机器人

2022-12-31

According to the rapid development of drone technologies, drones are widely used in many applications including military domains. In this paper, a novel situation-aware DRL- based autonomous nonlinear drone mobility control algorithm in cyber-physical loitering munition applications. On the battlefield, the design of DRL-based autonomous control algorithm is not straightforward because real-world data gathering is generally not available. Therefore, the approach in this paper is that cyber-physical virtual environment is constructed with Unity environment. Based on the virtual cyber-physical battlefield scenarios, a DRL-based automated nonlinear drone mobility control algorithm can be designed, evaluated, and visualized. Moreover, many obstacles exist which is harmful for linear trajectory control in real-world battlefield scenarios. Thus, our proposed autonomous nonlinear drone mobility control algorithm utilizes situation-aware components those are implemented with a Raycast function in Unity virtual scenarios. Based on the gathered situation-aware information, the drone can autonomously and nonlinearly adjust its trajectory during flight. Therefore, this approach is obviously beneficial for avoiding obstacles in obstacle-deployed battlefields. Our visualization-based performance evaluation shows that the proposed algorithm is superior from the other linear mobility control algorithms.

translated by 谷歌翻译

HIER: Metric Learning Beyond Class Labels via Hierarchical Regularization

Sungyeon Kim , Boseung Jung , Suha Kwak

分类：计算机视觉 | 人工智能

2022-12-29

Supervision for metric learning has long been given in the form of equivalence between human-labeled classes. Although this type of supervision has been a basis of metric learning for decades, we argue that it hinders further advances of the field. In this regard, we propose a new regularization method, dubbed HIER, to discover the latent semantic hierarchy of training data, and to deploy the hierarchy to provide richer and more fine-grained supervision than inter-class separability induced by common metric learning losses. HIER achieved this goal with no annotation for the semantic hierarchy but by learning hierarchical proxies in hyperbolic spaces. The hierarchical proxies are learnable parameters, and each of them is trained to serve as an ancestor of a group of data or other proxies to approximate the semantic hierarchy among them. HIER deals with the proxies along with data in hyperbolic space since geometric properties of the space are well-suited to represent their hierarchical structure. The efficacy of HIER was evaluated on four standard benchmarks, where it consistently improved performance of conventional methods when integrated with them, and consequently achieved the best records, surpassing even the existing hyperbolic metric learning technique, in almost all settings.

translated by 谷歌翻译

Critic-Guided Decoding for Controlled Text Generation

Minbeom Kim , Hwanhee Lee , Kang Min Yoo , Joonsuk Park , Hwaran Lee , Kyomin Jung

分类：自然语言处理

2022-12-21

Steering language generation towards objectives or away from undesired content has been a long-standing goal in utilizing language models (LM). Recent work has demonstrated reinforcement learning and weighted decoding as effective approaches to achieve a higher level of language control and quality with pros and cons. In this work, we propose a novel critic decoding method for controlled language generation (CriticControl) that combines the strengths of reinforcement learning and weighted decoding. Specifically, we adopt the actor-critic framework to train an LM-steering critic from non-differentiable reward models. And similar to weighted decoding, our method freezes the language model and manipulates the output token distribution using called critic, improving training efficiency and stability. Evaluation of our method on three controlled generation tasks, namely topic control, sentiment control, and detoxification, shows that our approach generates more coherent and well-controlled texts than previous methods. In addition, CriticControl demonstrates superior generalization ability in zero-shot settings. Human evaluation studies also corroborate our findings.

translated by 谷歌翻译

Spoken Language Understanding for Conversational AI: Recent Advances and Future Direction

Soyeon Caren Han , Siqu Long , Henry Weld , Josiah Poon

分类：自然语言处理 | 人工智能

2022-12-21

When a human communicates with a machine using natural language on the web and online, how can it understand the human's intention and semantic context of their talk? This is an important AI task as it enables the machine to construct a sensible answer or perform a useful action for the human. Meaning is represented at the sentence level, identification of which is known as intent detection, and at the word level, a labelling task called slot filling. This dual-level joint task requires innovative thinking about natural language and deep learning network design, and as a result, many approaches and models have been proposed and applied. This tutorial will discuss how the joint task is set up and introduce Spoken Language Understanding/Natural Language Understanding (SLU/NLU) with Deep Learning techniques. We will cover the datasets, experiments and metrics used in the field. We will describe how the machine uses the latest NLP and Deep Learning techniques to address the joint task, including recurrent and attention-based Transformer networks and pre-trained models (e.g. BERT). We will then look in detail at a network that allows the two levels of the task, intent classification and slot filling, to interact to boost performance explicitly. We will do a code demonstration of a Python notebook for this model and attendees will have an opportunity to watch coding demo tasks on this joint NLU to further their understanding.

translated by 谷歌翻译

Can Current Task-oriented Dialogue Models Automate Real-world Scenarios in the Wild?

Sang-Woo Lee , Sungdong Kim , Donghyeon Ko , Donghoon Ham , Youngki Hong , Shin Ah Oh , Hyunhoon Jung , Wangkyo Jung , Kyunghyun Cho , Donghyun Kwak

分类：自然语言处理

2022-12-20

Task-oriented dialogue (TOD) systems are mainly based on the slot-filling-based TOD (SF-TOD) framework, in which dialogues are broken down into smaller, controllable units (i.e., slots) to fulfill a specific task. A series of approaches based on this framework achieved remarkable success on various TOD benchmarks. However, we argue that the current TOD benchmarks are limited to surrogate real-world scenarios and that the current TOD models are still a long way from unraveling the scenarios. In this position paper, we first identify current status and limitations of SF-TOD systems. After that, we explore the WebTOD framework, the alternative direction for building a scalable TOD system when a web/mobile interface is available. In WebTOD, the dialogue system learns how to understand the web/mobile interface that the human agent interacts with, powered by a large-scale language model.

translated by 谷歌翻译

HouseCat6D -- A Large-Scale Multi-Modal Category Level 6D Object Pose Dataset with Household Objects in Realistic Scenarios

HyunJun Jung , Shun-Cheng Wu , Patrick Ruhkamp , Hannah Schieber , Pengyuan Wang , Giulia Rizzoli , Hongcheng Zhao , Sven Damian Meier , Daniel Roth , Nassir Navab

分类：计算机视觉

2022-12-20

Estimating the 6D pose of objects is one of the major fields in 3D computer vision. Since the promising outcomes from instance-level pose estimation, the research trends are heading towards category-level pose estimation for more practical application scenarios. However, unlike well-established instance-level pose datasets, available category-level datasets lack annotation quality and provided pose quantity. We propose the new category level 6D pose dataset HouseCat6D featuring 1) Multi-modality of Polarimetric RGB+P and Depth, 2) Highly diverse 194 objects of 10 household object categories including 2 photometrically challenging categories, 3) High-quality pose annotation with an error range of only 1.35 mm to 1.74 mm, 4) 41 large scale scenes with extensive viewpoint coverage, 5) Checkerboard-free environment throughout the entire scene. We also provide benchmark results of state-of-the-art category-level pose estimation networks.

translated by 谷歌翻译

Pivotal Role of Language Modeling in Recommender Systems: Enriching Task-specific and Task-agnostic Representation Learning

Kyuyong Shin , Hanock Kwak , Wonjae Kim , Jisu Jeong , Seungjae Jung , Kyung-Min Kim , Jung-Woo Ha , Sang-Woo Lee

分类：自然语言处理

2022-12-07

Recent studies have proposed a unified user modeling framework that leverages user behavior data from various applications. Most benefit from utilizing users' behavior sequences as plain texts, representing rich information in any domain or system without losing generality. Hence, a question arises: Can language modeling for user history corpus help improve recommender systems? While its versatile usability has been widely investigated in many domains, its applications to recommender systems still remain underexplored. We show that language modeling applied directly to task-specific user histories achieves excellent results on diverse recommendation tasks. Also, leveraging additional task-agnostic user histories delivers significant performance benefits. We further demonstrate that our approach can provide promising transfer learning capabilities for a broad spectrum of real-world recommender systems, even on unseen domains and services.

translated by 谷歌翻译