智能论文笔记

Refining Generative Process with Discriminator Guidance in Score-based Diffusion Models

Dongjun Kim , Yeongmin Kim , Wanmo Kang , Il-Chul Moon

分类：计算机视觉 | 人工智能 | 机器学习

2022-11-28

While the success of diffusion models has been witnessed in various domains, only a few works have investigated the variation of the generative process. In this paper, we introduce a new generative process that is closer to the reverse process than the original generative process, given the identical score checkpoint. Specifically, we adjust the generative process with the auxiliary discriminator between the real data and the generated data. Consequently, the adjusted generative process with the discriminator generates more realistic samples than the original process. In experiments, we achieve new SOTA FIDs of 1.74 on CIFAR-10, 1.33 on CelebA, and 1.88 on FFHQ in the unconditional generation.

translated by 谷歌翻译

Maximum Likelihood Training of Implicit Nonlinear Diffusion Models

Dongjun Kim , Byeonghu Na , Se Jung Kwon , Dongsoo Lee , Wanmo Kang , Il-Chul Moon

分类：机器学习

2022-05-27

尽管存在扩散模型的各种变化，但将线性扩散扩散到非线性扩散过程中仅由几项作品研究。非线性效应几乎没有被理解，但是直觉上，将有更多有希望的扩散模式来最佳地训练生成分布向数据分布。本文介绍了基于分数扩散模型的数据自适应和非线性扩散过程。提出的隐式非线性扩散模型（INDM）通过结合归一化流量和扩散过程来学习非线性扩散过程。具体而言，INDM通过通过流网络利用\ textIt {litex {litex {littent Space}的线性扩散来隐式构建\ textIt {data Space}的非线性扩散。由于非线性完全取决于流网络，因此该流网络是形成非线性扩散的关键。这种灵活的非线性是针对DDPM ++的非MLE训练，将INDM的学习曲线提高到了几乎最大的似然估计（MLE）训练，事实证明，这是具有身份流量的INDM的特殊情况。同样，训练非线性扩散可以通过离散的步骤大小产生采样鲁棒性。在实验中，INDM实现了Celeba的最新FID。

translated by 谷歌翻译

Learning multiple gaits of quadruped robot using hierarchical reinforcement learning

Yunho Kim , Bukun Son , Dongjun Lee

分类：机器人 | 人工智能

2021-12-09

由于其鲁棒性和可扩展性，在使用增强学习的速度学习时，可以越来越兴趣地学习四足机器人的速度指令跟踪控制器。但是，无论命令速度如何，单个策略训练训练，通常都显示了单个步态。考虑到根据四足动物的速度，考虑到最佳步态存在的次优的解决方案。在这项工作中，我们提出了一个分层控制器，用于四足机器人，可以在跟踪速度命令的同时生成多个Gaits（即步态，小跑，绑定）。我们的控制器由两项策略组成，每个政策都作为中央图案发生器和本地反馈控制器组成，并培训了具有层次强化学习。实验结果表明1）特定速度范围的最佳步态的存在2）与由单个策略组成的控制器相比，我们的分层控制器的效率通常显示单个步态。代码公开可用。

translated by 谷歌翻译

Soft Truncation: A Universal Training Technique of Score-based Diffusion Model for High Precision Score Estimation

Dongjun Kim , Seungjae Shin , Kyungwoo Song , Wanmo Kang , Il-Chul Moon

分类：机器学习 | 人工智能 | (统计)机器学习

2021-06-10

扩散模型的最新进展带来了图像生成任务的最新性能。然而，扩散模型的先前研究的经验结果意味着密度估计与样品产生性能之间存在逆相关性。本文研究了足够的经验证据，表明这种反相关发生，因为密度估计值显着造成了较小的扩散时间的贡献，而样品产生主要取决于大扩散时间。但是，在整个扩散时间内训练得分网络良好，因为损耗量表在每个扩散时间都显着不平衡。因此，为了成功训练，我们引入了软截断，这是一种普遍适用的扩散模型训练技术，将固定和静态截断的超参数软化为随机变量。在实验中，软截断可在CIFAR-10，Celeba，Celeba-HQ 256X256和STL-10数据集上实现最先进的性能。

translated by 谷歌翻译

KLUE: Korean Language Understanding Evaluation

Sungjoon Park , Jihyung Moon , Sungdong Kim , Won Ik Cho , Jiyoon Han , Jangwon Park , Chisung Song , Junseong Kim , Yongsook Song , Taehwan Oh

分类：自然语言处理

2021-05-20

我们介绍韩语了解评估（KLUE）基准。 Klue是8个韩国自然语言理解（nlu）任务的集合，包括主题分类，语言典的相似性，自然语言推断，命名实体识别，关系提取，依赖解析，机器阅读理解和对话状态跟踪。我们从各种源语料库中展开的所有任务，同时尊重版权，以确保任何没有任何限制的人的可访问性。考虑到道德考虑，我们仔细设计了注释协议。随着基准任务和数据，我们为每个任务提供适用的评估指标和微调配方，为每项任务进行预训练语言模型。我们还释放了预用的语言模型（PLM），Klue-Bert和Klue-Roberta，以帮助在KLUE上再现基线模型，从而促进未来的研究。我们通过拟议的Klue基准套件从初步实验中进行了一些有趣的观察，已经证明了这款新的基准套件的有用性。首先，我们找到了klue-roberta-mantring的其他基线，包括多语种plms和现有的开源韩国plms。其次，即使我们从预先预测语料库中取代个人身份信息，我们也会看到性能下降最小，这表明隐私和NLU能力并不彼此可能。最后，我们发现，使用BPE标记与语素级预象的组合，在涉及语素级标记，检测和发电的任务中是有效的。除了加速韩国人NLP研究外，我们的创建Klue的全面文件将有助于将来为其他语言创建类似的资源。 klue在https://klue-benchmark.com上提供。

translated by 谷歌翻译

Neural Posterior Regularization for Likelihood-Free Inference

Dongjun Kim , Kyungwoo Song , Seungjae Shin , Wanmo Kang , Il-Chul Moon , Weonyoung Joo

分类：机器学习 | 人工智能 | (统计)机器学习

2021-02-15

A simulation is useful when the phenomenon of interest is either expensive to regenerate or irreproducible with the same context. Recently, Bayesian inference on the distribution of the simulation input parameter has been implemented sequentially to minimize the required simulation budget for the task of simulation validation to the real-world. However, the Bayesian inference is still challenging when the ground-truth posterior is multi-modal with a high-dimensional simulation output. This paper introduces a regularization technique, namely Neural Posterior Regularization (NPR), which enforces the model to explore the input parameter space effectively. Afterward, we provide the closed-form solution of the regularized optimization that enables analyzing the effect of the regularization. We empirically validate that NPR attains the statistically significant gain on benchmark performances for diverse simulation tasks.

translated by 谷歌翻译

Sequential Likelihood-Free Inference with Neural Proposal

Dongjun Kim , Kyungwoo Song , YoonYeong Kim , Yongjin Shin , Wanmo Kang , Il-Chul Moon , Weonyoung Joo

分类：人工智能 | 机器学习 | (统计)机器学习

2020-10-15

Bayesian inference without the likelihood evaluation, or likelihood-free inference, has been a key research topic in simulation studies for gaining quantitatively validated simulation models on real-world datasets. As the likelihood evaluation is inaccessible, previous papers train the amortized neural network to estimate the ground-truth posterior for the simulation of interest. Training the network and accumulating the dataset alternatively in a sequential manner could save the total simulation budget by orders of magnitude. In the data accumulation phase, the new simulation inputs are chosen within a portion of the total simulation budget to accumulate upon the collected dataset. This newly accumulated data degenerates because the set of simulation inputs is hardly mixed, and this degenerated data collection process ruins the posterior inference. This paper introduces a new sampling approach, called Neural Proposal (NP), of the simulation input that resolves the biased data collection as it guarantees the i.i.d. sampling. The experiments show the improved performance of our sampler, especially for the simulations with multi-modal posteriors.

translated by 谷歌翻译

A Multi-Task Benchmark for Korean Legal Language Understanding and Judgement Prediction

Wonseok Hwang , Dongjun Lee , Kyoungyeon Cho , Hanuhl Lee , Minjoon Seo

分类：自然语言处理 | 人工智能

2022-06-10

深度学习的最新进展极大地改变了机器学习，尤其是在自然语言处理领域的方式，可以应用于法律领域。但是，这种转移到数据驱动的方法需要更大，更多样化的数据集，但数量仍然很小，尤其是在非英语语言中。在这里，我们介绍了韩国法律AI数据集的第一个大规模基准，即Lox Open，由一个法律语料库组成，两个分类任务，两个法律判决预测（LJP）任务和一项摘要任务。法律语料库由15万韩国的先例（26400万代币）组成，其中63K在过去4年中被判刑，而96,000次是从第一和第二级法院进行审查，其中审查了事实问题。这两个分类任务是案例名称（10K）和法规（3K）从个人案例的事实描述中的预测。 LJP任务由（1）11K犯罪示例组成，要求该模型预测罚款，对劳动的监禁以及没有劳动范围的监禁，以及（2）5K民事示例，其中投入是事实和要求救济和产出是索赔接受程度。摘要任务包括最高法院的先例和相应的摘要。我们还发布了LCUBE，这是该研究中首个对法律语料库进行培训的韩国法律语言模型。鉴于韩国法律的独特性以及这项工作中涵盖的法律任务的多样性，我们认为Lox Open有助于全球法律研究的多语言。 Lox Open和LCUBE将公开使用。

translated by 谷歌翻译

Large-Dimensional Multibody Dynamics Simulation Using Contact Nodalization and Diagonalization

Jeongmin Lee , Minji Lee , Dongjun Lee

分类：机器人

2022-01-23

我们提出了一个新型的多体动力学仿真框架，该框架可以有效地处理较大的维度和互补性多接触条件。典型的接触模拟方法执行接触式脉冲级的固定点迭代（IL-FPI），该迭代具有高度的矩阵反转和乘法以及对不良条件接触情况的敏感性。为了避免这种情况，我们提出了一个基于速度级固定点迭代（VL-FPI）的新颖框架，该迭代通过利用特定的替代动力学和接触淋巴结（带有虚拟节点），它不仅可以实现互联网脱钩，而且可以实现他们的轴间轴解耦合（即接触对角线化）。然后，这使我们能够在每个VL-FPI迭代环过程中单次/并行解决接触问题，而替代动态结构使我们能够规避大型/密度矩阵反转/乘法，从而显着加快了仿真的加快。有改进的收敛属性的时间。从理论上讲，我们的框架解决方案与原始问题的解决方案是一致的，进一步阐明了我们提出的求解器收敛的数学条件。我们提出的仿真框架的性能和性能也得到了证明，并针对包括可变形物体在内的各种大维/多接触场景进行了实验验证。

translated by 谷歌翻译

Class-Continuous Conditional Generative Neural Radiance Field

Jiwook Kim , Minhyeok Lee

分类：计算机视觉 | 人工智能

2023-01-03

The 3D-aware image synthesis focuses on conserving spatial consistency besides generating high-resolution images with fine details. Recently, Neural Radiance Field (NeRF) has been introduced for synthesizing novel views with low computational cost and superior performance. While several works investigate a generative NeRF and show remarkable achievement, they cannot handle conditional and continuous feature manipulation in the generation procedure. In this work, we introduce a novel model, called Class-Continuous Conditional Generative NeRF ($\text{C}^{3}$G-NeRF), which can synthesize conditionally manipulated photorealistic 3D-consistent images by projecting conditional features to the generator and the discriminator. The proposed $\text{C}^{3}$G-NeRF is evaluated with three image datasets, AFHQ, CelebA, and Cars. As a result, our model shows strong 3D-consistency with fine details and smooth interpolation in conditional feature manipulation. For instance, $\text{C}^{3}$G-NeRF exhibits a Fr\'echet Inception Distance (FID) of 7.64 in 3D-aware face image synthesis with a $\text{128}^{2}$ resolution. Additionally, we provide FIDs of generated 3D-aware images of each class of the datasets as it is possible to synthesize class-conditional images with $\text{C}^{3}$G-NeRF.

translated by 谷歌翻译