智能论文笔记

Generating Multiple-Length Summaries via Reinforcement Learning for Unsupervised Sentence Summarization

Dongmin Hyun , Xiting Wang , Chanyoung Park , Xing Xie , Hwanjo Yu

分类：自然语言处理 | 人工智能

2022-12-21

Sentence summarization shortens given texts while maintaining core contents of the texts. Unsupervised approaches have been studied to summarize texts without human-written summaries. However, recent unsupervised models are extractive, which remove words from texts and thus they are less flexible than abstractive summarization. In this work, we devise an abstractive model based on reinforcement learning without ground-truth summaries. We formulate the unsupervised summarization based on the Markov decision process with rewards representing the summary quality. To further enhance the summary quality, we develop a multi-summary learning mechanism that generates multiple summaries with varying lengths for a given text, while making the summaries mutually enhance each other. Experimental results show that the proposed model substantially outperforms both abstractive and extractive models, yet frequently generating new words not contained in input texts.

translated by 谷歌翻译

Heterogeneous Graph Learning for Multi-modal Medical Data Analysis

Sein Kim , Namkyeong Lee , Junseok Lee , Dongmin Hyun , Chanyoung Park

分类：计算机视觉 | 人工智能 | 机器学习

2022-11-28

Routine clinical visits of a patient produce not only image data, but also non-image data containing clinical information regarding the patient, i.e., medical data is multi-modal in nature. Such heterogeneous modalities offer different and complementary perspectives on the same patient, resulting in more accurate clinical decisions when they are properly combined. However, despite its significance, how to effectively fuse the multi-modal medical data into a unified framework has received relatively little attention. In this paper, we propose an effective graph-based framework called HetMed (Heterogeneous Graph Learning for Multi-modal Medical Data Analysis) for fusing the multi-modal medical data. Specifically, we construct a multiplex network that incorporates multiple types of non-image features of patients to capture the complex relationship between patients in a systematic way, which leads to more accurate clinical decisions. Extensive experiments on various real-world datasets demonstrate the superiority and practicality of HetMed. The source code for HetMed is available at https://github.com/Sein-Kim/Multimodal-Medical.

translated by 谷歌翻译

Beyond Learning from Next Item: Sequential Recommendation via Personalized Interest Sustainability

Dongmin Hyun , Chanyoung Park , Junsu Cho , Hwanjo Yu

分类：人工智能 | 机器学习

2022-09-14

顺序推荐系统通过捕获用户的兴趣漂移来显示有效的建议。有两组现有的顺序模型：以用户和项目为中心的模型。以用户为中心的模型根据每个用户的顺序消费历史记录来捕获个性化的利息漂移，但没有明确考虑用户对项目的利益是否超出培训时间，即利息可持续性。另一方面，以项目为中心的模型考虑了用户在培训时间后的一般利益是否维持，但不是个性化的。在这项工作中，我们提出了一个推荐系统，将两类模型的优势占据优势。我们提出的模型捕获了个性化的利息可持续性，表明每个用户对物品的利益是否会超出培训时间。我们首先制定一项任务，该任务需要根据用户的消费历史记录预测培训时间中每个用户将消耗哪些项目。然后，我们提出简单而有效的方案，以增强用户的稀疏消费历史记录。广泛的实验表明，所提出的模型在11个现实世界数据集上的表现优于10个基线模型。这些代码可在https://github.com/dmhyun/peris上找到。

translated by 谷歌翻译

Residual Correction in Real-Time Traffic Forecasting

Daejin Kim , Youngin Cho , Dongmin Kim , Cheonbok Park , Jaegul Choo

分类：机器学习

2022-09-12

预测交通状况非常具有挑战性，因为每条道路在空间和时间上都高度依赖。最近，为了捕获这种空间和时间依赖性，已经引入了专门设计的架构，例如图形卷积网络和时间卷积网络。尽管流量预测取得了显着进展，但我们发现基于深度学习的流量预测模型仍然在某些模式中失败，主要是在事件情况下（例如，快速速度下降）。尽管通常认为这些故障是由于不可预测的噪声造成的，但我们发现可以通过考虑以前的失败来纠正这些故障。具体而言，我们观察到这些失败中的自相关错误，这表明仍然存在一些可预测的信息。在这项研究中，为了捕获错误的相关性，我们引入了Rescal，Rescal是流量预测的剩余估计模块，作为广泛适用的附加模块，用于现有的流量预测模型。我们的恢复通过使用以前的错误和图形信号来估算未来错误，从而实时校准现有模型的预测。对METR-LA和PEMS-BAY进行的广泛实验表明，我们的恢复可以正确捕获错误的相关性，并在事件情况下纠正各种流量预测模型的故障。

translated by 谷歌翻译

Relational Self-Supervised Learning on Graphs

Namkyeong Lee , Dongmin Hyun , Junseok Lee , Chanyoung Park

分类：机器学习 | 人工智能

2022-08-21

在过去的几年中，图表学习（GRL）是分析图形结构数据的有力策略。最近，GRL方法通过采用用于图像的学习表示形式而开发的自我监督学习方法来显示出令人鼓舞的结果。尽管它们成功了，但现有的GRL方法倾向于忽略图像和图形之间的固有区别，即，假定图像是独立和相同分布的，而图表在数据实例之间显示了关系信息，即节点。为了完全受益于图形结构数据中固有的关系信息，我们提出了一种名为RGRL的新颖GRL方法，该方法从图形本身生成的关系信息中学习。 RGRL学习节点表示形式，使节点之间的关系是增强的不变性，即增强不变的关系，只要保留节点之间的关系，就可以改变节点表示。通过在全球和本地观点中考虑节点之间的关系，RGRL克服了对对比和非对抗性方法的局限性，并实现了两者中最好的。在各种下游任务上对十四个基准数据集进行了广泛的实验，证明了RGRL优于最先进的基线。 RGRL的源代码可在https://github.com/namkyeong/rgrl上获得。

translated by 谷歌翻译

Learning from Noisy Labels with Deep Neural Networks: A Survey

Hwanjun Song , Minseok Kim , Dongmin Park , Yooju Shin , Jae-Gil Lee

分类：机器学习 | 计算机视觉 | (统计)机器学习

2020-07-16

深度学习在大量大数据的帮助下取得了众多域中的显着成功。然而，由于许多真实情景中缺乏高质量标签，数据标签的质量是一个问题。由于嘈杂的标签严重降低了深度神经网络的泛化表现，从嘈杂的标签（强大的培训）学习是在现代深度学习应用中成为一项重要任务。在本调查中，我们首先从监督的学习角度描述了与标签噪声学习的问题。接下来，我们提供62项最先进的培训方法的全面审查，所有这些培训方法都按照其方法论差异分为五个群体，其次是用于评估其优越性的六种性质的系统比较。随后，我们对噪声速率估计进行深入分析，并总结了通常使用的评估方法，包括公共噪声数据集和评估度量。最后，我们提出了几个有前途的研究方向，可以作为未来研究的指导。所有内容将在https://github.com/songhwanjun/awesome-noisy-labels提供。

translated by 谷歌翻译

Segmentation based tracking of cells in 2D+time microscopy images of macrophages

Seol Ah Park , Tamara Sipka , Zuzana Kriva , George Lutfalla , Mai Nguyen-Chi , Karol Mikula

分类：计算机视觉

2023-01-02

The automated segmentation and tracking of macrophages during their migration are challenging tasks due to their dynamically changing shapes and motions. This paper proposes a new algorithm to achieve automatic cell tracking in time-lapse microscopy macrophage data. First, we design a segmentation method employing space-time filtering, local Otsu's thresholding, and the SUBSURF (subjective surface segmentation) method. Next, the partial trajectories for cells overlapping in the temporal direction are extracted in the segmented images. Finally, the extracted trajectories are linked by considering their direction of movement. The segmented images and the obtained trajectories from the proposed method are compared with those of the semi-automatic segmentation and manual tracking. The proposed tracking achieved 97.4% of accuracy for macrophage data under challenging situations, feeble fluorescent intensity, irregular shapes, and motion of macrophages. We expect that the automatically extracted trajectories of macrophages can provide pieces of evidence of how macrophages migrate depending on their polarization modes in the situation, such as during wound healing.

translated by 谷歌翻译

DMOps: Data Management Operation and Recipes

Eujeong Choi , Chanjun Park

分类：机器学习

2023-01-02

Data-centric AI has shed light on the significance of data within the machine learning (ML) pipeline. Acknowledging its importance, various research and policies are suggested by academia, industry, and government departments. Although the capability of utilizing existing data is essential, the capability to build a dataset has become more important than ever. In consideration of this trend, we propose a "Data Management Operation and Recipes" that will guide the industry regardless of the task or domain. In other words, this paper presents the concept of DMOps derived from real-world experience. By offering a baseline for building data, we want to help the industry streamline its data operation optimally.

translated by 谷歌翻译

Situation-Aware Deep Reinforcement Learning for Autonomous Nonlinear Mobility Control in Cyber-Physical Loitering Munition Systems

Hyunsoo Lee , Soohyun Park , Won Joon Yun , Soyi Jung , Joongheon Kim

分类：机器人

2022-12-31

According to the rapid development of drone technologies, drones are widely used in many applications including military domains. In this paper, a novel situation-aware DRL- based autonomous nonlinear drone mobility control algorithm in cyber-physical loitering munition applications. On the battlefield, the design of DRL-based autonomous control algorithm is not straightforward because real-world data gathering is generally not available. Therefore, the approach in this paper is that cyber-physical virtual environment is constructed with Unity environment. Based on the virtual cyber-physical battlefield scenarios, a DRL-based automated nonlinear drone mobility control algorithm can be designed, evaluated, and visualized. Moreover, many obstacles exist which is harmful for linear trajectory control in real-world battlefield scenarios. Thus, our proposed autonomous nonlinear drone mobility control algorithm utilizes situation-aware components those are implemented with a Raycast function in Unity virtual scenarios. Based on the gathered situation-aware information, the drone can autonomously and nonlinearly adjust its trajectory during flight. Therefore, this approach is obviously beneficial for avoiding obstacles in obstacle-deployed battlefields. Our visualization-based performance evaluation shows that the proposed algorithm is superior from the other linear mobility control algorithms.

translated by 谷歌翻译

Macro-block dropout for improved regularization in training end-to-end speech recognition models

Chanwoo Kim , Sathish Indurti , Jinhwan Park , Wonyong Sung

分类：机器学习 | 自然语言处理

2022-12-29

This paper proposes a new regularization algorithm referred to as macro-block dropout. The overfitting issue has been a difficult problem in training large neural network models. The dropout technique has proven to be simple yet very effective for regularization by preventing complex co-adaptations during training. In our work, we define a macro-block that contains a large number of units from the input to a Recurrent Neural Network (RNN). Rather than applying dropout to each unit, we apply random dropout to each macro-block. This algorithm has the effect of applying different drop out rates for each layer even if we keep a constant average dropout rate, which has better regularization effects. In our experiments using Recurrent Neural Network-Transducer (RNN-T), this algorithm shows relatively 4.30 % and 6.13 % Word Error Rates (WERs) improvement over the conventional dropout on LibriSpeech test-clean and test-other. With an Attention-based Encoder-Decoder (AED) model, this algorithm shows relatively 4.36 % and 5.85 % WERs improvement over the conventional dropout on the same test sets.

translated by 谷歌翻译