智能论文笔记

Automated Precision Localization of Peripherally Inserted Central Catheter Tip through Model-Agnostic Multi-Stage Networks

Subin Park , Yoon Ki Cha , Soyoung Park , Kyung-Su Kim , Myung Jin Chung

分类：计算机视觉

2022-06-14

外围插入的中央导管（PICC）由于其长期的血管内渗透感具有低感染率，因此已被广泛用作代表性的中央静脉线（CVC）之一。但是，PICC的尖端错位频率很高，增加了刺穿，栓塞和心律不齐等并发症的风险。为了自动，精确地检测到它，使用最新的深度学习（DL）技术进行了各种尝试。但是，即使采用了这些方法，实际上仍然很难确定尖端位置，因为多个片段现象（MFP）发生在预测和提取PICC线之前预测尖端之前所需的PICC线的过程。这项研究旨在开发一种通常应用于现有模型的系统，并通过删除模型输出的MF来更准确地恢复PICC线路，从而精确地定位了检测其处置的实际尖端位置。为此，我们提出了一个基于多阶段DL的框架后处理，以后处理现有技术的PICC线提取结果。根据是否将MFCN应用于五个常规模型，将每个均方根误差（RMSE）和MFP发病率比较性能。在内部验证中，当将MFCN应用于现有单个模型时，MFP平均提高了45％。 RMSE从平均26.85mm（17.16至35.80mm）到9.72mm（9.37至10.98mm）的平均增长了63％以上。在外部验证中，当应用MFCN时，MFP的发病率平均下降32％，RMSE平均下降了65 \％。因此，通过应用提出的MFCN，我们观察到与现有模型相比，PICC尖端位置的显着/一致检测性能提高。

translated by 谷歌翻译

What Changes Can Large-scale Language Models Bring? Intensive Study on HyperCLOVA: Billions-scale Korean Generative Pretrained Transformers

Boseop Kim , HyoungSeok Kim , Sang-Woo Lee , Gichang Lee , Donghyun Kwak , Dong Hyeon Jeon , Sunghyun Park , Sungju Kim , Seonhoon Kim , Dongpil Seo

分类：自然语言处理

2021-09-10

GPT-3显示了培训的大规模语言模型（LMS）的卓越情调学习能力，培训数十亿规模数据。在这里，我们解决了GPT-3纸张报告的一些剩余问题，例如非英语LM，不同大小模型的性能，以及最近引入的迅速优化对上下文学习的效果。为实现这一目标，我们介绍了HyperClova，一个韩国VPT-3的韩国变体训练在一个以韩国为中心的560b标准的令牌。通过我们的韩国特定标记化，HyperClova与我们的培训配置增强，显示了韩国各种下游任务的最先进的上下游零射击和几秒钟学习表演。此外，我们展示了基于及时的学习的性能优势，并演示如何集成到迅速的工程管道中。然后，我们讨论了通过引入Hyperclova Studio，互动提示工程界面向ML的非专家提供AI原型设计能力来实现No Code AI范例的可能性。最后，我们展示了我们具有三个成功的内部应用程序的方法的潜力。

translated by 谷歌翻译

Segmentation based tracking of cells in 2D+time microscopy images of macrophages

Seol Ah Park , Tamara Sipka , Zuzana Kriva , George Lutfalla , Mai Nguyen-Chi , Karol Mikula

分类：计算机视觉

2023-01-02

The automated segmentation and tracking of macrophages during their migration are challenging tasks due to their dynamically changing shapes and motions. This paper proposes a new algorithm to achieve automatic cell tracking in time-lapse microscopy macrophage data. First, we design a segmentation method employing space-time filtering, local Otsu's thresholding, and the SUBSURF (subjective surface segmentation) method. Next, the partial trajectories for cells overlapping in the temporal direction are extracted in the segmented images. Finally, the extracted trajectories are linked by considering their direction of movement. The segmented images and the obtained trajectories from the proposed method are compared with those of the semi-automatic segmentation and manual tracking. The proposed tracking achieved 97.4% of accuracy for macrophage data under challenging situations, feeble fluorescent intensity, irregular shapes, and motion of macrophages. We expect that the automatically extracted trajectories of macrophages can provide pieces of evidence of how macrophages migrate depending on their polarization modes in the situation, such as during wound healing.

translated by 谷歌翻译

DMOps: Data Management Operation and Recipes

Eujeong Choi , Chanjun Park

分类：机器学习

2023-01-02

Data-centric AI has shed light on the significance of data within the machine learning (ML) pipeline. Acknowledging its importance, various research and policies are suggested by academia, industry, and government departments. Although the capability of utilizing existing data is essential, the capability to build a dataset has become more important than ever. In consideration of this trend, we propose a "Data Management Operation and Recipes" that will guide the industry regardless of the task or domain. In other words, this paper presents the concept of DMOps derived from real-world experience. By offering a baseline for building data, we want to help the industry streamline its data operation optimally.

translated by 谷歌翻译

Situation-Aware Deep Reinforcement Learning for Autonomous Nonlinear Mobility Control in Cyber-Physical Loitering Munition Systems

Hyunsoo Lee , Soohyun Park , Won Joon Yun , Soyi Jung , Joongheon Kim

分类：机器人

2022-12-31

According to the rapid development of drone technologies, drones are widely used in many applications including military domains. In this paper, a novel situation-aware DRL- based autonomous nonlinear drone mobility control algorithm in cyber-physical loitering munition applications. On the battlefield, the design of DRL-based autonomous control algorithm is not straightforward because real-world data gathering is generally not available. Therefore, the approach in this paper is that cyber-physical virtual environment is constructed with Unity environment. Based on the virtual cyber-physical battlefield scenarios, a DRL-based automated nonlinear drone mobility control algorithm can be designed, evaluated, and visualized. Moreover, many obstacles exist which is harmful for linear trajectory control in real-world battlefield scenarios. Thus, our proposed autonomous nonlinear drone mobility control algorithm utilizes situation-aware components those are implemented with a Raycast function in Unity virtual scenarios. Based on the gathered situation-aware information, the drone can autonomously and nonlinearly adjust its trajectory during flight. Therefore, this approach is obviously beneficial for avoiding obstacles in obstacle-deployed battlefields. Our visualization-based performance evaluation shows that the proposed algorithm is superior from the other linear mobility control algorithms.

translated by 谷歌翻译

Macro-block dropout for improved regularization in training end-to-end speech recognition models

Chanwoo Kim , Sathish Indurti , Jinhwan Park , Wonyong Sung

分类：机器学习 | 自然语言处理

2022-12-29

This paper proposes a new regularization algorithm referred to as macro-block dropout. The overfitting issue has been a difficult problem in training large neural network models. The dropout technique has proven to be simple yet very effective for regularization by preventing complex co-adaptations during training. In our work, we define a macro-block that contains a large number of units from the input to a Recurrent Neural Network (RNN). Rather than applying dropout to each unit, we apply random dropout to each macro-block. This algorithm has the effect of applying different drop out rates for each layer even if we keep a constant average dropout rate, which has better regularization effects. In our experiments using Recurrent Neural Network-Transducer (RNN-T), this algorithm shows relatively 4.30 % and 6.13 % Word Error Rates (WERs) improvement over the conventional dropout on LibriSpeech test-clean and test-other. With an Attention-based Encoder-Decoder (AED) model, this algorithm shows relatively 4.36 % and 5.85 % WERs improvement over the conventional dropout on the same test sets.

translated by 谷歌翻译

Joint Engagement Classification using Video Augmentation Techniques for Multi-person Human-robot Interaction

Yubin Kim , Huili Chen , Sharifa Alghowinem , Cynthia Breazeal , Hae Won Park

分类：计算机视觉

2022-12-28

Affect understanding capability is essential for social robots to autonomously interact with a group of users in an intuitive and reciprocal way. However, the challenge of multi-person affect understanding comes from not only the accurate perception of each user's affective state (e.g., engagement) but also the recognition of the affect interplay between the members (e.g., joint engagement) that presents as complex, but subtle, nonverbal exchanges between them. Here we present a novel hybrid framework for identifying a parent-child dyad's joint engagement by combining a deep learning framework with various video augmentation techniques. Using a dataset of parent-child dyads reading storybooks together with a social robot at home, we first train RGB frame- and skeleton-based joint engagement recognition models with four video augmentation techniques (General Aug, DeepFake, CutOut, and Mixed) applied datasets to improve joint engagement classification performance. Second, we demonstrate experimental results on the use of trained models in the robot-parent-child interaction context. Third, we introduce a behavior-based metric for evaluating the learned representation of the models to investigate the model interpretability when recognizing joint engagement. This work serves as the first step toward fully unlocking the potential of end-to-end video understanding models pre-trained on large public datasets and augmented with data augmentation and visualization techniques for affect recognition in the multi-person human-robot interaction in the wild.

translated by 谷歌翻译

Off-Policy Reinforcement Learning with Loss Function Weighted by Temporal Difference Error

Bumgeun Park , Taeyoung Kim , Woohyeon Moon , Luiz Felipe Vecchietti , Dongsoo Har

分类：机器学习 | 人工智能

2022-12-26

Training agents via off-policy deep reinforcement learning (RL) requires a large memory, named replay memory, that stores past experiences used for learning. These experiences are sampled, uniformly or non-uniformly, to create the batches used for training. When calculating the loss function, off-policy algorithms assume that all samples are of the same importance. In this paper, we hypothesize that training can be enhanced by assigning different importance for each experience based on their temporal-difference (TD) error directly in the training objective. We propose a novel method that introduces a weighting factor for each experience when calculating the loss function at the learning stage. In addition to improving convergence speed when used with uniform sampling, the method can be combined with prioritization methods for non-uniform sampling. Combining the proposed method with prioritization methods improves sampling efficiency while increasing the performance of TD-based off-policy RL algorithms. The effectiveness of the proposed method is demonstrated by experiments in six environments of the OpenAI Gym suite. The experimental results demonstrate that the proposed method achieves a 33%~76% reduction of convergence speed in three environments and an 11% increase in returns and a 3%~10% increase in success rate for other three environments.

translated by 谷歌翻译

FFNeRV: Flow-Guided Frame-Wise Neural Representations for Videos

Joo Chan Lee , Daniel Rho , Jong Hwan Ko , Eunbyung Park

分类：计算机视觉 | 机器学习

2022-12-23

Neural fields, also known as coordinate-based or implicit neural representations, have shown a remarkable capability of representing, generating, and manipulating various forms of signals. For video representations, however, mapping pixel-wise coordinates to RGB colors has shown relatively low compression performance and slow convergence and inference speed. Frame-wise video representation, which maps a temporal coordinate to its entire frame, has recently emerged as an alternative method to represent videos, improving compression rates and encoding speed. While promising, it has still failed to reach the performance of state-of-the-art video compression algorithms. In this work, we propose FFNeRV, a novel method for incorporating flow information into frame-wise representations to exploit the temporal redundancy across the frames in videos inspired by the standard video codecs. Furthermore, we introduce a fully convolutional architecture, enabled by one-dimensional temporal grids, improving the continuity of spatial features. Experimental results show that FFNeRV yields the best performance for video compression and frame interpolation among the methods using frame-wise representations or neural fields. To reduce the model size even further, we devise a more compact convolutional architecture using the group and pointwise convolutions. With model compression techniques, including quantization-aware training and entropy coding, FFNeRV outperforms widely-used standard video codecs (H.264 and HEVC) and performs on par with state-of-the-art video compression algorithms.

translated by 谷歌翻译

DaDe: Delay-adoptive Detector for Streaming Perception

Wonwoo Jo , Kyungshin Lee , Jaewon Baik , Sangsun Lee , Dongho Choi , Hyunkyoo Park

分类：计算机视觉 | 人工智能

2022-12-22

Recognizing the surrounding environment at low latency is critical in autonomous driving. In real-time environment, surrounding environment changes when processing is over. Current detection models are incapable of dealing with changes in the environment that occur after processing. Streaming perception is proposed to assess the latency and accuracy of real-time video perception. However, additional problems arise in real-world applications due to limited hardware resources, high temperatures, and other factors. In this study, we develop a model that can reflect processing delays in real time and produce the most reasonable results. By incorporating the proposed feature queue and feature select module, the system gains the ability to forecast specific time steps without any additional computational costs. Our method is tested on the Argoverse-HD dataset. It achieves higher performance than the current state-of-the-art methods(2022.10) in various environments when delayed . The code is available at https://github.com/danjos95/DADE

translated by 谷歌翻译