智能论文笔记

DCT Approximations Based on Chen's Factorization

C. J. Tablada , T. L. T. da Silveira , R. J. Cintra , F. M. Bayer

分类：计算机视觉

2022-07-24

在本文中，提出了基于陈的分解的两个8点无乘法DCT近似值，并得出了它们的快速算法。通过计算成本，误差能量和编码增益来评估这两种转换。进行具有JPEG样图像压缩方案的实验，并将结果与竞争方法进行比较。根据JRIDI-Alfalou-Meher算法将提出的低复杂性变换缩放，以实现16分和32分的近似值。新的转换集嵌入了HEVC参考软件中，以提供完全符合HEVC的视频编码方案。我们表明，近似转换可以以非常低的复杂性成本胜过传统变换和最先进的方法。

translated by 谷歌翻译

Low-Complexity Loeffler DCT Approximations for Image and Video Coding

D. F. G. Coelho , R. J. Cintra , F. M. Bayer , S. Kulasekera , A. Madanayake , P. A. C. Martinez , T. L. T. Silveira , R. S. Oliveira , V. S. Dimitrov

分类：计算机视觉

2022-07-29

本文基于Loeffler离散余弦变换（DCT）算法引入了矩阵参数化方法。结果，提出了一类新的八点DCT近似值，能够统一文献中几个八点DCT近似的数学形式主义。帕累托效率的DCT近似是通过多准则优化获得的，其中考虑了计算复杂性，接近性和编码性能。有效的近似及其缩放的16和32点版本嵌入了图像和视频编码器中，包括类似JPEG的编解码器以及H.264/AVC和H.265/HEVC标准。将结果与未修饰的标准编解码器进行比较。在Xilinx VLX240T FPGA上映射并实现了有效的近似值，并评估了面积，速度和功耗。

translated by 谷歌翻译

Jubileo: An Open-Source Robot and Framework for Research in Human-Robot Social Interaction

Jair A. Bottega , Victor A. Kich , Alisson H. Kolling , Jardel D. S. Dyonisio , Pedro L. Corçaque , Rodrigo da S. Guerra , Daniel F. T. Gamarra

分类：机器人

2022-09-27

人类机器人相互作用（HRI）对于在日常生活中广泛使用机器人至关重要。机器人最终将能够通过有效的社会互动来履行人类文明的各种职责。创建直接且易于理解的界面，以与机器人开始在个人工作区中扩散时与机器人互动至关重要。通常，与模拟机器人的交互显示在屏幕上。虚拟现实（VR）是一个更具吸引力的替代方法，它为视觉提示提供了更像现实世界中看到的线索。在这项研究中，我们介绍了Jubileo，这是一种机器人的动画面孔，并使用人类机器人社会互动领域的各种研究和应用开发工具。Jubileo Project不仅提供功能齐全的开源物理机器人。它还提供了一个全面的框架，可以通过VR接口进行操作，从而为HRI应用程序测试带来沉浸式环境，并明显更好地部署速度。

translated by 谷歌翻译

Segmentation-guided Domain Adaptation and Data Harmonization of Multi-device Retinal Optical Coherence Tomography using Cycle-Consistent Generative Adversarial Networks

Shuo Chen , Da Ma , Sieun Lee , Timothy T. L. Yu , Gavin Xu , Donghuan Lu , Karteek Popuri , Myeong Jin Ju , Marinko V. Sarunic , Mirza Faisal Beg

分类：计算机视觉 | 机器学习

2022-08-31

光学相干断层扫描（OCT）是一种非侵入性技术，可在微米分辨率中捕获视网膜的横截面区域。它已被广泛用作辅助成像参考，以检测与眼睛有关的病理学并预测疾病特征的纵向进展。视网膜层分割是至关重要的特征提取技术之一，其中视网膜层厚度的变化和由于液体的存在而引起的视网膜层变形高度相关，与多种流行性眼部疾病（如糖尿病性视网膜病）和年龄相关的黄斑疾病高度相关。变性（AMD）。但是，这些图像是从具有不同强度分布或换句话说的不同设备中获取的，属于不同的成像域。本文提出了一种分割引导的域适应方法，以将来自多个设备的图像调整为单个图像域，其中可用的最先进的预训练模型可用。它避免了即将推出的新数据集的手动标签的时间消耗以及现有网络的重新培训。网络的语义一致性和全球特征一致性将最大程度地减少许多研究人员报告的幻觉效果，这些效应对周期矛盾的生成对抗网络（Cyclegan）体系结构。

translated by 谷歌翻译

HTML版本

Fast Unsupervised Brain Anomaly Detection and Segmentation with Diffusion Models

Walter H. L. Pinaya , Mark S. Graham , Robert Gray , Pedro F Da Costa , Petru-Daniel Tudosiu , Paul Wright , Yee H. Mah , Andrew D. MacKinnon , James T. Teo , Rolf Jager

分类：计算机视觉

2022-06-07

深层生成模型已成为检测数据中任意异常的有前途的工具，并分配了手动标记的必要性。最近，自回旋变压器在医学成像中取得了最先进的性能。但是，这些模型仍然具有一些内在的弱点，例如需要将图像建模为1D序列，在采样过程中误差的积累以及与变压器相关的显着推理时间。去核扩散概率模型是一类非自动回旋生成模型，最近显示出可以在计算机视觉中产生出色的样品（超过生成的对抗网络），并实现与变压器具有竞争力同时具有快速推理时间的对数可能性。扩散模型可以应用于自动编码器学到的潜在表示，使其易于扩展，并适用于高维数据（例如医学图像）的出色候选者。在这里，我们提出了一种基于扩散模型的方法，以检测和分段脑成像中的异常。通过在健康数据上训练模型，然后探索其在马尔可夫链上的扩散和反向步骤，我们可以识别潜在空间中的异常区域，因此可以确定像素空间中的异常情况。我们的扩散模型与一系列具有2D CT和MRI数据的实验相比，具有竞争性能，涉及合成和实际病理病变，推理时间大大减少，从而使它们的用法在临床上可行。

translated by 谷歌翻译

Audio Latent Space Cartography

Nicolas Jonason , Bob L. T. Sturm

分类：机器学习

2022-12-05

We explore the generation of visualisations of audio latent spaces using an audio-to-image generation pipeline. We believe this can help with the interpretability of audio latent spaces. We demonstrate a variety of results on the NSynth dataset. A web demo is available.

translated by 谷歌翻译

Detection and depth estimation for domestic waste in outdoor environments by sensors fusion

Ignacio de L. Páez-Ubieta , Edison Velasco-Sánchez , Santiago T. Puente , Francisco A. Candelas

分类：机器人

2022-11-08

In this work, we estimate the depth in which domestic waste are located in space from a mobile robot in outdoor scenarios. As we are doing this calculus on a broad range of space (0.3 - 6.0 m), we use RGB-D camera and LiDAR fusion. With this aim and range, we compare several methods such as average, nearest, median and center point, applied to those which are inside a reduced or non-reduced Bounding Box (BB). These BB are obtained from segmentation and detection methods which are representative of these techniques like Yolact, SOLO, You Only Look Once (YOLO)v5, YOLOv6 and YOLOv7. Results shown that, applying a detection method with the average technique and a reduction of BB of 40%, returns the same output as segmenting the object and applying the average method. Indeed, the detection method is faster and lighter in comparison with the segmentation one. The committed median error in the conducted experiments was 0.0298 ${\pm}$ 0.0544 m.

translated by 谷歌翻译

An Incremental Phase Mapping Approach for X-ray Diffraction Patterns using Binary Peak Representations

Dipendra Jha , K. V. L. V. Narayanachari , Ruifeng Zhang , Justin Liao , Denis T. Keane , Wei-keng Liao , Alok Choudhary , Yip-Wah Chung , Michael Bedzyk , Ankit Agrawal

分类：机器学习 | 计算机视觉

2022-11-08

Despite the huge advancement in knowledge discovery and data mining techniques, the X-ray diffraction (XRD) analysis process has mostly remained untouched and still involves manual investigation, comparison, and verification. Due to the large volume of XRD samples from high-throughput XRD experiments, it has become impossible for domain scientists to process them manually. Recently, they have started leveraging standard clustering techniques, to reduce the XRD pattern representations requiring manual efforts for labeling and verification. Nevertheless, these standard clustering techniques do not handle problem-specific aspects such as peak shifting, adjacent peaks, background noise, and mixed phases; hence, resulting in incorrect composition-phase diagrams that complicate further steps. Here, we leverage data mining techniques along with domain expertise to handle these issues. In this paper, we introduce an incremental phase mapping approach based on binary peak representations using a new threshold based fuzzy dissimilarity measure. The proposed approach first applies an incremental phase computation algorithm on discrete binary peak representation of XRD samples, followed by hierarchical clustering or manual merging of similar pure phases to obtain the final composition-phase diagram. We evaluate our method on the composition space of two ternary alloy systems- Co-Ni-Ta and Co-Ti-Ta. Our results are verified by domain scientists and closely resembles the manually computed ground-truth composition-phase diagrams. The proposed approach takes us closer towards achieving the goal of complete end-to-end automated XRD analysis.

translated by 谷歌翻译

A Deep Learning Approach to Generating Photospheric Vector Magnetograms of Solar Active Regions for SOHO/MDI Using SDO/HMI and BBSO Data

Haodi Jiang , Qin Li , Zhihang Hu , Nian Liu , Yasser Abduallah , Ju Jing , Genwei Zhang , Yan Xu , Wynne Hsu , Jason T. L. Wang

分类：机器学习

2022-11-04

Solar activity is usually caused by the evolution of solar magnetic fields. Magnetic field parameters derived from photospheric vector magnetograms of solar active regions have been used to analyze and forecast eruptive events such as solar flares and coronal mass ejections. Unfortunately, the most recent solar cycle 24 was relatively weak with few large flares, though it is the only solar cycle in which consistent time-sequence vector magnetograms have been available through the Helioseismic and Magnetic Imager (HMI) on board the Solar Dynamics Observatory (SDO) since its launch in 2010. In this paper, we look into another major instrument, namely the Michelson Doppler Imager (MDI) on board the Solar and Heliospheric Observatory (SOHO) from 1996 to 2010. The data archive of SOHO/MDI covers more active solar cycle 23 with many large flares. However, SOHO/MDI data only has line-of-sight (LOS) magnetograms. We propose a new deep learning method, named MagNet, to learn from combined LOS magnetograms, Bx and By taken by SDO/HMI along with H-alpha observations collected by the Big Bear Solar Observatory (BBSO), and to generate vector components Bx' and By', which would form vector magnetograms with observed LOS data. In this way, we can expand the availability of vector magnetograms to the period from 1996 to present. Experimental results demonstrate the good performance of the proposed method. To our knowledge, this is the first time that deep learning has been used to generate photospheric vector magnetograms of solar active regions for SOHO/MDI using SDO/HMI and H-alpha data.

translated by 谷歌翻译

Automated segmentation of microvessels in intravascular OCT images using deep learning

Juhwan Lee , Justin N. Kim , Lia Gomez-Perez , Yazan Gharaibeh , Issam Motairek , Ga-briel T. R. Pereira , Vladislav N. Zimin , Luis A. P. Dallan , Ammar Hoori , Sadeer Al-Kindi

分类：计算机视觉 | 机器学习

2022-10-01

To analyze this characteristic of vulnerability, we developed an automated deep learning method for detecting microvessels in intravascular optical coherence tomography (IVOCT) images. A total of 8,403 IVOCT image frames from 85 lesions and 37 normal segments were analyzed. Manual annotation was done using a dedicated software (OCTOPUS) previously developed by our group. Data augmentation in the polar (r,{\theta}) domain was applied to raw IVOCT images to ensure that microvessels appear at all possible angles. Pre-processing methods included guidewire/shadow detection, lumen segmentation, pixel shifting, and noise reduction. DeepLab v3+ was used to segment microvessel candidates. A bounding box on each candidate was classified as either microvessel or non-microvessel using a shallow convolutional neural network. For better classification, we used data augmentation (i.e., angle rotation) on bounding boxes with a microvessel during network training. Data augmentation and pre-processing steps improved microvessel segmentation performance significantly, yielding a method with Dice of 0.71+/-0.10 and pixel-wise sensitivity/specificity of 87.7+/-6.6%/99.8+/-0.1%. The network for classifying microvessels from candidates performed exceptionally well, with sensitivity of 99.5+/-0.3%, specificity of 98.8+/-1.0%, and accuracy of 99.1+/-0.5%. The classification step eliminated the majority of residual false positives, and the Dice coefficient increased from 0.71 to 0.73. In addition, our method produced 698 image frames with microvessels present, compared to 730 from manual analysis, representing a 4.4% difference. When compared to the manual method, the automated method improved microvessel continuity, implying improved segmentation performance. The method will be useful for research purposes as well as potential future treatment planning.

translated by 谷歌翻译