智能论文笔记

Jamdani Motif Generation using Conditional GAN

MD Tanvir Rouf Shawon , Raihan Tanvir , Humaira Ferdous Shifa , Susmoy Kar , Mohammad Imrul Jubair

分类：计算机视觉

2022-12-22

Jamdani is the strikingly patterned textile heritage of Bangladesh. The exclusive geometric motifs woven on the fabric are the most attractive part of this craftsmanship having a remarkable influence on textile and fine art. In this paper, we have developed a technique based on the Generative Adversarial Network that can learn to generate entirely new Jamdani patterns from a collection of Jamdani motifs that we assembled, the newly formed motifs can mimic the appearance of the original designs. Users can input the skeleton of a desired pattern in terms of rough strokes and our system finalizes the input by generating the complete motif which follows the geometric structure of real Jamdani ones. To serve this purpose, we collected and preprocessed a dataset containing a large number of Jamdani motifs images from authentic sources via fieldwork and applied a state-of-the-art method called pix2pix to it. To the best of our knowledge, this dataset is currently the only available dataset of Jamdani motifs in digital format for computer vision research. Our experimental results of the pix2pix model on this dataset show satisfactory outputs of computer-generated images of Jamdani motifs and we believe that our work will open a new avenue for further research.

translated by 谷歌翻译

Shapes2Toon: Generating Cartoon Characters from Simple Geometric Shapes

Simanta Deb Turja , Mohammad Imrul Jubair , Md. Shafiur Rahman , Md. Hasib Al Zadid , Mohtasim Hossain Shovon , Md. Faraz Kabir Khan

分类：计算机视觉

2022-11-03

Cartoons are an important part of our entertainment culture. Though drawing a cartoon is not for everyone, creating it using an arrangement of basic geometric primitives that approximates that character is a fairly frequent technique in art. The key motivation behind this technique is that human bodies - as well as cartoon figures - can be split down into various basic geometric primitives. Numerous tutorials are available that demonstrate how to draw figures using an appropriate arrangement of fundamental shapes, thus assisting us in creating cartoon characters. This technique is very beneficial for children in terms of teaching them how to draw cartoons. In this paper, we develop a tool - shape2toon - that aims to automate this approach by utilizing a generative adversarial network which combines geometric primitives (i.e. circles) and generate a cartoon figure (i.e. Mickey Mouse) depending on the given approximation. For this purpose, we created a dataset of geometrically represented cartoon characters. We apply an image-to-image translation technique on our dataset and report the results in this paper. The experimental results show that our system can generate cartoon characters from input layout of geometric shapes. In addition, we demonstrate a web-based tool as a practical implication of our work.

translated by 谷歌翻译

Capabilities, Limitations and Challenges of Style Transfer with CycleGANs: A Study on Automatic Ring Design Generation

Tomas Cabezon Pedroso , Javier Del Ser , Natalia Diaz-Rodrıguez

分类：计算机视觉 | 机器学习

2022-07-18

渲染程序已经完全改变了设计过程，因为它们可以在制造产品之前查看产品的外观。但是，渲染过程很复杂，并且需要大量时间，不仅在渲染本身，而且在场景的环境中。需要设置材料，灯光和摄像头，以获得最佳质量效果。然而，在第一个渲染中可能无法获得最佳输出。这一切使渲染过程成为一个繁琐的过程。因为Goodfellow等人。 2014年引入了生成对抗网络（GAN）[1]，它们已用于生成计算机分配的合成数据，从不存在的人脸到医学数据分析或图像样式转移。 GAN已被用来将图像纹理从一个域传输到另一个域。但是，需要来自两个域的配对数据。朱等。引入了Cyclegan模型，消除了这种昂贵的约束允许将一个图像从一个域转换为另一个域的，而无需配对数据。这项工作验证了Cyclegans在样式转移从初始草图到2D最终渲染的适用性，该渲染代表3D设计，这是每个产品设计过程中最重要的一步。我们询问将Cyclegans作为设计管道的一部分的可能性，更确切地说是应用于环设计的渲染。我们的贡献需要该过程的关键部分，因为它允许客户在购买前查看最终产品。这项工作为将来的研究树立了基础，展示了gan在设计中的可能性，并为新型应用程序建立了接近工艺设计的起点。

translated by 谷歌翻译

XCI-Sketch: Extraction of Color Information from Images for Generation of Colored Outlines and Sketches

V Manushree , Sameer Saxena , Parna Chowdhury , Manisimha Varma , Harsh Rathod , Ankita Ghosh , Sahil Khose

分类：计算机视觉 | 人工智能

2021-08-26

草图是一种从个人的创造性角度传达视觉场景的媒介。添加颜色基本上增强了草图的总体表征。本文提出了通过利用轮廓绘制数据集来模仿人绘制着色草图的两种方法。我们的第一个方法通过应用k-means颜色聚类辅助的图像处理技术来呈现彩色的轮廓草图。第二种方法使用生成的对抗性网络来开发一个可以从先前未观察到的图像生成彩色草图的模型。我们评估通过定量和定性评估获得的结果。

translated by 谷歌翻译

A Deep Learning Generative Model Approach for Image Synthesis of Plant Leaves

Alessandro Benfenati , Davide Bolzi , Paola Causin , Roberto Oberti

分类：计算机视觉 | 机器学习

2021-11-05

目标。我们通过先进的深度学习（DL）技术以自动化方式生成人工叶片。我们的目标是为现代作物管理的AI申请处理培训样本来源。这些应用需要大量数据，而叶片图像不是真正稀缺的，图像收集和注释仍然是一个非常耗时的过程。可以通过在属于小型数据集的样本简单转换的增强技术来解决数据稀缺，但增强数据的丰富性有限：这激励了寻找替代方法的搜索。方法。追求一种基于DL生成模型的方法，我们提出了一个叶片到叶片翻译（L2L）程序分为两个步骤：首先，从伴侣二值化骨架开始生成剩余变分的自动化器架构生成合成叶骨架（叶片轮廓和静脉）真实的图像。在第二步中，我们通过PIX2PIX框架进行平移，该框架使用条件发生器对外网络来再现叶片的着色，保持形状和风景图案。结果。 L2L程序产生具有现实外观的叶子的合成图像。我们以定性和定量方式解决了性能测量;对于后一种评价，我们使用DL异常检测策略，该策略量化了合成叶的异常程度相对于真正的样品。结论。生成DL方法有可能成为一种新的范例，为计算机辅助应用提供低成本有意义的合成样品。本发明的L2L方法代表了迈向该目标的步骤，能够产生与真实叶子相关的定性和定量相似的合成样本。

translated by 谷歌翻译

Image-to-image translation with conditional adversarial networks

分类：

Labels to Facade BW to Color Aerial to Map Labels to Street Scene Edges to Photo input output input input input input output output output output input output Day to Night Figure 1: Many problems in image processing, graphics, and vision involve translating an input image into a corresponding output image.These problems are often treated with application-specific algorithms, even though the setting is always the same: map pixels to pixels. Conditional adversarial nets are a general-purpose solution that appears to work well on a wide variety of these problems. Here we show results of the method on several. In each case we use the same architecture and objective, and simply train on different data.

translated by 谷歌翻译

Astronomical Image Colorization and upscaling with Generative Adversarial Networks

Shreyas Kalvankar , Hrushikesh Pandit , Pranav Parwate , Atharva Patil , Snehal Kamalapur

分类：计算机视觉 | 机器学习

2021-12-27

在没有人为干预的图像自动色彩上是在机器学习界的兴趣中的一个短暂的时间。分配颜色到图像是一个非常令人虐待的问题，因为它具有非常高的自由度的先天性;给定图像，通常没有单一的颜色组合是正确的。除了着色之外，图像重建中的另一个问题是单图像超分辨率，其旨在将低分辨率图像转换为更高的分辨率。该研究旨在通过专注于图像的非常特定的图像，即天文图像，并使用生成的对抗网络（GAN）来提供自动化方法。我们探索两种不同颜色空间，RGB和L * A *中各种型号的使用。我们使用传输学习，由于小数据集，使用预先训练的Reset-18作为骨干，即U-Net的编码器，进一步微调。该模型产生视觉上有吸引力的图像，其在原始图像中不存在的这些结果中呈现的高分辨率高分辨率，着色数据。我们通过使用所有通道的每个颜色空间中的距离度量（例如L1距离和L2距离）评估GAN来提供我们的结果，以提供比较分析。我们使用Frechet Inception距离（FID）将生成的图像的分布与实际图像的分布进行比较，以评估模型的性能。

translated by 谷歌翻译

MM811 Project Report: Cloud Detection and Removal in Satellite Images

Dale Chen-Song , Erfan Khalaji , Vaishali Rani

分类：计算机视觉 | 机器学习

2022-12-21

For satellite images, the presence of clouds presents a problem as clouds obscure more than half to two-thirds of the ground information. This problem causes many issues for reliability in a noise-free environment to communicate data and other applications that need seamless monitoring. Removing the clouds from the images while keeping the background pixels intact can help address the mentioned issues. Recently, deep learning methods have become popular for researching cloud removal by demonstrating promising results, among which Generative Adversarial Networks (GAN) have shown considerably better performance. In this project, we aim to address cloud removal from satellite images using AttentionGAN and then compare our results by reproducing the results obtained using traditional GANs and auto-encoders. We use RICE dataset. The outcome of this project can be used to develop applications that require cloud-free satellite images. Moreover, our results could be helpful for making further research improvements.

translated by 谷歌翻译

Face Generation and Editing with StyleGAN: A Survey

Andrew Melnik , Maksim Miasayedzenkau , Dzianis Makarovets , Dzianis Pirshtuk , Eren Akbulut , Dennis Holzmann , Tarek Renusch , Gustav Reichert , Helge Ritter

分类：计算机视觉 | 机器学习

2022-12-18

Our goal with this survey is to provide an overview of the state of the art deep learning technologies for face generation and editing. We will cover popular latest architectures and discuss key ideas that make them work, such as inversion, latent representation, loss functions, training procedures, editing methods, and cross domain style transfer. We particularly focus on GAN-based architectures that have culminated in the StyleGAN approaches, which allow generation of high-quality face images and offer rich interfaces for controllable semantics editing and preserving photo quality. We aim to provide an entry point into the field for readers that have basic knowledge about the field of deep learning and are looking for an accessible introduction and overview.

translated by 谷歌翻译

Generative Adversarial Networks: An Overview

Antonia Creswell , Tom White , Vincent Dumoulin , Kai Arulkumaran , Biswa Sengupta , Anil A Bharath

分类：

2017-10-19

Generative adversarial networks (GANs) provide a way to learn deep representations without extensively annotated training data. They achieve this through deriving backpropagation signals through a competitive process involving a pair of networks. The representations that can be learned by GANs may be used in a variety of applications, including image synthesis, semantic image editing, style transfer, image super-resolution and classification. The aim of this review paper is to provide an overview of GANs for the signal processing community, drawing on familiar analogies and concepts where possible. In addition to identifying different methods for training and constructing GANs, we also point to remaining challenges in their theory and application.

translated by 谷歌翻译

Toward Multimodal Image-to-Image Translation

Jun-Yan Zhu , Richard Zhang , Deepak Pathak , Trevor Darrell , Alexei A. Efros , Oliver Wang , Eli Shechtman

分类：

2017-11-30

Many image-to-image translation problems are ambiguous, as a single input image may correspond to multiple possible outputs. In this work, we aim to model a distribution of possible outputs in a conditional generative modeling setting. The ambiguity of the mapping is distilled in a low-dimensional latent vector, which can be randomly sampled at test time. A generator learns to map the given input, combined with this latent code, to the output. We explicitly encourage the connection between output and the latent code to be invertible. This helps prevent a many-to-one mapping from the latent code to the output during training, also known as the problem of mode collapse, and produces more diverse results. We explore several variants of this approach by employing different training objectives, network architectures, and methods of injecting the latent code. Our proposed method encourages bijective consistency between the latent encoding and output modes. We present a systematic comparison of our method and other variants on both perceptual realism and diversity.

translated by 谷歌翻译

Deep Facial Synthesis: A New Challenge

Deng-Ping Fan , Ziling Huang , Peng Zheng , Hong Liu , Xuebin Qin , Luc Van Gool

分类：计算机视觉

2021-12-31

本文的目标是对面部素描合成（FSS）问题进行全面的研究。然而，由于获得了手绘草图数据集的高成本，因此缺乏完整的基准，用于评估过去十年的FSS算法的开发。因此，我们首先向FSS引入高质量的数据集，名为FS2K，其中包括2,104个图像素描对，跨越三种类型的草图样式，图像背景，照明条件，肤色和面部属性。 FS2K与以前的FSS数据集不同于难度，多样性和可扩展性，因此应促进FSS研究的进展。其次，我们通过调查139种古典方法，包括34个手工特征的面部素描合成方法，37个一般的神经式传输方法，43个深映像到图像翻译方法，以及35个图像 - 素描方法。此外，我们详细说明了现有的19个尖端模型的综合实验。第三，我们为FSS提供了一个简单的基准，名为FSGAN。只有两个直截了当的组件，即面部感知屏蔽和风格矢量扩展，FSGAN将超越所提出的FS2K数据集的所有先前最先进模型的性能，通过大边距。最后，我们在过去几年中汲取的经验教训，并指出了几个未解决的挑战。我们的开源代码可在https://github.com/dengpingfan/fsgan中获得。

translated by 谷歌翻译

CDGAN: Cyclic Discriminative Generative Adversarial Networks for Image-to-Image Transformation

Kancharagunta Kishan Babu , Shiv Ram Dubey

分类：计算机视觉 | 机器学习

2020-01-15

生成的对抗网络（GANS）已经促进了解决图像到图像转换问题的新方向。不同的GANS在目标函数中使用具有不同损耗的发电机和鉴别器网络。仍然存在差距来填补所生成的图像的质量并靠近地面真理图像。在这项工作中，我们介绍了一个名为循环辨别生成的对抗网络（CDGAN）的新的图像到图像转换网络，填补了上述空白。除了加速本的原始架构之外，所提出的CDGAN通过结合循环图像的附加鉴别器网络来产生高质量和更现实的图像。所提出的CDGAN在三个图像到图像转换数据集上进行测试。分析了定量和定性结果，并与最先进的方法进行了比较。在三个基线图像到图像转换数据集中，所提出的CDGAN方法优于最先进的方法。该代码可在https://github.com/kishankancharagunta/cdgan获得。

translated by 谷歌翻译

Album cover art image generation with Generative Adversarial Networks

Felipe Perez Stoppa , Ester Vidaña-Vila , Joan Navarro

分类：计算机视觉 | 人工智能 | 机器学习

2022-12-09

Generative Adversarial Networks (GANs) were introduced by Goodfellow in 2014, and since then have become popular for constructing generative artificial intelligence models. However, the drawbacks of such networks are numerous, like their longer training times, their sensitivity to hyperparameter tuning, several types of loss and optimization functions and other difficulties like mode collapse. Current applications of GANs include generating photo-realistic human faces, animals and objects. However, I wanted to explore the artistic ability of GANs in more detail, by using existing models and learning from them. This dissertation covers the basics of neural networks and works its way up to the particular aspects of GANs, together with experimentation and modification of existing available models, from least complex to most. The intention is to see if state of the art GANs (specifically StyleGAN2) can generate album art covers and if it is possible to tailor them by genre. This was attempted by first familiarizing myself with 3 existing GANs architectures, including the state of the art StyleGAN2. The StyleGAN2 code was used to train a model with a dataset containing 80K album cover images, then used to style images by picking curated images and mixing their styles.

translated by 谷歌翻译

Virtual Underwater Datasets for Autonomous Inspections

oannis Polymenis , Maryam Haroutunian , Rose Norman , David Trodden

分类：计算机视觉

2022-09-13

在离岸部门以及科学界在水下行动方面的迅速发展，水下车辆变得更加复杂。值得注意的是，许多水下任务，包括对海底基础设施的评估，都是在自动水下车辆（AUV）的帮助下进行的。最近在人工智能（AI）方面取得了突破，尤其是深度学习（DL）模型和应用，这些模型和应用在各种领域都广泛使用，包括空中无人驾驶汽车，自动驾驶汽车导航和其他应用。但是，由于难以获得特定应用的水下数据集，它们在水下应用中并不普遍。从这个意义上讲，当前的研究利用DL领域的最新进步来构建从实验室环境中捕获的物品照片产生的定制数据集。通过将收集到的图像与包含水下环境的照片相结合，将生成的对抗网络（GAN）用于将实验室对象数据集转化为水下域。这些发现证明了创建这样的数据集的可行性，因为与现实世界的水下船体船体图像相比，所得图像与真实的水下环境非常相似。因此，水下环境的人工数据集可以克服因对实际水下图像的有限访问而引起的困难，并用于通过水下对象图像分类和检测来增强水下操作。

translated by 谷歌翻译

TileGen: Tileable, Controllable Material Generation and Capture

Xilong Zhou , Miloš Hašan , Valentin Deschaintre , Paul Guerrero , Kalyan Sunkavalli , Nima Kalantari

分类：计算机视觉

2022-06-12

最近的方法（例如材料gan）已使用无条件的gan来生成每像素材料图，或作为从输入照片重建材料之前的材料。这些模型可以生成各种随机材料外观，但没有任何将生成材料限制为特定类别或控制生成材料的粗体结构的机制，例如砖墙上的精确砖布局。此外，从单个输入照片中重建的材料通常具有伪像，并且通常不可易换，这限制了它们在实际内容创建管道中的使用。我们提出了Tilegen，这是一种针对SVBRDFS的生成模型，该模型特定于材料类别，始终可易换，并且在提供的输入结构模式上有条件。 Tilegen是Stylegan的变体，其架构经过修改以始终生成可易于的（周期性）材料图。除了标准的“样式”潜在代码外，Tilegen还可以选择拍摄条件图像，从而使用户直接控制材料的主要空间（和可选的颜色）功能。例如，在砖块中，用户可以指定砖布局和砖块，或者在皮革材料中，皱纹和褶皱的位置。我们的反渲染方法可以通过优化找到一种材料，从而感知到单个目标照片。这种重建也可以以用户提供的模式为条件。所得的材料是可拆卸的，可以大于目标图像，并且可以通过改变条件来编辑。

translated by 谷歌翻译

Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks

Jun-Yan Zhu , Taesung Park , Phillip Isola , Alexei A. Efros

分类：

2017-03-30

translated by 谷歌翻译

Generative Visual Manipulation on the Natural Image Manifold

Jun-Yan Zhu , Philipp Krähenbühl , Eli Shechtman , Alexei A. Efros

分类：

2016-09-12

Realistic image manipulation is challenging because it requires modifying the image appearance in a user-controlled way, while preserving the realism of the result. Unless the user has considerable artistic skill, it is easy to "fall off" the manifold of natural images while editing. In this paper, we propose to learn the natural image manifold directly from data using a generative adversarial neural network. We then define a class of image editing operations, and constrain their output to lie on that learned manifold at all times. The model automatically adjusts the output keeping all edits as realistic as possible. All our manipulations are expressed in terms of constrained optimization and are applied in near-real time. We evaluate our algorithm on the task of realistic photo manipulation of shape and color. The presented method can further be used for changing one image to look like the other, as well as generating novel imagery from scratch based on user's scribbles 1 .

translated by 谷歌翻译

Generative Adversarial Networks for Astronomical Images Generation

Davide Coccomini , Nicola Messina , Claudio Gennaro , Fabrizio Falchi

分类：计算机视觉

2021-11-22

太空探索一直是人类灵感的来源，并且由于现代望远镜，现在可以观察远离我们的天体。在网络上越来越多的空间的现实和虚构的图像，并利用现代深层学习架构，如生成的对抗网络，现在可以生成新的空间表示。在这项研究中，使用轻量级GaN，从网络获得的图像数据集，以及Galaxy动物园数据集，我们已经产生了数千个新的天体，星系，最后，最后的宇宙视图。。复制我们的结果的代码在https://github.com/davide-ccomini/ganiverse上公开提供，并且可以在https://davide-ccomini.github.io/goccomiverse/中探索生成的图像。

translated by 谷歌翻译

Culture-to-Culture Image Translation with Generative Adversarial Networks

Giulia Zaino , Carmine Tommaso Recchiuto , Antonio Sgorbissa

分类：计算机视觉 | 机器人

2022-01-05

本文介绍了图像“培养”的概念，即定义为改变“文化特征的画笔”的过程，使物体被认为属于给定文化的同时保留其功能。首先，我们提出了一种基于生成的对冲网络（GaN）将物体从源转换为目标文化域的管道。然后，我们通过在线调查问卷收集数据，以测试有关意大利参与者对属于不同文化的物体和环境的偏好的四个假设。正如预期的那样，结果取决于个人口味和偏好：然而，它们符合我们的猜想，即某些人在与机器人或其他智能系统的互动期间，可能更愿意被示出其文化领域已被修改以匹配其的图像文化背景。

translated by 谷歌翻译