引导深度超分辨率(GDSR)是多模态图像处理中的必要主题,其在同一场景的HR RGB图像的帮助下重建与次优条件的低分辨率的高分辨率(HR)深度映射。为了解决解释工作机制的挑战,提取过度转移的跨模型特征和RGB纹理,我们提出了一种新颖的离散余弦变换网络(DCTNet)来缓解三个方面的问题。首先,离散余弦变换(DCT)模块通过使用DCT来解决来自GDSR的图像域的频道明智的优化问题来重建多通道HR深度特征。其次,我们介绍了一个半耦合特征提取模块,使用共享卷积核,以提取公共功能和私有内核,以提取特定的模态特征。第三,我们采用了边缘注意机制,以突出导致导游的轮廓。广泛的定量和定性评估表明了我们的DCTNET的有效性,这优于以前的最先进方法,具有相对较少的参数。代码将公开。
translated by 谷歌翻译
深度映射记录场景中的视点和对象之间的距离,这在许多真实应用程序中起着关键作用。然而,消费者级RGB-D相机捕获的深度图遭受了低空间分辨率。引导深度地图超分辨率(DSR)是解决此问题的流行方法,该方法试图从输入的低分辨率(LR)深度及其耦合的HR RGB图像中恢复高分辨率(HR)深度映射和作为指引。引导DSR最具挑战性的问题是如何正确选择一致的结构并传播它们,并正确处理不一致的结构。在本文中,我们提出了一种用于引导DSR的新型关注的分层多模态融合(AHMF)网络。具体地,为了有效地提取和组合来自LR深度和HR引导的相关信息,我们提出了一种基于多模态注意力的融合(MMAF)策略,包括分层卷积层,包括特征增强块,以选择有价值的功能和特征重新校准块来统一不同外观特征的方式的相似性度量。此外,我们提出了一个双向分层特征协作(BHFC)模块,以完全利用多尺度特征之间的低级空间信息和高级结构信息。实验结果表明,在重建精度,运行速度和记忆效率方面,我们的方法优于最先进的方法。
translated by 谷歌翻译
Depth map super-resolution (DSR) has been a fundamental task for 3D computer vision. While arbitrary scale DSR is a more realistic setting in this scenario, previous approaches predominantly suffer from the issue of inefficient real-numbered scale upsampling. To explicitly address this issue, we propose a novel continuous depth representation for DSR. The heart of this representation is our proposed Geometric Spatial Aggregator (GSA), which exploits a distance field modulated by arbitrarily upsampled target gridding, through which the geometric information is explicitly introduced into feature aggregation and target generation. Furthermore, bricking with GSA, we present a transformer-style backbone named GeoDSR, which possesses a principled way to construct the functional mapping between local coordinates and the high-resolution output results, empowering our model with the advantage of arbitrary shape transformation ready to help diverse zooming demand. Extensive experimental results on standard depth map benchmarks, e.g., NYU v2, have demonstrated that the proposed framework achieves significant restoration gain in arbitrary scale depth map super-resolution compared with the prior art. Our codes are available at https://github.com/nana01219/GeoDSR.
translated by 谷歌翻译
引导过滤器是计算机视觉和计算机图形中的基本工具,旨在将结构信息从引导图像传输到目标图像。大多数现有方法构造来自指导本身的滤波器内核,而不考虑指导和目标之间的相互依赖性。然而,由于两种图像中通常存在显着不同的边沿,只需将引导的所有结构信息传送到目标即将导致各种伪像。要应对这个问题,我们提出了一个名为Deep Enterponal引导图像过滤的有效框架,其过滤过程可以完全集成两个图像中包含的互补信息。具体地,我们提出了一种注意力内核学习模块,分别从引导和目标生成双组滤波器内核,然后通过在两个图像之间建模像素方向依赖性来自适应地组合它们。同时,我们提出了一种多尺度引导图像滤波模块,以粗略的方式通过所构造的内核逐渐产生滤波结果。相应地,引入了多尺度融合策略以重用中间导点在粗略的过程中。广泛的实验表明,所提出的框架在广泛的引导图像滤波应用中,诸如引导超分辨率,横向模态恢复,纹理拆除和语义分割的最先进的方法。
translated by 谷歌翻译
这项工作提出了一种新的循环架构,可以从图像中提取高频模式并将其重新插入几何特征。此过程允许我们增强一方面捕获精细细节的低成本深度传感器的分辨率,并忠于另一方面的扫描地面真相。我们为深度超分辨率任务以及视觉上有吸引力,增强的3D模型提供了最先进的结果。
translated by 谷歌翻译
This paper explores the problem of reconstructing high-resolution light field (LF) images from hybrid lenses, including a high-resolution camera surrounded by multiple low-resolution cameras. The performance of existing methods is still limited, as they produce either blurry results on plain textured areas or distortions around depth discontinuous boundaries. To tackle this challenge, we propose a novel end-to-end learning-based approach, which can comprehensively utilize the specific characteristics of the input from two complementary and parallel perspectives. Specifically, one module regresses a spatially consistent intermediate estimation by learning a deep multidimensional and cross-domain feature representation, while the other module warps another intermediate estimation, which maintains the high-frequency textures, by propagating the information of the high-resolution view. We finally leverage the advantages of the two intermediate estimations adaptively via the learned attention maps, leading to the final high-resolution LF image with satisfactory results on both plain textured areas and depth discontinuous boundaries. Besides, to promote the effectiveness of our method trained with simulated hybrid data on real hybrid data captured by a hybrid LF imaging system, we carefully design the network architecture and the training strategy. Extensive experiments on both real and simulated hybrid data demonstrate the significant superiority of our approach over state-of-the-art ones. To the best of our knowledge, this is the first end-to-end deep learning method for LF reconstruction from a real hybrid input. We believe our framework could potentially decrease the cost of high-resolution LF data acquisition and benefit LF data storage and transmission.
translated by 谷歌翻译
深度完成旨在预测从深度传感器(例如Lidars)中捕获的极稀疏图的密集像素深度。它在各种应用中起着至关重要的作用,例如自动驾驶,3D重建,增强现实和机器人导航。基于深度学习的解决方案已经证明了这项任务的最新成功。在本文中,我们首次提供了全面的文献综述,可帮助读者更好地掌握研究趋势并清楚地了解当前的进步。我们通过通过对现有方法进行分类的新型分类法提出建议,研究网络体系结构,损失功能,基准数据集和学习策略的设计方面的相关研究。此外,我们在包括室内和室外数据集(包括室内和室外数据集)上进行了三个广泛使用基准测试的模型性能进行定量比较。最后,我们讨论了先前作品的挑战,并为读者提供一些有关未来研究方向的见解。
translated by 谷歌翻译
Convolutional Neural Network (CNN)-based image super-resolution (SR) has exhibited impressive success on known degraded low-resolution (LR) images. However, this type of approach is hard to hold its performance in practical scenarios when the degradation process is unknown. Despite existing blind SR methods proposed to solve this problem using blur kernel estimation, the perceptual quality and reconstruction accuracy are still unsatisfactory. In this paper, we analyze the degradation of a high-resolution (HR) image from image intrinsic components according to a degradation-based formulation model. We propose a components decomposition and co-optimization network (CDCN) for blind SR. Firstly, CDCN decomposes the input LR image into structure and detail components in feature space. Then, the mutual collaboration block (MCB) is presented to exploit the relationship between both two components. In this way, the detail component can provide informative features to enrich the structural context and the structure component can carry structural context for better detail revealing via a mutual complementary manner. After that, we present a degradation-driven learning strategy to jointly supervise the HR image detail and structure restoration process. Finally, a multi-scale fusion module followed by an upsampling layer is designed to fuse the structure and detail features and perform SR reconstruction. Empowered by such degradation-based components decomposition, collaboration, and mutual optimization, we can bridge the correlation between component learning and degradation modelling for blind SR, thereby producing SR results with more accurate textures. Extensive experiments on both synthetic SR datasets and real-world images show that the proposed method achieves the state-of-the-art performance compared to existing methods.
translated by 谷歌翻译
具有高分辨率(HR)的磁共振成像(MRI)提供了更详细的信息,以进行准确的诊断和定量图像分析。尽管取得了重大进展,但大多数现有的医学图像重建网络都有两个缺陷:1)所有这些缺陷都是在黑盒原理中设计的,因此缺乏足够的解释性并进一步限制其实际应用。可解释的神经网络模型引起了重大兴趣,因为它们在处理医学图像时增强了临床实践所需的可信赖性。 2)大多数现有的SR重建方法仅使用单个对比度或使用简单的多对比度融合机制,从而忽略了对SR改进至关重要的不同对比度之间的复杂关系。为了解决这些问题,在本文中,提出了一种新颖的模型引导的可解释的深层展开网络(MGDUN),用于医学图像SR重建。模型引导的图像SR重建方法求解手动设计的目标函数以重建HR MRI。我们通过将MRI观察矩阵和显式多对比度关系矩阵考虑到末端到端优化期间,将迭代的MGDUN算法展示为新型模型引导的深层展开网络。多对比度IXI数据集和Brats 2019数据集进行了广泛的实验,证明了我们提出的模型的优势。
translated by 谷歌翻译
图像超分辨率(SR)是重要的图像处理方法之一,可改善计算机视野领域的图像分辨率。在过去的二十年中,在超级分辨率领域取得了重大进展,尤其是通过使用深度学习方法。这项调查是为了在深度学习的角度进行详细的调查,对单像超分辨率的最新进展进行详细的调查,同时还将告知图像超分辨率的初始经典方法。该调查将图像SR方法分类为四个类别,即经典方法,基于学习的方法,无监督学习的方法和特定领域的SR方法。我们还介绍了SR的问题,以提供有关图像质量指标,可用参考数据集和SR挑战的直觉。使用参考数据集评估基于深度学习的方法。一些审查的最先进的图像SR方法包括增强的深SR网络(EDSR),周期循环gan(Cincgan),多尺度残留网络(MSRN),Meta残留密度网络(META-RDN) ,反复反射网络(RBPN),二阶注意网络(SAN),SR反馈网络(SRFBN)和基于小波的残留注意网络(WRAN)。最后,这项调查以研究人员将解决SR的未来方向和趋势和开放问题的未来方向和趋势。
translated by 谷歌翻译
高光谱成像由于其在捕获丰富的空间和光谱信息的能力上提供了多功能应用,这对于识别物质至关重要。但是,获取高光谱图像的设备昂贵且复杂。因此,已经通过直接从低成本,更多可用的RGB图像重建高光谱信息来提出了许多替代光谱成像方法。我们详细研究了来自广泛的RGB图像的这些最先进的光谱重建方法。对25种方法的系统研究和比较表明,尽管速度较低,但大多数数据驱动的深度学习方法在重建精度和质量方面都优于先前的方法。这项全面的审查可以成为同伴研究人员的富有成果的参考来源,从而进一步启发了相关领域的未来发展方向。
translated by 谷歌翻译
Pansharpening是一种广泛使用的图像增强技术,用于遥感。其原理是熔断输入的高分辨率单通道平面(PAN)图像和低分辨率多光谱图像,并获得高分辨率多光谱(HRMS)图像。现有的深度学习泛散歌方法有两个缺点。首先,需要沿信道维度连接两个输入图像的特征以重建HRMS图像,这使得PAN图像的重要性不突出,并且还导致高计算成本。其次,通过手动设计的损耗功能难以提取特征的隐式信息。为此,我们通过用于粉彩的快速引导滤波器(FGF)提出一种生成的对抗性网络。在发电机中,传统的信道级联被FGF替换,以更好地保留空间信息,同时减少参数的数量。同时,融合对象可以通过空间注意模块突出显示。此外,通过对抗性训练可以有效地保存特征的潜在信息。许多实验说明我们的网络生成了可以超越现有方法的高质量HRMS图像,以及更少的参数。
translated by 谷歌翻译
With the development of convolutional neural networks, hundreds of deep learning based dehazing methods have been proposed. In this paper, we provide a comprehensive survey on supervised, semi-supervised, and unsupervised single image dehazing. We first discuss the physical model, datasets, network modules, loss functions, and evaluation metrics that are commonly used. Then, the main contributions of various dehazing algorithms are categorized and summarized. Further, quantitative and qualitative experiments of various baseline methods are carried out. Finally, the unsolved issues and challenges that can inspire the future research are pointed out. A collection of useful dehazing materials is available at \url{https://github.com/Xiaofeng-life/AwesomeDehazing}.
translated by 谷歌翻译
Image Super-Resolution (SR) is essential for a wide range of computer vision and image processing tasks. Investigating infrared (IR) image (or thermal images) super-resolution is a continuing concern within the development of deep learning. This survey aims to provide a comprehensive perspective of IR image super-resolution, including its applications, hardware imaging system dilemmas, and taxonomy of image processing methodologies. In addition, the datasets and evaluation metrics in IR image super-resolution tasks are also discussed. Furthermore, the deficiencies in current technologies and possible promising directions for the community to explore are highlighted. To cope with the rapid development in this field, we intend to regularly update the relevant excellent work at \url{https://github.com/yongsongH/Infrared_Image_SR_Survey
translated by 谷歌翻译
Benefiting from color independence, illumination invariance and location discrimination attributed by the depth map, it can provide important supplemental information for extracting salient objects in complex environments. However, high-quality depth sensors are expensive and can not be widely applied. While general depth sensors produce the noisy and sparse depth information, which brings the depth-based networks with irreversible interference. In this paper, we propose a novel multi-task and multi-modal filtered transformer (MMFT) network for RGB-D salient object detection (SOD). Specifically, we unify three complementary tasks: depth estimation, salient object detection and contour estimation. The multi-task mechanism promotes the model to learn the task-aware features from the auxiliary tasks. In this way, the depth information can be completed and purified. Moreover, we introduce a multi-modal filtered transformer (MFT) module, which equips with three modality-specific filters to generate the transformer-enhanced feature for each modality. The proposed model works in a depth-free style during the testing phase. Experiments show that it not only significantly surpasses the depth-based RGB-D SOD methods on multiple datasets, but also precisely predicts a high-quality depth map and salient contour at the same time. And, the resulted depth map can help existing RGB-D SOD methods obtain significant performance gain. The source code will be publicly available at https://github.com/Xiaoqi-Zhao-DLUT/MMFT.
translated by 谷歌翻译
在本文中,我们提出了D2C-SR,这是一个新颖的框架,用于实现现实世界图像超级分辨率的任务。作为一个不适的问题,超分辨率相关任务的关键挑战是给定的低分辨率输入可能会有多个预测。大多数基于经典的深度学习方法都忽略了基本事实,缺乏对基础高频分布的明确建模,从而导致结果模糊。最近,一些基于GAN或学习的超分辨率空间的方法可以生成模拟纹理,但不能保证具有低定量性能的纹理的准确性。重新思考这两者,我们以离散形式了解了基本高频细节的分布,并提出了两阶段的管道:分歧阶段到收敛阶段。在发散阶段,我们提出了一个基于树的结构深网作为差异骨干。提出了发散损失,以鼓励基于树的网络产生的结果,以分解可能的高频表示,这是我们对基本高频分布进行离散建模的方式。在收敛阶段,我们分配空间权重以融合这些不同的预测,以获得更准确的细节,以获取最终输出。我们的方法为推理提供了方便的端到端方式。我们对几个现实世界基准进行评估,包括具有X8缩放系数的新提出的D2CrealSR数据集。我们的实验表明,D2C-SR针对最先进的方法实现了更好的准确性和视觉改进,参数编号明显较少,并且我们的D2C结构也可以作为广义结构应用于其他一些方法以获得改进。我们的代码和数据集可在https://github.com/megvii-research/d2c-sr上找到
translated by 谷歌翻译
面部超分辨率(FSR),也称为面部幻觉,其旨在增强低分辨率(LR)面部图像以产生高分辨率(HR)面部图像的分辨率,是特定于域的图像超分辨率问题。最近,FSR获得了相当大的关注,并目睹了深度学习技术的发展炫目。迄今为止,有很少有基于深入学习的FSR的研究摘要。在本次调查中,我们以系统的方式对基于深度学习的FSR方法进行了全面审查。首先,我们总结了FSR的问题制定,并引入了流行的评估度量和损失功能。其次,我们详细说明了FSR中使用的面部特征和流行数据集。第三,我们根据面部特征的利用大致分类了现有方法。在每个类别中,我们从设计原则的一般描述开始,然后概述代表方法,然后讨论其中的利弊。第四,我们评估了一些最先进的方法的表现。第五,联合FSR和其他任务以及与FSR相关的申请大致介绍。最后,我们设想了这一领域进一步的技术进步的前景。在\ URL {https://github.com/junjun-jiang/face-hallucination-benchmark}上有一个策划的文件和资源的策划文件和资源清单
translated by 谷歌翻译
高动态范围(HDR)成像是一种允许广泛的动态曝光范围的技术,这在图像处理,计算机图形和计算机视觉中很重要。近年来,使用深度学习(DL),HDR成像有重大进展。本研究对深层HDR成像方法的最新发展进行了综合和富有洞察力的调查和分析。在分层和结构上,将现有的深层HDR成像方法基于(1)输入曝光的数量/域,(2)学习任务数,(3)新传感器数据,(4)新的学习策略,(5)应用程序。重要的是,我们对关于其潜在和挑战的每个类别提供建设性的讨论。此外,我们审查了深度HDR成像的一些关键方面,例如数据集和评估指标。最后,我们突出了一些打开的问题,并指出了未来的研究方向。
translated by 谷歌翻译
Informative features play a crucial role in the single image super-resolution task. Channel attention has been demonstrated to be effective for preserving information-rich features in each layer. However, channel attention treats each convolution layer as a separate process that misses the correlation among different layers. To address this problem, we propose a new holistic attention network (HAN), which consists of a layer attention module (LAM) and a channel-spatial attention module (CSAM), to model the holistic interdependencies among layers, channels, and positions. Specifically, the proposed LAM adaptively emphasizes hierarchical features by considering correlations among layers. Meanwhile, CSAM learns the confidence at all the positions of each channel to selectively capture more informative features. Extensive experiments demonstrate that the proposed HAN performs favorably against the state-ofthe-art single image super-resolution approaches.
translated by 谷歌翻译
联合超分辨率和反音调映射(SR-ITM)旨在提高具有分辨率和动态范围具有质量缺陷的视频的视觉质量。当使用4K高动态范围(HDR)电视来观看低分辨率标准动态范围(LR SDR)视频时,就会出现此问题。以前依赖于学习本地信息的方法通常在保留颜色合规性和远程结构相似性方面做得很好,从而导致了不自然的色彩过渡和纹理伪像。为了应对这些挑战,我们建议联合SR-ITM的全球先验指导的调制网络(GPGMNET)。特别是,我们设计了一个全球先验提取模块(GPEM),以提取颜色合规性和结构相似性,分别对ITM和SR任务有益。为了进一步利用全球先验并保留空间信息,我们使用一些用于中间特征调制的参数,设计多个全球先验的指导空间调制块(GSMB),其中调制参数由共享的全局先验和空间特征生成来自空间金字塔卷积块(SPCB)的地图。通过这些精心设计的设计,GPGMNET可以通过较低的计算复杂性实现更高的视觉质量。广泛的实验表明,我们提出的GPGMNET优于最新方法。具体而言,我们提出的模型在PSNR中超过了0.64 dB的最新模型,其中69 $ \%$ $ $较少,3.1 $ \ times $ speedup。该代码将很快发布。
translated by 谷歌翻译