Since higher-order tensors are naturally suitable for representing multi-dimensional data in real-world, e.g., color images and videos, low-rank tensor representation has become one of the emerging areas in machine learning and computer vision. However, classical low-rank tensor representations can only represent data on finite meshgrid due to their intrinsical discrete nature, which hinders their potential applicability in many scenarios beyond meshgrid. To break this barrier, we propose a low-rank tensor function representation (LRTFR), which can continuously represent data beyond meshgrid with infinite resolution. Specifically, the suggested tensor function, which maps an arbitrary coordinate to the corresponding value, can continuously represent data in an infinite real space. Parallel to discrete tensors, we develop two fundamental concepts for tensor functions, i.e., the tensor function rank and low-rank tensor function factorization. We theoretically justify that both low-rank and smooth regularizations are harmoniously unified in the LRTFR, which leads to high effectiveness and efficiency for data continuous representation. Extensive multi-dimensional data recovery applications arising from image processing (image inpainting and denoising), machine learning (hyperparameter optimization), and computer graphics (point cloud upsampling) substantiate the superiority and versatility of our method as compared with state-of-the-art methods. Especially, the experiments beyond the original meshgrid resolution (hyperparameter optimization) or even beyond meshgrid (point cloud upsampling) validate the favorable performances of our method for continuous representation.
translated by 谷歌翻译
It is known that the decomposition in low-rank and sparse matrices (\textbf{L+S} for short) can be achieved by several Robust PCA techniques. Besides the low rankness, the local smoothness (\textbf{LSS}) is a vitally essential prior for many real-world matrix data such as hyperspectral images and surveillance videos, which makes such matrices have low-rankness and local smoothness properties at the same time. This poses an interesting question: Can we make a matrix decomposition in terms of \textbf{L\&LSS +S } form exactly? To address this issue, we propose in this paper a new RPCA model based on three-dimensional correlated total variation regularization (3DCTV-RPCA for short) by fully exploiting and encoding the prior expression underlying such joint low-rank and local smoothness matrices. Specifically, using a modification of Golfing scheme, we prove that under some mild assumptions, the proposed 3DCTV-RPCA model can decompose both components exactly, which should be the first theoretical guarantee among all such related methods combining low rankness and local smoothness. In addition, by utilizing Fast Fourier Transform (FFT), we propose an efficient ADMM algorithm with a solid convergence guarantee for solving the resulting optimization problem. Finally, a series of experiments on both simulations and real applications are carried out to demonstrate the general validity of the proposed 3DCTV-RPCA model.
translated by 谷歌翻译
从高度不足的数据中恢复颜色图像和视频是面部识别和计算机视觉中的一项基本且具有挑战性的任务。通过颜色图像和视频的多维性质,在本文中,我们提出了一种新颖的张量完成方法,该方法能够有效探索离散余弦变换(DCT)下张量数据的稀疏性。具体而言,我们介绍了两个``稀疏 +低升级''张量完成模型,以及两种可实现的算法来找到其解决方案。第一个是基于DCT的稀疏加权核标准诱导低级最小化模型。第二个是基于DCT的稀疏加上$ P $换图映射引起的低秩优化模型。此外,我们因此提出了两种可实施的增强拉格朗日算法,以解决基础优化模型。一系列数值实验在内,包括颜色图像介入和视频数据恢复表明,我们所提出的方法的性能要比许多现有的最新张量完成方法更好,尤其是对于缺少数据比率较高的情况。
translated by 谷歌翻译
低级张力完成已广泛用于计算机视觉和机器学习。本文开发了一种新型多模态核心张量分解(MCTF)方法,与张量低秩测量和该措施的更好的非凸弛豫形式(NC-MCTF)。所提出的模型编码由Tucker和T-SVD提供的一般张量的低秩见解,因此预计将在多个方向上同时模拟光谱低秩率,并准确地恢复基于几个观察到的条目的内在低秩结构的数据。此外,我们研究了MCTF和NC-MCTF正则化最小化问题,并设计了一个有效的块连续上限最小化(BSUM)算法来解决它们。该高效的求解器可以将MCTF扩展到各种任务,例如张量完成。一系列实验,包括高光谱图像(HSI),视频和MRI完成,确认了所提出的方法的卓越性能。
translated by 谷歌翻译
Seismic data often undergoes severe noise due to environmental factors, which seriously affects subsequent applications. Traditional hand-crafted denoisers such as filters and regularizations utilize interpretable domain knowledge to design generalizable denoising techniques, while their representation capacities may be inferior to deep learning denoisers, which can learn complex and representative denoising mappings from abundant training pairs. However, due to the scarcity of high-quality training pairs, deep learning denoisers may sustain some generalization issues over various scenarios. In this work, we propose a self-supervised method that combines the capacities of deep denoiser and the generalization abilities of hand-crafted regularization for seismic data random noise attenuation. Specifically, we leverage the Self2Self (S2S) learning framework with a trace-wise masking strategy for seismic data denoising by solely using the observed noisy data. Parallelly, we suggest the weighted total variation (WTV) to further capture the horizontal local smooth structure of seismic data. Our method, dubbed as S2S-WTV, enjoys both high representation abilities brought from the self-supervised deep network and good generalization abilities of the hand-crafted WTV regularizer and the self-supervised nature. Therefore, our method can more effectively and stably remove the random noise and preserve the details and edges of the clean signal. To tackle the S2S-WTV optimization model, we introduce an alternating direction multiplier method (ADMM)-based algorithm. Extensive experiments on synthetic and field noisy seismic data demonstrate the effectiveness of our method as compared with state-of-the-art traditional and deep learning-based seismic data denoising methods.
translated by 谷歌翻译
最近,从图像中提取的不同组件的低秩属性已经考虑在MAN Hypspectral图像去噪方法中。然而,这些方法通常将3D矩阵或1D向量展开,以利用现有信息,例如非识别空间自相似性(NSS)和全局光谱相关(GSC),其破坏了高光谱图像的内在结构相关性(HSI) )因此导致恢复质量差。此外,由于在HSI的原始高维空间中的矩阵和张量的矩阵和张量的参与,其中大多数受到重大计算负担问题。我们使用子空间表示和加权低级张量正则化(SWLRTR)进入模型中以消除高光谱图像中的混合噪声。具体地,为了在光谱频带中使用GSC,将噪声HSI投影到简化计算的低维子空间中。之后,引入加权的低级张量正则化术语以表征缩减图像子空间中的前导。此外,我们设计了一种基于交替最小化的算法来解决非耦合问题。模拟和实时数据集的实验表明,SWLRTR方法比定量和视觉上的其他高光谱去噪方法更好。
translated by 谷歌翻译
基于深度学习的高光谱图像(HSI)恢复方法因其出色的性能而广受欢迎,但每当任务更改的细节时,通常都需要昂贵的网络再培训。在本文中,我们建议使用有效的插入方法以统一的方法恢复HSI,该方法可以共同保留基于优化方法的灵活性,并利用深神经网络的强大表示能力。具体而言,我们首先开发了一个新的深HSI DeNoiser,利用了门控复发单元,短期和长期的跳过连接以及增强的噪声水平图,以更好地利用HSIS内丰富的空间光谱信息。因此,这导致在高斯和复杂的噪声设置下,在HSI DeNosing上的最新性能。然后,在处理各种HSI恢复任务之前,将提议的DeNoiser插入即插即用的框架中。通过对HSI超分辨率,压缩感测和内部进行的广泛实验,我们证明了我们的方法经常实现卓越的性能,这与每个任务上的最先进的竞争性或甚至更好任何特定任务的培训。
translated by 谷歌翻译
非凸松弛方法已被广泛用于张量恢复问题,并且与凸松弛方法相比,可以实现更好的恢复结果。在本文中,提出了一种新的非凸函数,最小值对数凹点(MLCP)函数,并分析了其某些固有属性,其中有趣的是发现对数函数是MLCP的上限功能。所提出的功能概括为张量病例,得出张量MLCP和加权张量$ l \ gamma $ -norm。考虑到将其直接应用于张量恢复问题时无法获得其明确解决方案。因此,给出了解决此类问题的相应等效定理,即张量等效的MLCP定理和等效加权张量$ l \ gamma $ -norm定理。此外,我们提出了两个基于EMLCP的经典张量恢复问题的模型,即低秩量张量完成(LRTC)和张量稳健的主组件分析(TRPCA)以及设计近端替代线性化最小化(棕榈)算法以单独解决它们。此外,基于Kurdyka - {\ l} ojasiwicz属性,证明所提出算法的溶液序列具有有限的长度并在全球范围内收敛到临界点。最后,广泛的实验表明,提出的算法取得了良好的结果,并证实MLCP函数确实比最小化问题中的对数函数更好,这与理论特性的分析一致。
translated by 谷歌翻译
张量稀疏建模是一种有希望的方法,在整个科学和工程学中,取得了巨大的成功。众所周知,实际应用中的各种数据通常由多种因素产生,因此使用张量表示包含多个因素内部结构的数据。但是,与矩阵情况不同,构建合理的稀疏度量张量是一项相对困难且非常重要的任务。因此,在本文中,我们提出了一种称为张量全功能度量(FFM)的新张量稀疏度度量。它可以同时描述张量的每个维度的特征信息以及两个维度之间的相关特征,并将塔克等级与张量管等级连接。这种测量方法可以更全面地描述张量的稀疏特征。在此基础上,我们建立了其非凸放松,并将FFM应用于低级张量完成(LRTC)和张量鲁棒的主成分分析(TRPCA)。提出了基于FFM的LRTC和TRPCA模型,并开发了两种有效的交替方向乘数法(ADMM)算法来求解所提出的模型。各种实际数值实验证实了超出最先进的方法的优势。
translated by 谷歌翻译
Tensor robust principal component analysis (TRPCA) is a promising way for low-rank tensor recovery, which minimizes the convex surrogate of tensor rank by shrinking each tensor singular values equally. However, for real-world visual data, large singular values represent more signifiant information than small singular values. In this paper, we propose a nonconvex TRPCA (N-TRPCA) model based on the tensor adjustable logarithmic norm. Unlike TRPCA, our N-TRPCA can adaptively shrink small singular values more and shrink large singular values less. In addition, TRPCA assumes that the whole data tensor is of low rank. This assumption is hardly satisfied in practice for natural visual data, restricting the capability of TRPCA to recover the edges and texture details from noisy images and videos. To this end, we integrate nonlocal self-similarity into N-TRPCA, and further develop a nonconvex and nonlocal TRPCA (NN-TRPCA) model. Specifically, similar nonlocal patches are grouped as a tensor and then each group tensor is recovered by our N-TRPCA. Since the patches in one group are highly correlated, all group tensors have strong low-rank property, leading to an improvement of recovery performance. Experimental results demonstrate that the proposed NN-TRPCA outperforms some existing TRPCA methods in visual data recovery. The demo code is available at https://github.com/qguo2010/NN-TRPCA.
translated by 谷歌翻译
In this paper, we study the problem of a batch of linearly correlated image alignment, where the observed images are deformed by some unknown domain transformations, and corrupted by additive Gaussian noise and sparse noise simultaneously. By stacking these images as the frontal slices of a third-order tensor, we propose to utilize the tensor factorization method via transformed tensor-tensor product to explore the low-rankness of the underlying tensor, which is factorized into the product of two smaller tensors via transformed tensor-tensor product under any unitary transformation. The main advantage of transformed tensor-tensor product is that its computational complexity is lower compared with the existing literature based on transformed tensor nuclear norm. Moreover, the tensor $\ell_p$ $(0<p<1)$ norm is employed to characterize the sparsity of sparse noise and the tensor Frobenius norm is adopted to model additive Gaussian noise. A generalized Gauss-Newton algorithm is designed to solve the resulting model by linearizing the domain transformations and a proximal Gauss-Seidel algorithm is developed to solve the corresponding subproblem. Furthermore, the convergence of the proximal Gauss-Seidel algorithm is established, whose convergence rate is also analyzed based on the Kurdyka-$\L$ojasiewicz property. Extensive numerical experiments on real-world image datasets are carried out to demonstrate the superior performance of the proposed method as compared to several state-of-the-art methods in both accuracy and computational time.
translated by 谷歌翻译
核标准和沙滕 - $ p $ quasi-Norm是低级矩阵恢复中受欢迎的排名代理。不幸的是,计算张量的核标准或schatten-$ p $ quasi-Norm是NP-HARD,这是对低级数张量完成(LRTC)(LRTC)和张量稳定性主组件分析(TRPCA)的怜悯。在本文中,我们根据张量的CP组件向量的欧几里得规范提出了一类新的张量级正规化器,并表明这些正则化是张量schatten-$ p $ quasi-norm的单调转换。该连接使我们能够将LRTC和TRPCA中的Schatten-$ p $ quasi-norm降至最低。这些方法不使用奇异的值分解,因此可以对大张量进行比例。此外,这些方法对初始等级的选择不敏感,并且与核定标准相比,该方法为低量张量回收率提供了任意尖锐的等级代理。另一方面,我们使用Schatten-$ $ p $ quasi-norm正规化和LRTC研究了LRTC的概括能力。该定理表明,相对更清晰的正规化程序会导致更严格的误差绑定,这与我们的数值结果一致。合成数据和实际数据的数值结果证明了与基线方法相比,我们方法的有效性和优势。
translated by 谷歌翻译
We present TensoRF, a novel approach to model and reconstruct radiance fields. Unlike NeRF that purely uses MLPs, we model the radiance field of a scene as a 4D tensor, which represents a 3D voxel grid with per-voxel multi-channel features. Our central idea is to factorize the 4D scene tensor into multiple compact low-rank tensor components. We demonstrate that applying traditional CP decomposition -- that factorizes tensors into rank-one components with compact vectors -- in our framework leads to improvements over vanilla NeRF. To further boost performance, we introduce a novel vector-matrix (VM) decomposition that relaxes the low-rank constraints for two modes of a tensor and factorizes tensors into compact vector and matrix factors. Beyond superior rendering quality, our models with CP and VM decompositions lead to a significantly lower memory footprint in comparison to previous and concurrent works that directly optimize per-voxel features. Experimentally, we demonstrate that TensoRF with CP decomposition achieves fast reconstruction (<30 min) with better rendering quality and even a smaller model size (<4 MB) compared to NeRF. Moreover, TensoRF with VM decomposition further boosts rendering quality and outperforms previous state-of-the-art methods, while reducing the reconstruction time (<10 min) and retaining a compact model size (<75 MB).
translated by 谷歌翻译
低级别在高光谱图像(HSI)降级任务中很重要。根据张量的奇异值分解定义的张量核标准(TNN)是描述HSI低级别的最新方法。但是,TNN忽略了HSI在解决deno的任务时的某些身体含义,从而导致了次优的降级性能。在本文中,我们提出了用于HSI降解任务的多模式和频率加权张量核定常(MFWTNN)和非凸MFWTNN。首先,我们研究了频率切片的物理含义,并重新考虑其权重以提高TNN的低级别表示能力。其次,我们考虑两个空间维度和HSI的光谱维度之间的相关性,并将上述改进与TNN相结合以提出MFWTNN。第三,我们使用非凸功能来近似频率张量的秩函数,并提出非MFWTNN以更好地放松MFWTNN。此外,我们自适应地选择更大的权重,用于切片,主要包含噪声信息和较小的重量,用于包含配置文件信息的切片。最后,我们开发了基于乘数(ADMM)算法的有效交替方向方法来求解所提出的模型,并在模拟和真实的HSI数据集中证实了我们的模型的有效性。
translated by 谷歌翻译
盲图修复(IR)是计算机视觉中常见但充满挑战的问题。基于经典模型的方法和最新的深度学习(DL)方法代表了有关此问题的两种不同方法,每种方法都有自己的优点和缺点。在本文中,我们提出了一种新颖的盲图恢复方法,旨在整合它们的两种优势。具体而言,我们为盲IR构建了一个普通的贝叶斯生成模型,该模型明确描绘了降解过程。在此提出的模型中,PICEL的非I.I.D。高斯分布用于适合图像噪声。它的灵活性比简单的I.I.D。在大多数常规方法中采用的高斯或拉普拉斯分布,以处理图像降解中包含的更复杂的噪声类型。为了解决该模型,我们设计了一个变异推理算法,其中所有预期的后验分布都被参数化为深神经网络,以提高其模型能力。值得注意的是,这种推论算法诱导统一的框架共同处理退化估计和图像恢复的任务。此外,利用了前一种任务中估计的降解信息来指导后一种红外过程。对两项典型的盲型IR任务进行实验,即图像降解和超分辨率,表明所提出的方法比当前最新的方法实现了卓越的性能。
translated by 谷歌翻译
高光谱图像(HSI)没有额外辅助图像的超分辨率仍然是由于其高维光谱图案的恒定挑战,其中学习有效的空间和光谱表示是基本问题。最近,隐式的神经表示(INR)正在进行进步,作为新颖且有效的代表,特别是在重建任务中。因此,在这项工作中,我们提出了一种基于INR的新颖的HSI重建模型,其通过将空间坐标映射到其对应的光谱辐射值值的连续函数来表示HSI。特别地,作为INR的特定实现,参数模型的参数是通过使用卷积网络在特征提取的超通知来预测的。它使连续功能以内容感知方式将空间坐标映射到像素值。此外,周期性空间编码与重建过程深度集成,这使得我们的模型能够恢复更高的频率细节。为了验证我们模型的功效,我们在三个HSI数据集(洞穴,NUS和NTIRE2018)上进行实验。实验结果表明,与最先进的方法相比,该建议的模型可以实现竞争重建性能。此外,我们提供了对我们模型各个组件的效果的消融研究。我们希望本文可以服务器作为未来研究的效率参考。
translated by 谷歌翻译
从嘈杂,不均匀和无知点云中的表面重建是计算机视觉和图形中的一个令人迷人但具有挑战性的问题。随着3D扫描技术的创新,强烈希望直接转换原始扫描数据,通常具有严重噪声,进入歧管三角网格。现有的基于学习的方法旨在学习零级曲面对底层形状进行的隐式功能。然而,大多数人都无法获得嘈杂和稀疏点云的理想结果,限制在实践中。在本文中,我们介绍了神经IML,一种新的方法,它直接从未引起的原始点云学习抗噪声符号距离功能(SDF)。通过最大限度地减少由隐式移动最小二乘函数获得的损耗,我们的方法通过最小化了自我监督的方式,从原始点云中从原始点云中的底层SDF,而不是明确地学习前提。 (IML)和我们的神经网络另一个,我们的预测器的梯度定义了便于计算IML的切线束。我们证明,当几个SDFS重合时,我们的神经网络可以预测符号隐式功能,其零电平集用作底层表面的良好近似。我们对各种基准进行广泛的实验,包括合成扫描和现实世界扫描,以表现出从各种投入重建忠实形状的能力,特别是对于具有噪音或间隙的点云。
translated by 谷歌翻译
张量完成是从部分观察到的条目中估算高阶数据缺失值的问题。由于盛行异常值而引起的数据腐败对传统的张量完成算法提出了重大挑战,这促进了减轻异常值效果的强大算法的发展。但是,现有的强大方法在很大程度上假定腐败很少,这可能在实践中可能不存在。在本文中,我们开发了一种两阶段的稳健张量完成方法,以处理张张量的视觉数据,并具有大量的严重损坏。提出了一个新颖的粗到精细框架,该框架使用全局粗完成结果来指导局部贴剂细化过程。为了有效地减轻大量异常值对张量恢复的影响,我们开发了一种新的基于M估计器的稳健张环回收方法,该方法可以自适应地识别异常值并减轻其在优化中的负面影响。实验结果表明,所提出的方法优于最先进的稳定算法以完成张量。
translated by 谷歌翻译
与基于离散网格的表示相比,通过基于坐标的深层完全连接网络表示视觉信号在拟合复杂的细节和求解逆问题方面有优势。但是,获得这种连续的隐式神经表示(INR)需要对信号测量值进行繁琐的人均培训,这限制了其实用性。在本文中,我们提出了一个通用的INR框架,该框架通过从数据收集中学习神经隐式词典(NID)来实现数据和培训效率,并将INR表示为词典的基础采样的功能组合。我们的NID组装了一组基于坐标的子网,这些子网已调整为跨越所需的函数空间。训练后,可以通过求解编码系数立即,稳健地获取看不见的场景表示形式。为了使大量网络优化,我们借用了从专家的混合物(MOE)借用这个想法,以设计和训练我们的网络,以稀疏的门控机制。我们的实验表明,NID可以将2D图像或3D场景的重建提高2个数量级,而输入数据少98%。我们进一步证明了NID在图像浇筑和遮挡清除中的各种应用,这被认为是香草INR的挑战。我们的代码可在https://github.com/vita-group/neural-implitic-dict中找到。
translated by 谷歌翻译
多维时空数据的概率建模对于许多现实世界应用至关重要。然而,现实世界时空数据通常表现出非平稳性的复杂依赖性,即相关结构随位置/时间而变化,并且在空间和时间之间存在不可分割的依赖性,即依赖关系。开发有效和计算有效的统计模型,以适应包含远程和短期变化的非平稳/不可分割的过程,成为一项艰巨的任务,尤其是对于具有各种腐败/缺失结构的大规模数据集。在本文中,我们提出了一个新的统计框架 - 贝叶斯互补内核学习(BCKL),以实现多维时空数据的可扩展概率建模。为了有效地描述复杂的依赖性,BCKL与短距离时空高斯过程(GP)相结合的内核低级分解(GP),其中两个组件相互补充。具体而言,我们使用多线性低级分组组件来捕获数据中的全局/远程相关性,并基于紧凑的核心函数引入加法短尺度GP,以表征其余的局部变异性。我们为模型推断开发了有效的马尔可夫链蒙特卡洛(MCMC)算法,并在合成和现实世界时空数据集上评估了所提出的BCKL框架。我们的结果证实了BCKL在提供准确的后均值和高质量不确定性估计方面的出色表现。
translated by 谷歌翻译