智能论文笔记

Least-squares methods for nonnegative matrix factorization over rational functions

Cécile Hautecoeur , Lieven De Lathauwer , Nicolas Gillis , François Glineur

分类：计算机视觉 | 机器学习

2022-09-26

非负矩阵分解（NMF）模型被广泛用于恢复线性混合的非负数据。当数据是由连续信号采样的数据时，NMF中的因素可能被限制为非负合理函数的样本，这些函数允许相当通用的模型。使用Rational功能（R-NMF）称之为NMF。我们首先表明，在温和的假设下，R-NMF与NMF不同，这在基本上是独特的分解，这在需要恢复地面实际因素（例如盲源分离问题）的应用中至关重要。然后，我们提出了求解R-NMF的不同方法：R-HANLS，R-ANLS和R-NLS方法。从我们的测试中，没有什么方法明显优于其他方法，并且在时间和准确性之间应进行权衡。确实，R-Hanls对于大型问题而言是快速准确的，而R-ANLS更准确，但在时间和内存中都需要更多的资源。 R-NLS非常准确，但仅针对小问题。此外，我们表明R-NMF在各种任务中的表现都优于NMF，包括恢复半合成连续信号，以及实际高光信号的分类问题。

translated by 谷歌翻译

Nonlinear matrix recovery using optimization on the Grassmann manifold

Florentin Goyens , Coralia Cartis , Armin Eftekhari

分类： (统计)机器学习 | 机器学习

2021-09-13

We investigate the problem of recovering a partially observed high-rank matrix whose columns obey a nonlinear structure such as a union of subspaces, an algebraic variety or grouped in clusters. The recovery problem is formulated as the rank minimization of a nonlinear feature map applied to the original matrix, which is then further approximated by a constrained non-convex optimization problem involving the Grassmann manifold. We propose two sets of algorithms, one arising from Riemannian optimization and the other as an alternating minimization scheme, both of which include first- and second-order variants. Both sets of algorithms have theoretical guarantees. In particular, for the alternating minimization, we establish global convergence and worst-case complexity bounds. Additionally, using the Kurdyka-Lojasiewicz property, we show that the alternating minimization converges to a unique limit point. We provide extensive numerical results for the recovery of union of subspaces and clustering under entry sampling and dense Gaussian sampling. Our methods are competitive with existing approaches and, in particular, high accuracy is achieved in the recovery using Riemannian second-order methods.

translated by 谷歌翻译

Dictionary-based Low-Rank Approximations and the Mixed Sparse Coding problem

Jeremy E. Cohen

分类：机器学习 | (统计)机器学习

2021-11-24

约束的张量和矩阵分子化模型允许从多道数据中提取可解释模式。因此，对于受约束的低秩近似度的可识别性特性和有效算法是如此重要的研究主题。这项工作涉及低秩近似的因子矩阵的列，以众所周知的和可能的过度顺序稀疏，该模型包括基于字典的低秩近似（DLRA）。虽然早期的贡献集中在候选列字典内的发现因子列，即一稀疏的近似值，这项工作是第一个以大于1的稀疏性解决DLRA。我建议专注于稀疏编码的子问题，在解决DLRA时出现的混合稀疏编码（MSC）以交替的优化策略在解决DLRA时出现。提供了基于稀疏编码启发式的几种算法（贪婪方法，凸起放松）以解决MSC。在模拟数据上评估这些启发式的性能。然后，我展示了如何基于套索来调整一个有效的MSC求解器，以计算高光谱图像处理和化学测量学的背景下的基于词典的基于矩阵分解和规范的多adic分解。这些实验表明，DLRA扩展了低秩近似的建模能力，有助于降低估计方差并提高估计因子的可识别性和可解释性。

translated by 谷歌翻译

Learning Transition Operators From Sparse Space-Time Samples

Christian Kümmerle , Mauro Maggioni , Sui Tang

分类：机器学习 | (统计)机器学习

2022-12-01

We consider the nonlinear inverse problem of learning a transition operator $\mathbf{A}$ from partial observations at different times, in particular from sparse observations of entries of its powers $\mathbf{A},\mathbf{A}^2,\cdots,\mathbf{A}^{T}$. This Spatio-Temporal Transition Operator Recovery problem is motivated by the recent interest in learning time-varying graph signals that are driven by graph operators depending on the underlying graph topology. We address the nonlinearity of the problem by embedding it into a higher-dimensional space of suitable block-Hankel matrices, where it becomes a low-rank matrix completion problem, even if $\mathbf{A}$ is of full rank. For both a uniform and an adaptive random space-time sampling model, we quantify the recoverability of the transition operator via suitable measures of incoherence of these block-Hankel embedding matrices. For graph transition operators these measures of incoherence depend on the interplay between the dynamics and the graph topology. We develop a suitable non-convex iterative reweighted least squares (IRLS) algorithm, establish its quadratic local convergence, and show that, in optimal scenarios, no more than $\mathcal{O}(rn \log(nT))$ space-time samples are sufficient to ensure accurate recovery of a rank-$r$ operator $\mathbf{A}$ of size $n \times n$. This establishes that spatial samples can be substituted by a comparable number of space-time samples. We provide an efficient implementation of the proposed IRLS algorithm with space complexity of order $O(r n T)$ and per-iteration time complexity linear in $n$. Numerical experiments for transition operators based on several graph models confirm that the theoretical findings accurately track empirical phase transitions, and illustrate the applicability and scalability of the proposed algorithm.

translated by 谷歌翻译

Matrix-wise $\ell_0$-constrained Sparse Nonnegative Least Squares

Nicolas Nadisic , Jeremy E Cohen , Arnaud Vandaele , Nicolas Gillis

分类：机器学习 | (统计)机器学习

2020-11-22

在依赖添加剂线性组合的模型中，出现了多个右侧（MNNL）的非负平方问题。特别是，它们是大多数非负矩阵分解算法的核心，并且具有许多应用。已知非负约束自然有利于稀疏性，即几乎没有零条目的解决方案。但是，它通常可以进一步增强这种稀疏性很有用，因为它可以提高结果的解释性并有助于减少噪声，从而导致稀疏的MNNL问题。在本文中，与大多数实施稀疏柱或行的大多数作品相反，我们首先引入了稀疏MNNL的新颖配方，并具有矩阵的稀疏性约束。然后，我们提出了一种两步算法来解决这个问题。第一步将稀疏的MNNL划分为子问题，每列的原始问题一列。然后，它使用不同的算法来确切或大约为每个子问题产生一个帕累托正面，即产生一组代表重建误差和稀疏性之间不同权衡的解决方案。第二步选择了这些帕累托前部之间的解决方案，以构建一个稀疏约束矩阵，以最大程度地减少重建误差。我们对面部和高光谱图像进行实验，我们表明我们提出的两步方法比最新的稀疏编码启发式方法提供了更准确的结果。

translated by 谷歌翻译

Grassmannian Optimization for Online Tensor Completion and Tracking with the t-SVD

Kyle Gilman , Davoud Ataee Tarzanagh , Laura Balzano

分类：机器学习

2020-01-30

我们使用张量奇异值分解（T-SVD）代数框架提出了一种新的快速流算法，用于抵抗缺失的低管级张量的缺失条目。我们展示T-SVD是三阶张量的研究型块术语分解的专业化，我们在该模型下呈现了一种算法，可以跟踪从不完全流2-D数据的可自由子模块。所提出的算法使用来自子空间的基层歧管的增量梯度下降的原理，以解决线性复杂度和时间样本的恒定存储器的张量完成问题。我们为我们的算法提供了局部预期的线性收敛结果。我们的经验结果在精确态度上具有竞争力，但在计算时间内比实际应用上的最先进的张量完成算法更快，以在有限的采样下恢复时间化疗和MRI数据。

translated by 谷歌翻译

Bounded Simplex-Structured Matrix Factorization

Olivier Vu Thanh , Nicolas Gillis , Fabian Lecron

分类：机器学习 | (统计)机器学习

2022-09-26

在本文中，我们提出了一个新的低级矩阵分解模型，称为有界的单纯形成矩阵分解（BSSMF）。给定输入矩阵$ x $和一个分解等级$ r $，BSSMF寻找带有$ r $ lum $ $ columns的矩阵$ w $和a矩阵$ h $，带有$ r $行，以便$ x \ lot在$ w $的每一列中，都有边界，也就是说，它们属于给定的间隔，$ h $的列属于概率单纯词，即，$ h $是列随机。 BSSMF概括了非负矩阵分解（NMF）和单纯结构的矩阵分解（SSMF）。当输入矩阵$ x $的条目属于给定间隔时，BSSMF特别适合。例如，当$ x $的行代表图像时，或$ x $是一个额定矩阵，例如在Netflix和Movielens数据集中，其中$ x $的条目属于Interval $ [1,5] $。单纯结构的矩阵$ h $不仅导致易于理解的分解，从而提供了$ x $的列的软聚类，而且暗示着$ wh $的每个列的条目属于与$的列的相同间隔W $。在本文中，我们首先提出了BSSMF的快速算法，即使在$ x $中缺少数据的情况下。然后，我们为BSSMF提供可识别性条件，也就是说，我们提供了BSSMF承认独特分解的条件，直到微不足道的歧义。最后，我们说明了BSSMF对两个应用程序的有效性：在一组图像中提取特征，以及推荐系统的矩阵完成问题。

translated by 谷歌翻译

Column $\ell_{2,0}$-norm regularized factorization model of low-rank matrix recovery and its computation

Ting Tao , Yitian Qian , Shaohua Pan

分类： (统计)机器学习

2020-08-24

本文涉及低级矩阵恢复问题的$ \ ell_ {2,0} $ \ ell_ {2,0} $ - 正则化分解模型及其计算。引入了Qual $ \ ell_ {2,0} $ - 因子矩阵的规范，以促进因素和低级别解决方案的柱稀疏性。对于这种不透露的不连续优化问题，我们开发了一种具有外推的交替的多种化 - 最小化（AMM）方法，以及一个混合AMM，其中提出了一种主要的交替的近端方法，以寻找与较少的非零列和带外推的AMM的初始因子对。然后用于最小化平滑的非凸损失。我们为所提出的AMM方法提供全局收敛性分析，并使用非均匀采样方案将它们应用于矩阵完成问题。数值实验是用综合性和实际数据示例进行的，并且与核形态正则化分解模型的比较结果和MAX-NORM正则化凸模型显示柱$ \ ell_ {2,0} $ - 正则化分解模型具有优势在更短的时间内提供较低误差和排名的解决方案。

translated by 谷歌翻译

Is Monte Carlo a bad sampling strategy for learning smooth functions in high dimensions?

Ben Adcock , Simone Brugiapaglia

分类：机器学习

2022-08-18

本文涉及使用多项式的有限样品的平滑，高维函数的近似。这项任务是计算科学和工程中许多应用的核心 - 尤其是由参数建模和不确定性量化引起的。通常在此类应用中使用蒙特卡洛（MC）采样，以免屈服于维度的诅咒。但是，众所周知，这种策略在理论上是最佳的。尺寸$ n $有许多多项式空间，样品复杂度尺度划分为$ n $。这种有据可查的现象导致了一致的努力，以设计改进的，实际上是近乎最佳的策略，其样本复杂性是线性的，甚至线性地缩小了$ n $。自相矛盾的是，在这项工作中，我们表明MC实际上是高维度中的一个非常好的策略。我们首先通过几个数值示例记录了这种现象。接下来，我们提出一个理论分析，该分析能够解决这种悖论，以实现无限多变量的全体形态功能。我们表明，基于$ M $ MC样本的最小二乘方案，其错误衰减为$ m/\ log（m）$，其速率与最佳$ n $ term的速率相同多项式近似。该结果是非构造性的，因为它假定了进行近似的合适多项式空间的知识。接下来，我们提出了一个基于压缩感应的方案，该方案达到了相同的速率，除了较大的聚类因子。该方案是实用的，并且在数值上，它的性能和比知名的自适应最小二乘方案的性能和更好。总体而言，我们的发现表明，当尺寸足够高时，MC采样非常适合平滑功能近似。因此，改进的采样策略的好处通常仅限于较低维度的设置。

translated by 谷歌翻译

Joint Continuous and Discrete Model Selection via Submodularity

Jonathan Bunton , Paulo Tabuada

分类：机器学习

2021-02-17

In model selection problems for machine learning, the desire for a well-performing model with meaningful structure is typically expressed through a regularized optimization problem. In many scenarios, however, the meaningful structure is specified in some discrete space, leading to difficult nonconvex optimization problems. In this paper, we connect the model selection problem with structure-promoting regularizers to submodular function minimization with continuous and discrete arguments. In particular, we leverage the theory of submodular functions to identify a class of these problems that can be solved exactly and efficiently with an agnostic combination of discrete and continuous optimization routines. We show how simple continuous or discrete constraints can also be handled for certain problem classes and extend these ideas to a robust optimization framework. We also show how some problems outside of this class can be embedded within the class, further extending the class of problems our framework can accommodate. Finally, we numerically validate our theoretical results with several proof-of-concept examples with synthetic and real-world data, comparing against state-of-the-art algorithms.

translated by 谷歌翻译

Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions

Nathan Halko , Per-Gunnar Martinsson , Joel A. Tropp

分类：

2009-09-22

Low-rank matrix approximations, such as the truncated singular value decomposition and the rank-revealing QR decomposition, play a central role in data analysis and scientific computing. This work surveys and extends recent research which demonstrates that randomization offers a powerful tool for performing low-rank matrix approximation. These techniques exploit modern computational architectures more fully than classical methods and open the possibility of dealing with truly massive data sets.This paper presents a modular framework for constructing randomized algorithms that compute partial matrix decompositions. These methods use random sampling to identify a subspace that captures most of the action of a matrix. The input matrix is then compressed-either explicitly or implicitly-to this subspace, and the reduced matrix is manipulated deterministically to obtain the desired low-rank factorization. In many cases, this approach beats its classical competitors in terms of accuracy, speed, and robustness. These claims are supported by extensive numerical experiments and a detailed error analysis.The specific benefits of randomized techniques depend on the computational environment. Consider the model problem of finding the k dominant components of the singular value decomposition of an m × n matrix. (i) For a dense input matrix, randomized algorithms require O(mn log(k)) floating-point operations (flops) in contrast with O(mnk) for classical algorithms. (ii) For a sparse input matrix, the flop count matches classical Krylov subspace methods, but the randomized approach is more robust and can easily be reorganized to exploit multi-processor architectures. (iii) For a matrix that is too large to fit in fast memory, the randomized techniques require only a constant number of passes over the data, as opposed to O(k) passes for classical algorithms. In fact, it is sometimes possible to perform matrix approximation with a single pass over the data.

translated by 谷歌翻译

Majorization-minimization for Sparse Nonnegative Matrix Factorization with the $β$-divergence

Arthur Marmin , José Henrique de Morais Goulart , Cédric Févotte

分类：机器学习

2022-07-13

本文介绍了针对非负矩阵分解的新的乘法更新，并使用$ \ beta $ -Divergence和两个因素之一的稀疏正则化（例如，激活矩阵）。众所周知，需要控制另一个因素（字典矩阵）的规范，以避免使用不良的公式。标准实践包括限制字典的列具有单位规范，这导致了非平凡的优化问题。我们的方法利用原始问题对等效规模不变的目标函数的优化进行了重新处理。从那里，我们得出了块状大量最小化算法，这些算法可为$ \ ell_ {1} $ - 正则化或更“激进的” log-regularization提供简单的乘法更新。与其他最先进的方法相反，我们的算法是通用的，因为它们可以应用于任何$ \ beta $ -Divergence（即任何$ \ beta $的任何值），并且它们具有融合保证。我们使用各种数据集报告了与现有的启发式和拉格朗日方法的数值比较：面部图像，音频谱图，高光谱数据和歌曲播放计数。我们表明，我们的方法获得了收敛时类似质量的溶液（相似的目标值），但CPU时间显着减少。

translated by 谷歌翻译

A fast iterative shrinkage-thresholding algorithm for linear inverse problems

分类：

We consider the class of iterative shrinkage-thresholding algorithms (ISTA) for solving linear inverse problems arising in signal/image processing. This class of methods, which can be viewed as an extension of the classical gradient algorithm, is attractive due to its simplicity and thus is adequate for solving large-scale problems even with dense matrix data. However, such methods are also known to converge quite slowly. In this paper we present a new fast iterative shrinkage-thresholding algorithm (FISTA) which preserves the computational simplicity of ISTA but with a global rate of convergence which is proven to be significantly better, both theoretically and practically. Initial promising numerical results for wavelet-based image deblurring demonstrate the capabilities of FISTA which is shown to be faster than ISTA by several orders of magnitude.

translated by 谷歌翻译

Estimation Contracts for Outlier-Robust Geometric Perception

Luca Carlone

分类： (统计)机器学习 | 计算机视觉 | 机器学习 | 机器人

2022-08-22

Outier-bubust估计是一个基本问题，已由统计学家和从业人员进行了广泛的研究。在过去的几年中，整个研究领域的融合都倾向于“算法稳定统计”，该统计数据的重点是开发可拖动的异常体 - 固定技术来解决高维估计问题。尽管存在这种融合，但跨领域的研究工作主要彼此断开。本文桥接了有关可认证的异常抗衡器估计的最新工作，该估计是机器人技术和计算机视觉中的几何感知，并在健壮的统计数据中并行工作。特别是，我们适应并扩展了最新结果对可靠的线性回归（适用于<< 50％异常值的低外壳案例）和列表可解码的回归（适用于>> 50％异常值的高淘汰案例）在机器人和视觉中通常发现的设置，其中（i）变量（例如旋转，姿势）属于非convex域，（ii）测量值是矢量值，并且（iii）未知的异常值是先验的。这里的重点是绩效保证：我们没有提出新算法，而是为投入测量提供条件，在该输入测量值下，保证现代估计算法可以在存在异常值的情况下恢复接近地面真相的估计值。这些条件是我们所谓的“估计合同”。除了现有结果的拟议扩展外，我们认为本文的主要贡献是（i）通过指出共同点和差异来统一平行的研究行，（ii）在介绍先进材料（例如，证明总和证明）中的统一行为。对从业者的可访问和独立的演讲，（iii）指出一些即时的机会和开放问题，以发出异常的几何感知。

translated by 谷歌翻译

Separable Quaternion Matrix Factorization for Polarization Images

Junjun Pan , Michael K. Ng

分类：计算机视觉

2022-07-28

极化是横向波的独特特征，由Stokes参数表示。极化状态的分析可以揭示有关来源的宝贵信息。在本文中，我们提出了一个可分离的低级别四元素线性混合模型对极化信号：我们假设源因子矩阵的每一列等于极化数据矩阵的一列，并将相应的问题称为可分离的Quaternion矩阵分解（SQMF）。我们讨论了SQMF可以分解的矩阵的一些属性。为了确定季节空间中的源因子矩阵，我们提出了一种受连续投影算法启发的称为Quaternion连续投影算法（QSPA）的启发式算法。为了确保QSPA的有效性，为Quaternion矩阵提出了一个新的归一化操作员。我们使用块坐标下降算法来计算实际数字空间中的非负因子激活矩阵。我们在极化图像表示和光偏光成像的应用中测试我们的方法，以验证其有效性。

translated by 谷歌翻译

Project and Forget: Solving Large-Scale Metric Constrained Problems

Rishi Sonthalia , Anna C. Gilbert

分类：机器学习 | (统计)机器学习

2020-05-08

给定数据点之间的一组差异测量值，确定哪种度量表示与输入测量最“一致”或最能捕获数据相关几何特征的度量是许多机器学习算法的关键步骤。现有方法仅限于特定类型的指标或小问题大小，因为在此类问题中有大量的度量约束。在本文中，我们提供了一种活跃的集合算法，即项目和忘记，该算法使用Bregman的预测，以解决许多（可能是指数）不平等约束的度量约束问题。我们提供了\ textsc {project and Hoses}的理论分析，并证明我们的算法会收敛到全局最佳解决方案，并以指数速率渐近地渐近地衰减了当前迭代的$ L_2 $距离。我们证明，使用我们的方法，我们可以解决三种类型的度量约束问题的大型问题实例：一般体重相关聚类，度量近距离和度量学习；在每种情况下，就CPU时间和问题尺寸而言，超越了艺术方法的表现。

translated by 谷歌翻译

Hankel low-rank approximation and completion in time series analysis and forecasting: a brief review

Jonathan Gillard , Konstantin Usevich

分类： (统计)机器学习

2022-06-10

在本文中，我们提供了有关Hankel低级近似和完成工作的综述和书目，特别强调了如何将这种方法用于时间序列分析和预测。我们首先描述问题的可能表述，并就获得全球最佳解决方案的相关主题和挑战提供评论。提供了关键定理，并且纸张以一些说明性示例关闭。

translated by 谷歌翻译

Robust Principal Component Analysis?

Emmanuel J. Candes , Xiaodong Li , Yi Ma , John Wright

分类：

2009-12-18

This paper is about a curious phenomenon. Suppose we have a data matrix, which is the superposition of a low-rank component and a sparse component. Can we recover each component individually? We prove that under some suitable assumptions, it is possible to recover both the low-rank and the sparse components exactly by solving a very convenient convex program called Principal Component Pursuit; among all feasible decompositions, simply minimize a weighted combination of the nuclear norm and of the 1 norm. This suggests the possibility of a principled approach to robust principal component analysis since our methodology and results assert that one can recover the principal components of a data matrix even though a positive fraction of its entries are arbitrarily corrupted. This extends to the situation where a fraction of the entries are missing as well. We discuss an algorithm for solving this optimization problem, and present applications in the area of video surveillance, where our methodology allows for the detection of objects in a cluttered background, and in the area of face recognition, where it offers a principled way of removing shadows and specularities in images of faces.

translated by 谷歌翻译

Recent Theoretical Advances in Non-Convex Optimization

Marina Danilova , Pavel Dvurechensky , Alexander Gasnikov , Eduard Gorbunov , Sergey Guminov , Dmitry Kamzolov , Innokentiy Shibaev

分类：机器学习

2020-12-11

近期在应用于培训深度神经网络和数据分析中的其他优化问题中的非凸优化的优化算法的兴趣增加，我们概述了最近对非凸优化优化算法的全球性能保证的理论结果。我们从古典参数开始，显示一般非凸面问题无法在合理的时间内有效地解决。然后，我们提供了一个问题列表，可以通过利用问题的结构来有效地找到全球最小化器，因为可能的问题。处理非凸性的另一种方法是放宽目标，从找到全局最小，以找到静止点或局部最小值。对于该设置，我们首先为确定性一阶方法的收敛速率提出了已知结果，然后是最佳随机和随机梯度方案的一般理论分析，以及随机第一阶方法的概述。之后，我们讨论了非常一般的非凸面问题，例如最小化$ \ alpha $ -weakly-are-convex功能和满足Polyak-lojasiewicz条件的功能，这仍然允许获得一阶的理论融合保证方法。然后，我们考虑更高阶和零序/衍生物的方法及其收敛速率，以获得非凸优化问题。

translated by 谷歌翻译

Guaranteed Minimum-Rank Solutions of Linear Matrix Equations via Nuclear Norm Minimization

Benjamin Recht , Maryam Fazel , Pablo A. Parrilo

分类：

2007-06-28

The affine rank minimization problem consists of finding a matrix of minimum rank that satisfies a given system of linear equality constraints. Such problems have appeared in the literature of a diverse set of fields including system identification and control, Euclidean embedding, and collaborative filtering. Although specific instances can often be solved with specialized algorithms, the general affine rank minimization problem is NP-hard, because it contains vector cardinality minimization as a special case.In this paper, we show that if a certain restricted isometry property holds for the linear transformation defining the constraints, the minimum rank solution can be recovered by solving a convex optimization problem, namely the minimization of the nuclear norm over the given affine space. We present several random ensembles of equations where the restricted isometry property holds with overwhelming probability, provided the codimension of the subspace is Ω(r(m + n) log mn), where m, n are the dimensions of the matrix, and r is its rank.The techniques used in our analysis have strong parallels in the compressed sensing framework. We discuss how affine rank minimization generalizes this pre-existing concept and outline a dictionary relating concepts from cardinality minimization to those of rank minimization. We also discuss several algorithmic approaches to solving the norm minimization relaxations, and illustrate our results with numerical examples.

translated by 谷歌翻译