Nonnegative matrix factorization arises widely in machine learning and data analysis. In this paper, for a given factorization of rank r, we consider the sparse stochastic matrix factorization (SSMF) of decomposing a prescribed m-by-n stochastic matrix V into a product of an m-by-r stochastic matrix W and a sparse r-by-n stochastic matrix H. With the prescribed sparsity level, we reformulate the SSMF as an unconstrained nonvonvex-nonsmooth minimization problem and introduce a row-wise update algorithm for solving the minimization problem. We show that the proposed algorithm converges globally and the generated sequence converges to a special critical point of the cost function, which is a global minimizer over the W-factor as a whole and is nearly a global minimizer over each row vector of the H-factor. Numerical experiments on both synthetic and real data sets are given to demonstrate the effectiveness of our proposed algorithm.

0
下载
关闭预览

相关内容

We provide a polynomial lower bound on the minimum singular value of an $m\times m$ random matrix $M$ with jointly Gaussian entries, under a polynomial bound on the matrix norm and a global small-ball probability bound $$\inf_{x,y\in S^{m-1}}\mathbb{P}\left(\left|x^* M y\right|>m^{-O(1)}\right)\ge \frac{1}{2}.$$ With the additional assumption that $M$ is self-adjoint, the global small-ball probability bound can be replaced by a weaker version. We establish two matrix anti-concentration inequalities, which lower bound the minimum singular values of the sum of independent positive semidefinite self-adjoint matrices and the linear combination of independent random matrices with independent Gaussian coefficients. Both are under a global small-ball probability assumption. As a major application, we prove a better singular value bound for the Krylov space matrix, which leads to a faster and simpler algorithm for solving sparse linear systems. Our algorithm runs in $\tilde{O}\left(n^{\frac{3\omega-4}{\omega-1}}\right)=O(n^{2.2716})$ time where $\omega<2.37286$ is the matrix multiplication exponent, improving on the previous fastest one in $\tilde{O}\left(n^{\frac{5\omega-4}{\omega+1}}\right)=O(n^{2.33165})$ time by Peng and Vempala.

0
0
下载
预览

Matrix Factorization plays an important role in machine learning such as Non-negative Matrix Factorization, Principal Component Analysis, Dictionary Learning, etc. However, most of the studies aim to minimize the loss by measuring the Euclidean distance, though in some fields, angle distance is known to be more important and critical for analysis. In this paper, we propose a method by adding constraints on factors to unify the Euclidean and angle distance. However, due to non-convexity of the objective and constraints, the optimized solution is not easy to obtain. In this paper we propose a general framework to systematically solve it with provable convergence guarantee with various constraints.

0
0
下载
预览

Solving linear discrete ill-posed problems for third order tensor equations based on a tensor t-product has attracted much attention. But when the data tensor is produced continuously, current algorithms are not time-saving. Here, we propose an incremental tensor regularized least squares (t-IRLS) algorithm with the t-product that incrementally computes the solution to the tensor regularized least squares (t-RLS) problem with multiple lateral slices on the right-hand side. More specifically, we update its solution by solving a t-RLS problem with a single lateral slice on the right-hand side whenever a new horizontal sample arrives, instead of solving the t-RLS problem from scratch. The t-IRLS algorithm is well suited for large data sets and real time operation. Numerical examples are presented to demonstrate the efficiency of our algorithm.

0
0
下载
预览

The maximum matching problem in dynamic graphs subject to edge updates (insertions and deletions) has received much attention over the last few years; a multitude of approximation/time tradeoffs were obtained, improving upon the folklore algorithm, which maintains a maximal (and hence $2$-approximate) matching in $O(n)$ worst-case update time in $n$-node graphs. We present the first deterministic algorithm which outperforms the folklore algorithm in terms of {\em both} approximation ratio and worst-case update time. Specifically, we give a $(2-\Omega(1))$-approximate algorithm with $O(m^{3/8})=O(n^{3/4})$ worst-case update time in $n$-node, $m$-edge graphs. For sufficiently small constant $\epsilon>0$, no deterministic $(2+\epsilon)$-approximate algorithm with worst-case update time $O(n^{0.99})$ was known. Our second result is the first deterministic $(2+\epsilon)$-approximate weighted matching algorithm with $O_\epsilon(1)\cdot O(\sqrt[4]{m}) = O_\epsilon(1)\cdot O(\sqrt{n})$ worst-case update time. Our main technical contributions are threefold: first, we characterize the tight cases for \emph{kernels}, which are the well-studied matching sparsifiers underlying much of the $(2+\epsilon)$-approximate dynamic matching literature. This characterization, together with multiple ideas -- old and new -- underlies our result for breaking the approximation barrier of $2$. Our second technical contribution is the first example of a dynamic matching algorithm whose running time is improved due to improving the \emph{recourse} of other dynamic matching algorithms. Finally, we show how to use dynamic bipartite matching algorithms as black-box subroutines for dynamic matching in general graphs without incurring the natural $\frac{3}{2}$ factor in the approximation ratio which such approaches naturally incur.

0
0
下载
预览

Centrality measures identify and rank the most influential entities of complex networks. In this paper, we generalize matrix function-based centrality measures, which have been studied extensively for single-layer and temporal networks in recent years to layer-coupled multiplex networks. The layers of these networks can reflect different relationships and interactions between entities or changing interactions over time. We use the supra-adjacency matrix as network representation, which has already been used to generalize eigenvector centrality to temporal and multiplex networks. With a suitable choice of edge weights, the definition of single-layer matrix function-based centrality measures in terms of walks on networks carries over naturally to the multilayer case. In contrast to other walk-based centralities, matrix function-based centralities are parameterized measures, which have been shown to interpolate between (local) degree and (global) eigenvector centrality in the single-layer case. As the explicit evaluation of the involved matrix function expressions becomes infeasible for medium to large-scale networks, we present highly efficient approximation techniques from numerical linear algebra, which rely on Krylov subspace methods, Gauss quadrature, and stochastic trace estimation. We present extensive numerical studies on synthetic and real-world multiplex transportation, communication, and collaboration networks. The comparison with established multilayer centrality measures shows that our framework produces meaningful rankings of nodes, layers, and node-layer pairs. Furthermore, our experiments corroborate the theoretically indicated linear computational complexity of the employed numerical methods for sparse supra-adjacency matrices, which allows the efficient treatment of large-scale networks with the number of node-layer pairs of order $10^7$ or higher.

0
0
下载
预览

Several recent applications of optimal transport (OT) theory to machine learning have relied on regularization, notably entropy and the Sinkhorn algorithm. Because matrix-vector products are pervasive in the Sinkhorn algorithm, several works have proposed to \textit{approximate} kernel matrices appearing in its iterations using low-rank factors. Another route lies instead in imposing low-rank constraints on the feasible set of couplings considered in OT problems, with no approximations on cost nor kernel matrices. This route was first explored by Forrow et al., 2018, who proposed an algorithm tailored for the squared Euclidean ground cost, using a proxy objective that can be solved through the machinery of regularized 2-Wasserstein barycenters. Building on this, we introduce in this work a generic approach that aims at solving, in full generality, the OT problem under low-rank constraints with arbitrary costs. Our algorithm relies on an explicit factorization of low rank couplings as a product of \textit{sub-coupling} factors linked by a common marginal; similar to an NMF approach, we alternatively updates these factors. We prove the non-asymptotic stationary convergence of this algorithm and illustrate its efficiency on benchmark experiments.

0
9
下载
预览

We show that for the problem of testing if a matrix $A \in F^{n \times n}$ has rank at most $d$, or requires changing an $\epsilon$-fraction of entries to have rank at most $d$, there is a non-adaptive query algorithm making $\widetilde{O}(d^2/\epsilon)$ queries. Our algorithm works for any field $F$. This improves upon the previous $O(d^2/\epsilon^2)$ bound (SODA'03), and bypasses an $\Omega(d^2/\epsilon^2)$ lower bound of (KDD'14) which holds if the algorithm is required to read a submatrix. Our algorithm is the first such algorithm which does not read a submatrix, and instead reads a carefully selected non-adaptive pattern of entries in rows and columns of $A$. We complement our algorithm with a matching query complexity lower bound for non-adaptive testers over any field. We also give tight bounds of $\widetilde{\Theta}(d^2)$ queries in the sensing model for which query access comes in the form of $\langle X_i, A\rangle:=tr(X_i^\top A)$; perhaps surprisingly these bounds do not depend on $\epsilon$. We next develop a novel property testing framework for testing numerical properties of a real-valued matrix $A$ more generally, which includes the stable rank, Schatten-$p$ norms, and SVD entropy. Specifically, we propose a bounded entry model, where $A$ is required to have entries bounded by $1$ in absolute value. We give upper and lower bounds for a wide range of problems in this model, and discuss connections to the sensing model above.

0
3
下载
预览

Review-based recommender systems have gained noticeable ground in recent years. In addition to the rating scores, those systems are enriched with textual evaluations of items by the users. Neural language processing models, on the other hand, have already found application in recommender systems, mainly as a means of encoding user preference data, with the actual textual description of items serving only as side information. In this paper, a novel approach to incorporating the aforementioned models into the recommendation process is presented. Initially, a neural language processing model and more specifically the paragraph vector model is used to encode textual user reviews of variable length into feature vectors of fixed length. Subsequently this information is fused along with the rating scores in a probabilistic matrix factorization algorithm, based on maximum a-posteriori estimation. The resulting system, ParVecMF, is compared to a ratings' matrix factorization approach on a reference dataset. The obtained preliminary results on a set of two metrics are encouraging and may stimulate further research in this area.

0
8
下载
预览

We introduce negative binomial matrix factorization (NBMF), a matrix factorization technique specially designed for analyzing over-dispersed count data. It can be viewed as an extension of Poisson matrix factorization (PF) perturbed by a multiplicative term which models exposure. This term brings a degree of freedom for controlling the dispersion, making NBMF more robust to outliers. We show that NBMF allows to skip traditional pre-processing stages, such as binarization, which lead to loss of information. Two estimation approaches are presented: maximum likelihood and variational Bayes inference. We test our model with a recommendation task and show its ability to predict user tastes with better precision than PF.

0
8
下载
预览

Since the invention of word2vec, the skip-gram model has significantly advanced the research of network embedding, such as the recent emergence of the DeepWalk, LINE, PTE, and node2vec approaches. In this work, we show that all of the aforementioned models with negative sampling can be unified into the matrix factorization framework with closed forms. Our analysis and proofs reveal that: (1) DeepWalk empirically produces a low-rank transformation of a network's normalized Laplacian matrix; (2) LINE, in theory, is a special case of DeepWalk when the size of vertices' context is set to one; (3) As an extension of LINE, PTE can be viewed as the joint factorization of multiple networks' Laplacians; (4) node2vec is factorizing a matrix related to the stationary distribution and transition probability tensor of a 2nd-order random walk. We further provide the theoretical connections between skip-gram based network embedding algorithms and the theory of graph Laplacian. Finally, we present the NetMF method as well as its approximation algorithm for computing network embedding. Our method offers significant improvements over DeepWalk and LINE for conventional network mining tasks. This work lays the theoretical foundation for skip-gram based network embedding methods, leading to a better understanding of latent network representation learning.

0
15
下载
预览
小贴士
相关主题
相关论文
Kai Liu
0+阅读 · 11月29日
Zhengbang Cao,Pengpeng Xie
0+阅读 · 11月29日
Mohammad Roghani,Amin Saberi,David Wajc
0+阅读 · 11月28日
Meyer Scetbon,Marco Cuturi,Gabriel Peyré
9+阅读 · 3月8日
Maria-Florina Balcan,Yi Li,David P. Woodruff,Hongyang Zhang
3+阅读 · 2018年10月18日
Georgios Alexandridis,Georgios Siolas,Andreas Stafylopatis
8+阅读 · 2018年1月10日
Olivier Gouvert,Thomas Oberlin,Cédric Févotte
8+阅读 · 2018年1月5日
Jiezhong Qiu,Yuxiao Dong,Hao Ma,Jian Li,Kuansan Wang,Jie Tang
15+阅读 · 2017年12月12日
相关资讯
ICLR2019最佳论文出炉
专知
11+阅读 · 2019年5月6日
强化学习的Unsupervised Meta-Learning
CreateAMind
7+阅读 · 2019年1月7日
无监督元学习表示学习
CreateAMind
21+阅读 · 2019年1月4日
meta learning 17年:MAML SNAIL
CreateAMind
9+阅读 · 2019年1月2日
直观的强化学习算法(A2C)
深度强化学习实验室
3+阅读 · 2018年6月19日
【推荐】免费书(草稿):数据科学的数学基础
机器学习研究会
12+阅读 · 2017年10月1日
【学习】Hierarchical Softmax
机器学习研究会
3+阅读 · 2017年8月6日
Top