Abstract:
The present invention provides a method and system for distributed probabilistic matrix factorization. In accordance with a disclosed embodiment, the method may include partitioning a sparse matrix into a first set of blocks on a distributed computer cluster, whereby a dimension of each block is MB rows and NB columns. Further, the method shall include initializing a plurality of matrices including first mean matrix Ū, a first variance matrix Ũ, a first prior variance matrix ŨP, a second mean matrix V, a second variance matrix {tilde over (V)}, and a second prior variance matrix {tilde over (V)}P, by a set of values from a probability distribution function. The plurality of matrices can be partitioned into a set of blocks on the distributed computer cluster, whereby each block can be of a shorter dimension K, and the plurality of matrices can be updated iteratively until a cost function of the sparse matrix converges.