VOOZH about

URL: https://www.geeksforgeeks.org/machine-learning/probabilistic-matrix-factorization/

⇱ Probabilistic Matrix Factorization - GeeksforGeeks


  • Courses
  • Tutorials
  • Interview Prep

Probabilistic Matrix Factorization

Last Updated : 30 Sep, 2025

Probabilistic Matrix Factorization (PMF) is a collaborative filtering technique used in recommendation systems. Unlike traditional matrix factorization, PMF incorporates probability theory to model uncertainty in user-item interactions. This makes it highly effective for sparse datasets, where only a fraction of the ratings are observed. PMF extends MF by introducing a probabilistic model:

  • Latent factors are assumed to follow Gaussian distributions.
  • Observed ratings are modeled as Gaussian distributions around the predicted dot product.
  • Noise & uncertainty in data are naturally captured.

Matrix Factorization

Matrix Factorization (MF) is a collaborative filtering approach that predicts missing entries in a user-item interaction matrix (R) by decomposing it into two smaller matrices.

  • : user latent factor matrix.
  • : item latent factor matrix.
  • Each user and item is represented as a vector in a latent space.
  • The dot product predicts the preference of user for item .

Key Concepts

1. Latent Factors

Latent factors represent hidden features of users and items.

  • Each user is represented by a vector .
  • Each item is represented by a vector ​.
  • Their dot product gives the predicted rating.

Where,

  • : number of latent dimensions (e.g., genre preferences for movies).
  • ​: hidden representation of user i.
  • ​: hidden representation of item j.
  • : predicted rating.

2. Gaussian Priors

PMF assumes that latent factors are drawn from Gaussian (normal) distributions.

Where,

  • : user latent factor matrix.
  • : item latent factor matrix.
  • ​: variance terms controlling spread.
  • This prior ensures regularization, keeping factor values from becoming too large.

3. Observed Ratings

Each rating is modeled as a Gaussian centered on the dot product of user and item vectors.

Where,

  • : observed rating given by user i to item j.
  • : expected rating (mean of distribution).
  • : variance (captures noise in user preferences).
  • This means ratings are not exact but have uncertainty.

4. Objective Function

The goal is to find latent factors that maximize likelihood (fit the observed ratings) while avoiding overfitting.

Where,

  • First term: squared error between actual and predicted ratings.
  • Second term: : regularization for users.
  • Third term: : regularization for items.
  • ​: control the strength of regularization.
  • Minimizing LLL balances accuracy (fit to data) and simplicity (avoid large values).

Implementation

We will implement PMF using Stochastic Gradient Descent (SGD) in NumPy.

Step 1: Import Library

We will import the necessary library such as NumPy.

Step 2: Define PMF Class

We will define the PMF model with parameters for ratings, latent dimensions, learning rate, regularization and training epochs.

  • Initializes latent factors for users and items.
  • Uses Stochastic Gradient Descent to update latent factors based on observed ratings.
  • Computes Mean Squared Error (MSE) at the end of each epoch to track training progress.
  • predict: estimates a single rating using the dot product of user and item vectors.
  • compute_mse: calculates error only on observed entries.
  • full_matrix: reconstructs the entire rating matrix with predicted values.

Step 3: Generate Synthetic Data and Train Model

We will

  • Creates synthetic latent factors for users and items to simulate a rating dataset.
  • Adds Gaussian noise to make the data realistic.
  • Masks some entries to simulate missing values.
  • Trains the PMF model with Stochastic Gradient Descent.
  • Reconstructs the full rating matrix, filling in missing values with predictions.

Output:


Applications

  • Movie Recommendations: Predict user ratings for unseen movies (e.g., Netflix, Movielens).
  • E-commerce: Suggest products based on previous purchase history.
  • Music Streaming: Recommend songs/playlists (Spotify, Last.fm).
  • Social Media: Suggest friends, groups or content.
  • Online Learning: Recommend courses/resources tailored to learner profiles.

Limitations

  • Assumes linear interactions between latent factors.
  • Computationally expensive for large datasets.
  • Struggles with cold-start problem (new users/items with no history).
  • Hyperparameter tuning (learning rate, regularization, factors) can be tricky.
  • May converge to local minima if poorly initialized.
Comment