VOOZH about

URL: https://towardsdatascience.com/lle-locally-linear-embedding-a-nifty-way-to-reduce-dimensionality-in-python-ab5c38336107/

⇱ LLE: Locally Linear Embedding - A Nifty Way to Reduce Dimensionality in Python | Towards Data Science


LLE: Locally Linear Embedding – A Nifty Way to Reduce Dimensionality in Python

A detailed look into how LLE works and how it compares to similar algorithms such as Isomap

10 min read

Tips and Tricks, Machine Learning

👁 Locally Linear Embedding (LLE). Image by author.
Locally Linear Embedding (LLE). Image by author.

Intro

It may be challenging to choose the right dimensionality reduction technique for your data. However, if you are specifically looking for a non-linear approach, then Locally Linear Embedding (LLE) and Isometric Mapping (Isomap) would be good ones to explore.

In this article, I will take you through the details of how LLE works and how it differs from Isomap. Also, I will provide a Python example to comparing these algorithms.

Contents

  • LLE within the universe of Machine Learning algorithms
  • Intuitive explanation of how LLE works
  • Python example of LLE and its comparison to Isomap

Locally Linear Embedding (LLE) within the universe of Machine Learning algorithms

Even an experienced Data Scientist can easily get lost amongst hundreds of different Machine Learning algorithms used in the industry. Hence, I believe there is value in creating a structure by putting some of the most frequently used algorithms into categories.

It will never be perfect because some algorithms are flexible enough to perform multiple tasks. Nevertheless, below is my attempt to create such categorization. Make sure to explore this interactive graph by clicking👇 on different sections to reveal more.

As you can see, Locally Linear Embedding (LLE) sits under the Unsupervised branch of Machine Learning within the group of dimensionality reduction algorithms.

It means that you do not need to have a target variable to perform dimensionality reduction with LLE, unlike supervised techniques such as Linear Discriminant Analysis (LDA).

If you enjoy Data Science and Machine Learning, please subscribe to get an email whenever I publish a new story.

How does Locally Linear Embedding work?

The high-level steps

Similar to Isomap, LLE combines several steps to produce the lower-dimensional embedding. These are:

  1. Use a KNN approach to find the k nearest neighbors of every data point. Here, "k" is an arbitrary number of neighbors that you can specify within model hyperparameters.
  2. Construct a weight matrix where every point has its weights determined by minimizing the error of the cost function shown below. Note that every point is a linear combination of its neighbors, which means that weights for non-neighbors are 0.
👁 Finding weights in the original high-dimensional space. Image by author.
Finding weights in the original high-dimensional space. Image by author.
  1. Find the positions of all the points in the new lower-dimensional embedding by minimizing the cost function shown below. Note, here we use weights (W) from step two and solve for Y. The actual solving is performed using Partial Eigenvalue Decomposition.
👁 Finding the position of points in the new lower-dimensional embedding. Image by author.
Finding the position of points in the new lower-dimensional embedding. Image by author.

With the above steps completed, we get a lower-dimensional representation of the data, which we can typically visualize using standard scatterplots, provided we reduce the dimensionality to 3D or less.

LLE variants

You should be aware of a few LLE variants, which improve upon the original setup. However, note that these improvements come at the cost of efficiency, making the algorithm slower. Here is how scikit-learn describes these variants:

  • Modified LLE (MLLE) – One well-known issue with LLE is the regularization problem. A way to address it is to use multiple weight vectors in each neighborhood. This is the essence of MLLE.
  • Hessian LLE (HLLE)— Hessian Eigenmapping is another method of solving the regularization problem of LLE. It revolves around a hessian-based quadratic form at each neighborhood used to recover the locally linear structure.

While I will not go into details, I recommend you experiment with them to see which variant yields the best results for your data. Personally, I find MLLE to perform well in most scenarios (see an example of this in the next section).

Difference between LLE and Isomap

The two algorithms are similar in the way they approach dimensionality reduction, but they do have their differences.

Similar to LLE, Isomap also uses KNN to find the nearest neighbors in the first step. However, the second step constructs neighborhood graphs instead of describing each point as a linear combination of its neighbors. Then it uses these graphs to compute the shortest path between every pair of points.

Finally, Isomap uses those pairwise distances between all points to construct a lower-dimensional embedding.

Should I choose LLE over Isomap?

In general, LLE is a more efficient algorithm as it eliminates the need to estimate pairwise distances between widely separated data points. Furthermore, it assumes that the manifold is linear when viewed locally. Thus it recovers the non-linear structure from locally linear fits.

However, because LLE focuses on preserving only the local structures, it may introduce some unexpected distortions on the global scale.

In the Python section below, I use LLE, MLLE, and Isomap to create 2D embeddings of the 3D shape known as a "swiss roll." These examples should give you a good indication of the differences in results between these algorithms.

👁 Image

Python example of LLE and its comparison to Isomap

Setup

We will use the following data and libraries:

Let’s start by importing libraries.

# Data manipulation
import pandas as pd # for data manipulation
import numpy as np # for data manipulation

# Visualization
import plotly.express as px # for data visualization

# Skleran
from sklearn.datasets import make_swiss_roll # for creating swiss roll
from sklearn.manifold import LocallyLinearEmbedding as LLE # for LLE dimensionality reduction
from sklearn.manifold import Isomap # for Isomap dimensionality reduction

Next, we create data arrays containing swiss rolls using Sklearn’s make_swiss_roll.

# Create a swiss roll
X, y = make_swiss_roll(n_samples=2000, noise=0.05)

# Make it thinner
X[:, 1] *= .5

# Create a flat addon to the top of the swiss roll
X_x=np.zeros((300,1))
X_y=np.random.uniform(low=0, high=10, size=(300,1))
X_z=np.random.uniform(low=14, high=25, size=(300,1))
X2=np.concatenate((X_x, X_y, X_z), axis=1)
y2=X_z.reshape(300)

# Concatenate swiss roll and flat rectangle arrays
X_two=np.concatenate((X, X2))
y_two=np.concatenate((y, y2))

You will note that we have created two swiss rolls instead of one. The first one is standard, while the second one contains an additional rectangular addon at the top.

To see what I mean by this, let’s first define two functions that we will use throughout the project to visualize our 3D swiss rolls and the 2D embedding results.

# Create a 3D scatter plot
def Plot3D(X, y, plot_name):
 fig = px.scatter_3d(None, 
 x=X[:,0], y=X[:,1], z=X[:,2],
 color=y,
 height=800, width=800
 )
 # Update chart looks
 fig.update_layout(title_text=plot_name,
 showlegend=False,
 legend=dict(orientation="h", yanchor="top", y=0, xanchor="center", x=0.5),
 scene_camera=dict(up=dict(x=0, y=0, z=1), 
 center=dict(x=0, y=0, z=-0.1),
 eye=dict(x=1.5, y=1.75, z=1)),
 margin=dict(l=0, r=0, b=0, t=0),
 scene = dict(xaxis=dict(backgroundcolor='white',
 color='black',
 gridcolor='#f0f0f0',
 title_font=dict(size=10),
 tickfont=dict(size=10),
 ),
 yaxis=dict(backgroundcolor='white',
 color='black',
 gridcolor='#f0f0f0',
 title_font=dict(size=10),
 tickfont=dict(size=10),
 ),
 zaxis=dict(backgroundcolor='lightgrey',
 color='black', 
 gridcolor='#f0f0f0',
 title_font=dict(size=10),
 tickfont=dict(size=10),
 )))
 # Update marker size
 fig.update_traces(marker=dict(size=3, 
 line=dict(color='black', width=0.1)))
 fig.update(layout_coloraxis_showscale=False)
 return fig

#---------------------------------------------- 
# Create a 2D scatter plot
def Plot2D(X, y, plot_name):
 # Create a scatter plot
 fig = px.scatter(None, x=X[:,0], y=X[:,1], 
 labels={
 "x": "Dimension 1",
 "y": "Dimension 2",
 },
 opacity=1, color=y)

 # Change chart background color
 fig.update_layout(dict(plot_bgcolor = 'white'))

 # Update axes lines
 fig.update_xaxes(showgrid=True, gridwidth=1, gridcolor='lightgrey', 
 zeroline=True, zerolinewidth=1, zerolinecolor='lightgrey', 
 showline=True, linewidth=1, linecolor='black')

 fig.update_yaxes(showgrid=True, gridwidth=1, gridcolor='lightgrey', 
 zeroline=True, zerolinewidth=1, zerolinecolor='lightgrey', 
 showline=True, linewidth=1, linecolor='black')

 # Set figure title
 fig.update_layout(title_text=plot_name)

 # Update marker size
 fig.update_traces(marker=dict(size=5,
 line=dict(color='black', width=0.3)))
 return fig

Using the above Plot3D function, we can visualize our two swiss rolls:

Regular one:Plot3D(X, y, "Regular Swiss Roll")

Modified one:Plot3D(X_two, y_two, "Modified Swiss Roll")

Performing Locally Linear Embedding (LLE, MLLE) and Isomap

With setup complete, we will now perform dimensionality reduction using LLE, Modified LLE, and Isomap embeddings.

First, let’s define a couple of reusable functions.

# Function for performing LLE and MLLE
def run_lle(num_neighbors, dims, mthd, data):
 # Specify LLE parameters
 embed_lle = LLE(n_neighbors=num_neighbors, # default=5, number of neighbors to consider for each point.
 n_components=dims, # default=2, number of dimensions of the new space 
 reg=0.001, # default=1e-3, regularization constant, multiplies the trace of the local covariance matrix of the distances.
 eigen_solver='auto', # {'auto', 'arpack', 'dense'}, default='auto', auto : algorithm will attempt to choose the best method for input data
 #tol=1e-06, # default=1e-6, Tolerance for 'arpack' method. Not used if eigen_solver=='dense'.
 #max_iter=100, # default=100, maximum number of iterations for the arpack solver. Not used if eigen_solver=='dense'.
 method=mthd, # {'standard', 'hessian', 'modified', 'ltsa'}, default='standard'
 #hessian_tol=0.0001, # default=1e-4, Tolerance for Hessian eigenmapping method. Only used if method == 'hessian'
 modified_tol=1e-12, # default=1e-12, Tolerance for modified LLE method. Only used if method == 'modified'
 neighbors_algorithm='auto', # {'auto', 'brute', 'kd_tree', 'ball_tree'}, default='auto', algorithm to use for nearest neighbors search, passed to neighbors.NearestNeighbors instance
 random_state=42, # default=None, Determines the random number generator when eigen_solver == 'arpack'. Pass an int for reproducible results across multiple function calls.
 n_jobs=-1 # default=None, The number of parallel jobs to run. -1 means using all processors.
 )
 # Fit and transofrm the data
 result = embed_lle.fit_transform(data)

 # Return results
 return result

#---------------------------------------------- 
# Function for performing Isomap
def run_isomap(num_neighbors, dims, data):
 # Specify Isomap parameters
 embed_isomap = Isomap(n_neighbors=num_neighbors, n_components=dims, n_jobs=-1)

 # Fit and transofrm the data
 result = embed_isomap.fit_transform(data)

 # Return results
 return result

We can now call the above functions to perform dimensionality reduction.

Let’s create five 2D embeddings using different data and algorithms. Note that in all cases, we set the number of neighbors to 30 while using default values for other hyperparameters. Our five embeddings will be:

  • Standard LLE on a regular swiss roll
  • Modified LLE on a regular swiss roll
  • Isomap on a regular swiss roll
  • Modified LLE on a modified swiss roll
  • Isomap on a modified swiss roll
######### Regular swiss roll #########

# Standard LLE on a regular swiss roll
std_lle_res=run_lle(num_neighbors=30, dims=2, mthd='standard', data=X)

# Modified LLE on a regular swiss roll
mlle_res=run_lle(num_neighbors=30, dims=2, mthd='modified', data=X)

# Isomap on a regular swiss roll
isomap_res=run_isomap(num_neighbors=30, dims=2, data=X)

######### Modified swiss roll #########

# Modified LLE on a modified swiss roll
mod_mlle_res=run_lle(num_neighbors=30, dims=2, mthd='modified', data=X_two)

# Isomap on a modified swiss roll
mod_isomap_res=run_isomap(num_neighbors=30, dims=2, data=X_two)

Finally, we can plot the results on 2D scatter plots using the Plot2D function we have created earlier.

Let’s do it first on a regular swiss roll.

Plot2D(std_lle_res, y, 'Regular Swiss Roll - LLE')
Plot2D(mlle_res, y, 'Regular Swiss Roll - MLLE')
Plot2D(isomap_res, y, 'Regular Swiss Roll - Isomap')
👁 Image
👁 Image
👁 2D embeddings of the swiss roll using three different algorithms: 1. LLE, 2. Modified LLE, 3. Isomap. Images by author.
2D embeddings of the swiss roll using three different algorithms: 1. LLE, 2. Modified LLE, 3. Isomap. Images by author.

We can see that Standard LLE was not able to unroll the swiss roll successfully. By contrast, Modified LLE and Isomap did well, yielding similar results between them. In general, out of the three algorithms, MLLE seems to have the least distorted 2D embedding.

Next, let’s visualize the results of the modified swiss roll embedding with MLLE and Isomap.

Plot2D(mod_mlle_res, y_two, 'Modified Swiss Roll - MLLE')
Plot2D(mod_isomap_res, y_two, 'Modified Swiss Roll - Isomap')
👁 Image
👁 2D embeddings of the modified swiss roll using two different algorithms: 1. Modified LLE 2. Isomap. Images by author.
2D embeddings of the modified swiss roll using two different algorithms: 1. Modified LLE 2. Isomap. Images by author.

This time we can see more of a difference between the results of the two algorithms.

MLLE "struggled" to deal with different angles introduced to the 3D shape and ended up bunching all points from the rectangle part of the swiss roll closely together. Meanwhile, Isomap was able to flatten the whole structure pretty evenly.

Conclusions

Locally Linear Embedding (LLE) and especially the modified variant (MLLE) is an excellent tool to have within your arsenal.

With LLE being designed to focus on local structures, it can perform computations faster than other similar algorithms such as Isomap. However, choosing which algorithm you should use will depend on the data and the task you need to perform.

As with many cases in Data Science, you may want to experiment with a few different algorithms to find out which one does the best job for your data.

I sincerely hope you found my explanation and comparisons helpful. If you have any questions or suggestions, please do not hesitate to reach out.

Cheers 👏 Saul Dobilas


A couple of related articles you may find interesting:

Isomap Embedding – An Awesome Approach to Non-linear Dimensionality Reduction

t-SNE Machine Learning Algorithm – A Great Tool for Dimensionality Reduction in Python


Written By

Saul Dobilas

Towards Data Science is a community publication. Submit your insights to reach our global audience and earn through the TDS Author Payment Program.

Write for TDS

Related Articles