In this post we will implement and play with a clustering algorithm of a mysterious name **Large Scale Spectral Clustering with Landmark-Based Representation** (or shortly LSC – corresponding paper here). We will first explain the algorithm step by step and then map it to Julia code (github link).

### Spectral Clustering

Spectral clustering (wikipedia entry) is a term that refers to many different clustering techniques. The core of the algorithm does not differ though. In essence, it is a method that relies on spectrum (eigendecomposition) of input data similarity matrix (or its transformations). Given input dataset encoded in a matrix \(X\) (such that each single data entry is a column of that matrix) – spectral clustering requires a similarity (adjacency in case of graphical interpretation) matrix \(A\) where

\[

A_{ij} = f(X_{\cdot i}, X_{\cdot j})

\]

and function \(f\) is some measure of similarity between data points.

One specific spectral clustering algorithm (Ng, Jordan, and Weiss 2001) relies on matrix \(W\) given by

\[

W = D^{-1/2} A D^{-1/2}

\]