LexRank algorithm explained: a step-by-step tutorial with examples

Maciej Zalwert
6 min readMay 5, 2021
Image by Nothing Ahead from Pexels

LexRank is a stochastic graph-based method for computing relative importance of textual units for Natural Language Processing. I used LexRank to summarize and extract the most relevant information from hundreds of articles, e.g. analysis on how to lose weight and how to succeed on a Medium as a writer. LexRank proved to work effectively — in my humble opinion.

At the first glance, the theory behind the algorithm seems complicated, but in reality, it’s very simple.

Let’s start with the official general description of the concept.

We consider a new approach, LexRank, for computing sentence importance based on the concept of eigenvector centrality in a graph representation of sentences. In this model, a connectivity matrix based on intra-sentence cosine similarity is used as the adjacency matrix of the graph representation of sentences.

Let’s divide LexRank algorithm into pieces so that step-by-step, with explained mathematical concepts, we get to the model output:

--

--

Maciej Zalwert
Maciej Zalwert

Written by Maciej Zalwert

Experienced in building data-intensive solutions for diverse industries

No responses yet