5.4.6.Clustering for single-cell genomics
About clustering
- Identify groups of similar cells (measurement objects)
- What clustering methods did we learn about?
- Could you explain the main ideas please?
Clustering in Seurat is a bit more complicated
Seurat clusters cells:
- Instead of genes, based on principal components:
- each PC represents a ‘metagene’,
- which combines information from multiple correlated genes.
Seurat
clustering is based on a community detection approach similar to SNN-Cliq
- Algorithm: graph-based clustering
Graph-based clustering is based on K-nearest neighbours (KNN) classification
KNN classification
- Supervised approach (we have labels:
red
, blue
)
- Is
X
red
or blue
?
Wikipedia
Graph-based clustering
Think of finding groups of friends among the people in UG!
Steps of graph-based clustering
- Embed cells in a graph structure
- This graph is derived from KNN graph
- calculated on the euclidean distance of cells in PCA space
- refine the edge weights between any two cells based on the overlap in their local neighborhoods
- Find “cliques” or “communities” link in the graph