scPopCorn Gets to the Kernel of Single-Cell Experiments: NLM–Made Algorithm Helps NIH Researchers

Arrows lead from a tissue sample from a healthy mouse  to multiple cell types (astrocyte, neuron, microglia, OPC, and oligodendrocyte). 
Arrows also lead from a tissue sample of a sick mouse to cell types (cancer, OPC, and oligodendrocyte). 
In the center of the figure, arrows connect both the astrocyte and neuron to the sick mouse's cancer cell suggesting the cancer is the same cell type as the astrocyte and neuron.
ScPopcorn allows identification of individual cells by determining which genes are being expressed, and can compare the same cells types in other animals.

Researchers at the National Library of Medicine’s National Center for Biotechnology Information have created a new algorithm called scPopCorn (single-cell subpopulations comparison) to capture the differences among populations of cells from single-cell experiments. The algorithm, developed by my team is available at GitHub and is described in an article in Cell Systems (Y. Wang, J. Honka, and T.M. Przytycka, Cell Syst 8:506–513, 2019).

The most frequently performed analyses of single-cell RNA sequencing (scRNA-seq) datasets include the identification of subpopulations of cells in scRNA-seq experiments and the comparison of such subpopulations across experiments. In multicellular organisms, different cell types execute different transcriptional programs expressing different sets of genes. Current experimental techniques can measure gene expression at single-cell resolution, making it possible to address questions that could not be answered with standard bulk experiments in which the total gene expression from a heterogeneous cell population was measured.

Single-cell transcriptomics opens a window to a better understanding of changes in the functioning of cell populations across different states and conditions including diseases. However, new computational methods are required to effectively gain important insights from these, unfortunately still quite noisy, measurements.

To address this need, my team leveraged several new algorithmic ideas and introduced the computational method scPopCorn. Unlike previous methods that treated the identification of cell types and their comparison across experiments as two separate tasks, scPopCorn identifies subpopulations of cells in individual experiments simultaneously by incorporating these two tasks into one complex optimization problem.

The optimization involves a measure of the homogeneity of a cell population (population consistency), which when combined with a technique much like Google’s personalized PageRank approach, guides subpopulation detection. (PageRank is Google’s algorithm that ranks web pages in search engine results.)

In addition, a cell-to-cell similarity measure is used to guide the mapping. In the scPopCorn method, the researchers substituted a cell-to-cell expression similarity graph for the network of webpages, and for each cell, estimated its preference (a “vote”) for which other cells should be included in the same subpopulation.

This integrative approach helps researchers confidently define both the common and unique cell types across many experiments. Scientists can use this method to understand and map the differences among populations of cells with different disease status and developmental stages and of different sexes and species. In particular, scientists can use the algorithm to identify similar and distinct cell types present in such single-cell experiments.

This new computational method, scPopCorn, not only enabled the design of a highly accurate identification of subpopulations and a mapping approach, but also introduced mathematical concepts that can serve as stepping stones for other tools to interrogate the relationships among single cells.


(NIH authors: Y. Wang, J. Hoinka, and T.M. Przytycka, Cell Syst 8:506–513, 2019; DOI:10.1016/j.cels.2019.05.007)

By Teresa Przytycka, PhD.  Dr. Przytycka leads the Algorithmic Methods in Computational and Systems Biology section of the Computational Biology Branch at NLM’s National Center for Biotechnology Information (NCBI). This article originally appeared in the November/December issue of the NIH Catalyst. It is reposted with permission. 

2 thoughts on “scPopCorn Gets to the Kernel of Single-Cell Experiments: NLM–Made Algorithm Helps NIH Researchers

  1. Pingback: scPopCorn Gets to the Kernel of Single-Cell Experiments: NLM–Made Algorithm Helps NIH Researchers ~ Agenparl

  2. Pingback: “What 2019 NLM Accomplishment Makes You Most Proud?” – NLM Musings from the Mezzanine

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s