Recent advance in single-cell RNA sequencing (scRNA-seq) has enabled large-scale transcriptional characterization of thousands of cells in multiple complex tissues, in which accurate cell type identification becomes the prerequisite and vital step for scRNA-seq studies. Currently, the common practice in cell type annotation is to map the highly expressed marker genes with known cell markers manually based on the identified clusters, which requires the priori knowledge and tends to be subjective on the choice of which marker genes to use. Besides, such manual annotation is usually time-consuming.
To address these problems, we introduce a single cell Cluster-based Annotation Toolkit for Cellular Heterogeneity (scCATCH) from cluster marker genes identification to cluster annotation based on evidence-based score by matching the identified potential marker genes with known cell markers in tissue-specific cell taxonomy reference database (CellMatch).
CellMatch includes a panel of 353 cell types and related 686 subtypes associated with 184 tissue types, and 2,096 references of human and mouse.
# install from cran
install.packages("scCATCH")
OR
# install devtools and install
install.packages(pkgs = 'devtools')
devtools::install_github('ZJUFanLab/scCATCH')
The scCATCH mainly includes two function findmarkergene()
and findcelltype()
to realize the automatic annotation for each identified cluster as detailed below:
# sc_data is the scRNA-seq data matrix
# sc_cluster is a character containing the cluster information
obj <- createscCATCH(data = sc_data, cluster = sc_cluster)
# find marker gene for each cluster
obj <- findmarkergene(obj, species, marker, tissue, cancer)
# find cell type for each cluster
obj <- findcelltype(obj)
For more detailed information, please refer to the document and tutorial vignette. Available tissues and cancers see the wiki page
- Now available on CRAN
- Allow users to use custom
cellmatch
- Allow users to select different combination of tissues or cancers for annotation
- Allow users to add more marker genes to
cellmatch
for annotation - Allow users to use markers from different species other than human and mouse
- Allow users to use more methods to identify highly expressed genes
- Create scCATCH object from Seurat object with the following code:
obj <- createscCATCH(data = Seurat_obj[['RNA']]@data, cluster = as.character(Idents(Seurat_obj)))
Please cite us as Shao et al., scCATCH:Automatic Annotation on Cell Types of Clusters from Single-Cell RNA Sequencing Data, iScience, Volume 23, Issue 3, 27 March 2020. doi: 10.1016/j.isci.2020.100882. PMID:32062421