Automated Cell Recognition Using Single-cell RNA sequencing with Machine Learning

This project investigates and summarizes the superiority and limitations of different dimensionality reduction schemes as well as classification methods in specific single-cell RNA sequencing (scRNA-seq) data sets.

Introductions

Background

Although scRNA-seq technology has gained further capability to capture differential information at the cellular level compared to earlier transcriptome analysis methods including bulk RNA-seq, the cross-cellular technical errors arising from its data acquisition phase and other limitations provide challenges for researchers to maintain a balance between data pre-processing and information retention. Based on this, several relatively mature schemes including t-SNE, PCA, and multiple algorithm combinations on data dimension reduction was explored and tested in this report, and evaluated the accuracy obtained by machine-learning-based classifiers for cell classification tasks as a base metric for comprehensive comparison and evaluation.

Pipeline

This is the pipeline for large-scale, cell identification task from the beginning of raw data to the final classification. a. Labels + Reads Per Kilobase per Million mapped reads. b. Multiple dimension reduction methods with multiple dimensions applied. c. The specific implementation principle of the PCA + t-SNE combination algorithm. d. Visualization in both 2 & 3 dimensions and both with & without labels. e. Multiple classifiers with multiple parameters applied

Dataset

The reprocessed dataset that supports the conclusion of this paper are publicly available online at https://scquery.cs.cmu.edu/processed_data/.

Graphics

Contributors

Click me to Open/Close the contributors listing

Yuetian Chen - Rensselaer Polytechnic Institute, 110 8th Street, Troy, NY, United States, 12180 (email: cheny63@rpi.edu)
Chenqi Xu - Southern University of Science and Technology, Shenzhen, China, 518055
Yiyang Cao - The University of British Columbia, Vancouver, BC, Canada, V6T 1Z4

Special Thanks

This research was undertaken as part of the CIS - Introduction to Machine Learning "Our Body" Project. Thanks to Prof. Ziv Bar-Joseph for his guidance and instruction in dataset pre-processing and paper refinement.

License

License MIT

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Automated Cell Recognition Using Single-cell RNA sequencing with Machine Learning

Table of Contents

Introductions

Background

Pipeline

Dataset

Graphics

Contributors

Special Thanks

License

Files

README.md

Latest commit

History

README.md

File metadata and controls

Automated Cell Recognition Using Single-cell RNA sequencing with Machine Learning

Table of Contents

Introductions

Background

Pipeline

Dataset

Graphics

Contributors

Special Thanks

License