mRNA-based Prostate Cancer Classification

In this project, I train micro RNA-based classification models to accurately identify prostate cancer. The dataset used in the project is avaiable here.

Prostate cancer is one of the common cancer types. While some types of prostate cancer grow slowly and may not need only minimal or no treatment, some other types can be very aggressive and can grow quickly.

Prostate cancers that is detected early, has the best chance for successful treatment. However, the high false rate of prostate-specific antigens (PSA) may often lead to negative prostate biopsies, which does not definitively exclude the presence of cancer and often requires further investigation.

Project Target

Comparing Prostate Cancer miRNA and healthy control miRNA, which might help determining the divergences among the groups.
Comparing Negative Biopsy miRNA to healthy control miRNA, to understand the deviation from normal miRNA in the Negative Biopsy miRNA.
Work on training classifiers on detecting prostate cancer and negative prostate biopsies.

Applied Methods:

Performing dimension reduction techniques, e.g., PCA and tSNE - to understand divergences among groups in lower dimensions.
Performing clustering methods, e.g., kmeans - to see whether clusters can be generated based on the data.
Training classification models, using k-NN and Random Forest algorithms.

Considerations for Classification

Evaluation Metric: Accuracy
Repeated cross validation - 10 fold 10 repeats each
Train/test split - 75%/25%
Hyperparameter tuning for the classification
Imbalance resulation techniques, e.g., up and down sampling

Model Performance

Best fitted trained model performed well for classification, with over 95% accuracy.

Project Takeaways:

mRNA based classification works fairly well for spearting prostate cancer patients from negative biopsies that can improve early detection of prostate cancer, which is crucial for successful treatment.
Accuracy of detecting negative biopsy patients from healthy individuals was also found very good, which may enable early detection and chance for close observation for further development of prostate cancer.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.gitignore		.gitignore
CSE_5713__Data_Mining__Project_Presentation.pptx		CSE_5713__Data_Mining__Project_Presentation.pptx
CSE_5713___Data_Mining___Project_Report.pdf		CSE_5713___Data_Mining___Project_Report.pdf
README.md		README.md
classification.R		classification.R
prelim_analysis.R		prelim_analysis.R

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

mRNA-based Prostate Cancer Classification

Project Target

Applied Methods:

Considerations for Classification

Model Performance

Project Takeaways:

About

Releases

Packages

Languages

ehsan-ashik/prostate-cancer-classification

Folders and files

Latest commit

History

Repository files navigation

mRNA-based Prostate Cancer Classification

Project Target

Applied Methods:

Considerations for Classification

Model Performance

Project Takeaways:

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages