We developed OncoNPC (Oncology NGS-based Primary cancer type Classifier), a molecular cancer type classifier trained on multicenter targeted panel sequencing data. OncoNPC utilized somatic alterations including mutations (single nucleotide variants and indels), mutational signatures, copy number alterations, as well as patient age at the time of sequencing and sex to jointly predict cancer type.
We utilized
- R (v4.0.2) and Python (v3.9.13) programming languages
- OncoNPC somatic mutation processing (R deconstructSigs v1.8.0)
- OncoNPC model development and interpretation (Python xgboost v1.2.0, shap v0.41.0)
- Survival analysis (R survival v3.2.7, stats v4.0.2, Python lifelines v0.27.4, scipy v1.7.1)
See this notebook example and our manuscript.
Citation: @article{moon2023machine, title={Machine learning for genetics-based classification and treatment response prediction in cancer of unknown primary}, author={Moon, Intae and LoPiccolo, Jaclyn and Baca, Sylvan C and Sholl, Lynette M and Kehl, Kenneth L and Hassett, Michael J and Liu, David and Schrag, Deborah and Gusev, Alexander}, journal={Nature Medicine}, pages={1--11}, year={2023}, publisher={Nature Publishing Group US New York} }