Paper : Aida Yazdanparast, Lang Li*, Chi Zhang and Lijun Cheng*. Bi-EB:Empirical Bayesian Biclustering for Multi-Omics Data Integration Pattern Identification among Species. Genes. 2022; 13(11):1982. https://doi.org/10.3390/genes13111982
Bicluster for (a) breast cancer luminal subtype (b) breast cancer basal-like subtype. Red color shows higher probability and green shows lower probability of belonging to bicluster.
The novelty biclustering method based on empirical bayesian (Bi-EB) algorithm is designed to search the coherent and flexible co-regulation patterns across mRNA and protein both in patient tumors and cancer cell lines. Transparent probabilistic interpretation and ratio strategy for omics data is first time proposed to detect the co-regulation patterns of drug targets and identify their associated molecular functions.
The genome molecular features shared between cell lines and tumors give us insight into discovering potential drug targets for cancer patients. Our previous studies demonstrate that these important drug targets in breast cancer, ESR1, PGR, HER2, EGFR, and AR have a high similarity in mRNA and protein variation in both tumors and cell lines [1-2]. Based on previous studies we made specific hypothesis that there exist translational gene sets that are characterized by highly correlated molecular profiles among RNA, and proteins. There are translational gene sets that are shared between tumor tissues and cancer cell lines. These gene sets show similar pattern in a subgroup of cell line and tissue samples. In this study, we aim to integrate cell line and tissue RNA and protein profiles to characterize drug-able target expression alterations across both RNA and protein data by using bi-clustering method. Here we developed a biclustering method based on empirical bayesian (Bi-EB), to detect the local pattern of integrated omics data both in cancer cells and tumors. We adopt a data driven statistics strategy by using Expected-Maximum (EM) algorithm to extract the foreground bicluster pattern from its background noise data in an iterative search. Our novel Bi-EB statistical model has better chance to detect co-current patterns of gene and protein expression variation than the existing biclustering algorithms and seek the drug targets’ co-regulated modules.
[1] Jiang GL, Zhang SJ, Yazdanparast A, Li M, Vikram Pawar A, Liu YL, Inavolu SM, Cheng LJ. Comprehensive comparison of molecular portraits between cell lines and tumors in breast cancer. BMC genomics, 2016, 17(7), 281-301.
[2] Yazdanparast A, Li L, Radovich M, Cheng LJ. Signal translational efficiency between mRNA expression and antibody-based protein expression for breast cancer and its subtypes from cell lines to tissue. International Journal of Computational Biology and Drug Design , 2018, 11 (1-2), 67-89.
Supplementary data is for systematic patterns of the (gene expression/Protein amount) ratio absed on the Cancer Genomics Atlas (TCGA) and the Cancer Cell Line Encyclopedia (CCLE) breast cancer data
-Muliple omics data integration
-Doing biclusters among multiple omics data and specises.
-Easy to use. We provide an example how to use Bi-EB, including example data and code.
-systematic patterns of the (gene expression/Protein amount) ratio is found