Date: Aug 10-14, 2020
Five-day workshop, primarily in R, occasionally in Python
Morning (9-12pm): tutorial (required) (Thursday starts at 8:00am)
Afternoon (1pm - 3pm): lab & QA (optional)
- Bin Chen
- Ke Liu
- Jing Xing
- Eugene Chekalin
- Mengying Sun
- Jiayu Zhou (invited)
- Yuehua Cui (invited)
- Paul Egeler (invited)
- Eric Kort (invited)
(two sessions, Bin), Day 1: 9-11am
- Data structure
- Matrix (data.frame, matrix)
- Network (igraph, cytoscape, stringdb, KEGG)
- Unstructured data (text, clinical note, pattern regular expression)
- Data modality
- Omics data (genomics, transcriptomics, proteomics, metabolomics)
- Screening data (CRISPR, pharmacogenomics)
- Image data (Morphology image)
- Knowledge graph (PPI)
- EMR
- Models
- cells/organoids/animal models/patients
- Public data resources
- Basic R (read/write, data types)
- Workshop Structure
(two sessions, Eugene), Day 1: 11-12pm, Day 2: 9-10am
- Apply (lapply, sapply, by)
- Collapse/Summarize data (dplyr)
- PCA/TSNE/umap (Variation)
- Clustering (hclust)
- ggplot2 (boxplot, violin plot, scatter plot, error bar)
- Heatmap (annotation, clustering, ComplexHeatmap )
- Publication-ready figures (font, background, legend)
(two sessions, Yuehua) Day 2: 10-12pm
- Correlation analysis
- Continuous data (spearman, pearson)
- Categorical data (fisher test, chi square)
- Linear regression and logistic regression (odds ratio)
- Confounding factors
- P value and FDR (p value correlation)
- Survival analysis (KM plot, Hazard Ratio)
(two sessions, Jiayu) Day 3: 9-11am
- Intro to ML
- Perceptron and Deep Learning
- Tree Methods
- Random Forest
- Adaboost
- Gradient Boosting
(two sessions, Ke) Day 3: 11-12, Day 4: 8-9am
- Sequence alignment
- DE analysis (edgeR, DESeq, Limma Voom)
- Enrichment analysis (GSEA, ssGSEA, EnrichR, Hypergeometric test, Pathway Databases)
(two sessions, Eric) Day 4: 9-11am
- Basic (biological theory, common platforms)
- Sequence analysis (alignment, counting, QC, normalization)
- Down-stream analysis (dimension reduction, visualization, cell type assignment, and RNA velocity)
(one session, Bin) Day 4: 11-12
- Structure representation (SMILES, SD)
- Structure embedding (fingerprint, pharmacophore)
- Drug-induced gene expression profiles
- Systems-based drug discovery (OCTAD)
(one session, Jing) Day 5: 9-10am
- Protein structure/3D structure
- Docking
- Target fishing
(one session, Paul) Day 5: 10-11am
(Bin) Day 5: 11:00-12pm