Modern systems biology methods make heavy use of algorithms that automatically determine useful data features and tune algorithm parameters to maximize accuracy. This lecture will introduce basic formulations of machine learning problems, including supervised and unsupervised learning problems, and study their applications in genomics and systems biology.
Classification and regression, support vector machines, randomized decision forests, feature selection, neural nets, regularization, K-means clustering, PCA, TSNE, clustering
Students will learn standard formalisms for specifying machine learning tasks and basic mathematical and algorithmic tools for learning predictors from data.
Python version 3.6 or higher installed through Anaconda is recommended: https://docs.anaconda.com/anaconda/install/
-
The Elements of Statistical Learning: Data Mining, Inference, and Prediction, T. Hastie, R. Tibshirani, J. Friedman http://statweb.stanford.edu/~tibs/ElemStatLearn/
-
Machine learning: a probabilistic perspective, Kevin Murphy https://www.cs.ubc.ca/~murphyk/MLbook/
-
Deep Learning, Ian Goodfellow and Yoshua Bengio and Aaron Courville https://www.deeplearningbook.org/