This repository is for a special session of the regular Intro to Python materials held for staff in HDC. Please see the main repository for updated information.
Description: This four-week course is designed to introduce attendees to Python programming and its broad applications. Each two hour session will include brief tutorials interspersed with challenge exercises, and assumes attendees have no prior computer coding experience. At the end of this course, you will be able to use Python to import, manipulate, and visualize data, and understand basic principles of machine learning.
The core fredhutch.io Intro to Python materials were adapted from content originally appearing in Python for Ecologists, Copyright (c) Data Carpentry. These materials were specifically modified to suit the needs of Hutch Data Commonwealth, and do not include the regular week 4 material of ggplot plotting with plotnine in lieu of machine learning. There have been additional minor modifications to the content in weeks 1 through 3 as well.
Software requirements for this course can be found on fredhutch.io's Software page.
- Week 1: Intro to python, jupyter notebooks, and data types
- Week 2: Using pandas to explore data frames
- Week 3: Extracting data from data frames
- Week 4: Machine learning
- Materials for weeks 1-3 are described in the python script prefaced with the number of the week. The remaining notebooks represent material from week 4, with relevant explanatory content in
slides/
. - Data used for this lesson are identical to that used in Introduction to R; details on obtaining these data from the National Cancer Institute's Genomic Data Commons can be found in that lesson repository.
exercises/
includes a file for each week representing both the aggregated in-class exercises as well as additional supplemental exercises for practicesolutions/
includes the solutions for all files inexercises/
resources.md
includes useful links mentioned during lessons; additional information about continued learning in Python as well as Hutch-specific resources can be found on the Data Science Wikihackmdio.txt
is an archive of the interactive webpage used during lessons