This is a repo for me to work through this course for fun/learning. I'm more or less following the curriculum (skipping topics I already know), but sourcing my own data to make it more relevant to other stuff I'm working on + more hands-on.
1_api.ipynb
<> Week 1: API andrequest
library3_datawrangling.ipynb
<> Week 3 & 4: data wrangling
Spring 2021: Tues/Thurs 9.30-10.45
Instructor: Adam Millard-Ball, he/him
Office hours: Mondays 2.30-4.30. Sign up here
About this course: New data sources are a potential goldmine for urban planners and policy makers. But sometimes they are large, sometimes they are messy, sometimes they are awkward to access, and often they are all of these things. In this hands-on course, we’ll develop skills in scraping, processing, and managing urban data, and using tools such as natural language processing, geospatial analysis, and machine learning. We’ll use examples from transit, housing, and equity planning, and build competence in open-source tools and languages such as Python and SQL. We’ll also consider the limits to data science, and the biases and pitfalls that "big data" can entail.
Prerequisites: Basic Python programming experience, for example through the 2020-21 version of 206A (Introduction to Geographic Information Systems and Spatial Data Science), or an introductory Python course. One good, free option is offered by Data Carpentry. Another is the University of Michigan Introduction to Data Science in Python course; if you have no prior knowledge, you should take Programming for Everyone first. (You can take these for free if you choose the "audit" option.) Whichever option you choose, before starting this course you should be familiar with Python syntax, Jupyter notebooks, plotting via matplotlib
, and pandas
and geopandas
dataframes.
Week 5: Natural language processing (1): parsing
Week 6: Natural language processing (2): topic modeling and sentiment analysis
Week 7: Machine learning (1): supervised algorithms
Week 8: Machine learning (2): unsupervised algorithms
Week 10: Big data, privacy, and ethics
- Makeup homework, due 9am, June 15