Skip to content

gohsu/UrbanDataScience

 
 

Repository files navigation

This is a repo for me to work through this course for fun/learning. I'm more or less following the curriculum (skipping topics I already know), but sourcing my own data to make it more relevant to other stuff I'm working on + more hands-on.

Contents

  • 1_api.ipynb <> Week 1: API and request library
  • 3_datawrangling.ipynb <> Week 3 & 4: data wrangling

UP229 Urban Data Science

Spring 2021: Tues/Thurs 9.30-10.45

Instructor: Adam Millard-Ball, he/him

Office hours: Mondays 2.30-4.30. Sign up here

About this course: New data sources are a potential goldmine for urban planners and policy makers. But sometimes they are large, sometimes they are messy, sometimes they are awkward to access, and often they are all of these things. In this hands-on course, we’ll develop skills in scraping, processing, and managing urban data, and using tools such as natural language processing, geospatial analysis, and machine learning. We’ll use examples from transit, housing, and equity planning, and build competence in open-source tools and languages such as Python and SQL. We’ll also consider the limits to data science, and the biases and pitfalls that "big data" can entail.

Prerequisites: Basic Python programming experience, for example through the 2020-21 version of 206A (Introduction to Geographic Information Systems and Spatial Data Science), or an introductory Python course. One good, free option is offered by Data Carpentry. Another is the University of Michigan Introduction to Data Science in Python course; if you have no prior knowledge, you should take Programming for Everyone first. (You can take these for free if you choose the "audit" option.) Whichever option you choose, before starting this course you should be familiar with Python syntax, Jupyter notebooks, plotting via matplotlib, and pandas and geopandas dataframes.

Course Schedule

Week 1: Introduction. APIs

Week 2: Web scraping

Weeks 3 and 4: Data wrangling

Week 5: Natural language processing (1): parsing

Week 6: Natural language processing (2): topic modeling and sentiment analysis

Week 7: Machine learning (1): supervised algorithms

Week 8: Machine learning (2): unsupervised algorithms

Week 9: Databases

Week 10: Big data, privacy, and ethics

About

Urban Data Science

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 100.0%