Skip to content

Latest commit

 

History

History
52 lines (48 loc) · 2.34 KB

README.md

File metadata and controls

52 lines (48 loc) · 2.34 KB

CIS 9440 NYC Rodent Issue

This repository stores the Python code for the Fall 2023 class project of CIS 9440 Data Warehousing and Analytics at Baruch College.
Professor: Professor Isaac Vaghefi
Team members:

  • Komsit Rattana
  • Angela Lee
  • Derek Strang
  • Mariya Mithaiwala

The project analyzes NYC 311 Service Requests data to find insights from the rodent complaints with secondary data sources like:

Technology used

  • Google Dataflow
  • Google BigQuery
  • Apache Beam
  • Python

Dimensional model

CIS 9440 DW Project - Dimensional Model (1)

Project structure

  • location-transformation-pipeline
    • Prepare staging data for location by extract geolocations from NYC 311 Service Request, Open Restaurant data, Restaurant Inspection data
    • Resolve to NTA, Tract, and Block
  • nyc-2020-census-block-dataload
    • Extract Census Block to BigQuery
  • nyc-2020-neighborhood-dataload
    • Extract NTA boundaries to BigQuery
  • nyc-311-request-extract-pipeline
    • Extract NYC 311 Service Requests from API to BigQuery
    • Support snapshot and incremental load
  • nyc-open-restaurant-application-dataload
    • Extract Open Restaurant data to BigQuery
  • nyc-restaurant-inspection-extract-pipeline
    • Extract Restaurant Inspection data from API to BigQuery
    • Support snapshot and incremental load
  • sql
    SQL files for Schedule SQL Queries in BigQuery to refresh or incrementally load data into the tables
  • schemas
    Schemas for all BigQuery tables
  • data
    CSV for static and historical data