Skip to content

mdezube/property-assessments

Repository files navigation

Author: Mike Dezube on behalf of Charles River Data Contact: inquiries@charlesriverdata.com

Read Me

Supports the ODSC Talk/Workshop Unlocking Insights in Home Values: A Multimillion-Row Journey with Polars and the associated article on Medium. If you intend the dive in straight into the python notebook, you can open polars-gis-demo.ipynb

Overview

Following the sections below, you'll be able to install all dependencies and download all requisite files needed to follow along with the workshop described in the medium article Harnessing the Power of GIS and Python for Property Value Analysis at Scale

Average Home Value by Town in Massachusetts

Average Home Value by Town in Massachusetts


Setting up Your Environment

  1. Follow the Installation steps below to get your environment set up.
  2. Download the data following the instructions in Data Sources
  3. Update variables in the notebook to point to your new files (you'll see instructions in the 2nd cell of the notebook)
  4. The notebook polars-gis-demo.ipynb should now run in full, give it a shot!

Environment and dependencies

Using the commands below, you can set up a new conda environment to hold your python dependencies named geo_explorer, and get it registered for use in jupyter-lab. Don't have conda yet? Download it here. Once installed, pop open terminal (or the equivalent on Windows, there are a number of options) and follow along below.

Conda can sometimes be slow to resolve dependencies, so at Charles River Data we like to use the libmamba solver for ~10X speed boost.

You can install this in your base environment so it works for all environments (link above explains how), but for today, we'll just install it in our new one:

# Create a new conda environment and activate it.
conda create -n geo_explorer python=3.11
conda activate geo_explorer
conda install conda-libmamba-solver

# Install the dependencies you'll need to work with this data.
conda install geopandas jupyter pyogrio itables pyarrow seaborn plotly polars --channel conda-forge --solver=libmamba
pip install jupyter-black

# Register the new conda environment with jupyter.
python -m ipykernel install --user --name=geo_explorer

Data sources

All data comes from the Massachusetts Government Tax Assessments

  • Home Values as a CSV The gov't site above only provides the assessment data embedded within GIS files (.shp and .gdb) as such I'm making available a .csv extract of parcel data (as of 2023) via Google Drive created via ogr2ogr. This file format is easy to work with, and most of the notebook leverages it.
  • Diving deeper with GIS: Later parts of the notebook get into GIS analysis and thus need property and town boundary definitions along with the assessments
    • The full file of parcel boundaries can be downloaded directly from mass.gov's GIS site And it's best to download it here so the gov't knows it gets heavy usage and supports it, but if you run into any issues I'm providing direct links as well:
      • Parcels as .gdb, used in this notebook for town boundaries. Property boundaries are corrupted in this distribution. GDB is an odd format as it's a directory with an extension. You want to point to the .gdb folder which is the root, don't be dismayed.
      • Eastern parcels as .shp Property boundaries of Eastern MA, used in the notebook.
      • Western parcels as .shp Property boundaries of Western MA, not used in the notebook, but good to have.

Technical note The notebook as written requires the .shp file and the .gdb files. In theory only one is needed, but due to a corrpution in the L3_TAXPAR_POLY layer in .gdb directly from the gov't we can't use it.

(Optional) Related Files

To fully grasp the nuances of this data and enrich your analysis, consider exploring these supplementary files

  1. Property Type Classification Codes also included in this repo. Includes a breakdown of codes defining property types (residential, multifamily, agriculture, apartments, etc.). Understanding these codes is essential for interpreting our data's "use codes."
  2. MassGIS standard for digital parcels and related data sets also included in this repo. This documentation offers a detailed explanation of the dataset columns and fields, thus providing deeper insights into the geographical data.

About

Property Value Analytics in MA: A Data Driven Demo

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages