This library is a toolset to start projects light and quick, with minimum overhead or boilerplate, with no special hardware or software. Yet it allows scaling up and process large datasets.
- Shorthands: read/write a GeoDataFrame from/to any format, write a single geometry into a geo-format file (no need to create a GeoDataFrame by hands).
- Chunk-wise IO for multiple formats.
- OSRM routing for large datasets: route lines, table routing, isochrones.
- Create command-line scripts with one decorator. Your scripts may process entire files, or chunks, or even generate chunks.
- Frequently used tools for GIS: area, length, etc.
- Lookups, aggregations with spatial join.
- Export or filter OSM files (.osm[.pbf|.gz|.bz2])
Instead of gdf.to_file('path', driver='GeoJSON')
the library offers shorthand functions that recognize formats and layers:
write_df(geodataframe1, 'file.geojson')
write_df(geodataframe2, 'another-file.gpkg:special-layer')
write_df(geodataframe3, 'third-file.csv')
GeoDataFrames are saved to/read from CSV files automatically, under the hood.
Save a single geometry object to a file:
write_geom(polygon, 'single-polygon.csv')
This code creates an app that opens and saves files for you, converts types of parameters and makes help file. No more argparse hell!
from erde import autocli
from geopandas import GeoDataFrame as GDF
@autocli
def main(input_data: GDF, sample_size:float) -> GDF:
return input_data.sample(sample_size)
call python myscript.py
to see command line arguments.
See the example for more code and instructions.
Instead of doing this:
df.rename(columns={'oldname1': 'newname1'}, inplace=True)
df.drop(['oldcol2'], inplace=True, axis=1, errors='ignore')
you can simply do this:
from erde import subset
df = subset(df, 'oldname1: newname1, -oldcol2, *')
Or even run this from command line:
erde subset old_file.gpkg oldname1:newname1,-oldcol2,* new_file.gpkg
-
erde route
takes a file with lines, treats them like waypoints, and outputs a file with original attributes, route geometries, and metadata: distance, duration, nodes.erde route input.gpkg car route_geoms.gpkg
Example datasets: input and output:
-
erde table
takes 2 datasets of N & M points and calculates all N*M durations/distances between them.erde table houses.csv shops.csv car distance-matrix.gpkg
-
erde isochrone
takes N points and m travel durations, and get N*m isochrones in 1 line
Examples: from command line:
$ erde isochrone my_houses.gpkg foot 5,10,15 my_isochrones.gpkg
from code/Jupyter:
from erde import isochrone
areas_df = isochrone(houses_df, 'foot', [5, 10, 15])
erde osm
filters, crops by polygon and converts OSM files, and can merge several OSM files into one. It is a wrapper around osmium-tool (up-to-date Ubuntu packages are available for 18.04LTS and newer) and GDAL ogr2ogr tool (Ubuntu users need to install gdal-bin
)
Examples:
-
Filter by tags:
erde osm my-country.osm.pbf wr/highway my-country-highways.osm.pbf
-
Filter by tags and crop by polygon:
erde osm my-country.osm.pbf wr/highway my-city-highways.osm.pbf --crop my-city.geojson
-
Convert to GeoPackage and extract only linestrings:
erde osm my-country.osm.pbf wr/highway city-hw.gpkg --crop my-city.geojson -l lines
-
Merge several files:
erde osm country1.osm.pbf country2.osm.pbf country1-country2-hw.osm.pbf
-
Filter by tag, merge and convert only linestrings:
erde osm country1.osm.pbf country2.osm.pbf wr/highway country1-country2-hw.gpkg -l lines
Most times, you need gpd.sjoin
for 3 things:
- intersect geometries and aggregate some field from those objects in the other dataframe
- lookup another table (city/region) and get a field from there (size, name, domestic product, incomes, etc.)
- filter a dataframe by objects in the other one
Erde has 3 functions for those cases in sjoin module: slookup
, sagg
and sfilter
.
- shortcuts for common usecases of sjoin: lookup, aggregate by geometry, and filter by geometry
- area/length/buffer in metres, all cleanup done under the hood
- CRS conversion