You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
NYC Bike: RedisGraph through redismod, a Go backend (behind an nginx reverse proxy), a React frontend - visual geospatial index of over 58 million bikeshare trips across NYC
#235
A visual geospatial index of over 58 million bikeshare trips across NYC. This could be helpful to capacity plan across the network, allowing you to investigate aggregated rush hour and weekend travel patterns in milliseconds!
This infrastructure can be started from docker-compose.yml.
This repo also includes a Go importer program to load the public dataset into RedisGraph.
redismod
This project uses the redismod Docker image. This was used (as per Hackathon requirements) instead Redis Enterprise Cloud as that did not yet support RedisGraph v2.4 (at time of development).
backend
The Go backend uses the redisgraph-go library to proxy graph queries from the frontend. The Go library didn't support the new point() type, so I sent PR redisgraph-go#45 adding this feature.
To mark every station on the map (/stations API call), a simple Cypher query is used to fetch all the locations:
MATCH (s:Station) RETURN s.loc
To count all the edges in the graph (part of /vitals API call), another simple Cypher query is used:
MATCH (:Station)-[t:Trip]->(:Station) RETURN count(t)
The main Cypher query to retrieve journeys (/journey_query API call) is of the form:
MATCH (src:Station)<-[t:Trip]->(dst:Station)
WHERE distance(src.loc, point($src)) < $src_radius
AND distance(dst.loc, point($dst)) < $dst_radius
RETURN
(startNode(t) = src) as egress,
sum(t.counts[0]) as h0_trip_count,
...
This matches all the :Stations within the $src and $dst circles, and all the trip edges between these stations (in both directions). This is a fast query due to the geospatial index on :Station.loc (see offline_importer below). The returned egress is true if the trip started at $src, or false if it started at $dst. The aggregated trip graph presented on the UI is built by aggregating properties on these :Trip edges, for both egress and ingress traffic.
frontend
The frontend is built in React, built around react-mapbox-gl and custom drawing modes I implemented. The aggregated trip graph is built using devexpress/dx-react-chart.
This is my first ever React project, be nice! ;)
offline_importer
The offline importer iteratively downloads the public Citi Bike trip data, unzips each archive, and indexes all the trips into the journeys graph.
The graph contains every :Station as a node, an index on the station ID, and a geospatial index of the station's locations:
CREATEINDEXON :Station(loc)
Each of the 58 million journeys are represented as increments on the edge between the src and dst stations (there are ~818k unique [src]->[dst] edges). The graph is setup to aggregate trips based on the trip time of the week (into 7*24 hour buckets). This graph could easily be extended to also aggregate trips on other dimensions too.
To index a single trip, the following Cypher query is used:
MATCH (src:Station{id: $src})
MATCH (dst:Station{id: $dst})
MERGE (src)-[t:Trip]->(dst)
ON CREATE
SETt.counts= [n in range(0, 167) | CASE WHEN n = $hour THEN 1 ELSE 0 END]
ON MATCH
SETt.counts=t.counts[0..$hour] + [t.counts[$hour]+1] +t.counts[($hour+1)..168]
This either creates a new edge with one trip, or increments the appropriate counter on the edge to index the trip.
To efficiently write all 56 million trips, I use pipelining and turn CLIENT REPLY OFF for each batch. The bulk import takes a couple of hours.
Each reload of the UI at http://localhost:80/ should show these trips accumulate. On the live demo, I use a prebuilt dump.rdb which is 674MB on disk.
The text was updated successfully, but these errors were encountered:
coding-to-music
changed the title
RedisGraph through redismod, a Go backend (behind an nginx reverse proxy), a React frontend - visual geospatial index of over 58 million bikeshare trips across NYC
NYC Bike: RedisGraph through redismod, a Go backend (behind an nginx reverse proxy), a React frontend - visual geospatial index of over 58 million bikeshare trips across NYC
Aug 26, 2021
NYC Bike
https://github.com/mitchsw/nycbike
Build on Redis Hackathon entry, mitchsw, 2021-05-12.
A visual geospatial index of over 58 million bikeshare trips across NYC. This could be helpful to capacity plan across the network, allowing you to investigate aggregated rush hour and weekend travel patterns in milliseconds!
Live Demo: https://nycbike.mitchsw.com/
Full visual UI.
Zoomed-in view of trips between a few stations.
System Overview
The visual UI is built using:
This infrastructure can be started from docker-compose.yml.
This repo also includes a Go importer program to load the public dataset into RedisGraph.
redismod
This project uses the redismod Docker image. This was used (as per Hackathon requirements) instead Redis Enterprise Cloud as that did not yet support RedisGraph v2.4 (at time of development).
backend
The Go backend uses the redisgraph-go library to proxy graph queries from the frontend. The Go library didn't support the new
point()
type, so I sent PR redisgraph-go#45 adding this feature.To mark every station on the map (
/stations
API call), a simple Cypher query is used to fetch all the locations:To count all the edges in the graph (part of
/vitals
API call), another simple Cypher query is used:The main Cypher query to retrieve journeys (
/journey_query
API call) is of the form:This matches all the
:Stations
within the$src
and$dst
circles, and all the trip edges between these stations (in both directions). This is a fast query due to the geospatial index on:Station.loc
(see offline_importer below). The returnedegress
is true if the trip started at$src
, or false if it started at$dst
. The aggregated trip graph presented on the UI is built by aggregating properties on these:Trip
edges, for both egress and ingress traffic.frontend
The frontend is built in React, built around react-mapbox-gl and custom drawing modes I implemented. The aggregated trip graph is built using devexpress/dx-react-chart.
This is my first ever React project, be nice! ;)
offline_importer
The offline importer iteratively downloads the public Citi Bike trip data, unzips each archive, and indexes all the trips into the
journeys
graph.The graph contains every
:Station
as a node, an index on the station ID, and a geospatial index of the station's locations:Each of the 58 million journeys are represented as increments on the edge between the
src
anddst
stations (there are ~818k unique[src]->[dst]
edges). The graph is setup to aggregate trips based on the trip time of the week (into7*24
hour buckets). This graph could easily be extended to also aggregate trips on other dimensions too.To index a single trip, the following Cypher query is used:
This either creates a new edge with one trip, or increments the appropriate counter on the edge to index the trip.
To efficiently write all 56 million trips, I use pipelining and turn
CLIENT REPLY OFF
for each batch. The bulk import takes a couple of hours.How to run
Create a Mapbox Access Token and write it to
frontend/.env
:Build the visual UI components, and run it using Docker Compose:
The frontend should now be accessible at http://localhost:80/, but the map will be blank as Redis is empty. Now, start indexing the public dataset:
Each reload of the UI at http://localhost:80/ should show these trips accumulate. On the live demo, I use a prebuilt
dump.rdb
which is 674MB on disk.The text was updated successfully, but these errors were encountered: