Skip to content

Developer Guide

Quan Zhang edited this page Sep 18, 2013 · 16 revisions

What Git workflow are we going to adopt?

How does all this code fit together? What goes where?

Marker Clustering on Server Side.

  • Open Source Application: uReport (https://github.com/City-of-Bloomington/uReport)
  • Author: Quan Zhang (quanzhang@acm.org)
  • Mentor: Cliff Inghamn (inghamn@bloomington.in.gov)
  • Sponsor: Google Inc. (Google Summer of Code)
  • Focus Area: GIS, Clustering, large data
  • When: June - September, 2013
  • Where: Bloomington City Hall, IT services Dept.
  • Address: 401 N Morton St, Suite 150, Bloomington IN 47402
  • Tools: Solr, Apache, Tomcat, MySQL, Google Map API V3, YUI
  • Languages: PHP, Javascript, SQL
  • Achievement: Displayed clusters of up to 1M tickets on the map within one second.

Project background: The city of Bloomington has a web application to receive the issues reported by citizens. Once received, the issue will be stored into the database as a ticket. A ticket has many properties like category, status, date, township, latitude, longitude and etc. These properties are set when the citizen report the issue. Now the city's database stored more than 70,000 tickets. In the main page of the web application, the users need to see the tickets on the map. Also they can filter tickets by choosing properties as search parameters. However, a problem rose. If we show all the tickets as markers on the map at a time, the map will be full of markers. Also, displaying the markers will take a long time. What's worse, the client will receive tickets from Solr server, The time to transmit tickets data from server to client will also be too long to accept. We need to cluster markers on server side and transmit the clusters instead of tickets to the map of the client.

Project Progress: Basic attempt on showing top N tickets on the map: Solr[] has a feature to facet query and return any number of results back. The query results will be transmitted as JSON or XML data. The more tickets results returned , the larger the data will be. By setting "start" and "rows" parameters in Solr query string, we can get small number of tickets of the query results, and display the tickets on the map immediately. The Solr query strings are sent by Javascript whenever the map stops to pan or zoom. Since this design cannot show all the query results at a time. I also added an HTML option to dynamically display the chosen number of tickets and used OverlappingMarkerSpiderfier[] to display overlapping markers separately. Later I tried to cluster tickets on the client side. Google provided some samples to cluster on client side[]. Gary Little[] implemented the client side clustering and released it as an API. Also, the officially posted article "Too Many Markers!"[] summarized solutions to deal with marker clustering on client side. I borrowed some ideas from this article in later server side clustering design. However, no matter how efficient the clustering algorithm is, since transmitting large data from server to client is not acceptable, We still need to implement clustering on server side.

Geohash based clustering on server side: As far as my mentor and I can find, there is no released API or build-in component of Solr can do clustering on server side. However I got idea from a thesis posted by Josef Dabernig[].

Distance based clustering on server side:

References:

Solr 4.0 Spatial Search: http://wiki.apache.org/solr/SolrAdaptersForLuceneSpatial4

Solutions to the Too Many Markers problem with the Google Maps JavaScript API V3: https://developers.google.com/maps/articles/toomanymarkers http://gmaps-samples-v3.googlecode.com/svn/trunk/toomanymarkers/toomanymarkers.html

Client side marker clusterer: https://code.google.com/p/google-maps-utility-library-v3/source/browse/trunk/markerclustererplus/src/markerclusterer.js?r=360

OverlappingMarkerSpiderfier: https://github.com/jawj/OverlappingMarkerSpiderfier

Base 32 Geohash: https://github.com/asonge/php-geohash/blob/master/geohash.php

Solr Stats Component: http://wiki.apache.org/solr/StatsComponent

Geocluster: Server-side clustering for mapping in Drupal based on Geohash: https://groups.drupal.org/node/296238

Haversine formula: http://en.wikipedia.org/wiki/Haversine_formula

Distance based marker clustering using Haversine algorithm: http://www.appelsiini.net/2008/11/introduction-to-marker-clustering-with-google-maps

clustermap -- Visualizing Clusters of Markers on Google Maps (v3): https://code.google.com/p/clustermap/source/browse/trunk/clustermap.js

Developer Guide

Features

Principles

  • Coding Style
  • Accessibility (Section 508)
  • Progressive Enhancement
  • Unobtrusive Javascript
Clone this wiki locally