The GPE (Georeferenced Program Evaluation) R package includes functions that will allow the user to study various aspects of consumer or beneficiary behavior, including:
-
Georeference postal addresses (converting street name and number to latitude and longitude points)
-
Obtain base maps of any city’s urban grid and visualize overlaid projections of the distance traveled (between participants’ homes and points of service accessed, for example)
-
Estimate a matrix of distances between sites accessed and consumer/beneficiary origin
-
Calculate metrics of frequency, distribution by group, and distance travelled by consumers/beneficiaries to points of sale/access
-
Create visualizations that explore the difference between frequency and type of consumption by consumer/beneficiary attribute (depending on the information available: socioeconomic status, length of program participation, type of program participation, etc)
For complete documentation of this R package, please see the documentation website. Examples of the functions can be found below and in the doc folder of this repository.
You’ll need tabular data with addresses, or “lon” and “lat” columns containing WG84 (Mercator) location coordinates. In other words, a table with the addresses of schools that program beneficiaries attend, or a street or centroid identifying their neighborhoods, or their home addresses if using personal data is allowed.
You can geocode your data (obtain latitude and longitude from addresses) using GPE_geocode()
, a function that queries the Google Maps Platform to translate addresses into latitude/longitude coordinates.
Using the Google Maps Platform requires a registered API key. To obtain it, follow instructions from https://developers.google.com/maps/gmp-get-started. Make sure you enable the Geocoding API.
A valid API key is a string of leters and numbers that looks like AIjaSyBR76W62lloYPh_c01LYGhCOZuKU6RVW9 -this is not a real ID by the way, just an example-. Once you have your key, you can take a data frame with an address column like this one: Having a "participants" dataframe such as
library(GPE)
schools
name address level
1 EMus 02 Camarones 4351, C1407FMU, CABA other
2 Col La Anuncia… Arenales 2065, C1124AAE, CABA preschool, primary,…
3 EPjs 06 Avda. Eva Perón 7431, C1439BTM, CA… primary `
4 EPjs 02 Avda. Pte. Manuel Quintana 31/07, … primary `
5 Col Horacio Wa… Díaz Colodrero 2431, C1431FMA, CABA preschool, primary,…
… and add latitude and longitude columns using GPE_geocode, specifying the name of the address column and your Google Maps Platform API key:
key <- "AIjaSyBR76W62lloYPh_c01LYGhCOZuKU6RVW9" # This is not a real API key, you must provide yours
GPE_geocode(schools, address, key)
name address level lon lat
1 EMus 02 Camarones 4351, C1407FMU, C… other -58.5 -34.6
2 Col La Anunc… Arenales 2065, C1124AAE, CA… preschool, prima… -58.4 -34.6`
3 EPjs 06 Avda. Eva Perón 7431, C1439… primary -58.5 -34.7`
4 EPjs 02 Avda. Pte. Manuel Quintana … primary -58.4 -34.6`
5 Col Horacio … Díaz Colodrero 2431, C1431F… preschool, prima… -58.5 -34.6`
Columns with latitude and longitude coordinates are added to the original data frame.
To map the geocoded data, you’ll need a data frame containing participant data. GPE includes “participants”, an example data frame with fictional public program beneficiaries. The geographic position of participants and program locations can be plotted, over a basemap, using:
GPE_plot_map(participants)
In order for GPE_plot_map()
to work, the input data frame must include columns named “lat” and “lon” representing WGS84 coordinates. As previously shown, latitude and longitude columns can be obtained from addresses using GPE_geocode()
.
In addition, a data frame containing locations -places that participants visit in order to interact with the program, such as training centers, day care providers, etc.- can also be mapped. GPE includes “locations”, an example data frame with fictional public program sites. This dummy data can be used as follows to generate a more detailed map:
GPE_plot_map(participants, locations)
You can also visualize participant attributes by indicating the name of the column that should be used:
GPE_plot_map(participants, locations, participant_attribute = "group")
The same can be done for location attributes…
GPE_plot_map(participants, locations, location_attribute = "type")
… or both participant and location attributes:
GPE_plot_map(participants, locations, participant_attribute = "group", location_attribute = "type")
Using records of interaction between people and places -representing trips by consumers/beneficiaries to points of sale/access- GPE can estimate time and distance travelled with the Google Maps Platform. As is the case with geocoding, a valid API key is needed to access this service.
GPE provides an example “visits” data frame:
visits
date participant_id location_id
1 2018-07-22 5786 71
2 2018-06-29 804 5
3 2018-11-08 6880 76
4 2018-03-29 834 6
5 2018-05-31 2643 67
Trip distance and duration can be obtained using GPE_travel_time_dist(). As inputs, the function uses visits data, and the locations and participants data frames to provide the origin and destination coordinates. The mode of transport can be chosen from “transit” (default), “driving”, “walking”, or “bicycling”. Keep in mind that transit routing information is not available for all cities; “driving” and “walking” routing is usually available everywhere.
GPE_travel_time_dist(visits, participants, locations, key)
date participant_id location_id time_minutes distance_km mode
1 2018-03-29 834 6 17.38333 2.317 transit
2 2018-05-31 2643 67 39.35000 5.511 transit
3 2018-06-12 3395 61 39.80000 8.816 transit
4 2018-06-29 804 5 36.20000 10.664 transit
5 2018-07-22 5786 71 21.75000 2.303 transit
GPE includes a simple summary function that takes a “visits” data frame, as described above, and returns basic descriptive statistics for: o frequency of visits, by participant (person) o frequency of visits, by location (site)
If the input data frame includes time_minutes and distance_km columns (i.e. as a result of using GPE_travel_time_dist())
the summary will also include basic descriptive statistics for
o travel time, in minutes o travel distance, in km
For example, given a data frame like:
date participant_id location_id time_minutes distance_km mode
1 2018-03-29 834 6 17.38333 2.317 transit
2 2018-05-31 2643 67 39.35000 5.511 transit
3 2018-06-12 3395 61 39.80000 8.816 transit
4 2018-06-29 804 5 36.20000 10.664 transit
5 2018-07-22 5786 71 21.75000 2.303 transit
... the result is:
GPE_summary(visits_timedist)
$`visits by participant`
Min. 1st Qu. Median Mean 3rd Qu. Max.
4.000 5.750 6.500 7.312 8.250 13.000
$`visits by location`
Min. 1st Qu. Median Mean 3rd Qu. Max.
1.000 3.000 4.000 4.255 6.000 10.000
$`travel time (minutes)`
Min. 1st Qu. Median Mean 3rd Qu. Max.
6.30 29.00 40.16 40.53 50.27 102.17
$`travel distance (km)`
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.576 5.122 8.897 9.165 12.921 28.855
install.packages("devtools")
devtools::install_github("EL-BID/GPE")
GPE depends on the following packages:
ggmap
Some funcitons in GPE will also require a Google API key. A key can be created here. For more information, use the R command ?ggmap::register_google
.
For all contributions to this project, this repository may be forked directly.
GPE was developed by H. Antonio Vazquez Brust
The Documentation of Support and Use of the software is licensed under Creative Commons IGO 3.0 Attribution-NonCommercial-NoDerivative (CC-IGO 3.0 BY-NC-ND)
The codebase of this repo uses AM-331-A3 Software License.
The IDB is not responsible, under any circumstance, for damage or compensation, moral or patrimonial; direct or indirect; accessory or special; or by way of consequence, foreseen or unforeseen, that could arise:
I. Under any concept of intellectual property, negligence or detriment of another part theory; I
ii. Following the use of the Digital Tool, including, but not limited to defects in the Digital Tool, or the loss or inaccuracy of data of any kind. The foregoing includes expenses or damages associated with communication failures and / or malfunctions of computers, linked to the use of the Digital Tool.