-
Notifications
You must be signed in to change notification settings - Fork 4
Home
Here the PARks Quality Assessment team will keep a running blog of progress, plots, data descriptions, ideas, questions for parks, etc.
How does the socio-economic characteristics of the surrounding neighborhoods impact park quality? Scope: For a given parks quality what are the contributing factors
On the borough: How many features are failing? What features are failing? Spring - ICE? Right way to normalize quality - safety, aesthetics, etc. How can you assign a quality score based on features to determine how the neighborhood is doing? What happens if there is only one feature in the park? How is Overall formulated? Rating as a function of inspection time - normalized by area and number of features. How do we improve inspection program? What is the coincidence of features - eg - litter and glass - if there is litter how often is there glass?
Link to PIP-xml = https://data.cityofnewyork.us/Housing-Development/Parks-Inspections-data/t9jy-gfev
NYU Data Services provided this link to solve the File-Geodatabase/QGis problem= http://gis.ucla.edu/working-with-file-geodatabases-gdb-using-qgis-and-gdal/
SocioEconomi Factors that we think might influence park quality: -Demographics (Median Income, Ethnicity, Population, Education Levels) -Complaints from 311 might reveal information about types pf complaints and which parks gain the most.
Meeting Notes
Internal
Quality score that takes into account error bars
How long per square foot is the park quality score dependent.
Time per square food of usable space
Ask about park space and function park space.
External Number of people living within radius around park R0 Number of people working within R0 Median Income Average Salary 311 calls (noisy b/c of propensity) Crime data
Percent of features that fail given if one fails.
Functional park land: what is rated by PIP - includes sidewalks Current as of March 2015 Historic data - check if retired parks exists
Rating Ice in the spring time? Difference between inspection seasons and actual seasons December 1 to April 1 - rate ice other times rate weeds
PIP Calculate new quality score - If there is only 1 feature for a park how are these treated? Parks with less featured are they treated differently? If one feature is so unacceptable it get a US because it is so unacceptable. US - can also be for safety hazard. Parks with many features are not treated differently than those with small features. Larger parks (size) have more latitude.
Closed for construction. Covered in snow
Failure rates of parks as a function of income - was area taken into account? No - just rating. Richness of features - combination of area and richness of features (amenities)
No on how much time spent per square foot relates to park quality.
If the park is being cleaned Richness of features - wooded areas
Another data set - land cover layer as of 2010 derived from LiDAR
If park is being cleaned during the inspection --> it's given an "N" so the grading is biased If a park is covered by snow during inspection --> it's given an "N" so the grading is biased
Inventory table - USE THIS - gives richer picture of how much is in that park Inventory - provides richness Rated column in all sites table should have rating of 0 or -1 - True False - very important - where rated is true those are the only things parks is currently inspecting. Reason not rated column - why this park is not being rated. Current snapshot of rated
Natural? How much grass, how much tree, are in each park. Natural areas do not have a lot of features or no features to rate. Boundaries of natural area - outdated - on open data. Purposely not built areas.
Jackie will send files
Synthetic turf is not captured in all athletic facilities Calculate total areas of synthetic turf - will send file
Geodatabase that has land cover data Forever wild data
Made changes to coincident feature failure script.
If Fences were rated 50 times and litter was rated 25 times and litter failed 10 times the number of time litter failed when fence failed is 10/25.
Calculating parks quality rating: Time, Area, Num Features Failed, Num Features Evaluated
Quality = Number of Features Failed / Number of Features Evaluated
Done per park
Quality of zip = (Sum (Quality of park * Area of Park))/(Sum Area of all parks in Zip Code)
I did a Pull on the code and did some moderate over hauls on the data structures to get them in line with what we were both talking about. I didn't test any of it but it's pretty clean and commented to sift through if there's any debugging needed. I didn't move beyond the data structures but they should be ready for you to run with.
Also, I put in some code.. commented out.. to test and see if you can run it. I think it should be good but it will allow us not to iterate through the entire Sites file and instead just read entries directly from the dataframe as necessary when building the parkID dictionary.
If the above works, the code goes to 2 iterations (for loops) total.. one for the Ratings file and one for the Insepctions file. That should speed things up.
I suggest once we get the structure parsed correctly, we can play with writing it to a file using 'Pickle'. Super easy. Then we just make other scripts read this file before doing analysis and it will save us from reading files into dataframes and all the work of structuring over and over again.
##04/30/2015 The Rating file spits out a pickle file.
The pickleFile contains a dictionary of each park and its relevant attributes. This can be used as the "master database"
Questions for Parks Department:
How did they deal with breakups that have different number inspections per year.
B-018 Zone?Playgrounds? How did they choose which feature to evaluate? How did the number of features change along seasons?
Q012 - zone4/zone3 ? Answer: Recently starting 2013 (2014 fiscal street)Each zone should be inspected twice a year. Before that; they parks were only guaranteed to be inspected once.
##05/06/2015
Created new script/plots for calculating the Coincidence of Failure of Features minus the 'noise' of the typical failure rate of the feature regardless of the coincidence.