Team Members: Yash Kansal, Meera G K, Yeyan Wang, Crystal Wang, Shweta Joshi
We will be analyzing the Socrata Energy and Water Consumption open source data for the City of Los Angeles. We will be focusing on different buildings across the City of LA and try to understand which factors contribute most to energy and water use consumption in the buildings and in turn, which levers could be used by the city to reduce the said consumptions.
Question 1: Which 5 property types have the highest correlation between Water Use and Carbon Emissions, and Source Energy Use Intensity (Source EUI) and Carbon Emissions?
Question 2: What property type has highest source energy use intensity and water use per square footage for buildings built after the year of 2000?
Question 3: Does any property type have a correlation between Water Use and Gross Building Floor Area (GFA), and if there is a correlation, what is the relationship between for building type with highest correlation value?
Question 4: For multi-family housing building type, check if there is a relationship between year built and source EUI, as well as year built and water use per square foot?
Data Cleaning
- Pull dataframe from the Socratas API, and create a new dataframe that only includes columns needed and drop rows with the NaN and "Not Available" values
- Change the data type to float for columns with numerical values
- Drop any properties with "0" occupancy
- Find out number of buildings for each property type and determine which property types have at least 300 data points so the data is not affected by rarely occurring properties
- Create a new data frame with only highly frequent property types
Steps For Question 1:
- Calculate the correlation between Water Use vs Carbon Emissions, and Source EUI vs Carbon Emissions for each building type
- Determine the top 5 property types with the highest (+ or -) correlation values, display the top 5 highly correlated property types for water consumption vs. Carbon Dioxide Emissions and the top 5 highly correlated property types for source EUI vs. Carbon Dioxide Emissions using bar graphs
- Use scatter plots, linear regression lines, and r-value to examine each relationship
Steps For Question 2:
- Create another dataframe that only has columns we will need to find water and energy use intensity for each building type and show buildings built only after 2000
- Calculate the water use intensity (kgal/sq-ft) by dividing total water use by gross building floor area and add these values to another column in the dataframe.
- Find out which building type has the highest median water consumption and highest median source energy consumption using groupby function
- Use bar plots to compare the median values for different bulding types
Steps For Question 3:
- Find the correlation between water use and gross building floor area for different building types
- Print the building type with the highest correlation between water use and gross building floor area
- Plot the correlation values for different building type
- Plot Gross Building Floor Area vs Water Use for Manufacturing/Industrial Plants (instead of Mixed Use because by nature, the use of Mixed Use properties can vary a lot and it is hard to draw conclusions and guide policies)
Steps For Question 4:
- Create bins for year built column based on decades
- Divide water use by property gross footage area and filter dataframe by Multifamily housing
- Calculate median energy and water use for each bin, i.e., each decade using groupby function
- Use line plot to see the trends for water use intensity and energy use intensity
- For water use and carbon emissions, there appears to be a relatively strong positive correlation for Parking buildings, where the correlation coefficient between the two variables is 0.7173057056152874. For source energy use and carbon emissions, there appears to be a very strong positive correlation for Retail Stores, where the correlation coefficient between the two variables is 0.9830449798002524. It seems that CO2e have a stronger relationship to Source EUI and accordingly, the focus should be on reducing Source EUI to reduce carbon emissions.
- Hotel has the highest water consumption per sq-ft, with a value of 8115.6 kgal/sq-ft for building built after 2000. Medical Office has the highest energy consumption per sq-ft, with a value of 168.6 kgal/sq-ft for buildings built after 2000. Overall, new Hotels have high water and energy use and therefore, more measures should be taken to regulate energy and water consumption at new Hotels.
- For water use and gross building floor area, there appears to be a moderately strong positive correlation for Manufacturing/Industrial Plants, where the correlation coefficient between the two variables is 0.5215937191397046. Also, based on the correlation values, the focus should be on adapting more water conservation and reduction strategies at larger Medical Offices, Manufacturing Plants and Mixed Use Properties.
- The water usage intensity has decreased with year the property was built, however, no such trend is observed for energy use intensity. Accordingly, the city should start focusing on improving the energy usage trend or better understand why this trend hasn't changed.