A collaboration between multiple nonprofits, governments, and tech companies is currently building an AI to continuously monitor pollution from every power plant in the world from space, using satellite data.
A crucial component of that system is its ability to be trained on a sufficiently large and accurate training dataset. This will be a mix of readily available satellite imagery, combined with accurate ground truth generation data. The purpose of this project is to help build that ground truthing system. Students will implement real-time scrapers for web-based data from power grids around the world and analyze the data for accuracy.
A forecasting model will be implemented to predict future generation values and a front-end tool will be created to visualize the collected data.
Students will be working within a framework provided by WattTime which already includes a robust ETL pipeline, database, and significant support on where to find data and how to interpret it.
Mentor - Connor Guest
Proffessor - Dr. Sara Farag
- Python code that will sit within an existing WattTime tool which will automatcially scrape, test, and correct data
- Forecasting model
- Front-end visualization tool of the data
- Write scrapers, likely in Python, to scrape data from power grid websites around the world.
- Analyze the scraped data for accuracy. (For example, many coal-fired power plants report emissions that are exactly 1,000 times smaller than is possible, which is a clear sign that they are reporting in the wrong units.)
- Create a forecasting model to predict future generation values
- Build a tool to visualize the data
- arrow
- bs4
- datetime
- requests
- selenium
- pandas
- pymongo
You can install most required python modules with pip install -r requirments.txt
(You'll need to manually install pandas and fbprophet due to some bugs or OS limitations. Some instructions are found in other_requirements.txt
)
- Watttime.org
- HTTP requests/responses
- Details like auth/account TBD
- Example code