From ade47e4d2adaa2413942e72eeefa02a697a46e15 Mon Sep 17 00:00:00 2001 From: Rob Hammond <13874373+RHammond2@users.noreply.github.com> Date: Fri, 14 Oct 2022 15:24:39 -0600 Subject: [PATCH] Rc1 cleanup (#215) * remove old examples and subpackages * fix the broken method attachment * update docs structure * remove types from v3 completely * bump required python version for new features * change all biner links to be develop_v3 so that the develop_v3 docs link there --- CHANGELOG.md | 26 +- examples/00_toolkit_examples_entr.ipynb | 536 ------- examples/00_v3_demonstration.ipynb | 897 ----------- openoa/__init__.py | 6 +- openoa/types/asset.py | 205 --- openoa/types/plant.py | 427 ------ openoa/types/plant_schema.json | 199 --- openoa/types/plant_schema_25.json | 168 --- openoa/types/plant_v2.py | 1826 ----------------------- openoa/types/timeseries_table.py | 413 ----- readme.md | 4 +- setup.py | 3 +- sphinx/examples/index.rst | 2 +- sphinx/getting_started/index.rst | 12 +- sphinx/index.rst | 4 +- 15 files changed, 38 insertions(+), 4690 deletions(-) delete mode 100644 examples/00_toolkit_examples_entr.ipynb delete mode 100644 examples/00_v3_demonstration.ipynb delete mode 100644 openoa/types/asset.py delete mode 100644 openoa/types/plant.py delete mode 100644 openoa/types/plant_schema.json delete mode 100644 openoa/types/plant_schema_25.json delete mode 100644 openoa/types/plant_v2.py delete mode 100644 openoa/types/timeseries_table.py diff --git a/CHANGELOG.md b/CHANGELOG.md index 217ebd9f..65184fd1 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,18 +1,28 @@ # Changelog All notable changes to this project will be documented in this file. If you make a notable change to the project, please add a line describing the change to the "unreleased" section. The maintainers will make an effort to keep the [Github Releases](https://github.com/NREL/OpenOA/releases) page up to date with this changelog. The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/). -## [UNRELEASED] +## 3.0rc1 - The package name is changed from `operational_analysis` to `openoa` to be more consistent with how we expect to import OpenOA! -- Renamed plant_analysis to long_term_monte_carlo_aep and updated it for the V3 API. +- `PlantData` is now fully based on attrs dataclasses and utilizing the pandas `DataFrame` for all internal data structures + - `PlantData` can now be imported via `from openoa import PlantData` + - By using attrs users no longer have to subclass `PlantData` and create their own `PlantData.prepare` method. + - Users can now bring their own column naming, and provide a metadata definition so columns are mapped under the hood through the `PlantMetaData` class (see Intro Example for more information!) + - `PlantData.scada` (or similar) is now used in place of accessing the SCADA (or similar) dataframe + - v2 `ReanalysisData` and `AssetData` methods have been absorbed by `PlantData` in favor of a unified data structure and means to operate on data. + - v2 `TimeSeriesTable` is removed in favor of a pandas-based API and data usage +- openoa has a new import structure + - `PlantData` is available at the top level: `from openoa import PlantData` + - tookits -> utils via `from openoa.utils import xx` + - pandas_plotting -> plot + - quality_check_automation -> qa (formerly located in methods) + - methods -> analysis via `from openoa.analysis import xx` +- Convenience methods such as `PlantData.turbine_ids` or `PlantData.tower_df(tower_id="x")` have been added to address commonly used code patters +- Analysis methods are now available through `from openoa.analysis import ` - Renamed `compute_shear_v3` to `compute_shear` and deleted old version of `compute_shear`. -- v2 `ReanalysisData` and `AssetData` methods have been absorbed by `PlantData` in favor of a unified data structure and means to operate on data. -- v2 `TimeSeriesTable` is removed in favor of a pandas-based API and data usage -- The `filters` and `imputing` module has been cleaned up to take both pandas `DataFrame` and `Series` objects where appropriate, refactors pandas code to be much cleaner for performance and readability, has more user-friendly error messages, and has more consist outputs +- The `utils` subpackage has been cleaned up to take both pandas `DataFrame` and `Series` objects where appropriate, refactors pandas code to be much cleaner for both performance and readability, has more user-friendly error messages, and has more consist outputs - `openoa.utils.imputing.correlation_matrix_by_id_column` has been renamed to `openoa.utils.imputing.asset_correlation_matrix` - A new 00_x example notebook is replace the 1a/b QA examples to highlight how the `project_ENGIE.py` methods are created. This creates an example for users to work with and significantly more details on how to use the new `PlantData` and `PlantMetaData` methods. - -## [UNRELEASED - 2.x] -- Added `compute_shear_v3` to `met_data_processing` toolkit, improving efficiency, removed requirement to provide reference height, and added option to return reference height and wind speed corresponding to best-fit shear exponent. Decided to bifurcate function into `compute_shear` and `compute_shear_v3` in order to maintain backwards compatibility with the OpenOA 2.x line. +- Documentation reorganization and cleanup ## [2.3 - 2022-01-18] - Replaced hard-coded reanalysis dates in plant analysis with automatic valid date selection and added optional user-defined end date argument. Fixed bug in normalization to 30-day months. diff --git a/examples/00_toolkit_examples_entr.ipynb b/examples/00_toolkit_examples_entr.ipynb deleted file mode 100644 index 8f3bff9c..00000000 --- a/examples/00_toolkit_examples_entr.ipynb +++ /dev/null @@ -1,536 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Use ENGIE’s open data set\n", - "\n", - "ENGIE provides access to the data of its 'La Haute Borne' wind farm through https://opendata-renewables.engie.com and through an API. The data can be used to create additional turbine objects and gives users the opportunity to work with further real-world data. \n", - "\n", - "The series of notebooks in the 'examples' folder uses SCADA data downloaded from https://opendata-renewables.engie.com, saved in the 'examples/data' folder. Additional plant level meter, availability, and curtailment data were synthesized based on the SCADA data.\n", - "\n", - "In the following example, data is loaded into a turbine object and plotted as a power curve. The selected turbine can be changed if desired." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Connect to the ENTR Warehouse" - ] - }, - { - "cell_type": "code", - "execution_count": 1, - "metadata": {}, - "outputs": [], - "source": [ - "%load_ext autoreload\n", - "%autoreload 2" - ] - }, - { - "cell_type": "code", - "execution_count": 2, - "metadata": {}, - "outputs": [], - "source": [ - "import pandas as pd" - ] - }, - { - "cell_type": "code", - "execution_count": 3, - "metadata": { - "collapsed": false, - "jupyter": { - "outputs_hidden": false - }, - "pycharm": { - "name": "#%%\n" - } - }, - "outputs": [], - "source": [ - "from operational_analysis.types import PlantData\n", - "from operational_analysis.toolkits import unit_conversion as un" - ] - }, - { - "cell_type": "code", - "execution_count": 4, - "metadata": {}, - "outputs": [ - { - "name": "stderr", - "output_type": "stream", - "text": [ - "INFO:pyhive.hive:USE `default`\n", - "INFO:pyhive.hive:SELECT Wind_turbine_name as Wind_turbine_name, \n", - " Date_time as Date_time, \n", - " cast(P_avg as float) as P_avg,\n", - " cast(Power_W as float) as Power_W,\n", - " cast(Ws_avg as float) as Ws_avg,\n", - " Wa_avg as Wa_avg,\n", - " Va_avg as Va_avg, \n", - " Ya_avg as Ya_avg, \n", - " Ot_avg as Ot_avg, \n", - " Ba_avg as Ba_avg \n", - " \n", - " FROM entr_warehouse.la_haute_borne_scada_for_openoa\n", - " \n" - ] - } - ], - "source": [ - "\n", - "### DATAFRAMES\n", - "\n", - "plant = PlantData.from_entr(wind_plant=\"la_haute_borne_scada_for_openoa\")\n", - "\n", - "### METADATA\n", - "\n", - "# Set time frequencies of data in minutes\n", - "plant._meter_freq = '10T' # Daily meter data\n", - "plant._curtail_freq = '10T' # Daily curtailment data\n", - "plant._scada_freq = '10T' # 10-min\n", - "\n", - "# Load meta data\n", - "plant._lat_lon = (48.452, 5.588)\n", - "plant._plant_capacity = 8.2 # MW\n", - "plant.n_turbines = 4\n", - "plant._turbine_capacity = 2.05 # MW\n", - "\n", - "\n", - "### PRE-PROCESSING\n", - "\n", - "plant._scada.df['time'] = pd.to_datetime(plant._scada.df['Date_time'],utc=True).dt.tz_localize(None)\n", - "\n", - "# Remove duplicated timestamps and turbine id\n", - "plant._scada.df = plant._scada.df.drop_duplicates(subset=['time','Wind_turbine_name'],keep='first')\n", - "\n", - "# Set time as index\n", - "plant._scada.df.set_index('time',inplace=True,drop=False)\n", - "\n", - "plant._scada.df = plant._scada.df[(plant._scada.df[\"Ot_avg\"]>=-15.0) & (plant._scada.df[\"Ot_avg\"]<=45.0)]\n", - "\n", - "plant._scada.df[\"Power_W\"] = plant._scada.df[\"P_avg\"] * 1000\n", - "\n", - "# Convert pitch to range -180 to 180.\n", - "plant._scada.df[\"Ba_avg\"] = plant._scada.df[\"Ba_avg\"] % 360\n", - "plant._scada.df.loc[plant._scada.df[\"Ba_avg\"] > 180.0,\"Ba_avg\"] \\\n", - " = plant._scada.df.loc[plant._scada.df[\"Ba_avg\"] > 180.0,\"Ba_avg\"] - 360.0\n", - "\n", - "# Calculate energy\n", - "plant._scada.df['energy_kwh'] = un.convert_power_to_energy(plant._scada.df[\"Power_W\"], plant._scada_freq) / 1000\n", - "\n", - "# Note: there is no vane direction variable defined in -25, so\n", - "# making one up\n", - "scada_map = {\n", - " \"time\" : \"time\",\n", - " \"Wind_turbine_name\" : \"id\",\n", - " \"Power_W\" : \"wtur_W_avg\",\n", - "\n", - " \"Ws_avg\" : \"wmet_wdspd_avg\",\n", - " \"Wa_avg\" : \"wmet_HorWdDir_avg\",\n", - " \"Va_avg\" : \"wmet_VaneDir_avg\",\n", - " \"Ya_avg\" : \"wyaw_YwAng_avg\",\n", - " \"Ot_avg\" : \"wmet_EnvTmp_avg\",\n", - " \"Ba_avg\" : \"wrot_BlPthAngVal1_avg\",\n", - " }\n", - "\n", - "plant._scada.df.rename(scada_map, axis=\"columns\", inplace=True)\n", - "\n", - "# Remove the fields we are not yet interested in\n", - "plant._scada.df.drop(['Date_time', 'time', 'P_avg'], axis=1, inplace=True)\n", - "\n" - ] - }, - { - "cell_type": "code", - "execution_count": 5, - "metadata": {}, - "outputs": [], - "source": [ - "import matplotlib.pyplot as plt\n", - "import numpy as np\n", - "# import databricks.koalas as pd # replaces: 'import pandas as pd'\n", - "\n", - "from operational_analysis.toolkits import filters\n", - "from operational_analysis.toolkits import power_curve\n", - "\n", - "import random\n", - "import pandas as pd\n", - "\n", - "import time\n", - "\n", - "project = plant" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Import the data" - ] - }, - { - "cell_type": "code", - "execution_count": 6, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "array(['R80711', 'R80721', 'R80736', 'R80790'], dtype=object)" - ] - }, - "execution_count": 6, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "# List of turbines\n", - "turb_list = project.scada.df.id.unique()\n", - "turb_list" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Let's examine the first turbine from the list above." - ] - }, - { - "cell_type": "code", - "execution_count": 7, - "metadata": {}, - "outputs": [], - "source": [ - "df = project.scada.df.loc[project.scada.df['id'] == turb_list[0]]\n", - "windspeed = df[\"wmet_wdspd_avg\"]\n", - "power_kw = df[\"wtur_W_avg\"]/1000 # Put into kW\n" - ] - }, - { - "cell_type": "code", - "execution_count": 8, - "metadata": {}, - "outputs": [], - "source": [ - "def plot_flagged_pc(ws, p, flag_bool, alpha):\n", - " plt.scatter(ws, p, s = 1, alpha = alpha)\n", - " plt.scatter(ws[flag_bool], p[flag_bool], s = 1, c = 'red')\n", - " plt.xlabel('Wind speed (m/s)')\n", - " plt.ylabel('Power (W)')\n", - " plt.show()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "First, we'll make a scatter plot the raw power curve data." - ] - }, - { - "cell_type": "code", - "execution_count": 9, - "metadata": {}, - "outputs": [ - { - "data": { - "image/png": "\n", - "text/plain": [ - "
" - ] - }, - "metadata": { - "needs_background": "light" - }, - "output_type": "display_data" - } - ], - "source": [ - "plot_flagged_pc(windspeed, power_kw, np.repeat(True, df.shape[0]), 1)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Range filter" - ] - }, - { - "cell_type": "code", - "execution_count": 10, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "Series([], Name: wmet_wdspd_avg, dtype: float64)" - ] - }, - "execution_count": 10, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "out_of_range = filters.range_flag(windspeed, below=0, above=70)\n", - "windspeed[out_of_range].head()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "No wind speeds out of range\n", - "\n", - "### Window range filter\n", - "\n", - "Now, we'll apply a window range filter to remove data with power values outside of the window from 20 to 2100 kW for wind speeds between 5 and 40 m/s." - ] - }, - { - "cell_type": "code", - "execution_count": 11, - "metadata": {}, - "outputs": [ - { - "data": { - "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYsAAAEGCAYAAACUzrmNAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjQuMywgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/MnkTPAAAACXBIWXMAAAsTAAALEwEAmpwYAAAY80lEQVR4nO3dfbRddX3n8feHhAbCgwYTmJiAoRVRQMZVLgyitah1oI4DTEdWs5YMVFRGFjOOD32A6qCurnYcxXHKKCijDNixUOpSwVEUhlJoEYQbRQhBhAqFCIWLQY1GYx6+88fedzgebrLvNffck9z7fq111t3ntx/Ody/I/dy9f/v8fqkqJEnant2GXYAkaednWEiSOhkWkqROhoUkqZNhIUnqNH/YBQzK4sWLa8WKFcMuQ5J2KatWrXqyqpb0t8/asFixYgWjo6PDLkOSdilJ/nGidm9DSZI6GRaSpE6GhSSpk2EhSepkWEiSOhkWkqROhoUkqZNhIUmzxOYtWxlbv5HNW7ZO+7ENC0maJZ7asIk7HlrHUxs2TfuxDQtJmiUWLdydo1fsx6KFu0/7sWftcB+SNNfMn7cbS/ZZMJBje2UhSepkWEiSOhkWkqROhoUkqZNhIUnqZFhIkjoZFpKkToaFJKmTYSFpVhrkOElzkWEhaVYa+/FGbrj3ccZ+vHHYpcwKhoWkWSl9P7VjHBtK0qy0eO8FvPpFBwxkUL25yLCQNCsNclC9uWhgt6GSXJrkiSSre9pekuS2JHcmGU1yTM+685I8kOS+JCf0tB+V5O523YVJvKqUpBk2yD6Ly4AT+9o+CLy/ql4CnN++J8lhwErg8Hafi5LMa/e5GDgLOKR99R9TkjRgAwuLqroZWNffDOzbLj8LeLRdPhm4sqo2VtWDwAPAMUmWAvtW1a1VVcCngVMGVbMkaWIz3WfxduCrSS6gCarj2vZlwG09261t2za1y/3tE0pyFs1VCAcddNC0FS1Jc91MPzp7NvCOqjoQeAfwqbZ9on6I2k77hKrqkqoaqaqRJUuW7HCxkqTGTIfFGcDn2uW/BsY7uNcCB/Zst5zmFtXadrm/XZI0g2Y6LB4FfrNdfhVwf7t8DbAyyYIkB9N0ZN9eVY8B65Mc2z4FdTpw9QzXLElz3sD6LJJcARwPLE6yFngv8Bbgz5PMB35G279QVfckuQpYA2wGzqmqLe2hzqZ5smpP4Nr2JUmaQWkeMpp9RkZGanR0dNhlSNIuJcmqqhrpb3dsKElSJ8NCktTJsJAkdTIsJEmdDAtJUifDQpojnGZUO8KwkOaIpzZs4o6H1vHUhk3DLkW7IMNCmiMWLdydo1fs58xx+qU4U540RzhznHaEVxaSpE6GhSSpk2EhSepkWEiSOhkWkqROhoUkqZNhIUnqZFhIkjoZFpKkToaFJKmTYSFJ6mRYSJI6GRaSpE6GhSSpk2EhzRHOlKcdMbCwSHJpkieSrO5r/49J7ktyT5IP9rSfl+SBdt0JPe1HJbm7XXdhkgyqZmk2c6Y87YhBXllcBpzY25DklcDJwJFVdThwQdt+GLASOLzd56Ik89rdLgbOAg5pX79wTEmT40x52hEDC4uquhlY19d8NvCBqtrYbvNE234ycGVVbayqB4EHgGOSLAX2rapbq6qATwOnDKpmaTYbnylv/jzvPmvqZvr/mhcAv5Hk60luSnJ0274MeKRnu7Vt27J2ub99QknOSjKaZHRsbGyaS5ekuWumw2I+sAg4FvgD4Kq2D2KifojaTvuEquqSqhqpqpElS5ZMR72SJGY+LNYCn6vG7cBWYHHbfmDPdsuBR9v25RO0S5Jm0EyHxReAVwEkeQHwK8CTwDXAyiQLkhxM05F9e1U9BqxPcmx7BXI6cPUM1yxJc978QR04yRXA8cDiJGuB9wKXApe2j9P+HDij7bi+J8lVwBpgM3BOVW1pD3U2zZNVewLXti9J0gxK87t69hkZGanR0dFhlyFJu5Qkq6pqpL/dZ+gkSZ0MC0lSJ8NCktTJsJAkdTIsJEmdDAtJUifDQpLUybCQJHUyLCRJnQwLSVInw0KS1MmwkCR1MiwkSZ0MC0lSJ8NCktTJsNCctHnLVsbWb2Tzlq3DLkXaJRgWmpOe2rCJOx5ax1MbNg27FGmXYFhoTtpnwTx+dfFe7LNg3rBLkXYJnWGR5KVJPpbkriRjSR5O8uUk5yR51kwUKU239Ru38N0nf8L6jVu6N5a0/bBIci3wZuCrwInAUuAw4D3AHsDVSU4adJHSdFu0cHeOXrEfixbuPuxSpF3C/I71/66qnuxr+zHwjfb14SSLB1KZNEDz5+3Gkn0WDLsMaZfRdRvqtCQjSbYZKhOEiSRplum6slgOXAi8MMldwNeAW4Bbq2rdoIuTBmXzlq08tWETixbuzvx5Puchddnuv5Kq+v2qOg74Z8AfA+uAM4HVSdbMQH3SQPjorDQ1XVcW4/YE9gWe1b4eBe4eVFHSoNnBLU1N19NQlyS5Bfgr4KU0t6FOraqRqnpjx76XJnkiyeoJ1v1+kurtHE9yXpIHktyX5ISe9qOS3N2uuzBJpnqSUr/xDm5vQUmT0/Uv5SBgAfBPwPeAtcAPJnnsy2get/0FSQ4EXgM83NN2GLASOLzd56Ik49+Wuhg4CzikfT3jmJKkwerqszgROBq4oG16F3BHkuuSvL9j35tp+jj6fQT4Q6B62k4GrqyqjVX1IPAAcEySpcC+VXVrVRXwaeCU7tOSJE2nzj6L9pf06iQ/AH7Yvl4HHAO8dyof1n6B73tV9a2+u0nLgNt63q9t2za1y/3t2zr+WTRXIRx00EFTKU2StB1dfRZvS3JlkkeAm2lC4j7gd4D9pvJBSRYC7wbOn2j1BG21nfYJVdUlbX/KyJIlS6ZSniRpO7quLFYAnwXeUVWP7eBn/RpwMDB+VbEc+EaSY2iuGA7s2XY5zRNXa9vl/nZJ0gzq6uA+v6o+u72gSLL3ZD6oqu6uqv2rakVVraAJgl+vqn8CrgFWJlmQ5GCajuzb289dn+TY9imo04GrJ/N5kqTp0xUWVyf5cJJXJNlrvDHJryZ5U5LxAQafIckVwK3AoUnWJnnTtj6kqu4BrgLWAF8Bzqmq8eFAzwY+SdPp/Q/AtZM8N0nSNEnTf72dDZLXAm8AXgYsAjbT9Ft8CfhUe2Ww0xkZGanR0dFhlyFJu5Qkq6pqpL99Mk9DfRn48kCqkiTtEvz6qiSpk2EhSepkWEiSOk1mDu7dJhoMUJI0d3SGRVVtpfkineNnSNIcNdn5LJYC9yS5HfjJeGNVnTSQqiRJO5XJhsV2R5iVdjVOqypNzaT+lVTVTcBDwO7t8h3ANwZYlzRQTqsqTc2kwiLJW2gGFPxE27QM+MKAapIGzmlVpamZ7PX3OTTDffwIoKruB/YfVFHSoDmtqjQ1k/2XsrGqfj7+Jsl8tjOvhCRpdplsWNyU5I+BPZO8Bvhr4IuDK0uStDOZbFicC4wBdwP/nmZgwfcMqihJ0s5lso/OHg98pqr+5wBrkSTtpCYbFr8HfDzJ94G/a19/X1VPDaowSdLOY1JhUVWnAyR5LvB64GPAcye7v7Sz8Ut50tRM6pd9ktOA3wBeDDwJfJTm6kLaJY1/Ke/oFfuxZJ8Fwy5H2ulN9srgv9PMf/1x4MaqemhQBUkzwS/lSVMz2eE+FgNnAnsAf5rk9iR/MdDKpAHyS3nS1Ex2uI99gYOA5wErgGcBWwdXljRYm7dsZWz9RjZv8X9jaTIm+2fV3wP/GrgL+N2qOrSqzhhcWdJgOZCgNDWTfRrqSIAk++AwH5oF7LOQpmayt6GOSPJNYDWwJsmqJEcMtjRpcOyzkKZmsv9SLgHeWVXPq6qDgHe1bduU5NIkT/TO353kQ0m+neSuJJ9P8uyedecleSDJfUlO6Gk/Ksnd7boLk2RKZyhJ2mGTDYu9qurG8TdV9bfAXh37XAac2Nd2PXBEe1vrO8B5AEkOA1YCh7f7XJRkXrvPxcBZwCHtq/+YkqQBm2xYfDfJf06yon29B3hweztU1c3Aur6266pqc/v2NmB5u3wycGVVbayqB4EHgGOSLAX2rapbq6qATwOnTLJmSdI0mWxYnAksAT7XvhYDb9zBzz4TuLZdXgY80rNubdu2rF3ub59QkrOSjCYZHRsb28Hy5g4fI5XUZbtPQyXZA3gr8Hya4cnfVVU7/KxhkncDm4HPjDdNsFltp31CVXUJbV/KyMiIT21NkkNfSOrS9ejs5cAmmnGgfht4EfD2HfnAJGcArwNe3d5aguaK4cCezZYDj7btyydo1zTyMVJJXbpuQx1WVadV1SdoRpt9xY58WJITgT8CTqqqDT2rrgFWJlmQ5GCajuzbq+oxYH2SY9unoE4Hrt6RGiRJU9cVFv//llNPx/SkJLkCuBU4NMnaJG+iGa12H+D6JHcm+Xh77HuAq4A1wFeAc6pqS3uos4FP0nR6/wNP93NomvhtZkld8vSdoAlWJluAn4y/BfYENrTLVVX7DrzCX9LIyEiNjo4Ou4xdgnM7SBqXZFVVjfS3b7fPoqrmbW+9ZofxbzNL0rb4Z6QkqZNhIUnqZFhIkjoZFpKkToaFJKmTYSFJ6mRYSJI6GRaSpE6GhSSpk2EhSepkWEiSOhkWkqROhoX42c838+3HfsTPfj6lUeglzSGGhXjo+xv47Kq1PPT9Dd0bS5qTDAux4jkLef1Ry1nxnIXDLkXSTqprDm7NAXv8ynxeuHSnncdK0k7AKwtJUifDQpLUybCQJHUyLCRJnQwLSVInw0KS1MmwkCR1GlhYJLk0yRNJVve07Zfk+iT3tz8X9aw7L8kDSe5LckJP+1FJ7m7XXZgkg6pZkjSxQV5ZXAac2Nd2LnBDVR0C3NC+J8lhwErg8Hafi5LMa/e5GDgLOKR99R9TkjRgAwuLqroZWNfXfDJwebt8OXBKT/uVVbWxqh4EHgCOSbIU2Leqbq2qAj7ds48kaYbMdJ/FAVX1GED7c/+2fRnwSM92a9u2Ze1yf7skaQbtLB3cE/VD1HbaJz5IclaS0SSjY2Nj01acJM11Mx0Wj7e3lmh/PtG2rwUO7NluOfBo2758gvYJVdUlVTVSVSNLliyZ1sIlaS6b6bC4BjijXT4DuLqnfWWSBUkOpunIvr29VbU+ybHtU1Cn9+wjSZohAxuiPMkVwPHA4iRrgfcCHwCuSvIm4GHgVICquifJVcAaYDNwTlVtaQ91Ns2TVXsC17YvSdIMSvOQ0ewzMjJSo6Ojwy5DknYpSVZV1Uh/+87SwS1J2okZFpKkToaFJKmTYSFJ6mRYSJI6GRaSpE6GhSSpk2EhSepkWEiSOhkWkqROhoUkqZNhIUnqZFhIkjoZFpKkToaFJKmTYSFJ6mRYSJI6GRaSpE6GhSSpk2HRZ/OWrYyt38jmLVuHXYok7TQMiz5PbdjEHQ+t46kNm4ZdiiTtNAyLPosW7s7RK/Zj0cLdh12KJO005g+7gJ3N/Hm7sWSfBcMuQ5J2Kl5ZSJI6GRaSpE5DCYsk70hyT5LVSa5IskeS/ZJcn+T+9ueinu3PS/JAkvuSnDCMmiVpLpvxsEiyDHgbMFJVRwDzgJXAucANVXUIcEP7niSHtesPB04ELkoyb6brlqS5bFi3oeYDeyaZDywEHgVOBi5v118OnNIunwxcWVUbq+pB4AHgmJktV5LmthkPi6r6HnAB8DDwGPDDqroOOKCqHmu3eQzYv91lGfBIzyHWtm3PkOSsJKNJRsfGxgZ1CpI05wzjNtQimquFg4HnAnslOW17u0zQVhNtWFWXVNVIVY0sWbJkx4uVJAHDuQ31W8CDVTVWVZuAzwHHAY8nWQrQ/nyi3X4tcGDP/stpbltJkmbIMMLiYeDYJAuTBHg1cC9wDXBGu80ZwNXt8jXAyiQLkhwMHALcPsM1S9KcNuPf4K6qryf5LPANYDPwTeASYG/gqiRvogmUU9vt70lyFbCm3f6cqtoyqPo2b9nKUxs2sWjh7syf59dQJAkgVRPe/t/ljYyM1Ojo6JT3G1u/kTseWsfRK/Zz2A9Jc06SVVU10t/un859HEhQkp7JgQT7OJCgJD2TVxaSpE6GhSSpk2EhSepkWPRxDm5JeibDoo9zcEvSMxkWfXx0VpKeyUdn+/jorCQ9k1cWkqROhoUkqZNhIUnqZFhIkjoZFpKkToaFJKmTYSFJ6jRrJz9KMgb84y+5+2LgyWksZ1fgOc8Nc+2c59r5wo6f8/Oqakl/46wNix2RZHSimaJmM895bphr5zzXzhcGd87ehpIkdTIsJEmdDIuJXTLsAobAc54b5to5z7XzhQGds30WkqROXllIkjoZFpKkToZFjyTvSHJPktVJrkiyx7BrGrQk/6k933uSvH3Y9QxCkkuTPJFkdU/bfkmuT3J/+3PRMGucbts451Pb/85bk8y6x0m3cc4fSvLtJHcl+XySZw+xxGm3jXP+k/Z870xyXZLnTsdnGRatJMuAtwEjVXUEMA9YOdyqBivJEcBbgGOAfw68Lskhw61qIC4DTuxrOxe4oaoOAW5o388ml/HMc14N/A5w84xXMzMu45nnfD1wRFUdCXwHOG+mixqwy3jmOX+oqo6sqpcA/wc4fzo+yLD4RfOBPZPMBxYCjw65nkF7EXBbVW2oqs3ATcC/GXJN066qbgbW9TWfDFzeLl8OnDKTNQ3aROdcVfdW1X1DKmngtnHO17X/bwPcBiyf8cIGaBvn/KOet3sB0/IUk2HRqqrvARcADwOPAT+squuGW9XArQZekeQ5SRYCrwUOHHJNM+WAqnoMoP25/5Dr0eCdCVw77CJmQpI/TfII8Aa8sphe7T3rk4GDgecCeyU5bbhVDVZV3Qv8V5pL9a8A3wI2b3cnaReU5N00/29/Zti1zISqendVHUhzvv9hOo5pWDztt4AHq2qsqjYBnwOOG3JNA1dVn6qqX6+qV9Bczt4/7JpmyONJlgK0P58Ycj0akCRnAK8D3lBz74tlfwn82+k4kGHxtIeBY5MsTBLg1cC9Q65p4JLs3/48iKbz84rhVjRjrgHOaJfPAK4eYi0akCQnAn8EnFRVG4Zdz0zoe0jlJODb03LcuRe025bk/cDv0lyufhN4c1VtHG5Vg5Xk74DnAJuAd1bVDUMuadoluQI4nmbo5seB9wJfAK4CDqL5Q+HUqurvBN9lbeOc1wH/A1gC/AC4s6pOGFKJ024b53wesAD4frvZbVX11qEUOADbOOfXAocCW2mmaXhr2ye7Y59lWEiSungbSpLUybCQJHUyLCRJnQwLSVInw0KS1Mmw0C4vyUd6R8xN8tUkn+x5/+Ek70xyUpIpDRiY5LIkr5/Gcqcsye8l+eg21p2SZMrDOSQ5L8kbtrHuxUkum+oxNbsZFpoNvkb7bfsku9E8c354z/rjgFuq6pqq+sAQ6hukPwQu+iX2+5fAhGOfVdXdwPL2i5oSYFhodriFp4dmOZxmgMT1SRYlWUAzuu43e/9Cb68YLkzytSTfHb96SOOjSdYk+RLbGGAwydvabe5KcmXb9r4kf5Hkb9p5Mt7Ss/0fJLmj3f79Pe2nJbm9nXvgE0nmte1vTPKdJDcBL9tGDS8ANlbVkz3ndHGSG9tz+s12voN7e68UkuwL/EpVjbVzXKxO8q0kvUOXf5FZPkS/pmb+sAuQdlRVPZpkc/uX8HHArcAy4KXAD4G7qurnzSguv2Ap8HLghTTDf3yWZoj2Q4EXAwcAa4BLJ/jYc4GDq2pj34Q6RwLH0gwN/c02cI4ADqGZNyTANUleAYzRjBjwsqralOQi4A1JrgfeDxzV1n8jzYgC/V4GfKOvbRHwKpphHr7YbvNm4I4kL6mqO2nGQRv/pv75wAlV9b2+8xhtz/GDE3yu5iDDQrPF+NXFccB/owmL42h+2X5tG/t8oaq2AmuSHNC2vQK4oqq2AI8m+Ztt7HsX8JkkX6AZOmTc1VX1U+CnSW6kCYiX09z2Gf+FvzdNeBxJEwh3tEG2J82Ahv8C+NuqGgNI8lfACyaoYSlN4PT6YlVVkruBx9tbSiS5B1gB3EkzWc7/are/BbgsyVU0g2eOe4Jm9GUJ8DaUZo/xfosX09yGuo3myuI4ml+IE+kd96v3smMyY+D8K+BjNL/sV7UTZk20b7XH/i9V9ZL29fyq+lTbfnlP+6FV9b4p1PBToH/q3/Fz2sovnt9Wnv7j8BjgdoB2nKT30MxjcmeS57Tb7NEeXwIMC80et9AMQ72uqra0gwI+myYwbp3CcW4GViaZ1w5d/sr+DdpO9AOr6kaaDuZn01wtAJycZI/2l+7xwB3AV4Ezk+zd7r+sHe33BuD1PSP/7pfkecDXgePTTEq1O3DqNmq9F3j+FM6NJIcD326vnEjya1X19ao6H3iSpye/egFN6EqAt6E0e9xN8xTUX/a17T3eATxJn6e55383zZzNN02wzTzgfyd5Fs3VwUeq6gftraTbgS/RjGb7J1X1KM3trBcBt7bb/Bg4rarWJHkPcF0bQJuAc6rqtiTvowm5x2j6JeZNUMfNwIeTZArzNPw2zURX4z7UDmkdmvD6Vtv+yvY8JMBRZ6Vp0/6C/3FVXTCDn/nnNP0U/3eS218PnD4+pew2tllAE5Iv75m/WnOct6GkXdufAQsnu3FVvWZ7QdE6CDjXoFAvrywkSZ28spAkdTIsJEmdDAtJUifDQpLUybCQJHX6fw2raATJGo6/AAAAAElFTkSuQmCC\n", - "text/plain": [ - "
" - ] - }, - "metadata": { - "needs_background": "light" - }, - "output_type": "display_data" - } - ], - "source": [ - "out_of_window = filters.window_range_flag(windspeed, 5., 40, power_kw, 20., 2100.)\n", - "plot_flagged_pc(windspeed, power_kw, out_of_window, 0.2)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Let's remove these flagged data from consideration" - ] - }, - { - "cell_type": "code", - "execution_count": 12, - "metadata": {}, - "outputs": [], - "source": [ - "windspeed_filt1 = windspeed[~out_of_window]\n", - "power_kw_filt1 = power_kw[~out_of_window]" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Bin filter\n", - "\n", - "We may be interested in fitting a power curve to data representing 'normal' turbine operation. In other words, we want to flag all anomalous data or data represenatative of underperformance. To do this, the 'bin_filter' function is useful. It works by binning the data by a specified variable, bin width, and start and end points. The criteria for flagging is based on some measure (scalar or standard deviation) from the mean or median of the bin center. \n", - "\n", - "As an example, let's bin on power in 100 kW increments, starting from 25.0 kW but stopping at 90% of peak power (i.e. we don't want to flag all the data at peak power and high wind speed. Let's use a scalar threshold of 1.5 m/s from the median for each bin. Let's also consider data on both sides of the curve by setting the 'direction' parameter to 'all'" - ] - }, - { - "cell_type": "code", - "execution_count": 13, - "metadata": {}, - "outputs": [ - { - "data": { - "image/png": "\n", - "text/plain": [ - "
" - ] - }, - "metadata": { - "needs_background": "light" - }, - "output_type": "display_data" - } - ], - "source": [ - "max_bin = 0.90*power_kw_filt1.max()\n", - "bin_outliers = filters.bin_filter(power_kw_filt1, windspeed_filt1, 100, 1.5, 'median', 20., max_bin, 'scalar', 'all')\n", - "plot_flagged_pc(windspeed_filt1, power_kw_filt1, bin_outliers, 0.5)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "As seen above, one call for the bin filter has done a decent job of cleaning up the power curve to represent 'normal' operation, without excessive removal of data points. There are a few points at peak power but low wind speed that weren't flagged, however. Let catch those, and then remove those as well as the flagged data above, and plot our 'clean' power curve " - ] - }, - { - "cell_type": "code", - "execution_count": 14, - "metadata": {}, - "outputs": [], - "source": [ - "windspeed_filt2 = windspeed_filt1[~bin_outliers]\n", - "power_kw_filt2 = power_kw_filt1[~bin_outliers]" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "#### Unresponsive Filter\n", - "\n", - "As a final filtering demonstration, we can look for an unrespsonsive sensor (i.e. repeating measurements). In this case, let's look for 3 or more repeating wind speed measurements:" - ] - }, - { - "cell_type": "code", - "execution_count": 15, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "Series([], Name: wmet_wdspd_avg, dtype: float64)" - ] - }, - "execution_count": 15, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "frozen = filters.unresponsive_flag(windspeed_filt2, 3)\n", - "windspeed_filt2[frozen]" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "We actually found a lot, so let's remove these data as well before moving on to power curve fitting.\n", - "\n", - "Note that many of the unresponsive sensor values identified above are likely caused by the discretization of the data to only two decimal places. However, the goal is to illustrate the filtering process." - ] - }, - { - "cell_type": "code", - "execution_count": 16, - "metadata": {}, - "outputs": [], - "source": [ - "windspeed_final = windspeed_filt2[~frozen]\n", - "power_kw_final = power_kw_filt2[~frozen]" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "##### Power curve fitting\n", - "\n", - "We will now consider three different models for fitting a power curve to the SCADA data." - ] - }, - { - "cell_type": "code", - "execution_count": 17, - "metadata": {}, - "outputs": [], - "source": [ - "# Fit the power curves\n", - "iec_curve = power_curve.IEC(windspeed_final, power_kw_final)\n", - "l5p_curve = power_curve.logistic_5_parametric(windspeed_final, power_kw_final)\n", - "spline_curve = power_curve.gam(windspeed_final, power_kw_final, n_splines = 20)" - ] - }, - { - "cell_type": "code", - "execution_count": 18, - "metadata": {}, - "outputs": [ - { - "data": { - "image/png": "iVBORw0KGgoAAAANSUhEUgAAAnMAAAFzCAYAAABVWI+TAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjQuMywgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/MnkTPAAAACXBIWXMAAAsTAAALEwEAmpwYAABb7ElEQVR4nO3dd3xUVfrH8c8zqYROKIYmHRFR1CyiWEBRpKi7a113fxbsde2iay+r2LCtrlhWXfu6qyKgqChWQEGpIh0hEnonPTm/P+5NZhISCJDkziTft6/7mnPOLfNch0yenHvvOeacQ0RERERiUyjoAERERERkzymZExEREYlhSuZEREREYpiSOREREZEYpmROREREJIYpmRMRERGJYfFBBxCU5s2buw4dOgQdhoiIiMguTZ8+fZ1zrkV56+psMtehQwemTZsWdBgiIiIiu2Rmv1a0TpdZRURERGKYkjkRERGRGKZkTkRERCSG1dl75sqTn59PRkYGOTk5QYcSqOTkZNq2bUtCQkLQoYiIiMguKJmLkJGRQcOGDenQoQNmFnQ4gXDOsX79ejIyMujYsWPQ4YiIiMgu6DJrhJycHFJTU+tsIgdgZqSmptb53kkREZFYEVgyZ2btzOwLM5tnZnPN7K9+ezMz+9TMFvqvTSP2ucXMFpnZfDMbFNF+qJnN9tc9aXuRjdXlRK6Y/h+IiIjEjiB75gqA651zPYC+wBVmtj8wApjonOsKTPTr+OvOAnoCJwLPmFmcf6xngYuBrv5yYk2eSFVq0KABAMuWLaNevXr07t27ZHn11VcB2LZtG5dccgmdO3emZ8+eHH300UydOjXIsEVERCQggd0z55zLBDL98lYzmwe0AU4B+vubvQJMAm72299yzuUCS81sEdDHzJYBjZxzkwHM7FXg98BHNXUu1aVz587MmDFjh/YLL7yQjh07snDhQkKhEEuWLGHevHk1H6CIiIgELioegDCzDsDBwFSglZ/o4ZzLNLOW/mZtgCkRu2X4bfl+uWx7rbR48WKmTp3K66+/Tijkdax26tSJTp06BRyZiIiIBCHwZM7MGgD/Ba5xzm3Zyf1a5a1wO2kv770uxrscS/v27XcV2M7X7w1Xbng7WLx4Mb179y6pP/XUU2zcuJHevXsTFxdX8Y4iIiJSZwSazJlZAl4i97pz7n9+82ozS/N75dKANX57BtAuYve2wEq/vW057Ttwzo0GRgOkp6dXLqMKUHmXWceMGRNMMCIiZRXmw8oZsHyyv0wBVwjdToQDz4SOx0Bc4H0GIrVeYD9l/hOnLwLznHOPRawaA5wLPOi/fhDR/oaZPQa0xnvQ4XvnXKGZbTWzvniXac8Bnqqh06hxPXv2ZObMmRQVFZVcZhURqVGr5sDn98KSL6Ege8f1s972lgat4IDT4KAzIe2gmo9TpI4IMhvoB/wfcKyZzfCXIXhJ3PFmthA43q/jnJsLvAP8DHwMXOGcK/SPdRnwArAIWExVPPzgXPUte6Fz586kp6dz55134vxjLVy4kA8++GAXe4qI7KWcLfDxrfDc0bDg4/ITuUjbVsOUf3jbf3bXXn//iUj5gnya9RvKv98N4LgK9rkfuL+c9mnAAVUXXXQoe8/c8OHDufrqq3nhhRe4/vrr6dKlCykpKaSmpvLwww8HF6iI1G7Owdz/eYnctlWl1zVpD+2PgH0P917zs2DWOzD7P7B9TXi7b0ZBQgocc1P57zF/PowdC7m51XceItXpvPOgdetA3lo3M0SZbdu2AdChQweys8v/q7dRo0Y8//zzNRmWiNRVG5fBh3+FJZNKt3c4CgY/BK3233Gf1r3h+Htg6Zfw7eOw9Cuv/Yv7Iakh9L0M8ObDXj1zJmmjRxP30ktQWLjjsURixcCBSuZERCSKOAc/vQYfj4C8beH2Bq3ghPuh12k7f+o/Lh66HAcdjoQ3zoQlX3jtH4+AxAbQ7Q9sv+MOWj3zDHHqjRPZK0rmRESktG1rvd64+ePCbRaCPpfAgFsgufGO+2zaBP/4B8yateM6i4P2TSBlk1f/4Er47DqafLeu9HbHHguHHVZFJyFSw9LSAntrJXMiIhL2y3j48GrYvjbcltoF/jAa2h664/Z5efDMM3DvvbBhQ8XHTQLOrQ9pcd7d0sfmwuo4WFwIvXrBQw/BoEHVO8anSC2lZE5ERGD7OpjwN5j1Vun2PhfDwLshMaV0u3Pwzjtw662wZMmuj58LvJYF56VAiziIMzitPnS5D86/HDQQusgeUzInIlKXOQcz3/QSuexwz5pr0Iq1R9xN04YHkHDCEMjMLL1fVhZkZJRu69gRbrgBUlMrfr/8jbDsQSjYCMlA6lwlciJ7ScmciEhdtX4xjL0m/LRpsQNO5df9r+S7CV9y+qM3wm+/7fw4TZvC7bfD5ZdDUtKu33dBJ3jjdK8880048AzofOwenYKIKJmLSvfffz9vvPEGcXFxhEIhnnvuOQ6r4Kbgu+66iwYNGnDDDTdwxx13cPTRRzNw4MAajlhEYkruNvjuKW/YkIKccHvj9jD0Ueh2Au7nnxn49NMk7CyRS0qCq6+GW27xErrK6nYC9PyjN3YdwNhr4bLJO17KlTrNOYeLmGq9eKD84raSdY7S2+1kn8i2Uu9VzpTukdvt8J7lSIlPIT4UTFqlZC7KTJ48mbFjx/Ljjz+SlJTEunXryMvLq9S+99xzTzVHJyIxrajQG27ki/u92RmKWQj6Xg79b4GkBuAc+959N6Hie+FCIXjtNTjkkNLHa9MGGjTYs1gGj4TFEyFnszeW3Zcj4fi79+xYVcw5R4ErIK8wj7zCPPKL8ku9FhQVkF+UX7IUFBWQX5hPvvPKxUthUSEFLqLuCkvaCosKKXJFJeVC59WLtykuR76WWiiiqMh7dc5R6ApxzpWsd5QuO+dKti3VFrFt8bkXb1e8Tdn1Jf8VJ1t+MlWqLeL/Zdn2yH2K62X3i0VvDHmDXi16BfLeSuaiTGZmJs2bNyfJv1TRvHlzwBtE+Mwzz+SLL7yxmt544w26dOlSat/zzjuPYcOGcdppp9GhQwfOPfdcPvzwQ/Lz8/nPf/7Dfvvtx/bt27nqqquYPXs2BQUF3HXXXZxyyik1e5IiUrOcg0Wfwad3wJqfS69LOwhOegJaHxxuu/tuQu+8E66PGgV/+lPVxtSgJZxwH4y5yqt/9xQccCqkHVip3QuLCskqyGJb3ja25W9je/72kiWrIIus/KyS1+yCbHIKc8guyCY73yvnFOSQW5hLbmEu2QXZ5BXmkVuY6yVwRXklyYtILFAyV5G7yhlHqcqOvbnCVSeccAL33HMP3bp1Y+DAgZx55pkcc8wxgDfzw/fff8+rr77KNddcw9ixY3f6Ns2bN+fHH3/kmWee4ZFHHuGFF17g/vvv59hjj+Wll15i06ZN9OnTh4EDB1K/fv0qPUURiRKrZsMnt4cH7S3WMA2OvR0OOgtCEQ8gvPEG3B3RQ3b55XDVVdUT28H/h5v1Ntt//Zb1ccaGsZex8YS72Zy3lY25G9mUs4lNuZvYnLuZLXlb2JK3ha15W9mSt4Xt+durJyaJOhYx86f5Q9cUt5WsszLb7WSfyLaK3qe87XZ4zzJCoeCmu1cyF2UaNGjA9OnT+frrr/niiy8488wzefDBBwH4k/+X8Z/+9CeuvfbaXR7rj3/8IwCHHnoo//ufd2/KJ598wpgxY3jkkUcAyMnJYfny5fTo0aM6TkdEgrIlE764D356HSIvXxUYLG4GS+rDW48Dj5feL3LQ3xNOgCee2OOx3/IK88jcnsmq7atYk7WG1VmrWZO1hjVZa1ibtZb1OetZF7eW3A7tioOGSbv+bqsJcRZHYlwiiXGJJIQSSAx55fhQfElb8RIfii+1lLRZPHGhuB3KIQuV1OMsjpCFiDOvHNlW3B4K+a/4baE4DCNkIcys1PYhC5WsK15f/F/Z9RglxzQvGwofFwvva/72ULrdLwM7tJVNnMq2R75f2eMW7yOVp2QuCsXFxdG/f3/69+9Pr169eOWVV4AyfyFU4h968aXauLg4CgoKAO/+hf/+97907969GiIXkcDlbfcfbnjCm/S+WJGDn/Lhi1zYvhlYuvPj7L+/N45cfMW/JpxzbMjZwPKty/l1y68s37KcjK0ZrNy+kpXbVrI2e22F+1aFlPgUGiQ2oEGCt9RPqE9KQgr1E+pTL74eKQkp3mu891ovvh7J8ckkxyWTHJ9MUlxSST0xLrHktThpE4kV+tdakZ1cCq1O8+fPJxQK0bVrVwBmzJjBvvvuy+zZs3n77bcZMWIEb7/9NocffvgeHX/QoEE89dRTPPXUU5gZP/30EwcffPCudxSR6LfwU+/J0M0rSrcvyIdPc2FdJe8Da90aPvwQGnu3mzjnWJ21mkWbFrF402IWblzI4k2LWbZlGdvyt+3iYLuWHJdEan4eqXk5NC0qonGD1jTd72SaJDehcVJjGic2plFSIxolekvDxIY0SGhAXEjj04mAkrmos23bNq666io2bdpEfHw8Xbp0YfTo0YwdO5bc3FwOO+wwioqKePPNN/fo+LfffjvXXHMNBx54IM45OnTosMt770QkeuTn55OZmUlaWhoJCQle4/Z13gT2s/9TeuP1IRi3FZYWQocO8P5rkJi40+MX4VjetgE/Z//CvGnv8fP6n5m3YR5b87budqwhC9EypSVp9dNoldKKliktaZnSsqTcvF5zUuulkhKfgi2fAv860dtx9Vr43QjYb8huv6dIXWTljbdSF6Snp7tp06aVaps3b17U3jvWoUMHpk2bVvJ0a3WL5v8XInXZ8uXL+fbbb+nXrx/t27XzEriPR0DW+vBG9ZrC1ER4f6F3u1yjRjB5snfptIzt+duZtXYWM9bM4Kc1PzF73ezd6m1LiU9h30b70r5Re9o3bE+7hu1o06ANrRu0plX9ViSEEip/ch9cCT/92ys3agtXTPWGShERzGy6cy69vHXqmRMRiSFpaWn069ePtMZJ8OZZsODj0hv0Oh3eWQPvfejV4+K8e9/8RC67IJsfV//IlMwpTM2cyvyN8ys1DEfDxIZ0bdKVzk0607lJZ7o06UKnxp1oXq951d2sfvw9MH+8l5huyYAvH/SGLxGRnVIyFyOWLVsWdAgiEgUSEhJoH7cOXjoXNi8Pr8hJhEX7wriZEHHVwT35BAv7dOSr2S8wZeUUflzzI/lF+Tt9j2bJzdg/dX96NOtBz9Se7J+6P/vU36f6nzBMaQYn3A/vX+rVJz8DB54J+wQzEKtIrFAyJyISS6a/AuNvgMKImWG+z4OJWyBvHQCFBj91TeHzC47hi1ZjyRjzzwoPZxhdm3bl4JYH07tlbw5ueTCt67cObmiIg86CGa/Dsq/BFXoPdAz/xJuFQkTKpWRORCQW5Gd7SdxPr4Xbchy8lw0LCigy+LFbCmOPaMLnhzRiY6N4YAWUc/tb58ad6du6L4enHc4hrQ6hYWLDGjuNXTKDoY/Bs0dAUT5k/AA/vgzpw4OOTCRqKZkTEYl2WzLhzTMhc2a4bVUhvJPF4pQkxr5+IeOK5pJZuLHc3esn1OfINkdyVJuj6JvWl1b1W9VQ4HuoRTc48hr46mGv/tldsN8wbwowEdmBkjkRkWi2bhG89gfYFL4/Lju7PR//ksHbV3VkbqcUyP9mh91a1GvBgHYDOLb9sfxun9+RGLfzIUmizlHXw+x3YeNSyNkME/4Gpz4fdFQiUUnJXJRp0KAB27aVvi7y8ssvc+ONN9KmTRsArrzySi688EKWLVtGjx496N69O3l5eRx99NE888wzgc4PJyJV6Lfp8PrpJcOOLEtM4u1uAxizZjZb9m+7w+aNEhtxYocTGdppKL1b9i6ZgikmJdSDoY/Ca960hMx+B3qfDZ0HBBuXSBRSMhcjzjzzTJ5++ukd2jt37syMGTMoKCjg2GOP5f333y+Zk1VEYtjiz+Gtv+Dyt/NDchIvNW3Kt8mJsP1nqB+e+SAhlED/dv0Z1mkYR7U5ioS43RjXLdp1OQ4OOBXm/Nerj7seLvsOEpKDjUskyiiZqyXi4+M54ogjWLRoUdChiMjemv0uRe9dyhfJ8bzYvBWzk5N22KTtmjzObD2YU069k6bJTQMIsoYMegAWfga5m2HDYvhmFAy4JeioRKKKkrkK9Hql+sY1mn3u7N3e57///S9fffUV3bp1Y9SoUbRr167U+qysLCZOnMg999xTVWGKSAAKvn+e8V/dyYtpzVmSWLqXLeSMo3/azFmfb+Dwrc0ILXoE4mv513jDVnDc7d6TvADfPAa9ToPmXYONSySKxPANFXXHSSedxLJly5g1axYDBw7k3HPPLVm3ePFievfuTb9+/Rg6dCiDBw8OMFIR2VPOOSZOuI5TZz7K31qklkrkEkOJnNHtdMY+X8hTTy6n35xthK66uvYncsXSh0ObQ71yYR6Muw7q6FSUIuWpI98EsS01NbWkfNFFF3HzzTeX1IvvmROR2PVD5g88PulGZuWth4gkrn58Cmfsdyb/1+P/aPH1j/Ddnd6KBg3gwgsDijYAoTgYNgpG9wdXBEu/gllvewMMi0iwyZyZvQQMA9Y45w7w2+4CLgLW+pvd6pwb76+7BbgAKASuds5N8NsPBV4G6gHjgb86t3d/tu3JpdDqkpmZSVpaGgBjxoyhR48eAUckIlVhyaYlPPz1PXyzYXqp9voOzm0yiD83PYFG2Snw43x44IHwBhdeCI0b13C0AUs7CA67DKb8w6tP+Bt0PcGbAkykjgu6Z+5l4Gng1TLto5xzj0Q2mNn+wFlAT6A18JmZdXPOFQLPAhcDU/CSuROBj6o39OqRlZVF27bhIQeuu+461q5dy5gxY4iPj6dZs2a8/PLLwQUoInstKz+Lf876J/+e8woFhCe5T3COs5Zv5sJRmTTbNAd4dMedQyG4+uqaCzaaDLgFfn4ftvwGWetg4t1w0hNBRyUSuECTOefcV2bWoZKbnwK85ZzLBZaa2SKgj5ktAxo55yYDmNmrwO+J0WSuqKio3PYHIv8q93Xo0IE5c+ZUd0giUkWcc0xYNoGHpz3Mmqw1Je3mHCdt284VP6yn9etbvWsPFfnjH6Fjx+oPNholNYTBI+Htv3j16S/DQWdD+8MCDUskaEH3zFXkSjM7B5gGXO+c2wi0wet5K5bht+X75bLtIiJRY9nmZdw35T6mrppaqv3gnBz+tn4j3X+tD0t6Qd+dTHDfti08/nj1Bhrt9hsG3QbDAv/v9bHXwiVfQm0aX09kN0VjMvcscC/g/NdHgeFAed9wbiftOzCzi/Eux9K+ffuqiFVEZKcKiwp5fd7rPPnTk+QW5pa0Nyss5PoNmzhp23asoCc8+7V3o7/snBkMeQiWfgn5WbBmLkx5Bvr9NejIRAITdUOTOOdWO+cKnXNFwPNAH39VBhA5uFpbYKXf3rac9vKOPdo5l+6cS2/RokXVBy8idVp+fj7Lly8nPz8f8Hrjzp9wPg9Pe7gkkQs5+PPmrXyYsZKTt23HJuXBJW8rkdsdTdpD/xHh+qQHS81dK1LXRF0yZ2ZpEdU/AMU3hY0BzjKzJDPrCHQFvnfOZQJbzayvmRlwDvDBnr7/Xj4EWyvo/4HInlmxYgWfffYZy5Yv49W5r3Lah6fx05qfStZ3yy/irZWZjNiwkUYFRTAmG9qeCmUGAZdK6Hs5tOzplfOzYPyNGntO6qyghyZ5E+gPNDezDOBOoL+Z9ca7VLoMuATAOTfXzN4BfgYKgCv8J1kBLiM8NMlH7OHDD8nJyaxfv57U1FS8vLDucc6xfv16kpM196HIntjGNu6YfQdzNocfToonxEWbNnPRxo0kAOQ7eDcbFhTA89cGFmtMi0uAkx6HF4/36gs+hl/GQo+TAg1LJAhWV3th0tPT3bRp00q15efnk5GRQU5OTkBRRYfk5GTatm1LQoJuKBbZHZ8s+oQ7p9zJtsJtJW3dQ/W4b8US9svzLr3ikuBfG2BFIRxzDEyaFEywtcWHf/WeagVo1AaumOo99SpSy5jZdOdcennrovEBiMAkJCTQsa4+8i8ieyyvMI9R00fx2rzXStpCGBflhrhk5XxK/ixq2QsenO0lcgDXXVfjsdY6A++CX8bB9rXe+HNfPAAn/j3oqERqVNTdMyciEkt+3fIrfxn/l1KJXMv4+rywZiNXrlwaTuQOPR/y/wArtnj1Ll1g2LAaj7fWqdcUBkUkb1OfhcyZwcUjEgAlcyIie+jLFV9y1tizmLdhXklb/4J4/rt4Pr/b7idtCSnwh+dgyKPwxD/CO197rTebg+y9XqdDx2O8siuCD6+Bop2NvCxSu+ibRERkNxW5Ip6d8SxXfn4l2/K9++MSMEas38CTK5bQpHgml6xk+L4T3PAiHHUULF3qtTdtCueeG1D0tZAZDH0M4hK9+sofYdpLwcYkUoN0z5yIyG7YmreVW7++lUkZk0raWhcUMmr1GvYvfsghH/g6B77bAoVrdjzIpZdC/fo1Em+d0bwLHHU9TPKnPpx4D/Q4GRq2CjYukRqgnjkRkUpavGkxZ487u1Qid1h2Dm/9lhlO5PI6wtNb4eu88udYbdUKrr66RuKtc/pdA806e+XcLTDhlkDDEakp6pkTEamEb3/7lusnXcf2gqyStvM2beGvGzd5X6StDoCuF8GgC6DIH/Jp5Eg4+ODwQUIhOOQQ7zKrVL2EZBg2Cl492avP+S/0/jN0OS7YuESqmZI5EZFdeHv2Szzw4+MU+tM+JxcVcc+6DQzengUt94ejb4D9TqboyKMIFd8vN3Ag3Hijdz+X1JxOx8CBZ8Kst736uOvh8smQUC/YuESqkS6ziohUoDB7Mw+9dwb3/TiqJJHbp6CA1zJXM7hRNzjzdbj0WzjgVHj5FUJTpwLgEhPhH/9QIheUE+6D5MZeeeNS+PrRYOMRqWZK5kREynKOrKn/5Jp/H86/t4SHHdk/N5c3ClvQ/fQ34eJJ0GOYd+l03Tq4+eaS7Yquuw66dQsgcAGgQUsYeHe4/s3jsHZ+YOGIVDclcyIikbatZd3rf+S8maOYlBRX0nxcfoh/HX4/LS76EroeX7rX7ZZbYMMGr9yhA3G3317DQcsODjkX2vbxykX5MPY6qKPTV0rtp3vmRESKLfyMXz+8nEsaxfFbUmJJ8/nN+3DNoGcJbcuCjz+GgoLwPqtWwQsvlFTX3HEHTRMS0MzGAQuFvIchnjsaXCH8+g3MfBN6nx10ZCJVTsmciEhBLnx2N3N/fJ7L92nBhjivRy4O47Y+t3Bajz/Bli2Qng6LF1d4mKzjj2dicjL9MjNp3759TUUvFdnnADj8CvjuSa/+yW3Q7URIaRZsXCJVTJdZRaRu27oaXhjIdzNf5Py0liWJXHIogSeOfcpL5ADuumuniRz16mFPPEGXLl1o3rx59cctldN/BDRu55Wz1sOndwQbj0g1UM+ciNRdW1fBKycxNjuD21u1oMC/D65xYiOePu4f9G7Z29tu1ix48snwfgMGlJ7BITkZLrmEtfXrs2jGDFq1aqWeuWiRWB+GPAxvnuXVf/q3d6l13yOCjUukCimZE5G6aesqeHkYr+dn8mDLcE/aPin78Nzxz9GpSSevoagILrsMCv3pHPr3h4kTyx12JC0/n379+pGWllYDJyCV1n0w7DcMfhnr1cdeC5d8DfGJO99PJEboMquI1An5+fksX76c/Px82JKJe3kIowtW82Bq+P6pLk268NqQ18KJHMArr8B333nl+Hh45pkKx49LSEigffv2JCTo8YeoM3gkJPi9qWt/gclPBRuPSBVSMicidcKKFSv47LPPWDl/Ou7lIYwqWs9TzZqUrD+oxUG8fOLLtKofMTH7hg1w003h+g03QI8eNRe0VJ3GbWHAreH6lw/DxmWBhSNSlZTMiUidUa9wM60mXMi9tpF/NWlU0t43rS+jjx9N46TGpXe49VZvQGCA9u3htttqMFqpcoddCvv08soF2TDuBo09J7WCkjkRqRPapbXk5Pz/cUdyFv9p1LCkfUC7ATx93NOkJKSU3uH772H06HD9iSdKP/QgsScuHoY9DviXyRd9Cj9/EGREIlVCD0CISO3nHO6Tm7mFlXzRIJyQDZuZzz13jyGhaMyO+2zYEO61GToUTjmlhoKVatU2HdKHw7QXvfrHI6DzsZDcaOf7iUQxJXMiUuvlfD+aa1dO4Jv64d63Myeu59bXMgnt6ipbcrI3LEkFDz1IDDruDu/J1m2rYWsmfH4fDHko6KhE9pgus4pIrZa15Auu+ulRvkmpV9J23kdr+du/K5HImcGjj0KnTrvYUGJKvSYw6O/h+g/Pw28/BhaOyN5Sz5yI1Frb1y3kiolXML1eUknbxR+u4crvk7AlS3bd29a4MTRtWs1RSiAOOBV+eg2WfAGuyBt77qLPIRQXdGQiu03JnIjUSlu3r+OyD89gZmL4l/OVY1dzyX/Xwp13QseOAUYngTODoY/CM4dDYS5kzoDvn4e+lwYdmchu02VWEal1Nudu5uL3TmZmqKCk7foJq7nk3bU4Mxg+PMDoJGqkdoajbwzXP78PtqwMLh6RPaRkTkRqlc25m7nog9OZU7i1pG3E2mac9+ZaALKPOsobM04EoN/V0LybV87b6j3dKhJjlMyJSK2xKWcTF47/P+ZlZ5a03Z7Qnj8/vqiknnjFFUGEJtEqPgmGPhau//wBLPgkuHhE9kCgyZyZvWRma8xsTkRbMzP71MwW+q9NI9bdYmaLzGy+mQ2KaD/UzGb765400xgCInXNhpwNXDBhOL9sWQqAOcddWcYZ9c+F3/xLZy1bEv/73wcXpESnjkfBQX8K18dfD3lZwcUjspuC7pl7GTixTNsIYKJzrisw0a9jZvsDZwE9/X2eMbPiO5ufBS4GuvpL2WOKSC22Pns9F0y4gAWbFgJeInf3+s2cesor8K83whuedx4kJgYTpES3E+6Den7fwabl8JXGnZPYEWgy55z7CthQpvkU4BW//Arw+4j2t5xzuc65pcAioI+ZpQGNnHOTnXMOeDViHxGp5dZlr+OCCRewaJN3KTXkHPevW88f+v0NbB8YOza88QUXBBSlRL36zeH4e8L1756CNfOCi0dkNwTdM1eeVs65TAD/taXf3gZYEbFdht/Wxi+XbReRWm5t1lqGTxjO4s2LAS+R+/va9ZzUZgAcdgm8/DIUFnobH3MMdOsWXLAS/Xr/Bdr19cpFBd7Yc0VFwcYkUgnRmMxVpLz74NxO2nc8gNnFZjbNzKatXbu2SoMTkZq1JmsNwycMZ+lm7x65OOd4cO16hsanwilPe/OqvvhieIcLLwwoUokZoRCc9DiE/CFYl0+GGa8FGpJIZURjMrfav3SK/7rGb88A2kVs1xZY6be3Lad9B8650c65dOdceosWLao8cBGpGau3r2b4hOEs27IM8BK5kWvWMXhrFnyQD337Q8+esNjrsaNJEzj11KDClVjSsgcccVW4/ukdsH1dcPGIVEI0JnNjgHP98rnABxHtZ5lZkpl1xHvQ4Xv/UuxWM+vrP8V6TsQ+IlLLrNq+iuEThvPrll8BiHeOh9esY1BWNnyUDV8ugNmz4Zdfwjv95S9Qr14FRxQp4+iboIk/FmH2Rvjk9mDjEdmFoIcmeROYDHQ3swwzuwB4EDjezBYCx/t1nHNzgXeAn4GPgSucc/7NMFwGvID3UMRi4KMaPRERqRErt63k/I/PZ/nW5YCXyD2yZh3HZ2XDtDyYlr/jTk2bwrXX1nCkEtMSU2DIo+H6zDdg6dfBxSOyC+Y9AFr3pKenu2nTpgUdhohUUsbWDC6YcAErt3t3UcQ7x6Nr1nFsVjb8WgCvZsGQYXD//aV37NpVvXKyZ945xxtEGLxZIi79xhtkWCQAZjbdOZde3rr4mg5GRGR3rdiyguGfDGfV9lUAJDgYtXotx2TnwBbgnWwoAq66Cg48MNBYpRY58UFY9Lk3zde6BfDdk6XnchWJEtF4z5yISIllm5dx3oTzShK5RAdPrl7jJXKWCG9sgywH++4LAwcGHK3UKo1aw7G3hetfPQIblgQXj0gFlMyJSNRasnkJwycMZ02W91B7koOnVq3hyOwcb4O57WC1Pw7YBRd4Q0uIVKU+F0Fab69ckAPjrveGvRGJIvrmE5GotGDjAs7/+HzWZntjQtZzjmdWreaInBzA4Hc3wX9+8jYOheD884MLVmqvUBwMGwXm/7pc/DnM/V+wMYmUoWRORKLO3PVzGT5hOBtyvNn+6hU5nlm1hj45uRBKgNNehG+3h3cYPBjatq3gaCJ7qc0h8LuLwvWPb4HsTYGFI1KWkjkRCVR+fj7Lly8nP98bVmTGmhlcOOFCNuduBqBBURGjV60mPScXEhvAn/8D+50C//pX+CCa3UGq27G3QcM0r7xtNXx+X7DxiERQMicigcrMzOTbb78lMzOTH1b9wMWfXsy2/G0ANC4s5IXM1fTOzYP6LeC8sdB5AIwbB6u8ByLYZx8YOjTAM5A6IbkRnPhAuP7DC5AxPbh4RCIomRORQKWlpdGvXz+WFi7gsk8uJrsgG4BmhYW8mLmGnnn50LQDDJ8ArQ/2dnrhhfABzjsPEhJqPG6pg/b/PXQ53q84GPtXKCwIMiIRQOPMiUhNcw62rIS182DVHBJWz2H+upnclJRNgRkALQsKeH7VGjqF6sMxl8Hhl0NyY2//jAwYPz58vAsuCOAkpE4ygyEPwzN9vSdbV82G70d7/z5FAqRkrjps3QpDhgQdhUiwQkWQkust9XPD5ZRciC8q2ey9BvW5q3kzivxELq2ggBdXrKPdklRY0dxP3CKSt7Vrocjff8AA6NKlBk9K6rxmHeGYm2DiPV79i/th/1OgcZtg45I6TclcdSgshG++CToKkZrRwKB5CFJD3mvzOK/cxLyejJ14rVFDRqY2Lal3yMlj9PsZpH2eBXnrgIU7f289+CBBOPwqmPUOrP0F8rbBRzfBWa8HHZXUYUrmRGTX4oBmxclaceLmJ23JO0/YyuOyHf9MacgzEYncfsuz+efDy0jdWli5g/TsCX/8426/t8hei0+EoY/By/4VmF/GwvyPoPvgYOOSOkvJXHVo0AC++iroKER2j3OQtwmyMiDrN9i+IlzOXg3s7qj3IajXCuq3hZTipQ1FKW14eOtnvLbli5ItO+ak8uB+F5I6bt/KHTohAQ4+GJI06bkEpEM/6P0XmPGaVx9/I3Q8GhLrBxuX1ElK5qpDfDwcdVTQUYiUryAPNi71Jg5ftwDWLfJe1y+EnM27f7ykRtC8KzTvBqld/HJ3796i+NLJVn5hPrd9exvjIxK5vvv05cZuN9K+bUc9lSqx5fh7YP54yN4Am1fAlyO9NpEapmROpLbavj6cpK1bAOsWesvGZeAqeSmzhEGT9l7C1rwbNO8CqV29xK1Bq13eGwewPX871026ju9WflfSdvy+x3Nv33tZv2b9bsYjEgXqp8IJ98EH/tOsk/8BB54JrXoGG5fUOUrmRGJZYT5s/DXcy7Z+YThpy96w+8dLbOglas27hZO15t2gWSdISN7jMNdnr+fyiZfz8/qfS9rO6HYGtx52K79l/Ma3335Lv379aN++/R6/h0ggep8NM16HX7+FogL48BpvTMSQhnGVmqNkTiQWZG3wErSSXjb/0ujGpd4vkN1i0Lidn6j5S6qftDXcp1K9bLtjxZYVXPLZJazYuqKk7YreV3DJgZdgZiWDBqelpVXp+4rUCDMYNgqe7QdF+ZDxPfz0Khx6XtCRSR2iZE4kWhQWwKZf/Z61Mr1sWet2/3gJKf49bMWXRv3ErVlnSEyp+vjLMWfdHK6ceCXrc7zLqCELcVvf2zi92+nhMBMS1CMnsa1Fd+j3V/j6Ea/+6Z3QfSg0aBFsXFJnKJkTqWnZm2D9ooj72BZ49fWLvb/sd1fD1uHLoZH3szVqE+ilnonLJzLiqxHkFOYAkBSXxMijR3Jc++MCi0mk2hx9A8x517snNWcTfHIb/PG5oKOSOkLJnEh1KCr0nm5bV+bhg3ULYPua3T9efD2/l63M/WypXSCpQdXHvxecc/z753/zyLRHcP5wJo2TGvPkgCc5pNUhAUcnUk0S6sHQR+G1U736rLe8++k6HRNsXFInKJkT2Rs5W/zLoYtKXxpdvxgKc3f/eA3TIi6NRjyA0KhtTNxQXVBUwMjvR/LW/LdK2to1bMczxz1Dh8YdggtMpCZ0GQg9/wBz3/Pq466Dy77bYYgekaqmZE5kV4qKYEtG6QcPii+Nbs3c/ePFJUFq54ix2bqGL40mN6r6+GvI9vzt3PTVTXyVER4wu3eL3jxx7BM0S24WYGQiNWjQA7BoIuRu8b4jvnkc+t8cdFRSyymZEymWu82/d21R6Uuj6xdBQfbuH69+i4getohLo03aQyiu6uMP0IotK7j6i6tZtGlRSdugDoO4/8j7SYpTr4TUIY3S4Lg7YPwNXv3rR6HXad4fcCLVRMmc1C3OwZbfwr1skUN9bMnY/eOFErwv6VKXRv2ZEOo1qfLwo9HUzKlc/+X1bM4Nzx5xwQEXcPUhVxOy6L80LFLl0od7Y8+t/Mm73WLstXDOB1U+7I9IMSVzUjvlZcGGxaUfPFi3wLuXLX/77h8vpfmOY7I17wpN9oW4uvlj5JzjrflvMfL7kRT6M0okhBK464i7OLnzyQFHJxKgUBwMexyeHwCuCJZ+CbP/AweeEXRkUkvVzd9CUjs4B1tXlRmTze9l27yC3Z4YPhQPTTuGh/eIvDSaonu+IuUX5vP37//OuwveLWlrXq85jw94nINaHBRgZCJRonVvOOxSmPKMV59wK3Q9Huo1DTQsqZ2UzEn0y88p3csWeWk0b+vuHy+5yY4D6TbvBk07QJwmet+VVdtXcf2k65m1blZJW8/Unjwx4Ala1W8VYGQiUWbArTD3fdi6Eravhc/uhpMeDzoqqYWUzEl0cA62rdlxuqr1C725R3e3l81CXnJWfP9aZPKWkqp7V/bQdyu/Y8RXI9iYu7GkbWinodx1+F0kx+/53K0itVJSQxg8Et75P68+/V/e2HPt+gQbl9Q6UZvMmdkyYCtQCBQ459LNrBnwNtABWAac4Zzb6G9/C3CBv/3VzrkJAYQtu1KQCxuW7jhd1bqFEHEDfaUlNY6Y/SDi0mizjhrbqQoVuSKen/U8/5jxj5KBgOMsjmsPvZZz9j8HU3IsUr4eJ0G3E2HBx179w2vgki91FUCqVNQmc74BzrnISSlHABOdcw+a2Qi/frOZ7Q+cBfQEWgOfmVk35/y7sqVmOQdZ60tPV1V8eXTjMu+G4N1i0HTf0vewFSdw9Vuol62abcrZxK3f3MrXv31d0taiXgsePuZhDm11aICRicQAMxj8ECz50hviaM1cmPIs9Ls66MikFon2ZK6sU4D+fvkVYBJws9/+lnMuF1hqZouAPsDkAGKsOwrzveSs+EnRyAF1czbt/vESG4Z71yKfGm3WCRJ0CS8IP6z6gRFfj2BNVngKsvRW6Tx8zMM0r9c8wMhEYkjTfaH/CPjsTq8+6QHo+XtvzEmRKhDNyZwDPjEzBzznnBsNtHLOZQI45zLNrKW/bRtgSsS+GX6bVIWsDWUePPAvi25cCkUFu3kwg8btInrXIiaIb9BKvWxRoqCogH/O/CejZ40uuawKMPyA4Vx18FXEh6L5q0MkCh1+Bcx6G9b8DPlZ8NHN8Kc3g45Kaolo/kbu55xb6Sdsn5rZLzvZtrwMYIc75s3sYuBigPbt9RdRKYUFsOnXMmOy+T1tWet3/3gJ9csM7+GXm3WGxJSqj1+qzG/bfmPEVyOYsXZGSVuTpCbc2+9e+rfrH1hcIjEtLsEbe+6lE7z6/PEwbyz0GBZoWFI7RG0y55xb6b+uMbP38C6brjazNL9XLg0ovvaTAbSL2L0tsLKcY44GRgOkp6fv5uORtUT2pjLTVflJ2/rFUJS/+8dr1KZ071pqF6/eqI162WKMc47xS8dz/5T72ZofHvKlzz59+PuRf9ewIyJ7q/1hcMi58OMrXv2jm6BTf0hqEGhYEvuiMpkzs/pAyDm31S+fANwDjAHOBR70Xz/wdxkDvGFmj+E9ANEV+L7GA48WRYV+L9uiHYf62L5m1/uXFV8vnKQ1j5j9ILULJNav+vilxm3I2cB9U+7j018/LWmLsziuPPhKzu95PnG1bC5ZkcAMvAt+GQdZ67ypBSc9AIPuDzoqiXFRmcwBrYD3/OEO4oE3nHMfm9kPwDtmdgGwHDgdwDk318zeAX4GCoAr6sSTrDlb/GRtUelLo+sXe/MB7q6GaTtOV9W8KzRqCyHNsVlbffbrZ9w75V425GwoaWvToA0jjx6p2RxEqlpKMy95e+8Srz7lWTjwTEg7MNi4JKaZc3XzamN6erqbNm1a0GHsWlGRNwF85IMHxUnb1szdP15ckt/LVuZ+ttSukNyo6uOXqLU5dzN/n/p3xi8dX6r9tG6ncUP6DdRPUK+rSLVwDl45CZb5w/20SYcLPvHmdBWpgJlNd86ll7cuWnvm6p7cbX6v2qLSQ32sX+SNTbS76rfccbqq1C7eo/D6wqjTnHN8tPQjRv4wslRvXMuUltxzxD30a9MvwOhE6gAzGDYKnj0CCvPgt2ne7BC/uzDoyCRGKZmrSUVF3hx9ZaerWrfQu3did4USILXzjtNVpXaBek2qPHyJfSu2ruD+Kffz7cpvS7Wf3Plkbu5zM40S1TsrUiOad4Ujr4UvR3r1z+6B/U6ChnrQSHafkrnqUJAH6+aXf2k0P2v3j5fSPDxdVeT9bE32hTh9hLJr+UX5vDr3Vf4585/kFOaUtLdMacntfW/XkCMiQTjyOpj9H9iwxJvOcMKtcNqLQUclMUiZQHVYOw+eO3r39gnFezMdlJoY3u9lS2lWPXFKnTAlcwojvx/Jok2LStoM4+weZ3PVwVfp3jiRoCQkw9DH4N+/9+pz3oWD/wydjw00LIk9SuaqQ2qXitfVa7rjdFXNu3nTvWjiZalCK7au4NFpjzJx+cRS7fs12487D7+TA5ofEFBkIlKi8wDodbrXQwcw7nq4bLKmMJTdomSuOiTWh32P9AaCLHn4oHhi+NSgo5NaLis/ixdmv8Arc18hryivpD0lPoXLe1/On3v8WdNxiUSTE+6HBZ94l1o3LIGvH4Vj/xZ0VBJD9I1eXc4fF3QEUsfkF+Xz3sL3eHbms6zLXldq3cmdT+aaQ66hRUqLgKITkQo1bAUD74Rx13n1b0Z5vXUtugUbl8QMJXMiMa7IFfHJr5/w9E9P8+uWX0ut69W8FyP6jODAFhqQVCSqHXo+zHjDG6akKN9L7M79UNMiSqUomROJUc45JmdO5okfn+Dn9T+XWteyXkuuPuRqTup8EiHT7B0iUS8UgpMeh+eOAVfoDSg88y3o/aegI5MYsMtkzsySgWHAUXjznmYDc4Bxzrm51RueiJTlnOPbld/yz5n/ZObamaXWNUxsyAUHXMDZPc6mXny9gCIUkT2yTy/oexlMftqrf3IbdBukEQ1kl3aazJnZXcBJwCRgKrAGSAa6AQ/6id71zrlZ1RumiDjn+DLjS56b+Rxz1s8ptS4pLomze5zNBQdcQOOkxgFFKCJ7rf8tMPd9bxrHrHXw2Z1w8lNBRyVRblc9cz845+6qYN1jZtYSaF+1IYlIpPyifCYsm8Arc1/hlw2/lFqXEErgD13+wEUHXsQ+9fcJKEIRqTJJDWDIQ/DW2V79x1eh95+hfd9g45Kotqtk7jczM+ecK2+lc24NXm+diFSxLXlbeHfBu7w+73XWZJX+MUsMJXJat9M4/4DzlcSJ1Db7DYXuQ2D+eK8+9lq45CuNRSoV2lUy9wLQ0cx+BL4FvgOmOOe2VHtkInXUss3LeHv+2/xv4f/IKig9/VtyXDKndz+d83uer2FGRGqzwQ/Bki8hfzus+dm7j+7Ia4OOSqLUTpM551y6maUAfYAjgKuBf5vZKuBb59zlNRCjSK2XX5TPF8u/4J357zB11dQd1qcmp/Kn/f7EGd3PoGly0wAiFJEa1aQdDLjFewgCYNJI6PlHb7YgkTJ2+TSrcy4LmGRmP+A9BNEPOAc4sZpjE6n1VmxZwfuL3+d/C/+3w0C/AF2adOGc/c9hSKchJMUlBRChiATmsEu94UlWz4GCbBh/I5z9tsaekx3s6mnWs/F65HoDuUBxQnekc25VtUcnUgttzdvKJ8s+YcziMfy45scd1ocsxDFtj+HM7mdyROsjMH1xi9RNcQkw7HF48XjAwcIJMG8M7H9K0JFJlNlVz9xo4Bfgn8BXzrkF1R+SSO2TW5jLd799x0dLP+LzFZ+TW5i7wzYt6rXg1G6ncmrXU/VQg4h42v0O0s+HaS959Y9uhs7HQlLDYOOSqLKrZK4xcBBe79xdZtYdyAQmA5Odc59Xc3wiMSuvMI/vVn7HhGUTmLRiEtvyt+2wTZzFcWSbIzmlyyn0b9efhJCeVhORMo67A+aNhe1rYGsmfH4/DH4w6KgkilgFo46Uv7FZK+A04Fqgo3MurroCq27p6elu2rRpQYchtczm3M1889s3fLniS77+7etyEziA7k27c3LnkxnSaQjN6zWv4ShFJObM+g/870KvbCG46Ato3TvQkKRmmdl051x6eet2dc/cgXi9csVLIl6v3FN4Q5WI1GnOOZZsXsI3v33DpBWT+GnNTxS6wnK3bdugLYM6DGJwx8F0b9a9ZgMVkdjW6zSY8RosmQSuCMZeAxdOhFDM9qlIFdrVZdaX8ZK2j4DbnXO/VntEIlFubdZapmRO8ZaVU1iTXfG42W0btOWEDicwqMMgejTroYcZRGTPmMHQx+CZw6EwF1b+BD+8CIddHHRkEgV2Nc7cIQBmdmjZRM7MTnLOfVidwYlEg8xtmUxfM53pq6fz4+ofWbJ5SYXbGkav5r04pt0xHNP2GLo17aYETkSqRmpnOOp6mPR3rz7xHuhxEjRKCzYuCdwux5nzPW9m5zrnZgOY2Z+AawAlc1Kr5BTk8MuGX5izbg6z1s1ixpoZZG7P3Ok+DRMb0jetL0e1OYqj2h6le+BEpPoceQ3M/g+sXwh5W2HCLXD6y0FHJQGrbDJ3GvCumf0ZOBJv0OATqi0qkRqQlZ/Fgo0LmL9hPr9s/IW56+aycONCClzBTveLD8XTu0VvDm99OIenHc7+qfsTp/tWRKQmxCfBsMfglZO8+tz3oPdfoOvAYOOSQFUqmXPOLTGzs4D3gRXACc657OoMTKSq5BbmsmzzMpZuXsrizYtZvGkxCzYuYPmW5Th2/TR3vfh6HNTiIA5pdQiHtjyUXi16US++Xg1ELiJSjo5Hw4Fnway3vPq46+DyKZCYEmxcEphdPc06G0r9tmsGxAFTzQzn3IHVGZxIZeUW5pK5LZPlW5ezYusKlm9ZzvKty1m+ZTkZ2zIockWVPlaHRh3o1bwXBzQ/gANbHEj3Zt01/puIRJdB98OCjyFnE2z6Fb5+xBuPTuqkXfXMDauRKER2Ir8wn3XZ61idtZo1WWtYm72W1dtXs3L7SjK3ZbJy+8py5zXdlZCF6NioI92bdad7s+7s12w/Dmh+AI0SG1XDWYiIVKH6zeH4e+DDq736t09CrzOg5X7BxiWB2FUyt945V/6opz4za7CrbWqCmZ0IPIHXc/iCc07DY0ehgqICtudvZ0veFrbkbWFzzmY25W5ic573uilnE+tz1rMhZwPrs73XTbmb9uo9DaNNgzZ0btKZTk060alxJ7o06UKXJl1Ijk+umhMTEalpB/8fzHgDVkyBonwYey2cP94bxkTqlF0lcx+Y2QzgA2C6c247gJl1AgYAZwDPA+9WZ5C7YmZxwD+A44EM4AczG+Oc+znIuKJZkSui0BVS5IooKCooWQpdYUk5rzCP/KL8kiWvMI+8wjxyC3NLLdkF2WQXZJNTkENOQQ5ZBVlk5WexvWA72fnZbM/fzrb8bWzN20pWQVa1nE/IQrRKaUXbhm1p37A97Ru1p33D9rRr2I72jdrrHjcRqX1CIe9hiOeOhqICWP4dzHgdDv5L0JFJDdvVOHPHmdkQ4BKgn5k1BQqA+cA44Fzn3KrqD3OX+gCLnHNLAMzsLeAUIJBkblveNk7/8PRKb7+rm/CLp1wr3s7hcM6F93NQRFFJW/FroSvEOUeRKyqVvFU0Q0G0MozUeqm0TGlJy5SWtEppRYt6LUhrkEbr+q1p3aA1LVNaEh+q7MPZIiK1RKuecPgV8O0TXv2T26HbYKifGmxcUqN2+dvPOTceGF8DseyNNnhP2RbLAA4LKBaKKCJjW0ZQbx/1GiY0pEFiAxomNqRJUhMaJzWmSVKTknJqvVSaJTcjNTmV1HqpNElqokRNRKQix9wMc96DzcshewN8egf8/h9BRyU1qLb8hizvBoEdurvM7GLgYoD27dtXd0xRL87iCFmI+FA88RZPfCieuFAc8aF4EkIJJIQSSIxLJCGUQHwonsS4RJLikkiKSyIxLpHEUCLJ8ckkxydTL65eSTklPoWUhBTqx9cnJcErN0jwkrf6CfUJWSjoUxcRqT0S68PQR+CNM7z6jNeg99nQoV+wcUmNqS3JXAbQLqLeFlhZdiPn3GhgNEB6evquBxjbQ/Xj6zP+D7vZmVlOOmoRjcVTQhW3GVZqmqg4i8PMStoNI2QhQhYqWRdZV0IlIlKLdBvkTe01z5+Yaey1cOk3EJ8YbFxSI2pLMvcD0NXMOgK/AWcBZwcVTFwojnaN2u16QxERkapy4khY/AXkbYN18+G7J+HoG4KOSmrALrtnzCxkZnNqIpg95ZwrAK4EJgDzgHecc3ODjUpERKQGNW4DA/4Wrn/1MGxYGlw8UmN2mcw554qAmWYW1TeZOefGO+e6Oec6O+fuDzoeERGRGtfnYtjHn5ypIAfG3wCu2u4qkihR2Run0oC5ZjbRzMYUL9UZmIiIiOymuHgY9jglN2Iv+gx+fj/AgKQmVPaeuburNQoRERGpGm0Phd9dCD8879U/GgGdj4XkxsHGJdWmUj1zzrkvgWVAgl/+AfixGuMSERGRPXXc7dCglVfetgo+vy/YeKRaVSqZM7OL8Kbses5vagO8X00xiYiIyN5IbgwnPhCuf/88/KY+mNqqsvfMXQH0A7YAOOcWAi2rKygRERHZSz3/CJ2P8ysOxl4DhQVBRiTVpLLJXK5zLq+4YmbxlDPDgoiIiEQJM29miPhkr545M3wfndQqlU3mvjSzW4F6ZnY88B/gw+oLS0RERPZas06lBw7+/D7YssMESRLjKpvMjQDWArOBS4DxwG3VFZSIiIhUkSP+Cs27e+W8bfDRzcHGI1Wusslcf+B159zpzrnTnHPPO6dRCEVERKJefCIMGxWuzxsDCyYEF49Uucomc+cBM8xsspk9ZGYnmVnTaoxLREREqkqHftD7z+H6uBsgLyu4eKRKVXacuXOcc92AU4EM4B94l11FREQkFhx/L9Rr5pU3L4cvRwYbj1SZyo4z9xczew5vrLmBwNPAUdUZmIiIiFSh+qlwwr3h+uSnYfXPwcUjVaayl1kfB3oDzwNXO+cecs5Nrq6gREREpBr0/jO0P8IrFxV4Y88VFQUakuy9yl5mbQ4MB5KB+83sezP7d7VGJiIiIlXLzHsYIpTg1VdMhZ/06zzWVfYyayOgPbAv0AFoDCiVFxERiTUt94N+V4frn94B29cFF4/stcpeZv0GOAmYBZzpnOvunDu3+sISERGRanPUDdBkX6+cswk+0dCxsayyl1kPdM5djjfrw6ZqjUhERESqV2IKDH0sXJ/5Jiz9Krh4ZK9U9jLrAWb2EzAH+NnMppvZAdUbmoiIiFSbrgOh5x/C9bHXQUFucPHIHqvsZdbRwHXOuX2dc+2B6/02ERERiVWDHoCkRl55/UL49olg45E9Utlkrr5z7oviinNuElC/WiISERGRmtEoDY69PVz/6hFYvzi4eGSPVDaZW2Jmt5tZB3+5DVhanYGJiIhIDfjdBdD6YK9cmAvjrgdNvx5TKpvMDQdaAP/zl+bA+dUVlIiIiNSQUBwMexzMTwmWfAFz/htoSLJ7dprMmVmymV0D3AvMBQ5zzh3inLvGObexJgIUERGRata6N/S5JFz/+BbI3hRUNLKbdtUz9wqQDswGBgMPV3tEIiIiUvMG3AoN07zy9jUw8Z5g45FK21Uyt79z7i/OueeA04CjayAmERERqWnJjWDwyHB92kuQMS24eKTSdpXM5RcXnHMF1RyLiIiIBKnHydB1kF9x8OE1UKhf/9FuV8ncQWa2xV+2AgcWl81sS00EKCIiIjXEDIY8DPH1vPrq2TD1n8HGJLu002TOORfnnGvkLw2dc/ER5UY1FaSIiIjUkKb7Qv+bw/Uv/g6bVgQXj+xSZYcmqTFmdpeZ/WZmM/xlSMS6W8xskZnNN7NBEe2Hmtlsf92TZmbBRC8iIlILHH4ltOjhlfO3w0c373x7CVTUJXO+Uc653v4yHsDM9gfOAnoCJwLPmFmcv/2zwMVAV385MYCYRUREaoe4BDjp8XB9/jj4ZXxg4cjORWsyV55TgLecc7nOuaXAIqCPmaUBjZxzk51zDngV+H2AcYqIiMS+9n3hkHPC9fE3Qu624OKRCkVrMnelmc0ys5fMrKnf1gaIvGif4be18ctl20VERGRvDLwbUlK98pYM+PLBYOORcgWSzJnZZ2Y2p5zlFLxLpp2B3kAm8GjxbuUcyu2kvbz3vdjMppnZtLVr1+79iYiIiNRmKc3ghPvD9cnPwKrZwcUj5QokmXPODXTOHVDO8oFzbrVzrtA5VwQ8D/Txd8sA2kUcpi2w0m9vW057ee872jmX7pxLb9GiRdWfmIiISG1z0FnQ4Siv7Aph7LVQVBRsTFJK1F1m9e+BK/YHYI5fHgOcZWZJZtYR70GH751zmcBWM+vrP8V6DvBBjQYtIiJSW5nB0McglODVM36AH18ONCQpLeqSOeAhf5iRWcAA4FoA59xc4B3gZ+Bj4ArnXKG/z2XAC3gPRSwGPqrxqEVERGqrFt3gyGvC9c/ugm1rgopGyjDvAdC6Jz093U2bpjnnREREKiU/G545HDYu9eq9zoBTnw82pjrEzKY759LLWxeNPXMiIiISbRLqwdBHw/XZ78CSSYGFI2FK5kRERKRyuhwHB5waro+9DvJzgotHACVzIiIisjsGPQBJjb3yhsXwzahg4xElcyIiIrIbGraC424P1795DNYtDC4eUTInIiIiuyl9OLQ51CsX5sG466COPlAZDZTMiYiIyO4JxcGwx8H8NGLpVzDrnUBDqsuUzImIiMjuSzsQDrssXJ9wK2RtCC6eOkzJnIiIiOyZAbdCozZeOWsdTLw72HjqKCVzIiIismeSGsDgh8L16S/D8qmBhVNXKZkTERGRPbffUOg2OFwfey0U5gcXTx2kZE5ERET2nBkMeQgSUrz6mrkw5ZlgY6pjlMyJiIjI3mnSHvrfEq5PehA2LQ8unjpGyZyIiIjsvb6XQasDvHJ+Foy/UWPP1RAlcyIiIrL34hJg2CjAvPqCj+GXsYGGVFcomRMREZGq0a4PHHpeuP7RzZC7NbBw6golcyIiIlJ1Bt4J9Vt45S2/wRcPBBtPHaBkTkRERKpOvaYw6O/h+tRnIXNmcPHUAUrmREREpGr1Oh06HuOVXZE39lxRYbAx1WJK5kRERKRqmcHQxyAuyav/Nh2m/yvYmGoxJXMiIiJS9Zp3gaOuC9c/uwe2rg4unlpMyZyIiIhUjyOvhdQuXjl3M0y4Zefbyx5RMiciIiLVIz7Ju9xabM5/YdHE4OKppZTMiYiISPXpdAwceGa4Pu56yM8OLp5aSMmciIiIVK8T7oPkxl5541L4+tFg46lllMyJiIhI9WrQEgbeHa5/8zisXRBYOLWNkjkRERGpfoecC237eOWifG/sOeeCjamWUDInIiIi1S8UgmGjwOK8+q/fwMw3g42pllAyJyIiIjVjnwPg8MvD9U9ug6wNwcVTSwSSzJnZ6WY218yKzCy9zLpbzGyRmc03s0ER7Yea2Wx/3ZNmZn57kpm97bdPNbMONXw6IiIiUln9b4HG7bxy1nr49I5g46kFguqZmwP8EfgqstHM9gfOAnoCJwLPmBX3x/IscDHQ1V9O9NsvADY657oAo4CR1R69iIiI7JnE+jDk4XD9p3/Dr98FF08tEEgy55yb55ybX86qU4C3nHO5zrmlwCKgj5mlAY2cc5Odcw54Ffh9xD6v+OV3geOKe+1EREQkCnUfDPsNC9fHXgcFecHFE+Oi7Z65NsCKiHqG39bGL5dtL7WPc64A2AykVnukIiIisucGj4TEBl557TyY/HSw8cSwakvmzOwzM5tTznLKznYrp83tpH1n+5QX08VmNs3Mpq1du3bnJyAiIiLVp3FbGHBruP7lQ7BxWWDhxLJqS+accwOdcweUs3ywk90ygHYR9bbASr+9bTntpfYxs3igMVDuozHOudHOuXTnXHqLFi327MRERESkavS5BPbp5ZULsmHcDRp7bg9E22XWMcBZ/hOqHfEedPjeOZcJbDWzvv79cOcAH0Tsc65fPg343L+vTkRERKJZXDwMe4KSi2yLPoWfd9bnI+UJamiSP5hZBnA4MM7MJgA45+YC7wA/Ax8DVzjnCv3dLgNewHsoYjHwkd/+IpBqZouA64ARNXYiIiIisnfaHgq/uyBc/3gE5GwJLp4YZHW1Eys9Pd1NmzYt6DBEREQkexP8ow9sW+3VD7vUe0BCSpjZdOdcennrou0yq4iIiNQ19ZrAoL+H69+PhpU/BRZOrFEyJyIiIsE74FTofKxXdkXw4TVQVLjTXcSjZE5ERESCZwZDHoG4JK+eOQN+eCHQkGKFkjkRERGJDqmd4egbw/WJ98KWlRVvL4CSOREREYkm/a6G1K5eOW+r93Sr7JSSOREREYke8UkwbFS4/vMHsOCT4OKJAUrmREREJLp0PAoOOjtcH3895GUFF0+UUzInIiIi0eeEe6FeU6+8aTl89VCw8UQxJXMiIiISfeo3h+PvCde/ewrWzAsuniimZE5ERESiU++/QLu+XrmoAMZeC0VFwcYUhZTMiYiISHQKheCkxyEU79WXT4YZrwUaUjRSMiciIiLRq2UPOOKqcP3TO2D7uuDiiUJK5kRERCS6HX0TNGnvlbM3wie3BxtPlFEyJyIiItEtMQWGPBquz3wDln4dXDxRRsmciIiIRL9uJ0CPk8P1cddBQW5w8UQRJXMiIiISGwaPhMQGXnndAvjuyWDjiRJK5kRERCQ2NGoNx94Wrn/1CGxYElw8UULJnIiIiMSOPhdD2kFeuSAHxl0PzgUbU8CUzImIiEjsCMXBsMcB8+qLP4e5/wsyosApmRMREZHY0uYQ6HNRuP7xLZCzObh4AqZkTkRERGLPsbdBg3288rbVMPHeYOMJkJI5ERERiT3JjWHwg+H6Dy9AxvTg4gmQkjkRERGJTfv/HroM9CsOxl4DhQUBBhQMJXMiIiISm8xgyCMQn+zVV82C70cHG1MAlMyJiIhI7GrWEY65KVz/4n7Y/Ftw8QRAyZyIiIjEtsOvghb7eeW8bfDRTTvfvpZRMiciIiKxLT4Rho0K138ZC/M/Ci6eGqZkTkRERGLfvkfAwX8J18ffCHnbg4unBgWSzJnZ6WY218yKzCw9or2DmWWb2Qx/+WfEukPNbLaZLTKzJ83M/PYkM3vbb59qZh0COCUREREJ2sB7oF4zr7x5BUx6cOfb1xJB9czNAf4IfFXOusXOud7+cmlE+7PAxUBXfznRb78A2Oic6wKMAkZWX9giIiISteqnwgn3heuT/wGr5wYXTw0JJJlzzs1zzs2v7PZmlgY0cs5Nds454FXg9/7qU4BX/PK7wHHFvXYiIiJSx/Q+G/Y90iu7QvjwGigqCjSk6haN98x1NLOfzOxLMzvKb2sDZERsk+G3Fa9bAeCcKwA2A6k1FayIiIhEETMY9hiEErx6xvfw06vBxlTNqi2ZM7PPzGxOOcspO9ktE2jvnDsYuA54w8waAeX1tLnit9rJurIxXWxm08xs2tq1a3fndERERCRWtOgO/f4arn96J2yrvb/3qy2Zc84NdM4dUM7ywU72yXXOrffL04HFQDe8nri2EZu2BVb65QygHYCZxQONgQ0VHH+0cy7dOZfeokWLvT1FERERiVZH3wBNO3jlnE3wyW1BRlOtouoyq5m1MLM4v9wJ70GHJc65TGCrmfX174c7ByhOCscA5/rl04DP/fvqREREpK5KqAdDHw3XZ70FS74MLp5qFNTQJH8wswzgcGCcmU3wVx0NzDKzmXgPM1zqnCvuZbsMeAFYhNdjVzwa4ItAqpktwrs0O6KGTkNERESiWZeB0POP4fq466AgN7h4qonV1U6s9PR0N23atKDDEBERkeq0dRU8/TvI3eLV+98K/W8ONqY9YGbTnXPp5a2LqsusIiIiIlWq4T5w3B3h+tePwvrFwcVTDZTMiYiISO2WPhxaH+yVC3O9y6216MqkkjkRERGp3UJxMOxxMD/tWTIJZr8bZERVSsmciIiI1H6te8NhEbOETrgFsjcGFk5VUjInIiIidcOAW6Fha6+8fS1MvCfYeKqIkjkRERGpG5IawuCR4fq0f8GKH4KLp4oomRMREZG6o8dJ0O1Ev+Jg7DVQmB9kRHtNyZyIiIjUHWYw+CGIr+fVV8+BKc8GG9NeUjInIiIidUvTfaF/xIRRkx6ATSuCi2cvKZkTERGRuufwK6Dl/l45Pws+uinYePaCkjkRERGpe+ISvLHnis0fD7+MCyycvaFkTkREROqm9ofBIeeG6+NvgtxtwcWzh5TMiYiISN018C5Iae6Vt2R498/FGCVzIiIiUnelNINB94frU56FzFnBxbMHlMyJiIhI3XbgmdDhKK/sCmHstVBUGGxMu0HJnIiIiNRtZjBsFMQlevXfpsH0fwUb025QMiciIiLSvCsceW24/tk9sHV1cPHsBiVzIiIiIgBHXgfNOnnl3M0w4dZg46kkJXMiIiIiAAnJMPSxcH3Ou7BoYnDxVJKSOREREZFinQdAr9PD9XHXQ352cPFUgpI5ERERkUgn3A9Jjb3yxqXw9WM73z5gSuZEREREIjVsBQPvDNe/GQVrFwQXzy4omRMREREp69Dzoe3vvHJRPoy7DpwLNqYKKJkTERERKSsU8saeszivvuxrmPlWsDFVQMmciIiISHn26QV9LwvXP/kbZG0ILp4KKJkTERERqUj/W6BRW6+ctR4+u3Pn2wdAyZyIiIhIRZIawJCHwvUfX4VfJwcXTzmUzImIiIjszH5DofvQcH3stVCYH1w8ZQSSzJnZw2b2i5nNMrP3zKxJxLpbzGyRmc03s0ER7Yea2Wx/3ZNmZn57kpm97bdPNbMONX9GIiIiUqsNeQgS6nvltfNg8tPBxhMhqJ65T4EDnHMHAguAWwDMbH/gLKAncCLwjFnxYyQ8C1wMdPWXE/32C4CNzrkuwChgZE2dhIiIiNQRjdvCgFvC9UkjYeOvwcUTIZBkzjn3iXOuwK9OAfw7CzkFeMs5l+ucWwosAvqYWRrQyDk32TnngFeB30fs84pffhc4rrjXTkRERKTKHHYZtOrllQuyYfyNUTH2XDTcMzcc+MgvtwFWRKzL8Nva+OWy7aX28RPEzUBqNcYrIiIidVFcvDf2HH6f0cIJMG9MoCFBNSZzZvaZmc0pZzklYpu/AQXA68VN5RzK7aR9Z/uUF9PFZjbNzKatXbu28icjIiIiAtDud5B+frj+0c2QuzW4eKjGZM45N9A5d0A5ywcAZnYuMAz4s3/pFLwet3YRh2kLrPTb25bTXmofM4sHGgPljujnnBvtnEt3zqW3aNGiak5URERE6pbj7oT6fh6xNRM+vz/QcIJ6mvVE4GbgZOdcVsSqMcBZ/hOqHfEedPjeOZcJbDWzvv79cOcAH0Tsc65fPg34PCI5FBEREala9ZrAoAfC9e+fg5UzgoomsHvmngYaAp+a2Qwz+yeAc24u8A7wM/AxcIVzrtDf5zLgBbyHIhYTvs/uRSDVzBYB1wEjauwsREREpG7qdRp06u+VXREs+jSwUKyudmKlp6e7adOmBR2GiIiIxKr1i+Gdc+HEv0PHo6v1rcxsunMuvbx18dX6ziIiIiK1VWpnuPRrCHhEtGgYmkREREQkNkXB0LZK5kRERERimJI5ERERkRimZE5EREQkhimZExEREYlhSuZEREREYpiSOREREZEYpmROREREJIYpmRMRERGJYUrmRERERGKYkjkRERGRGKZkTkRERCSGmXMu6BgCYWZrgV+r+W2aA+uq+T2iWV0+/7p87lC3z1/nXnfV5fOvy+cONXP++zrnWpS3os4mczXBzKY559KDjiModfn86/K5Q90+f5173Tx3qNvnX5fPHYI/f11mFREREYlhSuZEREREYpiSueo1OugAAlaXz78unzvU7fPXuddddfn86/K5Q8Dnr3vmRERERGKYeuZEREREYpiSuSpgZiea2XwzW2RmI8pZb2b2pL9+lpkdEkSc1cHM2pnZF2Y2z8zmmtlfy9mmv5ltNrMZ/nJHELFWBzNbZmaz/fOaVs76WvnZm1n3iM9zhpltMbNrymxTqz53M3vJzNaY2ZyItmZm9qmZLfRfm1aw706/I6JdBef+sJn94v+7fs/MmlSw705/RmJBBed/l5n9FvHve0gF+9bGz/7tiPNeZmYzKtg3pj/7in6/ReXPvXNOy14sQBywGOgEJAIzgf3LbDME+AgwoC8wNei4q/D804BD/HJDYEE5598fGBt0rNV0/suA5jtZX2s/+4hzjANW4Y2BVGs/d+Bo4BBgTkTbQ8AIvzwCGFnB/5+dfkdE+1LBuZ8AxPvlkeWdu79upz8jsbBUcP53ATfsYr9a+dmXWf8ocEdt/Owr+v0WjT/36pnbe32ARc65Jc65POAt4JQy25wCvOo8U4AmZpZW04FWB+dcpnPuR7+8FZgHtAk2qqhSaz/7CMcBi51z1T0Id6Ccc18BG8o0nwK84pdfAX5fzq6V+Y6IauWdu3PuE+dcgV+dArSt8cBqSAWffWXUys++mJkZcAbwZo0GVUN28vst6n7ulcztvTbAioh6BjsmM5XZJuaZWQfgYGBqOasPN7OZZvaRmfWs2ciqlQM+MbPpZnZxOevrwmd/FhV/mdfWz71YK+dcJnhf/EDLcrapC/8GhuP1QJdnVz8jsexK/zLzSxVcaqvtn/1RwGrn3MIK1teaz77M77eo+7lXMrf3rJy2so8IV2abmGZmDYD/Atc457aUWf0j3iW4g4CngPdrOLzq1M85dwgwGLjCzI4us75Wf/ZmlgicDPynnNW1+XPfHbX938DfgALg9Qo22dXPSKx6FugM9AYy8S43llWrP3vgT+y8V65WfPa7+P1W4W7ltFXbZ69kbu9lAO0i6m2BlXuwTcwyswS8f+ivO+f+V3a9c26Lc26bXx4PJJhZ8xoOs1o451b6r2uA9/C61iPV6s8e70v6R+fc6rIravPnHmF18WVz/3VNOdvU2n8DZnYuMAz4s/NvFCqrEj8jMck5t9o5V+icKwKep/zzqs2ffTzwR+DtirapDZ99Bb/fou7nXsnc3vsB6GpmHf1eirOAMWW2GQOc4z/Z2BfYXNxFG+v8eyZeBOY55x6rYJt9/O0wsz54/+7W11yU1cPM6ptZw+Iy3g3hc8psVms/e1+Ff5nX1s+9jDHAuX75XOCDcrapzHdEzDGzE4GbgZOdc1kVbFOZn5GYVObe1z9Q/nnVys/eNxD4xTmXUd7K2vDZ7+T3W/T93AfxhEhtW/CeWFyA9+TK3/y2S4FL/bIB//DXzwbSg465Cs/9SLyu41nADH8ZUub8rwTm4j3NMwU4Iui4q+jcO/nnNNM/v7r22afgJWeNI9pq7eeOl7RmAvl4f3VfAKQCE4GF/mszf9vWwPiIfXf4joilpYJzX4R3T1Dxz/0/y557RT8jsbZUcP7/9n+mZ+H9kk6rK5+93/5y8c96xLa16rPfye+3qPu51wwQIiIiIjFMl1lFREREYpiSOREREZEYpmROREREJIYpmRMRERGJYUrmRERERGKYkjkRiWpmNsrMromoTzCzFyLqj5rZdWZ2spmN2M1jv2xmp1VhuLvNzM4zs6crWPd7M7tjD455i5n9uYJ1vczs5d09pohELyVzIhLtvgOOADCzENAciJzn9QjgW+fcGOfcgwHEV51uAp7Zg/1OAD4pb4VzbjbQ1sza701gIhI9lMyJSLT7Fj+Zw0vi5gBbzaypmSUBPYCfInu4/B63J83sOzNbUtz75s/E8bSZ/Wxm4yh/gmzM7Gp/m1lm9pbfdpeZ/dvMPjezhWZ2UcT2N5rZD/72d0e0/8XMvjezGWb2nJnF+e3nm9kCM/sS6FdBDN2AXOfcuohzetbMvvDP6RjzJnifF9nTZmaNgETn3FozO93M5pjZTDP7KuLwH+KNSC8itUB80AGIiOyMc26lmRX4PUlHAJOBNsDhwGZglnMuz585LFIa3gju++GN0P8u3rRL3YFeQCvgZ+Clct52BNDROZdrZk0i2g8E+gL18RLIccABQFe8eScNGGPehOJrgTPxJhvPN7NngD+b2afA3cChfvxfAD+VE0M/4McybU2BY4GT8RKyfsCFwA9m1ts5NwNvmqWJ/vZ3AIOcc7+VOY9p/jk+VM77ikiMUTInIrGguHfuCOAxvGTuCLxk6LsK9nnfeZOg/2xmrfy2o4E3nXOFwEoz+7yCfWcBr5vZ+8D7Ee0fOOeygWwz+wIvgTsS77JmcULWAC+5OxAvYfvBTzTr4U3IfRgwyTm3FsDM3ga6lRNDGl5CGOlD55wzs9nAav+SKWY2F+iAN93QicC//O2/BV42s3eA/0UcZw3e1EMiUgvoMquIxILi++Z64V1mnYLXM3cEXsJSntyIcmS3XWXmMByKN6fuocB0Myv+w7fsvs4/9gPOud7+0sU596Lf/kpEe3fn3F27EUM2kFymrficiih9fkWE/zjvA3wP4Jy7FLgNaAfMMLNUf5tk//giUgsomRORWPAtMAzY4JwrdM5tAJrgJXSTd+M4XwFnmVmcmaUBA8pu4D9k0c459wXeAwhN8HrbAE4xs2Q/KeoP/ABMAIabWQN//zZm1hLvUudpfhkza2Zm+wJTgf5mlmpmCcDpFcQ6D+iyG+eGmfUEfvF7HjGzzs65qc65O4B1eEkdeD2Bc3bn2CISvXSZVURiwWy8p1jfKNPWoPgBgUp6D++es9nAAuDLcraJA14zs8Z4vWujnHOb/Eul3wPjgPbAvc65lXiXa3sAk/1ttgF/cc79bGa3AZ/4CWI+cIVzboqZ3YWXhGbi3RcXV04cXwGPmpk55yrTkwcwGPg4ov6wmXX1z2MiMNNvH+Cfh4jUAlb57wgRkbrLT8C2OeceqcH3fALvPrnPKrn9p8A5zrnMnWyThJfEHumcK6iaSEUkSLrMKiISvf4OpFR2Y+fc8TtL5HztgRFK5ERqD/XMiYiIiMQw9cyJiIiIxDAlcyIiIiIxTMmciIiISAxTMiciIiISw5TMiYiIiMQwJXMiIiIiMez/AdVLqP8kaKYCAAAAAElFTkSuQmCC\n", - "text/plain": [ - "
" - ] - }, - "metadata": { - "needs_background": "light" - }, - "output_type": "display_data" - } - ], - "source": [ - "# Plot the results\n", - "x = np.linspace(0,20,100)\n", - "plt.figure(figsize = (10,6))\n", - "plt.scatter(windspeed_final, power_kw_final, alpha=0.5, s = 1, c = 'gray')\n", - "plt.plot(x, iec_curve(x), color=\"red\", label = 'IEC', linewidth = 3)\n", - "plt.plot(x, spline_curve(x), color=\"C1\", label = 'Spline', linewidth = 3)\n", - "plt.plot(x, l5p_curve(x), color=\"C2\", label = 'L5P', linewidth = 3)\n", - "plt.xlabel('Wind speed (m/s)')\n", - "plt.ylabel('Power (kW)')\n", - "plt.legend()\n", - "plt.show()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "The above plot shows that the IEC method accurately captures the power curve, although it results in a 'choppy' fit, while the L5P model (constrained by its parametric form) deviates from the knee of the power curve through peak production. The spline fit tends to fit the best." - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3 (ipykernel)", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.9.7" - }, - "toc": { - "base_numbering": 1, - "nav_menu": {}, - "number_sections": true, - "sideBar": true, - "skip_h1_title": false, - "title_cell": "Table of Contents", - "title_sidebar": "Contents", - "toc_cell": false, - "toc_position": {}, - "toc_section_display": true, - "toc_window_display": false - } - }, - "nbformat": 4, - "nbformat_minor": 4 -} diff --git a/examples/00_v3_demonstration.ipynb b/examples/00_v3_demonstration.ipynb deleted file mode 100644 index aa918323..00000000 --- a/examples/00_v3_demonstration.ipynb +++ /dev/null @@ -1,897 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Demonstration of the v3 `PlantData` class" - ] - }, - { - "cell_type": "code", - "execution_count": 1, - "metadata": {}, - "outputs": [], - "source": [ - "from copy import deepcopy\n", - "from pprint import pprint\n", - "from pathlib import Path\n", - "\n", - "import yaml\n", - "import numpy as np\n", - "import pandas as pd\n", - "from openoa import PlantData\n", - "\n", - "import project_ENGIE" - ] - }, - { - "cell_type": "code", - "execution_count": 2, - "metadata": {}, - "outputs": [ - { - "name": "stderr", - "output_type": "stream", - "text": [ - "INFO:root:Loading SCADA data\n", - "INFO:root:SCADA data loaded\n", - "INFO:root:Timestamp conversion to datetime and UTC\n", - "INFO:root:Removing out of range of temperature readings\n", - "INFO:root:Flagging unresponsive sensors\n", - "INFO:root:Converting pitch to the range [-180, 180]\n", - "INFO:root:Calculating energy production\n", - "INFO:root:Reading in the meter data\n", - "INFO:root:Reading in the curtailment data\n", - "INFO:root:Reading in the reanalysis data and calculating the extra fields\n", - "INFO:root:Reading in the asset data\n" - ] - } - ], - "source": [ - "scada_df, meter_df, curtail_df, asset_df, reanalysis_dict = project_ENGIE.prepare(return_value=\"dataframes\")" - ] - }, - { - "cell_type": "code", - "execution_count": 10, - "metadata": { - "collapsed": true, - "jupyter": { - "outputs_hidden": true - }, - "tags": [] - }, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "{'asset': {'elevation': 'elevation_m',\n", - " 'hub_height': 'Hub_height_m',\n", - " 'id': 'Wind_turbine_name',\n", - " 'latitude': 'Latitude',\n", - " 'longitude': 'Longitude',\n", - " 'rated_power': 'Rated_power',\n", - " 'rotor_diameter': 'Rotor_diameter_m'},\n", - " 'curtail': {'availability': 'availability_kwh',\n", - " 'curtailment': 'curtailment_kwh',\n", - " 'frequency': '10T',\n", - " 'net_energy': 'net_energy_kwh',\n", - " 'time': 'time'},\n", - " 'latitude': 48.4497,\n", - " 'longitude': 5.5896,\n", - " 'meter': {'energy': 'net_energy_kwh', 'time': 'time'},\n", - " 'reanalysis': {'era5': {'frequency': 'H',\n", - " 'surface_pressure': 'surf_pres',\n", - " 'temperature': 't_2m',\n", - " 'time': 'datetime',\n", - " 'windspeed_u': 'u_100',\n", - " 'windspeed_v': 'v_100'},\n", - " 'merra2': {'frequency': 'H',\n", - " 'surface_pressure': 'surface_pressure',\n", - " 'temperature': 'temp_10m',\n", - " 'time': 'datetime',\n", - " 'windspeed_u': 'u_50',\n", - " 'windspeed_v': 'v_50'}},\n", - " 'scada': {'frequency': '10T',\n", - " 'id': 'Wind_turbine_name',\n", - " 'pitch': 'Ba_avg',\n", - " 'power': 'P_avg',\n", - " 'temperature': 'Ot_avg',\n", - " 'time': 'time',\n", - " 'wind_direction': 'Wa_avg',\n", - " 'windspeed': 'Ws_avg'}}\n" - ] - } - ], - "source": [ - "with open(\"data/plant_meta.yml\", \"r\") as f:\n", - " meta = yaml.safe_load(f)\n", - "pprint(meta)" - ] - }, - { - "cell_type": "code", - "execution_count": 3, - "metadata": {}, - "outputs": [], - "source": [ - "engie = PlantData(\n", - " analysis_type=None,\n", - " metadata=\"data/plant_meta.yml\",\n", - " scada=scada_df,\n", - " meter=meter_df,\n", - " curtail=curtail_df,\n", - " asset=asset_df,\n", - " reanalysis=reanalysis_dict\n", - ")" - ] - }, - { - "cell_type": "code", - "execution_count": 4, - "metadata": {}, - "outputs": [ - { - "ename": "ValueError", - "evalue": "`scada` data is missing the following columns: ['status']\n`meter` data is missing the following columns: ['power']\n`tower` data is missing the following columns: ['time', 'id']\n`status` data is missing the following columns: ['time', 'id', 'status_id', 'status_code', 'status_text']\n`scada` data columns were of the wrong type: ['status']\n`meter` data columns were of the wrong type: ['power']\n`tower` data columns were of the wrong type: ['time', 'id']\n`status` data columns were of the wrong type: ['time', 'id', 'status_id', 'status_code', 'status_text']\n`scada` data is of the wrong frequency: None", - "output_type": "error", - "traceback": [ - "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", - "\u001b[0;31mValueError\u001b[0m Traceback (most recent call last)", - "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[0mengie\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0manalysis_type\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;34m\"all\"\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 2\u001b[0;31m \u001b[0mengie\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mvalidate\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", - "\u001b[0;32m~/Documents/GitHub/OpenOA/openoa/plant.py\u001b[0m in \u001b[0;36mvalidate\u001b[0;34m(self, metadata)\u001b[0m\n\u001b[1;32m 1311\u001b[0m \u001b[0merror_message\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mcompose_error_message\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_errors\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0manalysis_type\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 1312\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0merror_message\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m-> 1313\u001b[0;31m \u001b[0;32mraise\u001b[0m \u001b[0mValueError\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0merror_message\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 1314\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 1315\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mupdate_column_names\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", - "\u001b[0;31mValueError\u001b[0m: `scada` data is missing the following columns: ['status']\n`meter` data is missing the following columns: ['power']\n`tower` data is missing the following columns: ['time', 'id']\n`status` data is missing the following columns: ['time', 'id', 'status_id', 'status_code', 'status_text']\n`scada` data columns were of the wrong type: ['status']\n`meter` data columns were of the wrong type: ['power']\n`tower` data columns were of the wrong type: ['time', 'id']\n`status` data columns were of the wrong type: ['time', 'id', 'status_id', 'status_code', 'status_text']\n`scada` data is of the wrong frequency: None" - ] - } - ], - "source": [ - "engie.analysis_type = \"all\"\n", - "engie.validate()" - ] - }, - { - "cell_type": "code", - "execution_count": 12, - "metadata": {}, - "outputs": [ - { - "data": { - "text/html": [ - "
\n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - "
Wind_turbine_nameDate_timeBa_avgP_avgWs_avgVa_avgOt_avgYa_avgWa_avgtimeenergy_kwh
timeid
2014-01-01 00:00:00R80736R807362014-01-01T01:00:00+01:00-1.00642.780037.120.664.69181.34000182.009992014-01-01 00:00:00107.130005
R80721R807212014-01-01T01:00:00+01:00-1.01441.060006.39-2.484.94179.82001177.360002014-01-01 00:00:0073.510000
R80790R807902014-01-01T01:00:00+01:00-0.96658.530037.111.074.55172.39000173.509992014-01-01 00:00:00109.755005
R80711R807112014-01-01T01:00:00+01:00-0.93514.239996.876.954.30172.77000179.720002014-01-01 00:00:0085.706665
2014-01-01 00:10:00R80790R807902014-01-01T01:10:00+01:00-0.96640.239997.01-1.904.68172.39000170.460012014-01-01 00:10:00106.706665
\n", - "
" - ], - "text/plain": [ - " Wind_turbine_name Date_time \\\n", - "time id \n", - "2014-01-01 00:00:00 R80736 R80736 2014-01-01T01:00:00+01:00 \n", - " R80721 R80721 2014-01-01T01:00:00+01:00 \n", - " R80790 R80790 2014-01-01T01:00:00+01:00 \n", - " R80711 R80711 2014-01-01T01:00:00+01:00 \n", - "2014-01-01 00:10:00 R80790 R80790 2014-01-01T01:10:00+01:00 \n", - "\n", - " Ba_avg P_avg Ws_avg Va_avg Ot_avg \\\n", - "time id \n", - "2014-01-01 00:00:00 R80736 -1.00 642.78003 7.12 0.66 4.69 \n", - " R80721 -1.01 441.06000 6.39 -2.48 4.94 \n", - " R80790 -0.96 658.53003 7.11 1.07 4.55 \n", - " R80711 -0.93 514.23999 6.87 6.95 4.30 \n", - "2014-01-01 00:10:00 R80790 -0.96 640.23999 7.01 -1.90 4.68 \n", - "\n", - " Ya_avg Wa_avg time \\\n", - "time id \n", - "2014-01-01 00:00:00 R80736 181.34000 182.00999 2014-01-01 00:00:00 \n", - " R80721 179.82001 177.36000 2014-01-01 00:00:00 \n", - " R80790 172.39000 173.50999 2014-01-01 00:00:00 \n", - " R80711 172.77000 179.72000 2014-01-01 00:00:00 \n", - "2014-01-01 00:10:00 R80790 172.39000 170.46001 2014-01-01 00:10:00 \n", - "\n", - " energy_kwh \n", - "time id \n", - "2014-01-01 00:00:00 R80736 107.130005 \n", - " R80721 73.510000 \n", - " R80790 109.755005 \n", - " R80711 85.706665 \n", - "2014-01-01 00:10:00 R80790 106.706665 " - ] - }, - "execution_count": 12, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "engie.scada.head()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Load the data and create file mappings for later use" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "tags": [] - }, - "outputs": [], - "source": [ - "# project = Project_Engie('./data/la_haute_borne')\n", - "# project.prepare()\n", - "\n", - "fpath = Path(\"data/la_haute_borne\")\n", - "fn_scada = fpath / \"la-haute-borne-data-2014-2015.csv\"\n", - "fn_meter = fpath / \"plant_data.csv\"\n", - "fn_curtail = fpath / \"plant_data.csv\"\n", - "fn_reanalysis_merra2 = fpath / \"merra2_la_haute_borne.csv\"\n", - "fn_reanalysis_era5 = fpath / \"era5_wind_la_haute_borne.csv\"\n", - "fn_asset = fpath / \"la-haute-borne_asset_table.csv\"\n", - "\n", - "yaml_meta = \"data/plant_meta.yml\"\n", - "project = PlantData(\n", - " analysis_type=None, # Choosing a random type that doesn't fail validation\n", - " metadata=yaml_meta,\n", - " scada=fn_scada,\n", - " meter=fn_meter,\n", - " curtail=fn_curtail,\n", - " asset=fn_asset,\n", - " reanalysis=dict(era5=fn_reanalysis_era5, merra2=fn_reanalysis_merra2),\n", - ")\n", - "\n", - "# Create missing variables from the data set\n", - "project.asset[\"type\"] = \"turbine\"" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "fpath = Path(\"data/la_haute_borne\")\n", - "fn_scada = fpath / \"la-haute-borne-data-2014-2015.csv\"\n", - "fn_meter = fpath / \"plant_data.csv\"\n", - "fn_curtail = fpath / \"plant_data.csv\"\n", - "fn_reanalysis_merra2 = fpath / \"merra2_la_haute_borne.csv\"\n", - "fn_reanalysis_era5 = fpath / \"era5_wind_la_haute_borne.csv\"\n", - "fn_asset = fpath / \"la-haute-borne_asset_table.csv\"\n", - "\n", - "scada = pd.read_csv(fn_scada)\n", - "meter = pd.read_csv(fn_meter)\n", - "curtail = pd.read_csv(fn_curtail)\n", - "reanalysis_era5 = pd.read_csv(fn_reanalysis_era5)\n", - "reanalysis_merra2 = pd.read_csv(fn_reanalysis_merra2)\n", - "asset = pd.read_csv(fn_asset)\n", - "\n", - "latitude = 48.4497\n", - "longitude = 5.5896\n", - "\n", - "yaml_meta = \"data/plant_meta.yml\"\n", - "json_meta = \"data/plant_meta.json\"" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## TODO\n", - " - [x] read data from spark, csv, pandas\n", - " - [x] read metadata from json, yaml, dict, and pre-loaded object\n", - " - [x] automatically calculate wind direction from u/v windspeed\n", - " - [x] call planetos api if API key is provided\n", - " - [x] validate this works\n", - " - [x] support flags for if csv/planetos/data object/etc\n", - " - datetime column frequency checks\n", - " - [ ] check against the provided metadata\n", - " - [ ] validate against the analysis requirements\n", - " - **note**: bring Lewis into this conversation on datetime & frequency validation, but is ok to use pandas for now\n", - " - [x] expand metadata to contain plant-level identifiers (latitude, longitude)\n", - " - check against the -25 namings and (likely) adopt that naming convention for the plant data\n", - " - [ ] update internal column naming convention to the -25 schema (Eric/Lewis)\n", - " - [x] map the input column names, and provide a method to provide them back as the original inputs\n", - " - [x] get the 0 notebook working, or at least as a means to understand what will be required for refactoring\n", - " - [x] no failures for tower data as it's not used\n", - " - [x] none flag for raising warning, not error, for missing/bad data\n", - " - `None` will run no validation\n", - " - [ ] flag to not raise an error for known missing data\n", - " - [x] metadata keyword argument for validate() to recreate `PlantMetaData`\n", - " - allows for more flexibility in use cases, especially in the exploratory phase, or for changing analysis types\n", - " - [ ] review the v3 todo workbook to stay on track with the rest of v3 development\n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Create a dictionary of plant meta data \n", - "\n", - "**NOTE**: the datetime frequency checking is not in place, but the placeholder exists to implement it later" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "plant_meta = dict(\n", - " latitude=latitude,\n", - " longitude=longitude,\n", - " scada=dict(\n", - " time=\"Date_time\",\n", - " id=\"Wind_turbine_name\",\n", - " power=\"P_avg\",\n", - " windspeed=\"Ws_avg\",\n", - "# wtur_wspd=\"Ws_avg\", # TODO: adopt the -25 naming\n", - " wind_direction=\"Wa_avg\",\n", - "# status=\"?\",\n", - " pitch=\"Ba_avg\",\n", - " temperature=\"Ot_avg\",\n", - " frequency=\"10T\",\n", - " ),\n", - " meter=dict(\n", - " time=\"time_utc\",\n", - " energy=\"net_energy_kwh\",\n", - " ),\n", - " curtail=dict(\n", - " time=\"time_utc\",\n", - " curtailment=\"curtailment_kwh\",\n", - " availability=\"availability_kwh\",\n", - " net_energy=\"net_energy_kwh\",\n", - " frequency=\"10T\",\n", - " ),\n", - " reanalysis=dict( # keys are informational/product-type, not pre-defined\n", - " era5=dict(\n", - " time=\"datetime\",\n", - " # windspeed=\"ws_100m\", # Commented out to demonstrate variable creation from base windspeed data\n", - " windspeed_u=\"u_100\",\n", - " windspeed_v=\"v_100\",\n", - " temperature=\"t_2m\",\n", - " # density=\"dens_100m\", # Commented out to demonstrate variable creation from base windspeed data\n", - " surface_pressure=\"surf_pres\",\n", - " frequency=\"H\",\n", - " ),\n", - " merra2=dict(\n", - " time=\"datetime\",\n", - " # windspeed=\"ws_50\", # Commented out to demonstrate variable creation from base windspeed data\n", - " windspeed_u=\"u_50\",\n", - " windspeed_v=\"v_50\",\n", - " temperature=\"temp_10m\",\n", - " # density=\"dens_50\", # Commented out to demonstrate variable creation from base windspeed data\n", - " surface_pressure=\"surface_pressure\",\n", - " frequency=\"H\",\n", - " )\n", - " ),\n", - " asset=dict(\n", - " id=\"id\",\n", - " latitude=\"Latitude\",\n", - " longitude=\"Longitude\",\n", - " rated_power=\"Rated_power\",\n", - " hub_height=\"Hub_height_m\",\n", - " rotor_diameter=\"Rotor_diameter_m\",\n", - " elevation=\"elevation_m\",\n", - "# type=\"?\",\n", - " ),\n", - ")\n", - "\n", - "# Recreate the YAML and JSON meta data objects as the dictionary above gets updated\n", - "import yaml\n", - "import json\n", - "\n", - "with open(yaml_meta, \"w\") as f:\n", - " yaml.safe_dump(plant_meta, f, default_flow_style=False)\n", - " \n", - "with open(json_meta, \"w\") as f:\n", - " json.dump(plant_meta, f, indent=4)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Demonstrate the loading from YAML, JSON, and dictionary produce the exact same meta data" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "meta_from_dict = PlantMetaData.from_dict(plant_meta)\n", - "meta_from_json = PlantMetaData.from_json(json_meta)\n", - "meta_from_yaml = PlantMetaData.from_yaml(yaml_meta)\n", - "meta_from_dict == meta_from_json == meta_from_yaml, type(meta_from_dict)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Show the PlantData capabilities\n", - "\n", - "### Load from `DataFrame`s and a metadata dictionary" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "plant_from_data = PlantDataV3(\n", - " metadata=meta_from_dict,\n", - " scada=scada,\n", - " meter=meter,\n", - " curtail=curtail,\n", - " reanalysis={\"merra2\": reanalysis_merra2, \"era5\": reanalysis_era5}, # preferred, and enable API pulling\n", - " asset=asset,\n", - " analysis_type=\"MonteCarloAEP\",\n", - ")\n", - "type(plant_from_data)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Show that \"windspeed\", \"wind_direction\", and \"density\" columns are all created from the core variables" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "plant_from_data.reanalysis[\"era5\"].head()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "plant_from_data.reanalysis[\"merra2\"].head()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Show loading the data from file for both the meta data (JSON and YAML) and data (CSV)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "plant_from_file1 = PlantDataV3(\n", - " metadata=yaml_meta,\n", - " scada=fn_scada,\n", - " meter=fn_meter,\n", - " curtail=fn_curtail,\n", - " reanalysis={\"merra2\": fn_reanalysis_merra2, \"era5\": fn_reanalysis_era5}, # preferred, and enable API pulling\n", - " asset=fn_asset,\n", - " analysis_type=\"MonteCarloAEP\"\n", - ")\n", - "type(plant_from_file1)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "plant_from_file2 = PlantDataV3(\n", - " metadata=json_meta,\n", - " scada=fn_scada,\n", - " meter=fn_meter,\n", - " curtail=fn_curtail,\n", - " reanalysis={\"merra2\": fn_reanalysis_merra2, \"era5\": fn_reanalysis_era5}, # preferred, and enable API pulling\n", - " asset=fn_asset,\n", - " analysis_type=\"MonteCarloAEP\"\n", - ")\n", - "type(plant_from_file2)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### When updating the `analysis_type` to \"all\", note all the column data errors that are saved until the end of the validation" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "plant_from_data = PlantDataV3(\n", - " metadata=meta_from_dict,\n", - " scada=scada,\n", - " meter=meter,\n", - " curtail=curtail,\n", - " reanalysis={\"merra2\": reanalysis_merra2, \"era5\": reanalysis_era5}, # preferred, and enable API pulling\n", - " asset=asset,\n", - " analysis_type=\"all\"\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Demonstrate changing a parameter (`analysis_type`) and revalidating with `PlantDataV3.validate()`" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "plant = deepcopy(plant_from_data)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "plant.analysis_type = None\n", - "plant.validate()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "plant.analysis_type = \"all\"\n", - "plant.validate()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "plant.analysis_type = \"TurbineLongTermGrossEnergy\"\n", - "plant.validate()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "plant.analysis_type = \"ElectricalLosses\"\n", - "plant.validate()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "#### Direct copy of the analysis requirements for easy referece" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "ANALYSIS_REQUIREMENTS = {\n", - " \"MonteCarloAEP\": {\n", - " \"meter\": {\n", - " \"columns\": [\"energy\"],\n", - " \"freq\": (\"MS\", \"D\", \"H\", \"T\"),\n", - " },\n", - " \"curtail\": {\n", - " \"columns\": [\"availability\", \"curtailment\"],\n", - " \"freq\": (\"MS\", \"D\", \"H\", \"T\"),\n", - " },\n", - " \"reanalysis\": {\n", - " \"columns\": [\"windspeed\", \"rho\"],\n", - " \"conditional_columns\": {\n", - " \"reg_temperature\": [\"temperature\"],\n", - " \"reg_wind_direction\": [\"windspeed_u\", \"windspeed_v\"],\n", - " },\n", - " },\n", - " },\n", - " \"TurbineLongTermGrossEnergy\": {\n", - " \"scada\": {\n", - " \"columns\": [\"id\", \"windspeed\", \"power\"], # TODO: wtur_W_avg vs energy_kwh ?\n", - " \"freq\": (\"D\", \"H\", \"T\"),\n", - " },\n", - " \"reanalysis\": {\n", - " \"columns\": [\"windspeed\", \"wind_direction\", \"rho\"],\n", - " },\n", - " },\n", - " \"ElectricalLosses\": {\n", - " \"scada\": {\n", - " \"columns\": [\"energy\"],\n", - " \"freq\": (\"D\", \"H\", \"T\"),\n", - " },\n", - " \"meter\": {\n", - " \"columns\": [\"energy\"],\n", - " \"freq\": (\"MS\", \"D\", \"H\", \"T\"),\n", - " },\n", - " },\n", - "}" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Show the updated column names and how to map them back to the original data" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "scada.columns.tolist()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "plant.scada.columns.tolist()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "plant.update_column_names(to_original=True)\n", - "plant.scada.columns.tolist()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Demonstrate the PlanetOS integration" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "apikey_file = Path(\"./APIKEY\").resolve()\n", - "plant_meta_planetos = deepcopy(plant_meta)\n", - "plant_meta_planetos[\"reanalysis\"][\"era5\"] = dict(\n", - " time=\"datetime\",\n", - " windspeed=\"windspeed_ms\",\n", - " wind_direction=\"winddirection_deg\",\n", - " windspeed_u=\"u_ms\",\n", - " windspeed_v=\"v_ms\",\n", - " temperature=\"temperature_K\",\n", - " density=\"rho_kgm-3\",\n", - " surface_pressure=\"surf_pres_Pa\",\n", - " frequency=\"H\",\n", - ")\n", - "plant_meta_planetos[\"reanalysis\"][\"merra2\"] = dict(\n", - " time=\"datetime\",\n", - " windspeed=\"windspeed_ms\",\n", - " wind_direction=\"winddirection_deg\",\n", - " windspeed_u=\"u_ms\",\n", - " windspeed_v=\"v_ms\",\n", - " temperature=\"temperature_K\",\n", - " density=\"rho_kgm-3\",\n", - " surface_pressure=\"surf_pres_Pa\",\n", - " frequency=\"H\",\n", - ")\n", - "\n", - "plant_from_data = PlantDataV3(\n", - " metadata=meta_from_dict,\n", - " scada=scada,\n", - " meter=meter,\n", - " curtail=curtail,\n", - " reanalysis={\n", - " \"merra2\": {\"apikey_file\": apikey_file, \"save_pathname\": \".\", \"save_filename\": \"merra2\"},\n", - " \"era5\": {\"apikey_file\": apikey_file, \"save_pathname\": \".\", \"save_filename\": \"era5\"},\n", - " },\n", - " asset=asset,\n", - " analysis_type=\"all\"\n", - ")" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.8.5" - } - }, - "nbformat": 4, - "nbformat_minor": 4 -} diff --git a/openoa/__init__.py b/openoa/__init__.py index 86eaf762..931d6074 100644 --- a/openoa/__init__.py +++ b/openoa/__init__.py @@ -4,9 +4,9 @@ """ -## API Shortcuts +# API Shortcuts from openoa.plant import PlantData -import openoa.analysis -PlantData.MonteCarloAEP = openoa.analysis.aep.MonteCarloAEP + +# TODO: Attach analysis classes to PlantData diff --git a/openoa/types/asset.py b/openoa/types/asset.py deleted file mode 100644 index a3247392..00000000 --- a/openoa/types/asset.py +++ /dev/null @@ -1,205 +0,0 @@ -import importlib -import itertools - -import numpy as np -import pandas as pd -from pyproj import Transformer -from shapely.geometry import Point - - -class AssetData(object): - """ - This class wraps around a Pandas dataframe that contains - metadata about the plant assets. It provides some useful functions - to work with this data (e.g., calculating nearest neighbors, etc.). - """ - - def __init__(self, engine="pandas"): - self._asset = None - self._nearest_neighbors = None - self._nearest_towers = None - self._engine = engine - if engine == "spark": - self._sql = importlib.import_module("pyspark.sql") - self._pyspark = importlib.import_module("pyspark") - self._sc = self._pyspark.SparkContext.getOrCreate() - self._sqlContext = self._sql.SQLContext.getOrCreate(self._sc) - - def load(self, path, name, format="csv"): - if self._engine == "pandas": - self._asset = pd.read_csv("%s/%s.%s" % (path, name, format)) - elif self._engine == "spark": - self._asset = ( - self._sqlContext.read.format("com.databricks.spark.csv") - .options(header="true", inferschema="true") - .load("%s/%s.csv" % (path, name)) - .toPandas() - ) - - def save(self, path, name, format="csv"): - if self._engine == "pandas": - self._asset.to_csv("%s/%s.%s" % (path, name, format)) - elif self._engine == "spark": - self._sqlContext.createDataFrame(self._asset).write.mode("overwrite").format( - "com.databricks.spark.csv" - ).options(header="true", inferschema="true").save("%s/%s.csv" % (path, name)) - - def prepare(self, active_turbine_ids, active_tower_ids, srs="epsg:4326"): - """Prepare the asset data frame for further analysis work. Currently, this function calls parse_geometry(srs) - and calculate_nearest(active_turbine, active_tower), passing through the arguments to this function. - - Args: - active_turbine_ids (:obj:`list`): List of IDs of turbines to consider. - active_tower_ids (:obj:`list`): List of IDs of met towers to consider. - srs (:obj:`str`, optional): Used to define the coordinate - reference system (CRS). Defaults to the European - Petroleum Survey Group (EPSG) code 4326 to be used with - the World Geodetic System reference system, WGS 84. - - Returns: None - Sets asset 'geometry', 'nearest_turbine_id' and 'nearest_tower_id' column. - - """ - self.parse_geometry(srs) - self.calculate_nearest(active_turbine_ids, active_tower_ids) - - def parse_geometry(self, srs="epsg:4326", zone=None, longitude=None): - """Calculate UTM coordinates from latitude/longitude. - - The UTM system divides the Earth into 60 zones, each 6deg of - longitude in width. Zone 1 covers longitude 180deg to 174deg W; - zone numbering increases eastward to zone 60, which covers - longitude 174deg E to 180deg. The polar regions south of 80deg S - and north of 84deg N are excluded. - - Ref: http://geopandas.org/projections.html - - Args: - srs (:obj:`str`, optional): Used to define the coordinate - reference system (CRS). Defaults to the European - Petroleum Survey Group (EPSG) code 4326 to be used with - the World Geodetic System reference system, WGS 84. - zone (:obj:`int`, optional): UTM zone. If set to None - (default), then calculated from the longitude. - longitude (:obj:`float`, optional): Reference longitude for - calculating the UTM zone. If None (default), then taken - as the average longitude of all assets. - - Returns: None - Sets asset 'geometry' column. - """ - if zone is None: - # calculate zone - if longitude is None: - longitude = self.df["longitude"].mean() - zone = int(np.floor((180 + longitude) / 6.0)) + 1 - - to_crs = f"+proj=utm +zone={zone} +ellps=WGS84 +datum=WGS84 +units=m +no_defs" - transformer = Transformer.from_crs(srs.upper(), to_crs) - lats, lons = transformer.transform( - self._asset["latitude"].values, self._asset["longitude"].values - ) - self._asset["geometry"] = [Point(lat, lon) for lat, lon in zip(lats, lons)] - - def calculate_nearest(self, active_turbine_ids, active_tower_ids): - """Create or overwrite a column called 'nearest_turbine_id' or 'nearest_tower_id' which contains the asset id - of the closest active turbine or tower to the closest turbine or tower. The columns are only valid for turbines - or towers listed in the parameters of this function, and it will only calculate the value of the correct column - for each asset. Turbines, for example, will have null 'nearest_tower_id' and vice versa. - - Args: - active_turbine_ids (:obj:`list`): List of IDs of turbines to consider. - active_tower_ids (:obj:`list`): List of IDs of met towers to consider. - - Returns: None - Sets asset 'nearest_turbine_id' and 'nearest_tower_id' column. - """ - self._asset["nearest_turbine_id"] = None - if active_turbine_ids is not None and len(active_turbine_ids) > 0: - nn = self.nearest_neighbors() - for k, v in nn.items(): - v = [val for val in v if val in active_turbine_ids] - self._asset.loc[self._asset["id"] == k, "nearest_turbine_id"] = v[0] - if active_tower_ids is not None and len(active_tower_ids) > 0: - nt = self.nearest_towers() - self._asset["nearest_tower_id"] = None - for k, v in nt.items(): - v = [val for val in v if val in active_tower_ids] - self._asset.loc[self._asset["id"] == k, "nearest_tower_id"] = v[0] - - def distance_matrix(self): - ret = np.ones((self._asset.shape[0], self._asset.shape[0])) * -1 - for i, j in itertools.combinations(self._asset.index, 2): - point1 = self._asset.loc[i, "geometry"] - point2 = self._asset.loc[j, "geometry"] - distance = point1.distance(point2) - ret[i, j] = ret[j, i] = distance - return ret - - def asset_ids(self): - return self._asset.loc[:, "id"].values - - def tower_ids(self): - return self._asset.loc[self._asset["type"] == "tower", "id"].values - - def turbine_ids(self): - return self._asset.loc[self._asset["type"] == "turbine", "id"].values - - def remove_assets(self, to_delete): - self._asset = self._asset.loc[~self._asset["id"].isin(to_delete), :].reset_index(drop=True) - - def nearest_neighbors(self): - if self._nearest_neighbors is not None: - return self._nearest_neighbors - - ret = {} - towers = self._asset.loc[self._asset["type"] == "tower", :].index - turbines = self._asset.loc[self._asset["type"] == "turbine", :].index - m = self.distance_matrix() - for i in turbines: - row = m[i] - row[row == -1] = float("inf") - row[towers.tolist()] = float("inf") - ret[self._asset.loc[i, "id"]] = [self._asset.loc[x, "id"] for x in row.argsort()] - - self._nearest_neighbors = ret - return ret - - def nearest_tower_to(self, id): - return self._asset.loc[self._asset["id"] == id, "nearest_tower_id"].values[0] - - def nearest_turbine_to(self, id): - return self._asset.loc[self._asset["id"] == id, "nearest_turbine_id"].values[0] - - def nearest_towers(self): - if self._nearest_towers is not None: - return self._nearest_towers - - ret = {} - turbines = self._asset.loc[self._asset["type"] == "turbine", :].index - m = self.distance_matrix() - for i in turbines: - row = m[i] - row[row == -1] = float("inf") - row[turbines.tolist()] = float("inf") - ret[self._asset.loc[i, "id"]] = [self._asset.loc[x, "id"] for x in row.argsort()] - - self._nearest_towers = ret - return ret - - def rename_columns(self, mapping): - for k in list(mapping.keys()): - if k != mapping[k]: - self._asset[k] = self._asset[mapping[k]] - self._asset[mapping[k]] = None - - def head(self): - return self._asset.head() - - @property - def df(self): - return self._asset - - @df.setter - def df(self, value): - self._asset = value diff --git a/openoa/types/plant.py b/openoa/types/plant.py deleted file mode 100644 index 6f8f6ff8..00000000 --- a/openoa/types/plant.py +++ /dev/null @@ -1,427 +0,0 @@ -import io -import os -import json -import itertools -from dataclasses import dataclass - -import pandas as pd -from dateutil.parser import parse - -from openoa.types import timeseries_table -from openoa.types.asset import AssetData -from openoa.utils.reanalysis import ReanalysisData - - -# @dataclass -# class PlantDataV2: -# scada: pd.DataFrame -# meter: pd.DataFrame -# tower: pd.DataFrame -# status: pd.DataFrame -# curtail: pd.DataFrame -# asset: pd.DataFrame -# reanalysis: pd.DataFrame - -# name: str -# version: float = 2 - -# def validate(plant, schema): -# pass - -# def from_entr(thrift_server_host:str="localhost", -# thrift_server_port:int=10000, -# database:str="entr_warehouse", -# wind_plant:str="", -# aggregation:str="", -# date_range:list=None): -# """ -# from_entr - -# Load a PlantData object from data in an entr_warehouse. - -# Args: -# thrift_server_url(str): URL of the Apache Thrift server -# database(str): Name of the Hive database -# wind_plant(str): Name of the wind plant you'd like to load -# aggregation: Not yet implemented -# date_range: Not yet implemented - -# Returns: -# plant(PlantData): An OpenOA PlantData object. -# """ -# from pyhive import hive - -# conn = hive.Connection(host=thrift_server_host, port=thrift_server_port) - -# scada_query = """SELECT Wind_turbine_name as Wind_turbine_name, -# Date_time as Date_time, -# cast(P_avg as float) as P_avg, -# cast(Power_W as float) as Power_W, -# cast(Ws_avg as float) as Ws_avg, -# Wa_avg as Wa_avg, -# Va_avg as Va_avg, -# Ya_avg as Ya_avg, -# Ot_avg as Ot_avg, -# Ba_avg as Ba_avg - -# FROM entr_warehouse.la_haute_borne_scada_for_openoa -# """ - -# plant = PlantDataV2() - -# plant.scada.df = pd.read_sql(scada_query, conn) - -# conn.close() - -# validate(plant) - -# return plant - -# def from_plantdata_v1(plant_v1:PlantData): -# plant_v2 = PlantDataV2() -# plant_v2.scada = plant_v1.scada._df -# plant_v2.asset = plant_v1.asset._df -# plant_v2.meter = plant_v1.meter._df -# plant_v2.tower = plant_v1.tower._df -# plant_v2.status = plant_v1.status._df -# plant_v2.curtail = plant_v1.curtail._df -# plant_v2.reanalysis = plant_v1.reanalysis._df - -# # copy any other data members to their new location - -# # validate(plant_v2) - -# return plant_v2 - - -class PlantData(object): - """Data object for operational wind plant data. - - This class holds references to all tables associated with a wind plant. The tables are grouped by type: - - PlantData.scada - - PlantData.meter - - PlantData.tower - - PlantData.status - - PlantData.curtail - - PlantData.asset - - PlantData.reanalysis - - Each table must have columns following the following convention: - - - - The PlantData object can serialize all of these structures and reload them - them from the cache as needed. - - The underlying datastructure is a TimeseriesTable, which is agnostic to the underlying - engine and can be implemented with Pandas, Spark, or Dask (for instance). - - Individual plants will extend this object with their own - prepare() and other methods. - """ - - def __init__(self, path, name, engine="pandas", toolkit=["pruf_analysis"], schema=None): - """ - Create a plant data object without loading any data. - - Args: - path(string): path where data should be read/written - name(string): uniqiue name for this plant in case there's multiple plant's data in the directory - engine(string): backend engine - pandas, spark or dask - toolkit(list): the _tool_classes attribute defines a list of toolkit modules that can be loaded - - Returns: - New object - """ - if not schema: - dir = os.path.dirname(os.path.abspath(__file__)) - schema = dir + "/plant_schema.json" - with open(schema) as schema_file: - self._schema = json.load(schema_file) - - self._scada = timeseries_table.TimeseriesTable.factory(engine) - self._meter = timeseries_table.TimeseriesTable.factory(engine) - self._tower = timeseries_table.TimeseriesTable.factory(engine) - self._status = timeseries_table.TimeseriesTable.factory(engine) - self._curtail = timeseries_table.TimeseriesTable.factory(engine) - self._asset = AssetData(engine) - self._reanalysis = ReanalysisData(engine) - self._name = name - self._path = path - self._engine = engine - - self._version = 1 - - self._status_labels = ["full", "unavailable"] - - self._tables = [ - "_scada", - "_meter", - "_status", - "_tower", - "_asset", - "_curtail", - "_reanalysis", - ] - - def amend_std(self, dfname, new_fields): - """ - Amend a dataframe standard with new or changed fields. Consider running ensure_columns afterward to - automatically create the new required columns if they don't exist. - - Args: - dfname (string): one of scada, status, curtail, etc. - new_fields (dict): set of new fields and types in the same format as _scada_std to be added/changed in - the std - - Returns: - New data field standard - """ - - k = "_%s_std" % (dfname,) - setattr( - self, k, dict(itertools.chain(iter(getattr(self, k).items()), iter(new_fields.items()))) - ) - - def get_time_range(self): - """Get time range as tuple - - Returns: - (tuple): - start_time(datetime): start time - stop_time(datetime): stop time - """ - return (self._start_time, self._stop_time) - - def set_time_range(self, start_time, stop_time): - """Set time range given two unparsed timestamp strings - - Args: - start_time(string): start time - stop_time(string): stop time - - Returns: - (None) - """ - self._start_time = parse(start_time) - self._stop_time = parse(stop_time) - - def save(self, path=None): - """Save out the project and all JSON serializeable attributes to a file path. - - Args: - path(string): Location of new directory into which plant will be saved. The directory should not - already exist. Defaults to self._path - - Returns: - (None) - """ - if path is None: - raise RuntimeError("Path not specified.") - - os.mkdir(path) - - meta_dict = {} - for ca, ci in self.__dict__.items(): - if ca in self._tables: - ci.save(path, ca) - elif ca in ["_start_time", "_stop_time"]: - meta_dict[ca] = str(ci) - else: - meta_dict[ca] = ci - - with io.open(os.path.join(path, "metadata.json"), "w", encoding="utf-8") as outfile: - outfile.write(str(json.dumps(meta_dict, ensure_ascii=False))) - - def load(self, path=None): - """Load this project and all associated data from a file path - - Args: - path(string): Location of plant data directory. Defaults to self._path - - Returns: - (None) - """ - if not path: - path = self._path - - for df in self._tables: - getattr(self, df).load(path, df) - - meta_path = os.path.join(path, "metadata.json") - if os.path.exists(meta_path): - with io.open(os.path.join(path, "metadata.json"), "r") as infile: - meta_dict = json.load(infile) - for ca, ci in meta_dict.items(): - if ca in ["_start_time", "_stop_time"]: - ci = parse(ci) - setattr(self, ca, ci) - - def ensure_columns(self): - """@deprecated Ensure all dataframes contain necessary columns and format as needed""" - raise NotImplementedError("ensure_columns has been deprecated. Use plant.validate instead.") - - def validate(self, schema=None): - - """Validate this plant data object against its schema. Returns True if valid, Rasies an exception if not valid.""" - - if not schema: - schema = self._schema - - for field in schema["fields"]: - if field["type"] == "timeseries": - attr = "_{}".format(field["name"]) - if not getattr(self, attr).is_empty(): - getattr(self, attr).validate(field) - - return True - - def merge_asset_metadata(self): - """Merge metadata from the asset table into the scada and tower tables""" - if not (self._scada.is_empty()) and (len(self._asset.turbine_ids()) > 0): - self._scada.pandas_merge( - self._asset.df, - [ - "latitude", - "longitude", - "rated_power_kw", - "id", - "nearest_turbine_id", - "nearest_tower_id", - ], - "left", - on="id", - ) - if not (self._tower.is_empty()) and (len(self._asset.tower_ids()) > 0): - self._tower.pandas_merge( - self._asset.df, - [ - "latitude", - "longitude", - "rated_power_kw", - "id", - "nearest_turbine_id", - "nearest_tower_id", - ], - "left", - on="id", - ) - - def prepare(self): - """Prepare this object for use by loading data and doing essential preprocessing.""" - self.ensure_columns() - if not ((self._scada.is_empty()) or (self._tower.is_empty())): - self._asset.prepare(self._scada.unique("id"), self._tower.unique("id")) - self.merge_asset_metadata() - - @property - def scada(self): - return self._scada - - @property - def meter(self): - return self._meter - - @property - def tower(self): - return self._tower - - @property - def reanalysis(self): - return self._reanalysis - - @property - def status(self): - return self._status - - @property - def asset(self): - return self._asset - - @property - def curtail(self): - return self._curtail - - @classmethod - def from_entr( - cls, - thrift_server_host="localhost", - thrift_server_port=10000, - database="entr_warehouse", - wind_plant="", - aggregation="", - date_range=None, - ): - """ - from_entr - - Load a PlantData object from data in an entr_warehouse. - - Args: - thrift_server_host(str): URL of the Apache Thrift server - thrift_server_port(int): Port of the Apache Thrift server - database(str): Name of the Hive database - wind_plant(str): Name of the wind plant you'd like to load - aggregation: Not yet implemented - date_range: Not yet implemented - - Returns: - plant(PlantData): An OpenOA PlantData object. - """ - from pyhive import hive - - plant = cls( - database, wind_plant - ) # Passing in database as the path and wind_plant as the name for now. - - conn = hive.Connection(host=thrift_server_host, port=thrift_server_port) - - scada_query = f"""SELECT Wind_turbine_name as Wind_turbine_name, - Date_time as Date_time, - cast(P_avg as float) as P_avg, - cast(Power_W as float) as Power_W, - cast(Ws_avg as float) as Ws_avg, - Wa_avg as Wa_avg, - Va_avg as Va_avg, - Ya_avg as Ya_avg, - Ot_avg as Ot_avg, - Ba_avg as Ba_avg - - FROM {database}.{wind_plant} - """ - - plant.scada.df = pd.read_sql(scada_query, conn) - - conn.close() - - return plant - - @classmethod - def from_pandas(cls, scada, meter, status, tower, asset, curtail, reanalysis): - """ - from_pandas - - Create a PlantData object from a collection of Pandas data frames. - - Args: - scada: - meter: - status: - tower: - asset: - curtail: - reanalysis: - - Returns: - plant(PlantData): An OpenOA PlantData object. - """ - plant = cls() - - plant.scada.df = scada - plant.meter.df = meter - plant.status.df = status - plant.tower.df = tower - plant.asset.df = asset - plant.curtail.df = curtail - plant.reanalysis.df = reanalysis - - plant.validate() diff --git a/openoa/types/plant_schema.json b/openoa/types/plant_schema.json deleted file mode 100644 index a50d0d0f..00000000 --- a/openoa/types/plant_schema.json +++ /dev/null @@ -1,199 +0,0 @@ -{ - "description": "Schema for OpenOA PlantData objects", - "fields": [ - { - "description": "SCADA data at fixed time interval from all turbines in plant", - "fields": [ - { - "description": "end of bin", - "name": "time", - "required": true, - "type": "datetime64[ns]" - }, - { - "description": "foreign key to asset table", - "name": "id", - "required": true, - "type": "string" - }, - { - "name": "power_kw", - "type": "float64" - }, - { - "name": "windspeed_ms", - "type": "float64", - "unit": "m/s" - }, - { - "name": "winddirection_deg", - "type": "float64", - "unit": "deg" - }, - { - "name": "status_label", - "type": "string" - }, - { - "name": "pitch_deg", - "type": "float64", - "unit": "deg" - }, - { - "name": "temp_c", - "type": "float64", - "unit": "deg celsius" - } - ], - "metadata": [ - { - "description": "Frequency of this table in Hz", - "name": "frequency", - "required": true, - "type": "float64" - } - ], - "name": "scada", - "type": "timeseries" - }, - { - "description": "SCADA data at fixed time interval from met towers included with the plant", - "fields": [ - { - "description": "end of bin", - "name": "time", - "required": true, - "type": "datetime64[ns]" - }, - { - "description": "foreign key to asset table", - "name": "id", - "required": true, - "type": "string" - } - ], - "name": "tower", - "type": "timeseries" - }, - { - "description": "SCADA data at fixed time interval from met towers included with the plant", - "fields": [ - { - "description": "end of bin", - "name": "time", - "required": true, - "type": "datetime64[ns]" - }, - { - "description": "power in kw", - "name": "power_kw", - "type": "int64" - }, - { - "name": "energy_kwh", - "type": "float64", - "unit": "kw/h" - } - ], - "name": "meter", - "type": "timeseries" - }, - { - "description": "SCADA data at fixed time interval from met towers included with the plant", - "fields": [ - { - "description": "end of bin", - "name": "time", - "required": true, - "type": "datetime64[ns]" - }, - { - "description": "foreign key to asset table", - "name": "id", - "required": true, - "type": "string" - }, - { - "name": "status_id", - "type": "int64" - }, - { - "name": "status_code", - "type": "int64" - }, - { - "name": "status_text", - "type": "string" - } - ], - "name": "status", - "type": "timeseries" - }, - { - "description": "SCADA data at fixed time interval from met towers included with the plant", - "fields": [ - { - "description": "end of bin", - "name": "time", - "required": true, - "type": "datetime64[ns]" - }, - { - "description": "percent of plant that is curtailed", - "name": "curtailment_pct", - "type": "float64" - }, - { - "description": "kwh of plant that is curtailed", - "name": "curtailment_kwh", - "type": "float64" - }, - { - "name": "availability_pct", - "type": "float64" - }, - { - "name": "availability_kwh", - "type": "float64" - }, - { - "name": "net_energy", - "type": "float64" - } - ], - "name": "curtail", - "type": "timeseries" - }, - { - "description": "SCADA data at fixed time interval from met towers included with the plant", - "fields": [ - { - "description": "primary unique key", - "name": "id", - "required": true, - "type": "string" - }, - { - "name": "latitude", - "type": "float64" - }, - { - "name": "longitude", - "type": "float64" - }, - { - "name": "rated_power_kw", - "type": "float64" - }, - { - "name": "type", - "type": "string" - } - ], - "name": "asset", - "type": "table" - } - ], - "name": "PlantData", - "version": 0.1 -} diff --git a/openoa/types/plant_schema_25.json b/openoa/types/plant_schema_25.json deleted file mode 100644 index 8c02ebdc..00000000 --- a/openoa/types/plant_schema_25.json +++ /dev/null @@ -1,168 +0,0 @@ -{ - "description": "Schema for OpenOA PlantData objects", - "fields": [ - { - "description": "SCADA data at fixed time interval from all turbines in plant", - "fields": [ - { - "description": "end of bin", - "name": "time", - "required": true, - "type": "datetime64[ns]" - }, - { - "description": "foreign key to asset table", - "name": "asset_id", - "required": true, - "type": "int64" - }, - { - "name": "status_label", - "type": "string" - }, - { - "name": "wgen_activepw_avg", - "type": "float64", - "unit": "kw" - }, - { - "name": "wrot_bladeposition_avg", - "type": "float64", - "unit": "deg" - }, - { - "name": "wnac_windspeed_avg", - "type": "float64", - "unit": "m/s" - }, - { - "name": "wnac_winddirection_avg", - "type": "float64", - "unit": "deg" - }, - { - "name": "wnac_temout_avg", - "type": "float64", - "unit": "deg celsius" - } - ], - "metadata": [ - { - "description": "Frequency of this table in Hz", - "name": "frequency", - "required": true, - "type": "float64" - } - ], - "name": "scada", - "type": "timeseries" - }, - { - "description": "SCADA data at fixed time interval from met towers included with the plant", - "fields": [ - { - "description": "end of bin", - "name": "time", - "required": true, - "type": "datetime64[ns]" - }, - { - "description": "foreign key to asset table", - "name": "asset_id", - "required": true, - "type": "int64" - }, - { - "name": "wmet_winddirection_avg", - "type": "float64", - "unit": "deg" - } - ], - "name": "tower", - "type": "timeseries" - }, - { - "description": "SCADA data at fixed time interval from met towers included with the plant", - "fields": [ - { - "description": "end of bin", - "name": "time", - "required": true, - "type": "datetime64[ns]" - }, - { - "description": "foreign key to asset table", - "name": "asset_id", - "required": true, - "type": "int64" - }, - { - "name": "wmet_winddirection_avg", - "type": "float64", - "unit": "deg" - } - ], - "name": "meter", - "type": "timeseries" - }, - { - "description": "SCADA data at fixed time interval from met towers included with the plant", - "fields": [ - { - "description": "end of bin", - "name": "time", - "required": true, - "type": "datetime64[ns]" - }, - { - "description": "foreign key to asset table", - "name": "asset_id", - "required": true, - "type": "int64" - } - ], - "name": "status", - "type": "timeseries" - }, - { - "description": "SCADA data at fixed time interval from met towers included with the plant", - "fields": [ - { - "description": "end of bin", - "name": "time", - "required": true, - "type": "datetime64[ns]" - }, - { - "description": "foreign key to asset table", - "name": "asset_id", - "required": true, - "type": "int64" - } - ], - "name": "curtail", - "type": "timeseries" - }, - { - "description": "SCADA data at fixed time interval from met towers included with the plant", - "fields": [ - { - "description": "end of bin", - "name": "time", - "required": true, - "type": "datetime64[ns]" - }, - { - "description": "foreign key to asset table", - "name": "asset_id", - "required": true, - "type": "int64" - } - ], - "name": "asset", - "type": "table" - } - ], - "name": "PlantData", - "version": 0.2 -} diff --git a/openoa/types/plant_v2.py b/openoa/types/plant_v2.py deleted file mode 100644 index e2f713da..00000000 --- a/openoa/types/plant_v2.py +++ /dev/null @@ -1,1826 +0,0 @@ -from __future__ import annotations - -import io -import os -import json -import itertools -from typing import Callable, Optional, Sequence -from pathlib import Path -from dataclasses import dataclass - -import attr -import yaml -import numpy as np -import pandas as pd -import pyspark as spark -from attr import define, fields, fields_dict -from dateutil.parser import parse - -import openoa.toolkits.met_data_processing as met -from openoa.types import timeseries_table -from openoa.types.asset import AssetData -from openoa.utils.reanalysis import ReanalysisData -from openoa.toolkits.reanalysis_downloading import download_reanalysis_data_planetos - - -# PlantData V2 with Attrs Dataclass - -# Datetime frequency checks -_at_least_monthly = ("M", "MS", "W", "D", "H", "T", "min", "S", "L", "ms", "U", "us", "N") -_at_least_daily = ("D", "H", "T", "min", "S", "L", "ms", "U", "us", "N") -_at_least_hourly = ("H", "T", "min", "S", "L", "ms", "U", "us", "N") - -ANALYSIS_REQUIREMENTS = { - "MonteCarloAEP": { - "meter": { - "columns": ["energy"], - "freq": _at_least_monthly, - }, - "curtail": { - "columns": ["availability", "curtailment"], - "freq": _at_least_monthly, - }, - "reanalysis": { - "columns": ["windspeed", "density"], - "conditional_columns": { - "reg_temperature": ["temperature"], - "reg_wind_direction": ["windspeed_u", "windspeed_v"], - }, - "freq": _at_least_monthly, - }, - }, - "TurbineLongTermGrossEnergy": { - "scada": { - "columns": ["id", "windspeed", "power"], # TODO: wtur_W_avg vs energy_kwh ? - "freq": _at_least_daily, - }, - "reanalysis": { - "columns": ["windspeed", "wind_direction", "density"], - "freq": _at_least_daily, - }, - }, - "ElectricalLosses": { - "scada": { - "columns": ["energy"], - "freq": _at_least_daily, - }, - "meter": { - "columns": ["energy"], - "freq": _at_least_monthly, - }, - }, -} - - -def analysis_type_validator( - instance: PlantDataV3, attribute: attr.Attribute, value: list[str] -) -> None: - """Validates the input from `PlantDataV3` against the analysis requirements in - `ANALYSIS_REQUIREMENTS`. If there is an error, then it gets added to the - `PlantDataV3._errors` dictionary to be raised in the post initialization hook. - - Args: - instance (PlantDataV3): The PlantData object. - attribute (attr.Attribute): The converted `analysis_type` attribute object. - value (list[str]): The input value from `analysis_type`. - """ - if None in value: - UserWarning("`None` was provided to `analysis_type`, so no validation will occur.") - - valid_types = [*ANALYSIS_REQUIREMENTS] + ["all", None] - incorrect_types = set(value).difference(set(valid_types)) - if incorrect_types: - raise ValueError( - f"{attribute.name} input: {incorrect_types} is invalid, must be one of 'all' or a combination of: {[*ANALYSIS_REQUIREMENTS]}" - ) - - -def frequency_validator( - actual_freq: str, desired_freq: Optional[str | set[str]], exact: bool -) -> bool: - """Helper function to check if the actual datetime stamp frequency is valid compared - to what is required. - - Args: - actual_freq (str): The frequency of the datetime stamp, or `df.index.freq`. - desired_freq (Optional[str | set[str]]): Either the exact frequency required, - or a set of options that are also valid, in which case any numeric - information encoded in `actual_freq` will be dropped. - exact (bool): If the provided frequency codes should be exact matches (`True`), - or, if `False`, the check should be for a combination of matches. - - Returns: - bool: If the actual datetime frequency is sufficient, per the match requirements. - """ - if exact: - return actual_freq != desired_freq - - if desired_freq is None: - return True - - actual_freq = "".join(filter(str.isalpha, actual_freq)) - return actual_freq in desired_freq - - -@define(auto_attribs=True) -class FromDictMixin: - """A Mixin class to allow for kwargs overloading when a data class doesn't - have a specific parameter definied. This allows passing of larger dictionaries - to a data class without throwing an error. - - Raises - ------ - AttributeError - Raised if the required class inputs are not provided. - """ - - @classmethod - def from_dict(cls, data: dict): - """Maps a data dictionary to an `attrs`-defined class. - TODO: Add an error to ensure that either none or all the parameters are passed in - Args: - data : dict - The data dictionary to be mapped. - Returns: - cls - The `attrs`-defined class. - """ - # Get all parameters from the input dictionary that map to the class initialization - kwargs = { - a.name: data[a.name] - for a in cls.__attrs_attrs__ # type: ignore - if a.name in data and a.init - } - - # Map the inputs must be provided: 1) must be initialized, 2) no default value defined - required_inputs = [ - a.name - for a in cls.__attrs_attrs__ # type: ignore - if a.init and isinstance(a.default, attr._make._Nothing) # type: ignore - ] - undefined = sorted(set(required_inputs) - set(kwargs)) - if undefined: - raise AttributeError( - f"The class defintion for {cls.__name__} is missing the following inputs: {undefined}" - ) - return cls(**kwargs) # type: ignore - - -######################################### -# Define the meta data validation classes -######################################### -@define(auto_attribs=True) -class SCADAMetaData(FromDictMixin): - """A metadata schematic to create the necessary column mappings and other validation - components, or other data about the SCADA data, that will contribute to a larger - plant metadata schema/routine. - - Args: - time (str): The datetime stamp for the SCADA data, by default "time". This data should be of - type: `np.datetime64[ns]`. Additional columns describing the datetime stamps - are: `frequency` - """ - - # DataFrame columns - time: str = attr.ib(default="time") - id: str = attr.ib(default="id") - power: str = attr.ib(default="power") - windspeed: str = attr.ib(default="windspeed") - wind_direction: str = attr.ib(default="wind_direction") - status: str = attr.ib(default="status") - pitch: str = attr.ib(default="pitch") - temperature: str = attr.ib(default="temperature") - - # Data about the columns - frequency: str = attr.ib(default="10T") - - # Parameterizations that should not be changed - # Prescribed mappings, datatypes, and units for in-code reference. - name: str = attr.ib(default="scada", init=False) - col_map: dict = attr.ib(init=False) - dtypes: dict = attr.ib( - default=dict( - time=np.datetime64, - id=str, - power=float, - windspeed=float, - wind_direction=float, - status=str, - pitch=float, - temperature=float, - ), - init=False, # don't allow for user input - ) - units: dict = attr.ib( - default=dict( - time="datetim64[ns]", - id=None, - power="kW", - windspeed="m/s", - wind_direction="deg", - status=None, - pitch="deg", - temperature="C", - ), - init=False, # don't allow for user input - ) - - def __attrs_post_init__(self) -> None: - self.col_map = dict( - time=self.time, - id=self.id, - power=self.power, - windspeed=self.windspeed, - wind_direction=self.wind_direction, - status=self.status, - pitch=self.pitch, - temperature=self.temperature, - ) - - -@define(auto_attribs=True) -class MeterMetaData(FromDictMixin): - - # DataFrame columns - time: str = attr.ib(default="time") - power: str = attr.ib(default="power") - energy: str = attr.ib(default="energy") - - # Parameterizations that should not be changed - # Prescribed mappings, datatypes, and units for in-code reference. - name: str = attr.ib(default="meter", init=False) - col_map: dict = attr.ib(init=False) - dtypes: dict = attr.ib( - default=dict( - time=np.datetime64, - power=float, - energy=float, - ), - init=False, # don't allow for user input - ) - units: dict = attr.ib( - default=dict( - time="datetim64[ns]", - power="kW", - energy="kW", - ), - init=False, # don't allow for user input - ) - - def __attrs_post_init__(self) -> None: - self.col_map = dict( - time=self.time, - power=self.power, - energy=self.energy, - ) - - -@define(auto_attribs=True) -class TowerMetaData(FromDictMixin): - # DataFrame columns - time: str = attr.ib(default="time") - id: str = attr.ib(default="id") - - # Parameterizations that should not be changed - # Prescribed mappings, datatypes, and units for in-code reference. - name: str = attr.ib(default="tower", init=False) - col_map: dict = attr.ib(init=False) - dtypes: dict = attr.ib( - default=dict( - time=np.datetime64, - id=str, - ), - init=False, # don't allow for user input - ) - units: dict = attr.ib( - default=dict( - time="datetim64[ns]", - id=None, - ), - init=False, # don't allow for user input - ) - - def __attrs_post_init__(self) -> None: - self.col_map = dict( - time=self.time, - id=self.id, - ) - - -@define(auto_attribs=True) -class StatusMetaData(FromDictMixin): - # DataFrame columns - time: str = attr.ib(default="time") - id: str = attr.ib(default="id") - status_id: str = attr.ib(default="status_id") - status_code: str = attr.ib(default="status_code") - status_text: str = attr.ib(default="status_text") - - # Data about the columns - frequency: str = attr.ib(default="10T") - - # Parameterizations that should not be changed - # Prescribed mappings, datatypes, and units for in-code reference. - name: str = attr.ib(default="status", init=False) - col_map: dict = attr.ib(init=False) - dtypes: dict = attr.ib( - default=dict( - time=np.datetime64, - id=str, - status_id=np.int64, - status_code=np.int64, - status_text=str, - ), - init=False, # don't allow for user input - ) - units: dict = attr.ib( - default=dict( - time="datetim64[ns]", - id=None, - status_id=None, - status_code=None, - status_text=None, - ), - init=False, # don't allow for user input - ) - - def __attrs_post_init__(self) -> None: - self.col_map = dict( - time=self.time, - id=self.id, - status_id=self.status_id, - status_code=self.status_code, - status_text=self.status_text, - ) - - -@define(auto_attribs=True) -class CurtailMetaData(FromDictMixin): - # DataFrame columns - time: str = attr.ib(default="time") - curtailment: str = attr.ib(default="curtailment") - availability: str = attr.ib(default="availability") - net_energy: str = attr.ib(default="net_energy") - - # Data about the columns - frequency: str = attr.ib(default="10T") - - # Parameterizations that should not be changed - # Prescribed mappings, datatypes, and units for in-code reference. - name: str = attr.ib(default="curtail", init=False) - col_map: dict = attr.ib(init=False) - dtypes: dict = attr.ib( - default=dict( - time=np.datetime64, - curtailment=float, - availability=float, - net_energy=float, - ), - init=False, # don't allow for user input - ) - units: dict = attr.ib( - default=dict( - time="datetim64[ns]", - curtailment=float, - availability=float, - net_energy="kW", - ), - init=False, # don't allow for user input - ) - - def __attrs_post_init__(self) -> None: - self.col_map = dict( - time=self.time, - curtailment=self.curtailment, - availability=self.availability, - net_energy=self.net_energy, - ) - - -@define(auto_attribs=True) -class AssetMetaData(FromDictMixin): - # DataFrame columns - id: str = attr.ib(default="id") - latitude: str = attr.ib(default="latitude") - longitude: str = attr.ib(default="longitude") - rated_power: str = attr.ib(default="rated_power") - hub_height: str = attr.ib(default="hub_height") - rotor_diameter: str = attr.ib(default="rotor_diameter") - elevation: str = attr.ib(default="elevation") - type: str = attr.ib(default="type") - - # Parameterizations that should not be changed - # Prescribed mappings, datatypes, and units for in-code reference. - name: str = attr.ib(default="asset", init=False) - col_map: dict = attr.ib(init=False) - dtypes: dict = attr.ib( - default=dict( - id=str, - latitude=float, - longitude=float, - rated_power=float, - hub_height=float, - rotor_diameter=float, - elevation=float, - type=str, - ), - init=False, # don't allow for user input - ) - units: dict = attr.ib( - default=dict( - id=None, - latitude="WGS84", - longitude="WGS84", - rated_power="kW", - hub_height="m", - rotor_diameter="m", - elevation="m", - type=None, - ), - init=False, # don't allow for user input - ) - - def __attrs_post_init__(self) -> None: - self.col_map = dict( - id=self.id, - latitude=self.latitude, - longitude=self.longitude, - rated_power=self.rated_power, - hub_height=self.rated_power, - rotor_diameter=self.rated_power, - elevation=self.rated_power, - type=self.type, - ) - - -@define(auto_attribs=True) -class ReanalysisMetaData(FromDictMixin): - # DataFrame columns - time: str = attr.ib(default="time") - windspeed: str = attr.ib(default="windspeed") - windspeed_u: str = attr.ib(default="windspeed_u") - windspeed_v: str = attr.ib(default="windspeed_v") - wind_direction: str = attr.ib(default="wind_direction") - temperature: str = attr.ib(default="temperature") - density: str = attr.ib(default="density") - surface_pressure: str = attr.ib(default="surface_pressure") - - # Data about the columns - frequency: str = attr.ib(default="10T") - - # Parameterizations that should not be changed - # Prescribed mappings, datatypes, and units for in-code reference. - name: str = attr.ib(default="reanalysis", init=False) - col_map: dict = attr.ib(init=False) - dtypes: dict = attr.ib( - default=dict( - time=np.datetime64, - windspeed=float, - windspeed_u=float, - windspeed_v=float, - wind_direction=float, - temperature=float, - density=float, - surface_pressure=float, - ), - init=False, # don't allow for user input - ) - units: dict = attr.ib( - default=dict( - time="datetim64[ns]", - windspeed="m/s", - windspeed_u="m/s", - windspeed_v="m/s", - wind_direction="deg", - temperature="K", - density="kg/m^3", - surface_pressure="Pa", - ), - init=False, # don't allow for user input - ) - - def __attrs_post_init__(self) -> None: - self.col_map = dict( - time=self.time, - windspeed=self.windspeed, - windspeed_u=self.windspeed_u, - windspeed_v=self.windspeed_v, - wind_direction=self.wind_direction, - temperature=self.temperature, - density=self.density, - surface_pressure=self.surface_pressure, - ) - - -def convert_reanalysis(value: dict[str, dict]): - return {k: ReanalysisMetaData.from_dict(v) for k, v in value.items()} - - -@define(auto_attribs=True) -class PlantMetaData(FromDictMixin): - """Composese the individual metadata/validation requirements from each of the - individual data "types" that can compose a `PlantData` object. - """ - - latitude: float = attr.ib(default=0, converter=float) - longitude: float = attr.ib(default=0, converter=float) - scada: SCADAMetaData = attr.ib(default={}, converter=SCADAMetaData.from_dict) - meter: MeterMetaData = attr.ib(default={}, converter=MeterMetaData.from_dict) - tower: TowerMetaData = attr.ib(default={}, converter=TowerMetaData.from_dict) - status: StatusMetaData = attr.ib(default={}, converter=StatusMetaData.from_dict) - curtail: CurtailMetaData = attr.ib(default={}, converter=CurtailMetaData.from_dict) - asset: AssetMetaData = attr.ib(default={}, converter=AssetMetaData.from_dict) - reanalysis: dict[str, ReanalysisMetaData] = attr.ib(default={}, converter=convert_reanalysis) - - @property - def column_map(self): - values = dict( - scada=self.scada.col_map, - meter=self.meter.col_map, - tower=self.tower.col_map, - status=self.status.col_map, - asset=self.asset.col_map, - curtail=self.curtail.col_map, - reanalysis={k: v.col_map for k, v in self.reanalysis.items()}, - ) - return values - - @property - def type_map(self): - types = dict( - scada=self.scada.dtypes, - meter=self.meter.dtypes, - tower=self.tower.dtypes, - status=self.status.dtypes, - asset=self.asset.dtypes, - curtail=self.curtail.dtypes, - reanalysis={k: v.dtypes for k, v in self.reanalysis.items()}, - ) - return types - - @property - def coordinates(self) -> tuple[float, float]: - """Returns the latitude, longitude pair for the wind power plant. - - Returns: - tuple[float, float]: The (latitude, longitude) pair - """ - return self.latitude, self.longitude - - @classmethod - def from_json(cls, metadata_file: str | Path) -> PlantMetaData: - metadata_file = Path(metadata_file).resolve() - if not metadata_file.is_file(): - raise FileExistsError(f"Input JSON file: {metadata_file} is an invalid input.") - - with open(metadata_file) as f: - return cls.from_dict(json.load(f)) - - @classmethod - def from_yaml(cls, metadata_file: str | Path) -> PlantMetaData: - metadata_file = Path(metadata_file).resolve() - if not metadata_file.is_file(): - raise FileExistsError(f"Input YAML file: {metadata_file} is an invalid input.") - - with open(metadata_file) as f: - return cls.from_dict(yaml.safe_load(f)) - - @classmethod - def load(cls, data: str | Path | dict | PlantMetaData) -> PlantMetaData: - if isinstance(data, PlantMetaData): - return data - - if isinstance(data, str): - data = Path(data).resolve() - - if isinstance(data, Path): - if data.suffix == ".json": - return cls.from_json(data) - elif data.suffix in (".yaml", ".yml"): - return cls.from_yaml(data) - else: - raise ValueError("Bad input file extension, must be one of: .json, .yml, or .yaml") - - if isinstance(data, dict): - return cls.from_dict(data) - - raise ValueError("PlantMetaData can only be loaded from str, Path, or dict objects.") - - def frequency_requirements(self, analysis_types: list[str | None]) -> dict[str, set[str]]: - """Creates the frequency requirements - - Args: - analysis_types (list[str | None]): _description_ - - Returns: - dict[str, set[str]]: _description_ - """ - requirements = {key: ANALYSIS_REQUIREMENTS[key] for key in analysis_types} - frequency_requirements = { - key: {name: value["freq"] for name, value in values.items()} - for key, values in requirements.items() - } - frequency = { - k: [] - for k in set( - itertools.chain.from_iterable([[*val] for val in frequency_requirements.values()]) - ) - } - for vals in frequency_requirements.values(): - for name, req in vals.items(): - reqs = frequency[name] - if reqs == []: - frequency[name] = set(req) - else: - frequency[name] = reqs.intersection(req) - return frequency - - -#################################################### -# Define the data validator and conversion functions -#################################################### - - -def convert_to_list( - value: Sequence | str | int | float | None, - manipulation: Callable | None = None, -) -> list: - """Converts an unknown element that could be a list or single, non-sequence element - to a list of elements. - - Parameters - ---------- - value : Sequence | str | int | float - The unknown element to be converted to a list of element(s). - manipulation: Callable | None - A function to be performed upon the individual elements, by default None. - - Returns - ------- - list - The new list of elements. - """ - - if isinstance(value, (str, int, float, None)): - value = [value] - if manipulation is not None: - return [manipulation(el) for el in value] - return list(value) - - -def column_validator(df: pd.DataFrame, column_names={}) -> None | list[str]: - """Validates that the column names exist as provided for each expected column. - - Args: - df (pd.DataFrame): The DataFrame for column naming validation - column_names (dict, optional): Dictionary of column type (key) to real column - value (value) pairs. Defaults to {}. - - Returns: - None | list[str]: A list of error messages that can be raised at a later step - in the validation process. - """ - try: - missing = set(column_names.values()).difference(df.columns) - except AttributeError: - # Catches 'NoneType' object has no attribute 'columns' for no data - missing = column_names.values() - if missing: - return list(missing) - return [] - - -def dtype_converter(df: pd.DataFrame, column_types={}) -> None | list[str]: - """Converts the columns provided in `column_types` of `df` to the appropriate data - type. - - Args: - df (pd.DataFrame): The DataFrame for type validation/conversion - column_types (dict, optional): Dictionary of column name (key) and data type - (value) pairs. Defaults to {}. - - Returns: - None | list[str]: List of error messages that were encountered in the conversion - process that will be raised at another step of the data validation. - """ - errors = [] - for column, new_type in column_types.items(): - if new_type in (np.datetime64, pd.DatetimeIndex): - try: - df[column] = pd.to_datetime(df[column], utc=True) - except Exception as e: # noqa: disable=E722 - errors.append(column) - continue - try: - df[column] = df[column].astype(new_type) - except: # noqa: disable=E722 - errors.append(column) - - if errors: - return errors - return [] - - -def analysis_filter(error_dict: dict, analysis_types: list[str] = ["all"]) -> dict: - if "all" in analysis_types: - return error_dict - - categories = ("scada", "meter", "tower", "curtail", "reanalysis", "asset") - requirements = {key: ANALYSIS_REQUIREMENTS[key] for key in analysis_types} - column_requirements = { - cat: set( - itertools.chain(*[r.get(cat, {}).get("columns", []) for r in requirements.values()]) - ) - for cat in categories - } - - # Filter the missing columns, so only analysis-specific columns are provided - error_dict["missing"] = { - key: values.intersection(error_dict["missing"].get(key, [])) - for key, values in column_requirements.items() - } - - # Filter the bad dtype columns, so only analysis-specific columns are provided - error_dict["dtype"] = { - key: values.intersection(error_dict["dtype"].get(key, [])) - for key, values in column_requirements.items() - } - - # Filter the incorrect frequencies, so only analysis-specific categories are provided - # TODO - # error_dict["frequency"] = { - # key: value["freq"] for key, value in requirements if value["freq"] not in frequency - # } - - return error_dict - - -def compose_error_message(error_dict: dict, analysis_types: list[str] = ["all"]) -> str: - """Takes a dictionary of error messages from the `PlantDataV3` validation routines, - filters out errors unrelated to the intended analysis types, and creates a - human-readable error message. - - Args: - error_dict (dict): See `PlantDataV3._errors` for more details. - analysis_types (list[str], optional): The user-input analysis types, which are - used to filter out unlreated errors. Defaults to ["all"]. - - Returns: - str: The human-readable error message breakdown. - """ - if "all" not in analysis_types: - error_dict = analysis_filter(error_dict, analysis_types) - - messages = [ - f"`{name}` data is missing the following columns: {cols}" - for name, cols in error_dict["missing"].items() - if len(cols) > 0 - ] - messages.extend( - [ - f"`{name}` data columns were of the wrong type: {cols}" - for name, cols in error_dict["dtype"].items() - if len(cols) > 0 - ] - ) - messages.extend([f"`{name}` data is of the wrong frequecy" for name in error_dict["frequency"]]) - return "\n".join(messages) - - -def load_to_pandas(data: str | Path | pd.DataFrame | spark.sql.DataFrame) -> pd.DataFrame | None: - """Loads the input data or filepath to apandas DataFrame. - - Args: - data (str | Path | pd.DataFrame | spark.DataFrame): The input data. - - Raises: - ValueError: Raised if an invalid data type was passed. - - Returns: - pd.DataFrame | None: The passed `None` or the converted pandas DataFrame object. - """ - if data is None: - return data - elif isinstance(data, (str, Path)): - return pd.read_csv(data) - elif isinstance(data, pd.DataFrame): - return data - elif isinstance(data, spark.sql.DataFrame): - return data.toPandas() - else: - raise ValueError("Input data could not be converted to pandas") - - -def rename_columns(df: pd.DataFrame, col_map: dict, reverse: bool = True) -> pd.DataFrame: - """Renames the pandas DataFrame columns using col_map. Intended to be used in - conjunction with the a data objects meta data column mapping (reverse=True). - - Args: - df (pd.DataFrame): The DataFrame to have its columns remapped. - col_map (dict): Dictionary of existing column names and new column names. - reverse (bool, optional): True, if the new column names are the keys (using the - xxMetaData.col_map as input), or False, if the current column names are the - values. Defaults to True. - - Returns: - pd.DataFrame: Input DataFrame with remapped column names. - """ - if reverse: - col_map = {v: k for k, v in col_map.items()} - return df.rename(columns=col_map) - - -############################ -# Define the PlantData class -############################ - - -@define(auto_attribs=True) -class PlantDataV3: - """Data object for operational wind plant data, which can serialize all of these - structures and reload them them from the cache as needed. - - This class holds references to all tables associated with a wind plant. The tables - are grouped by type: - - `scada` - - `meter` - - `tower` - - `status` - - `curtail` - - `asset` - - `reanalysis` - - Parameters - ---------- - metadata : PlantMetaData - A nested dictionary of the schema definition for each of the data types that - will be input. See `SCADAMetaData`, etc. for more information. <-- TODO - scada : pd.DataFrame - The SCADA data to be used for analyis. See `SCADAMetaData` for more details - on the required columns, and other conventions - TODO: FINISH THE DOCSTRING - - Raises: - ValueError: Raised if any column names are missing in the input data, as - specified in the appropriate schema - """ - - metadata: PlantMetaData = attr.ib( - default={}, converter=PlantMetaData.load, on_setattr=[attr.converters, attr.validators] - ) - analysis_type: list[str] | None = attr.ib( - default=None, - converter=convert_to_list, - validator=analysis_type_validator, - on_setattr=[attr.setters.convert, attr.setters.validate], - ) - scada: pd.DataFrame | None = attr.ib(default=None, converter=load_to_pandas) - meter: pd.DataFrame | None = attr.ib(default=None, converter=load_to_pandas) - tower: pd.DataFrame | None = attr.ib(default=None, converter=load_to_pandas) - status: pd.DataFrame | None = attr.ib(default=None, converter=load_to_pandas) - curtail: pd.DataFrame | None = attr.ib(default=None, converter=load_to_pandas) - asset: pd.DataFrame | None = attr.ib(default=None, converter=load_to_pandas) - reanalysis: dict[str, pd.DataFrame] | None = attr.ib(default=None) - preprocess: Callable | None = attr.ib(default=None) - - # Error catching in validation - _errors: dict[str, list[str]] = attr.ib( - default={"missing": {}, "dtype": {}, "frequency": []}, init=False - ) # No user initialization required - - def __attrs_post_init__(self): - self.reanalysis_validation() - # Check the errors againts the analysis requirements - error_message = compose_error_message(self._errors, analysis_types=self.analysis_type) - if error_message != "": - # raise ValueError("\n".join(itertools.chain(*self._errors.values()))) - raise ValueError(error_message) - self.update_column_names() - - if self.preprocess is not None: - self.preprocess( - self - ) # TODO: should be a user-defined method to run the data cleansing steps - - @scada.validator - @meter.validator - # @tower.validator - @status.validator - @curtail.validator - @asset.validator - def data_validator(self, instance: attr.Attribute, value: pd.DataFrame | None) -> None: - """Validator function for each of the data buckets in `PlantData`. - - Args: - instance (attr.Attribute): The `attr` attribute details - value (pd.DataFrame | None): The attributes user-provided value. - """ - if None in self.analysis_type: - return - name = instance.name - if value is None: - self._errors["missing"].update( - {name: list(getattr(self.metadata, instance.name).col_map.values())} - ) - self._errors["dtype"].update( - {name: list(getattr(self.metadata, instance.name).dtypes.keys())} - ) - - else: - self._errors["missing"].update(self._validate_column_names(category=name)) - self._errors["dtype"].update(self._validate_types(category=name)) - - def reanalysis_validation(self) -> None: - """Provides the reanalysis data initialization and validation routine. - - Control Flow: - - If `None` is provided, then run the `data_validator` method to collect - missing columns and bad data types - - If the dictionary values are a dictionary, then the reanalysis data will - be downloaded using the dictionary as kwargs passed to the PlanetOS API - in `openoa.toolkits.reanslysis_downloading`, with the product name and site - coordinates being provided automatically. NOTE: This also calculates the - derived variables such as wind direction upon downloading. - - If a non-dictionary input is provided for a reanalysis product type, then the - `load_to_pandas` method will be called on the input data. - - Raises: - ValueError: Raised if reanalysis input is not a dictionary. - """ - if None in self.analysis_type: - return - if self.reanalysis is None: - self.data_validator(PlantDataV3.reanalysis, self.reanalysis) - return - - if not isinstance(self.reanalysis, dict): - raise ValueError( - "Reanalysis data should be provided as a dictionary of product name (keys) and api kwargs or data" - ) - - reanalysis = {} - for name, value in self.reanalysis.items(): - if isinstance(value, dict): - value.update( - dict( - dataset=name, - lat=self.metadata.latitude, - lon=self.metadata.longitude, - calc_derived_vars=True, - ) - ) - reanalysis[name] = download_reanalysis_data_planetos(**value) - else: - reanalysis[name] = load_to_pandas(value) - - self.reanalysis = reanalysis - self._calculate_reanalysis_columns() - - self._errors["missing"].update(self._validate_column_names(category="reanalysis")) - self._errors["dtype"].update(self._validate_types(category="reanalysis")) - self._errors["frequency"].extend(self._validate_frequency(category="reanalysis")) - - @property - def analysis_values(self): - # if self.analysis_type == "x": - # return self.scada, self, self.meter, self.asset - values = dict( - scada=self.scada, - meter=self.meter, - tower=self.tower, - asset=self.asset, - status=self.status, - curtail=self.curtail, - reanalysis=self.reanalysis, - ) - return values - - def _validate_column_names(self, category: str = "all") -> dict[str, list[str]]: - column_map = self.metadata.column_map - - if category == "reanalysis": - missing_cols = { - f"{category}-{name}": column_validator(df, column_names=column_map[category][name]) - for name, df in self.analysis_values[category].items() - } - return missing_cols if isinstance(missing_cols, dict) else {} - - if category != "all": - df = self.analysis_values[category] - missing_cols = {category: column_validator(df, column_names=column_map[category])} - return missing_cols if isinstance(missing_cols, dict) else {} - - missing_cols = { - name: column_validator(df, column_names=column_map[name]) - for name, df in self.analysis_values.items() - if name != "reanalysis" - } - missing_cols.update( - { - f"reanalysis-{name}": column_validator( - df, column_names=column_map["reanalysis"][name] - ) - for name, df, in self.analysis_values["reanalysis"].items() - } - ) - return missing_cols if isinstance(missing_cols, dict) else {} - - def _validate_types(self, category: str = "all") -> dict[str, list[str]]: - - # Create a new mapping of the data's column names to the expected dtype - # TODO: Consider if this should be a encoded in the metadata/plantdata object elsewhere - column_name_map = self.metadata.column_map - column_type_map = self.metadata.type_map - column_map = {} - for name in column_name_map: - if name == "reanalysis": - column_map["reanalysis"] = {} - for name in column_name_map["reanalysis"]: - column_map["reanalysis"][name] = dict( - zip( - column_name_map["reanalysis"][name].values(), - column_type_map["reanalysis"][name].values(), - ) - ) - else: - column_map[name] = dict( - zip(column_name_map[name].values(), column_type_map[name].values()) - ) - - if category == "reanalysis": - error_cols = { - f"{category}-{name}": dtype_converter(df, column_types=column_map[category][name]) - for name, df in self.analysis_values[category].items() - } - return error_cols if isinstance(error_cols, dict) else {} - - if category != "all": - df = self.analysis_values[category] - error_cols = {category: dtype_converter(df, column_types=column_map[category])} - return error_cols if isinstance(error_cols, dict) else {} - - error_cols = { - name: dtype_converter(df, column_types=column_map[name]) - for name, df in self.analysis_values.items() - } - return error_cols if isinstance(error_cols, dict) else {} - - def _validate_frequency(self, category: str = "all") -> list[str]: - frequency_requirements = self.metadata.frequency_requirements(self.analysis_type) - actual_frequencies = { - name: df.index.freq for name, df in self.analysis_values if name != "reanalysis" - } - actual_frequencies["reanalysis"] = { - name: df.index.freq for name, df in self.analysis_values["reanalysis"] - } - # TODO: ACTUALLY MATCH AGAINST REAL, AND CHECK IF THAT MATTERS, AND IF SO - # CHECK AGAINST THE REQUIREMENTS - if category == "reanalysis": - # Check if this category requires a check - if category not in frequency_requirements: - return {} - invalid_freq = [ - f"{category}-{name}" - for name, df in self.analysis_values[category].items() - if frequency_validator( - df.index.freq, getattr(self.metadata, category)[name].frequency, True - ) - or frequency_validator(df.index.freq, frequency_requirements.get(category), False) - ] - return invalid_freq - - if category != "all": - freq = self.analysis_values[category].index.freq - if frequency_validator( - freq, getattr(self.metadata, category).frequency, True - ) or frequency_validator(freq, frequency_requirements.get(category), False): - return [category] - else: - return [] - - invalid_freq = [ - name - for name, df in self.analysis_values.items() - if frequency_validator(df.index.freq, getattr(self.metadata, name).frequency, True) - or frequency_validator(df.index.freq, frequency_requirements.get(name), False) - ] - invalid_freq.extend( - [ - f"reanalysis-{name}" - for name, df, in self.analysis_values["reanalysis"].items() - if frequency_validator( - df.index.freq, getattr(self.metadata, category)[name].frequency, True - ) - or frequency_validator(df.index.freq, frequency_requirements.get(category), False) - ] - ) - return invalid_freq - - def validate(self, metadata: Optional[dict | str | Path | PlantMetaData] = None) -> None: - """Secondary method to validate the plant data objects after loading or changing - data with option to provide an updated `metadata` object/file as well - - Args: - metadata (Optional[dict]): Updated metadata object, dictionary, or file to - create the updated metadata for data validation. - - Raises: - ValueError: Raised at the end if errors are caught in the validation steps. - """ - if metadata is not None: - self.metadata = metadata - - self._errors = { - "missing": self._validate_column_names(), - "dtype": self._validate_types(), - "frequency": self._validate_frequency(), - } - self.reanalysis_validation() - - # TODO: Check for extra columns? - # TODO: Define other checks? - - error_message = compose_error_message(self._errors, self.analysis_type) - if error_message: - raise ValueError(error_message) - - self.update_column_names() - - def _calculate_reanalysis_columns(self) -> None: - """Calculates extra variables such as wind_direction from the provided - reanalysis data if they don't already exist. - """ - if self.reanalysis is None: - return - reanalysis = {} - for name, df in self.reanalysis.items(): - col_map = self.metadata.reanalysis[name].col_map - u = col_map["windspeed_u"] - v = col_map["windspeed_v"] - has_u_v = (u in df) & (v in df) - - ws = col_map["windspeed"] - if ws not in df: - if has_u_v: - df[ws] = np.sqrt(df[u].values ** 2 + df[v].values ** 2) - - wd = col_map["wind_direction"] - if wd not in df: - if has_u_v: - df[wd] = met.compute_wind_direction(df[u], df[v]) - - dens = col_map["density"] - sp = col_map["surface_pressure"] - temp = col_map["temperature"] - if dens not in df: - if (sp in df) & (temp in df): - df[dens] = met.compute_air_density(df[temp], df[sp]) - - reanalysis[name] = df - self.reanalysis = reanalysis - - def update_column_names(self, to_original: bool = False) -> None: - meta = self.metadata - reverse = not to_original - if self.scada is not None: - self.scada = rename_columns(self.scada, meta.scada.col_map, reverse=reverse) - if self.meter is not None: - self.meter = rename_columns(self.meter, meta.meter.col_map, reverse=reverse) - if self.tower is not None: - self.tower = rename_columns(self.tower, meta.tower.col_map, reverse=reverse) - if self.status is not None: - self.status = rename_columns(self.status, meta.status.col_map, reverse=reverse) - if self.curtail is not None: - self.curtail = rename_columns(self.curtail, meta.curtail.col_map, reverse=reverse) - if self.asset is not None: - self.asset = rename_columns(self.asset, meta.asset.col_map) - if self.reanalysis is not None: - reanalysis = {} - for name, df in self.reanalysis.items(): - reanalysis[name] = rename_columns( - df, meta.reanalysis[name].col_map, reverse=reverse - ) - self.reanalysis = reanalysis - - # Not necessary, but could provide an additional way in - @classmethod - def from_entr( - cls: PlantDataV3, - thrift_server_host: str = "localhost", - thrift_server_port: int = 10000, - database: str = "entr_warehouse", - wind_plant: str = "", - aggregation: str = "", - date_range: list = None, - ): - """Load a PlantData object from data in an entr_warehouse. - - Args: - thrift_server_url(str): URL of the Apache Thrift server - database(str): Name of the Hive database - wind_plant(str): Name of the wind plant you'd like to load - aggregation: Not yet implemented - date_range: Not yet implemented - - Returns: - plant(PlantData): An OpenOA PlantData object. - """ - return from_entr( - thrift_server_host, thrift_server_port, database, wind_plant, aggregation, date_range - ) - - def turbine_ids(self) -> list[str]: - """Convenience method for getting the unique turbine IDs from the scada data. - - Returns: - list[str]: List of unique turbine identifiers. - """ - return self.scada[self.metadata.scada.id].unique() - - -def from_entr( - thrift_server_host: str = "localhost", - thrift_server_port: int = 10000, - database: str = "entr_warehouse", - wind_plant: str = "", - aggregation: str = "", - date_range: list = None, -): - """ - from_entr - - Load a PlantData object from data in an entr_warehouse. - - Args: - thrift_server_url(str): URL of the Apache Thrift server - database(str): Name of the Hive database - wind_plant(str): Name of the wind plant you'd like to load - aggregation: Not yet implemented - date_range: Not yet implemented - - Returns: - plant(PlantData): An OpenOA PlantData object. - """ - from pyhive import hive - - conn = hive.Connection(host=thrift_server_host, port=thrift_server_port) - - scada_query = """SELECT Wind_turbine_name as Wind_turbine_name, - Date_time as Date_time, - cast(P_avg as float) as P_avg, - cast(Power_W as float) as Power_W, - cast(Ws_avg as float) as Ws_avg, - Wa_avg as Wa_avg, - Va_avg as Va_avg, - Ya_avg as Ya_avg, - Ot_avg as Ot_avg, - Ba_avg as Ba_avg - - FROM entr_warehouse.la_haute_borne_scada_for_openoa - """ - - plant = PlantDataV3() - - plant.scada.df = pd.read_sql(scada_query, conn) - - conn.close() - - return plant - - -# PlantData V2 with Python Dataclass -# requirements: -# - Holds 7 dataframes with data about one wind plant -# - Optionally validates data with respect to a schema -# - Can support loading data from multiple sources and saving itself to disk -@dataclass -class PlantDataV2: - scada: pd.DataFrame - meter: pd.DataFrame - tower: pd.DataFrame - status: pd.DataFrame - curtail: pd.DataFrame - asset: pd.DataFrame - reanalysis: pd.DataFrame - - name: str - version: float = 2 - - def __init__(self): - self._dataframe_field_names = [ - "scada", - "meter", - "tower", - "status", - "curtail", - "asset", - "reanalysis", - ] - - def _get_dataframes(self) -> dict[str : pd.DataFrame]: - return {name: getattr(self, name) for name in self._dataframe_fields} - - def validate(self, schema, fail_if_contains_extra_data=False): - """Validate this plant data object against a schema. Returns True if valid, Rasies an exception if not valid. - - Example Usage: - ``` - # Plant is automatically validated when an analysis is run - openoa.AEP(plant).run() - - # Manually validate with a schema - schema = openoa.AEP.input_schema # schema is a python dict object - plant.validate(schema) - ``` - """ - errors = [] - - dataframes = self._get_dataframes() - for field in schema["fields"]: - field_df = dataframes[field["name"]] - - # Check the dataframe contains the right columns: - expected_tags = set([field.name for field in field["fields"]]) - present_tags = set(field_df.columns) - - # Missing tags - missing_tags = expected_tags - present_tags - if len(missing_tags) > 0: - errors.append(f"Table {field['name']} missing tags {missing_tags}") - - # Extra tags - if fail_if_contains_extra_data: - extra_tags = present_tags - expected_tags - if len(extra_tags > 0): - errors.append(f"Table {field['name']} contains extra tags {extra_tags}") - - # Special validator for scada - if field["name"] == "scada": - pass - - if len(errors > 0): - for error in errors: - print(error) - raise ValueError(f"Plant {self.name} failed validation") - else: - return True - - def __repr__(self): - print(f"PlantData V{self.version}") - print(f"\tPlant: {self.name}") - print("=======================================") - missing_tables = ["scada", "meter", "tower", "status", "curtail", "asset", "reanalysis"] - - if self.asset is not None and self.asset.shape[0] > 0: - missing_tables.remove("asset") - print("\tAsset Table:") - print(f"\t\tNumber of Assets: {self.asset.shape[0]}") - - if self.scada is not None and self.scada.shape[0] > 0: - missing_tables.remove("scada") - print("\tScada Table:") - print(f"\t\tNumber of Rows: {self.scada.shape[0]}") - print(f"\t\tNumber of Columns: {self.scada.shape[1]}") - print(f"\t\tTags: {self.scada.columns}") - - print(f"Missing or Empty Tables: {missing_tables}") - - def save(self, path): - pass - - @classmethod - def from_save(cls, path): - pass - - @classmethod - def from_entr( - cls, - thrift_server_host: str = "localhost", - thrift_server_port: int = 10000, - database: str = "entr_warehouse", - wind_plant: str = "", - aggregation: str = "", - date_range: list = None, - ): - """ - from_entr - - Load a PlantData object from data in an entr_warehouse. - - Args: - thrift_server_url(str): URL of the Apache Thrift server - database(str): Name of the Hive database - wind_plant(str): Name of the wind plant you'd like to load - aggregation: Not yet implemented - date_range: Not yet implemented - - Returns: - plant(PlantData): An OpenOA PlantData object. - """ - from pyhive import hive - - conn = hive.Connection(host=thrift_server_host, port=thrift_server_port) - - scada_query = """SELECT Wind_turbine_name as Wind_turbine_name, - Date_time as Date_time, - cast(P_avg as float) as P_avg, - cast(Power_W as float) as Power_W, - cast(Ws_avg as float) as Ws_avg, - Wa_avg as Wa_avg, - Va_avg as Va_avg, - Ya_avg as Ya_avg, - Ot_avg as Ot_avg, - Ba_avg as Ba_avg - - FROM entr_warehouse.la_haute_borne_scada_for_openoa - """ - - plant = cls() - - plant.scada.df = pd.read_sql(scada_query, conn) - - conn.close() - - return plant - - @classmethod - def from_pandas(cls, scada, meter, status, tower, asset, curtail, reanalysis): - """ - from_pandas - - Create a PlantData object from a collection of Pandas data frames. - - Args: - scada: - meter: - status: - tower: - asset: - curtail: - reanalysis: - - Returns: - plant(PlantData): An OpenOA PlantData object. - """ - plant = cls() - - plant.scada = scada - plant.meter = meter - plant.status = status - plant.tower = tower - plant.asset = asset - plant.curtail = curtail - plant.reanalysis = reanalysis - - plant.validate() - - -def from_plantdata_v1(plant_v1: PlantData): - plant_v2 = PlantDataV2() - plant_v2.scada = plant_v1.scada._df - plant_v2.asset = plant_v1.asset._df - plant_v2.meter = plant_v1.meter._df - plant_v2.tower = plant_v1.tower._df - plant_v2.status = plant_v1.status._df - plant_v2.curtail = plant_v1.curtail._df - plant_v2.reanalysis = plant_v1.reanalysis._df - - # copy any other data members to their new location - - # validate(plant_v2) - - return plant_v2 - - -# PlantData -class PlantData(object): - """Data object for operational wind plant data. - - This class holds references to all tables associated with a wind plant. The tables are grouped by type: - - PlantData.scada - - PlantData.meter - - PlantData.tower - - PlantData.status - - PlantData.curtail - - PlantData.asset - - PlantData.reanalysis - - Each table must have columns following the following convention: - - - - The PlantData object can serialize all of these structures and reload them - them from the cache as needed. - - The underlying datastructure is a TimeseriesTable, which is agnostic to the underlying - engine and can be implemented with Pandas, Spark, or Dask (for instance). - - Individual plants will extend this object with their own - prepare() and other methods. - """ - - def __init__(self, path, name, engine="pandas", toolkit=["pruf_analysis"], schema=None): - """ - Create a plant data object without loading any data. - - Args: - path(string): path where data should be read/written - name(string): uniqiue name for this plant in case there's multiple plant's data in the directory - engine(string): backend engine - pandas, spark or dask - toolkit(list): the _tool_classes attribute defines a list of toolkit modules that can be loaded - - Returns: - New object - """ - if not schema: - dir = os.path.dirname(os.path.abspath(__file__)) - schema = dir + "/plant_schema.json" - with open(schema) as schema_file: - self._schema = json.load(schema_file) - - self._scada = timeseries_table.TimeseriesTable.factory(engine) - self._meter = timeseries_table.TimeseriesTable.factory(engine) - self._tower = timeseries_table.TimeseriesTable.factory(engine) - self._status = timeseries_table.TimeseriesTable.factory(engine) - self._curtail = timeseries_table.TimeseriesTable.factory(engine) - self._asset = AssetData(engine) - self._reanalysis = ReanalysisData(engine) - self._name = name - self._path = path - self._engine = engine - - self._version = 1 - - self._status_labels = ["full", "unavailable"] - - self._tables = [ - "_scada", - "_meter", - "_status", - "_tower", - "_asset", - "_curtail", - "_reanalysis", - ] - - def amend_std(self, dfname, new_fields): - """ - Amend a dataframe standard with new or changed fields. Consider running ensure_columns afterward to - automatically create the new required columns if they don't exist. - - Args: - dfname (string): one of scada, status, curtail, etc. - new_fields (dict): set of new fields and types in the same format as _scada_std to be added/changed in - the std - - Returns: - New data field standard - """ - - k = "_%s_std" % (dfname,) - setattr( - self, k, dict(itertools.chain(iter(getattr(self, k).items()), iter(new_fields.items()))) - ) - - def get_time_range(self): - """Get time range as tuple - - Returns: - (tuple): - start_time(datetime): start time - stop_time(datetime): stop time - """ - return (self._start_time, self._stop_time) - - def set_time_range(self, start_time, stop_time): - """Set time range given two unparsed timestamp strings - - Args: - start_time(string): start time - stop_time(string): stop time - - Returns: - (None) - """ - self._start_time = parse(start_time) - self._stop_time = parse(stop_time) - - def save(self, path=None): - """Save out the project and all JSON serializeable attributes to a file path. - - Args: - path(string): Location of new directory into which plant will be saved. The directory should not - already exist. Defaults to self._path - - Returns: - (None) - """ - if path is None: - raise RuntimeError("Path not specified.") - - os.mkdir(path) - - meta_dict = {} - for ca, ci in self.__dict__.items(): - if ca in self._tables: - ci.save(path, ca) - elif ca in ["_start_time", "_stop_time"]: - meta_dict[ca] = str(ci) - else: - meta_dict[ca] = ci - - with io.open(os.path.join(path, "metadata.json"), "w", encoding="utf-8") as outfile: - outfile.write(str(json.dumps(meta_dict, ensure_ascii=False))) - - def load(self, path=None): - """Load this project and all associated data from a file path - - Args: - path(string): Location of plant data directory. Defaults to self._path - - Returns: - (None) - """ - if not path: - path = self._path - - for df in self._tables: - getattr(self, df).load(path, df) - - meta_path = os.path.join(path, "metadata.json") - if os.path.exists(meta_path): - with io.open(os.path.join(path, "metadata.json"), "r") as infile: - meta_dict = json.load(infile) - for ca, ci in meta_dict.items(): - if ca in ["_start_time", "_stop_time"]: - ci = parse(ci) - setattr(self, ca, ci) - - def ensure_columns(self): - """@deprecated Ensure all dataframes contain necessary columns and format as needed""" - raise NotImplementedError("ensure_columns has been deprecated. Use plant.validate instead.") - - def validate(self, schema=None): - - """Validate this plant data object against its schema. Returns True if valid, Rasies an exception if not valid.""" - - if not schema: - schema = self._schema - - for field in schema["fields"]: - if field["type"] == "timeseries": - attr = "_{}".format(field["name"]) - if not getattr(self, attr).is_empty(): - getattr(self, attr).validate(field) - - return True - - def merge_asset_metadata(self): - """Merge metadata from the asset table into the scada and tower tables""" - if not (self._scada.is_empty()) and (len(self._asset.turbine_ids()) > 0): - self._scada.pandas_merge( - self._asset.df, - [ - "latitude", - "longitude", - "rated_power_kw", - "id", - "nearest_turbine_id", - "nearest_tower_id", - ], - "left", - on="id", - ) - if not (self._tower.is_empty()) and (len(self._asset.tower_ids()) > 0): - self._tower.pandas_merge( - self._asset.df, - [ - "latitude", - "longitude", - "rated_power_kw", - "id", - "nearest_turbine_id", - "nearest_tower_id", - ], - "left", - on="id", - ) - - def prepare(self): - """Prepare this object for use by loading data and doing essential preprocessing.""" - self.ensure_columns() - if not ((self._scada.is_empty()) or (self._tower.is_empty())): - self._asset.prepare(self._scada.unique("id"), self._tower.unique("id")) - self.merge_asset_metadata() - - @property - def scada(self): - return self._scada - - @property - def meter(self): - return self._meter - - @property - def tower(self): - return self._tower - - @property - def reanalysis(self): - return self._reanalysis - - @property - def status(self): - return self._status - - @property - def asset(self): - return self._asset - - @property - def curtail(self): - return self._curtail - - @classmethod - def from_entr( - cls, - thrift_server_host="localhost", - thrift_server_port=10000, - database="entr_warehouse", - wind_plant="", - aggregation="", - date_range=None, - ): - """ - from_entr - - Load a PlantData object from data in an entr_warehouse. - - Args: - thrift_server_host(str): URL of the Apache Thrift server - thrift_server_port(int): Port of the Apache Thrift server - database(str): Name of the Hive database - wind_plant(str): Name of the wind plant you'd like to load - aggregation: Not yet implemented - date_range: Not yet implemented - - Returns: - plant(PlantData): An OpenOA PlantData object. - """ - from pyhive import hive - - plant = cls( - database, wind_plant - ) # Passing in database as the path and wind_plant as the name for now. - - conn = hive.Connection(host=thrift_server_host, port=thrift_server_port) - - scada_query = f"""SELECT Wind_turbine_name as Wind_turbine_name, - Date_time as Date_time, - cast(P_avg as float) as P_avg, - cast(Power_W as float) as Power_W, - cast(Ws_avg as float) as Ws_avg, - Wa_avg as Wa_avg, - Va_avg as Va_avg, - Ya_avg as Ya_avg, - Ot_avg as Ot_avg, - Ba_avg as Ba_avg - - FROM {database}.{wind_plant} - """ - - plant.scada.df = pd.read_sql(scada_query, conn) - - conn.close() - - return plant - - @classmethod - def from_pandas(cls, scada, meter, status, tower, asset, curtail, reanalysis): - """ - from_pandas - - Create a PlantData object from a collection of Pandas data frames. - - Args: - scada: - meter: - status: - tower: - asset: - curtail: - reanalysis: - - Returns: - plant(PlantData): An OpenOA PlantData object. - """ - plant = cls() - - plant.scada.df = scada - plant.meter.df = meter - plant.status.df = status - plant.tower.df = tower - plant.asset.df = asset - plant.curtail.df = curtail - plant.reanalysis.df = reanalysis - - plant.validate() diff --git a/openoa/types/timeseries_table.py b/openoa/types/timeseries_table.py deleted file mode 100644 index bac2c887..00000000 --- a/openoa/types/timeseries_table.py +++ /dev/null @@ -1,413 +0,0 @@ -# timeseries_table.py -""" - -A basic columnar timeseries datastructure whose -underlying dataframe backend can be Pandas, Dask, -or Spark. -# -The assumption here is that there is a single -column that contains the timestamp and the other columns -are various metrics/data. The time stamps may be on -a regular interval, or they may be irregularly spaced. -The dataframe may be somewhat sparse (e.g., not all metrics may -be defined at all time points). - -""" - -import datetime -import importlib - -import pandas as pd -from openoa import logging, logged_method_call - - -logger = logging.getLogger(__name__) - - -# The abstract class sets the interface for the timeseries table -class AbstractTimeseriesTable: - df = None - _time_field = "time" - _metric_fields = [] - - def __init__(self): - pass - - # save data from our dataframe into some format that load can read - def save(self, path, name, format): - raise NotImplementedError("Called method on abstract class") - - # load tabular data in some format into our dataframe - def load(self, path, name, format): - raise NotImplementedError("Called method on abstract class") - - # ensure the given columns exist, create them if they do not, drop the rest - def ensure_columns(self, std): - raise NotImplementedError("Called method on abstract class") - - # given a mapping of column names, rename them - def rename_columns(self, mapping): - raise NotImplementedError("Called method on abstract class") - - # given a mapping of column names, rename them - def copy_column(self, to, fro): - raise NotImplementedError("Called method on abstract class") - - # return true if the data frame hasn't been initialized or has no data - def is_empty(self): - raise NotImplementedError("Called method on abstract class") - - # create columns like year/month/day from the time column - def explode_time(self, vars): - raise NotImplementedError("Called method on abstract class") - - # use a strptime string to process a date/time like string into a datetime object - def normalize_time_to_datetime(self, format, col=None): - raise NotImplementedError("Called method on abstract class") - - # convert unix time in seconds to datetime - def epoch_time_to_datetime(self, col=None): - raise NotImplementedError("Called method on abstract class") - - # return the first 5 rows of our dataframe as a pandas dataframe - def head(self): - raise NotImplementedError("Called method on abstract class") - - # apply a function to each element in a given column, modifying the column in place - def map_column(self, col, func): - raise NotImplementedError("Called method on abstract class") - - # merge our dataframe with a pandas dataframe - def pandas_merge(self, right, right_cols, how, on): - raise NotImplementedError("Called method on abstract class") - - # Return list of unique values in a given column - def unique(self, col): - raise NotImplementedError("Called method on abstract class") - - # Combine the given timeseries table with this one, row-wise - def rbind(self, tt): - raise NotImplementedError("Called method on abstract class") - - def trim_timeseries(self, start, stop): - raise NotImplementedError("Called method on abstract class") - - @property - def time_field(self): - return self._time_field - - @property - def metric_fields(self): - return self._metric_fields - - @property - def schema(self): - raise NotImplementedError("Called method on abstract class") - - -# These inherited classes implement it -class PandasTimeseriesTable(AbstractTimeseriesTable): - """Pandas based timeseries table""" - - def __init__(self, *args, **kwargs): - self._pd = __import__("pandas", globals(), locals(), [], 0) - - @logged_method_call - def save(self, path, name, format="csv"): - """Write data to file""" - logger.info("save name:{}".format(name)) - if format != "csv": - raise NotImplementedError("Cannot save to format %s yet" % (format,)) - self.df.to_csv("%s/%s.csv" % (path, name)) - - @logged_method_call - def load(self, path, name, format="csv", nrows=None): - """Read data from a file""" - logger.info("Loading name:{}".format(name)) - if format != "csv": - raise NotImplementedError("Cannot save to format %s yet" % (format,)) - self.df = self._pd.read_csv("%s/%s.csv" % (path, name), nrows=nrows) - - def rename_columns(self, mapping): - """Rename columns based on mapping - - Args: - mapping (dict): new and old column names based on {"new":"old"} convention - """ - for k in list(mapping.keys()): - if k != mapping[k]: - self.df[k] = self.df[mapping[k]] - self.df[mapping[k]] = None - - def copy_column(self, to, fro): - """Copy column data - - Args: - fro (str): column name to copy data from - to (str): column name to copy to - """ - logger.debug("copying {} to {}".format(fro, to)) - self.df[to] = self.df[fro] - - def ensure_columns(self, std): - """@deprecated Set column types to specified type - - Args: - std (dict): - """ - for col in list(std.keys()): - logging.debug("checking {} is astype {} ".format(col, std[col])) - if col not in self.df.columns: - if std[col] == "float64": - self.df[col] = float("nan") - else: - self.df[col] = None - self.df[col] = self.df[col].astype(std[col]) - self.df = self.df[list(std.keys())] - - @property - def schema(self): - """Return schema of this dataframe as a dictionary. - - Returns: - (dict): {column_name(str): column_type(str)} - """ - return {col: str(t) for col, t in zip(self.df, self.df.dtypes)} - - def validate(self, schema): - """Validate this timeseriestable object against its schema. - - Returns: - (bool): True if valid, Rasies an exception if not valid.""" - if schema["type"] != "timeseries": - raise Exception( - "Incompatible schema type {} applied to TimeseriesTable".format(schema["type"]) - ) - - df_schema = self.schema - for field in schema["fields"]: - if field["name"] in df_schema.keys(): - assert ( - df_schema[field["name"]] == field["type"] - ), "Incompatible type for field {}. Expected {} but got {}".format( - field["name"], field["type"], df_schema[field["name"]] - ) - del df_schema[field["name"]] - - assert len(df_schema) == 0, "Extra columns are present in TimeseriesTable: \n {}".format( - df_schema - ) - - return True - - def is_empty(self): - """Test if data is None - - Returs: - (bool): True if None, False if not None - """ - return self.df is None - - def explode_time(self, vars=["year", "month", "day"]): - """Create new columns for components of time - - Args: - vars (list): list of time components - """ - - for v in vars: - self.df[v] = self.df[self._time_field].apply(lambda x: getattr(x, v), 1) - - def normalize_time_to_datetime(self, format="%Y-%m-%d %H:%M:%S", col=None): - """Apply datetime format to timestamp column""" - if col is None: - col = self._time_field - logging.debug("setting {} to datetime ".format(col)) - self.df[col] = self.df[col].apply(lambda x: datetime.datetime.strptime(x, format), 1) - - def to_datetime(self, format="%Y-%m-%d %H:%M:%S", col=None): - """Run pd.to_datetime on timestamp column""" - if col is None: - col = self._time_field - logger.debug("Running pd.to_datetime on {} ".format(col)) - self.df[col] = pd.to_datetime(self.df[col]) - - def epoch_time_to_datetime(self, col=None): - """Format col as datetime""" - if col is None: - col = self._time_field - logger.debug("Running to_datetime on {} ".format(col)) - self.df[col] = self.df[col].apply(lambda x: pd.to_datetime(x, unit="s"), 1) - - def head(self): - """Head data""" - return self.df.head() - - def map_column(self, col, func): - """Apply a function to col""" - logger.debug("Mapping col:{}".format(col)) - if col not in self.df.columns: - self.df[col] = "unknown" - - self.df[col] = self.df[col].apply(func, 1) - - def pandas_merge(self, right, right_cols, how="left", on="id"): - """Run merge with data""" - logger.debug("merging right:{} right_cols:{} ".format(right, right_cols)) - self.df = self.df.merge(right.loc[:, right_cols], how=how, on=on) - - def unique(self, col): - """Get unique values of a column""" - logger.debug("unique col:{}".format(col)) - return self.df[col].unique() - - def rbind(self, tt): - """Append data""" - logger.debug("appending tt.df") - self.df = self.df.append(tt.df) - - def to_pandas(self): - """Return data""" - return self.df - - def trim_timeseries(self, start, stop): - """Get time range - - Args: - start (datetime): start of time-sereies trim - stop (datetime): stop of time-sereies trim - """ - logger.debug("trim_timeseries start:{} stop:{} ".format(start, stop)) - self.df = self.df.loc[ - (self.df[self._time_field] >= start) & (self.df[self._time_field] <= stop), : - ] - - def max(self): - """Find maximum timestamp value""" - return self.df[self._time_field].max() - - def min(self): - """Find minimum timestamp value""" - return self.df[self._time_field].min() - - -class SparkTimeseriesTable(AbstractTimeseriesTable): - def __init__(self, *args, **kwargs): - self._f = importlib.import_module("pyspark.sql.functions") - self._t = importlib.import_module("pyspark.sql.types") - self._sql = importlib.import_module("pyspark.sql") - self._pyspark = importlib.import_module("pyspark") - self._sc = self._pyspark.SparkContext.getOrCreate() - self._sqlContext = self._sql.SQLContext.getOrCreate(self._sc) - self.type_map = { - "datetime64[ns]": self._t.TimestampType(), - "string": self._t.StringType(), - "object": self._t.StringType(), - "float64": self._t.DoubleType(), - } - - def save(self, path, name, format="parquet"): - if format != "parquet": - raise NotImplementedError("Cannot save to format %s yet" % (format,)) - self.df.write.mode("overwrite").parquet("%s/%s.parquet" % (path, name)) - - def load(self, path, name, format="parquet", nrows=None): - if format == "parquet": - self.df = self._sqlContext.read.parquet("%s/%s.parquet" % (path, name)) - elif format == "csv": - self.df = ( - self._sqlContext.read.format("com.databricks.spark.csv") - .options(header="true", inferschema="true") - .load("%s/%s.csv" % (path, name)) - ) - if nrows is not None: - self.df = self.df.limit(nrows) - - def rename_columns(self, mapping): - for k in list(mapping.keys()): - if k != mapping[k]: - self.df = self.df.withColumnRenamed(mapping[k], k) - - def copy_column(self, to, fro): - self.df = self.df.withColumn(to, self.df[fro]) - - def ensure_columns(self, std): - for col in list(std.keys()): - if col not in self.df.columns: - self.df = self.df.withColumn(col, self._f.lit(None).cast(self._t.StringType())) - else: - cast_to = self.type_map[std[col]] - self.df = self.df.withColumn(col, self.df[col].cast(cast_to)) - self.df = self.df.select(list(std.keys())) - - def is_empty(self): - return self.df is None - - def explode_time(self, vars=["year", "month", "day", "hour"]): - if "year" in vars: - self.df = self.df.withColumn("year", self._f.year(self._time_field)) - if "month" in vars: - self.df = self.df.withColumn("month", self._f.month(self._time_field)) - if "day" in vars: - self.df = self.df.withColumn("day", self._f.dayofmonth(self._time_field)) - if "hour" in vars: - self.df = self.df.withColumn("hour", self._f.hour(self._time_field)) - - def normalize_time_to_datetime(self, format, col=None): - if col is None: - col = self._time_field - raise NotImplementedError("TODO") - - def epoch_time_to_datetime(self, col=None): - if col is None: - col = self._time_field - self.df = self.df.withColumn(col, self._f.from_unixtime(col)) - - def head(self): - return self.df.limit(5).toPandas() - - def map_column(self, col, func): - as_udf = self._f.udf(func, self._t.StringType()) - self.df.withColumn(col, as_udf(col)) - - def pandas_merge(self, right, right_cols, how, on): - right = right.loc[:, right_cols] - schema = [ - self._t.StructField(x, self.type_map[right[x].dtype.name], True) for x in right_cols - ] - schema = self._t.StructType(schema) - right = self._sqlContext.createDataFrame(right, schema) - self.df = self.df.join(right, on, how) - - def unique(self, col): - if self.is_empty(): - return [] - else: - self.df.select(col).distinct().rdd.map(lambda x: x[0]).collect() - - def rbind(self, tt): - raise NotImplementedError("TODO") - - def trim_timeseries(self, start, stop): - raise NotImplementedError("TODO") - - -class DaskTimeseriesTable(AbstractTimeseriesTable): - def __init__(self, *args, **kwargs): - raise NotImplementedError("DASK implementation is TBD") - - -# The timeseries table class is a factory -class TimeseriesTable: - _classes = { - "spark": SparkTimeseriesTable, - "pandas": PandasTimeseriesTable, - "dask": DaskTimeseriesTable, - } - - @staticmethod - def factory(engine="pandas", *args, **kwargs): - if engine not in list(TimeseriesTable._classes.keys()): - raise NotImplementedError("Engine %s Not Implemented" % (engine,)) - else: - return TimeseriesTable._classes[engine](*args, **kwargs) diff --git a/readme.md b/readme.md index ca4cfc02..b0a86a44 100644 --- a/readme.md +++ b/readme.md @@ -1,6 +1,6 @@ OpenOA -[![Binder Badge](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/NREL/OpenOA/main?filepath=examples) [![Gitter Badge](https://badges.gitter.im/NREL_OpenOA/community.svg)](https://gitter.im/NREL_OpenOA/community?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge) [![Journal of Open Source Software Badge](https://joss.theoj.org/papers/d635ef3c3784d49f6e81e07a0b35ff6b/status.svg)](https://joss.theoj.org/papers/d635ef3c3784d49f6e81e07a0b35ff6b) +[![Binder Badge](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/NREL/OpenOA/develop_v3?filepath=examples) [![Gitter Badge](https://badges.gitter.im/NREL_OpenOA/community.svg)](https://gitter.im/NREL_OpenOA/community?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge) [![Journal of Open Source Software Badge](https://joss.theoj.org/papers/d635ef3c3784d49f6e81e07a0b35ff6b/status.svg)](https://joss.theoj.org/papers/d635ef3c3784d49f6e81e07a0b35ff6b) [![Documentation Badge](https://readthedocs.org/projects/openoa/badge/?version=latest)](https://openoa.readthedocs.io) ![Tests Badge](https://github.com/NREL/OpenOA/workflows/Tests/badge.svg?branch=develop) [![Code Coverage Badge](https://codecov.io/gh/NREL/OpenOA/branch/develop/graph/badge.svg)](https://codecov.io/gh/NREL/OpenOA) @@ -22,7 +22,7 @@ The library is written around Pandas Data Frames, utilizing a flexible backend so that data loading, processing, and analysis could be performed using other libraries, such as Dask and Spark, in the future. -If you would like to try out the code before installation or simply explore the possibilities, please see our examples on [Binder](https://mybinder.org/v2/gh/NREL/OpenOA/main?filepath=examples). +If you would like to try out the code before installation or simply explore the possibilities, please see our examples on [Binder](https://mybinder.org/v2/gh/NREL/OpenOA/develop_v3?filepath=examples). If you use this software in your work, please cite our JOSS article with the following BibTex: diff --git a/setup.py b/setup.py index 4c5d2265..8c801980 100644 --- a/setup.py +++ b/setup.py @@ -91,9 +91,8 @@ def read_file(filename): url="https://github.com/NREL/OpenOA", packages=find_packages(exclude=["test"]), include_package_data=True, - data_files=[("openoa/types", ["openoa/types/plant_schema.json"])], install_requires=REQUIRED, extras_require=EXTRAS, tests_require=TESTS, - python_requires=">=3.6, <=3.10", + python_requires=">=3.8, <=3.10", ) diff --git a/sphinx/examples/index.rst b/sphinx/examples/index.rst index 8115a55a..0848e69f 100644 --- a/sphinx/examples/index.rst +++ b/sphinx/examples/index.rst @@ -73,4 +73,4 @@ Table of Contents examplesout -.. _Binder: https://mybinder.org/v2/gh/NREL/OpenOA/main?filepath=examples +.. _Binder: https://mybinder.org/v2/gh/NREL/OpenOA/develop_v3?filepath=examples diff --git a/sphinx/getting_started/index.rst b/sphinx/getting_started/index.rst index 397c4d7c..a378f825 100644 --- a/sphinx/getting_started/index.rst +++ b/sphinx/getting_started/index.rst @@ -6,10 +6,20 @@ Getting Started Before installing and diving in, users can interact with our examples and test out the analysis library on our `Binder page`_. + +Installation, Using, and Contributing +************************************* .. toctree:: :maxdepth: 2 install contributing -.. _Binder page: https://mybinder.org/v2/gh/NREL/OpenOA/main?filepath=examples +What's New? +*********** + +.. include:: ../../CHANGELOG.md + :parser: myst_parser.sphinx_ + + +.. _Binder page: https://mybinder.org/v2/gh/NREL/OpenOA/develop_v3?filepath=examples diff --git a/sphinx/index.rst b/sphinx/index.rst index ae4fb0ee..bbfae09b 100644 --- a/sphinx/index.rst +++ b/sphinx/index.rst @@ -29,7 +29,7 @@ and reanalysis products such as Merra2. Analysis routines are provided in analysis classes, which each use the PlantData objects to ingest data. To interact with how each of these components of OpenOA are used, please visit our examples notebooks on -`Binder `_, or view them statically on the +`Binder `_, or view them statically on the `examples page `_. If you use this software in your work, please cite our JOSS article with the following BibTex:: @@ -60,7 +60,7 @@ Table of Contents .. |Binder Badge| image:: https://mybinder.org/badge_logo.svg - :target: https://mybinder.org/v2/gh/NREL/OpenOA/main?filepath=examples + :target: https://mybinder.org/v2/gh/NREL/OpenOA/develop_v3?filepath=examples .. |Gitter Badge| image:: https://badges.gitter.im/NREL_OpenOA/community.svg :target: https://gitter.im/NREL_OpenOA/community?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge .. |Journal of Open Source Software Badge| image:: https://joss.theoj.org/papers/d635ef3c3784d49f6e81e07a0b35ff6b/status.svg