diff --git a/docs/conf.py b/docs/conf.py index 324436899..058a39b96 100644 --- a/docs/conf.py +++ b/docs/conf.py @@ -61,7 +61,9 @@ # label references for depth of headers: label name in anchor slug structure myst_heading_anchors = 4 # execute jupter notebooks output before building webpage -jupyter_execute_notebooks = "off" +nb_execution_mode = "off" +nb_execution_excludepatterns = ["*/_build/*"] + # Extentions in myst myst_enable_extensions = [ "dollarmath", diff --git a/docs/user_manual/dataset-terminology.md b/docs/user_manual/dataset-terminology.md index e50afb4d3..850be114f 100644 --- a/docs/user_manual/dataset-terminology.md +++ b/docs/user_manual/dataset-terminology.md @@ -10,23 +10,105 @@ Some terms regarding the data structures are explained here, including the defin ## Data structures -- **Dataset:** Either a single or a batch dataset. - - **SingleDataset:** A data type storing input data (i.e. all elements of all components) for a single scenario. - - **BatchDataset:** A data type storing update and or output data for one or more scenarios. A batch dataset can contain sparse or dense data, depending on the component. -- **DataArray** A data array can be a single or a batch array. It is a numpy structured array. - - **SingleArray** A dictionary where the keys are the component types and the values are one-dimensional structured numpy arrays. - - **BatchArray:** An array of dictionaries where the keys are the component types and the values are two-dimensional structured numpy arrays. - - **DenseBatchArray:** A two-dimensional structured numpy array containing a list of components of the same type for each scenario. - - **SparseBatchArray:** A dictionary with a one-dimensional numpy int64 array and a one-dimensional structured numpy arrays. - -### Type of Dataset - -The types of `Dataset` include the following: `input`, `update`, `sym_output`, `asym_output`, and `sc_output`: -Exemplery datasets attributes are given in a dataset containing a `line` component. - -- **input:** Contains attributes relevant to configuration of grid. +```{mermaid} +graph TD + subgraph Other numpy arrays + IndexPointer + SingleColumn + BatchColumn + end + + subgraph Datasets + Dataset --> SingleDataset + Dataset --> BatchDataset + end + + + click Dataset href "../api_reference/python-api-reference.html#power_grid_model.data_types.Dataset" + click SingleDataset href "../api_reference/python-api-reference.html#power_grid_model.data_types.SingleDataset" + click BatchDataset href "../api_reference/python-api-reference.html#power_grid_model.data_types.BatchDataset" + + click IndexPointer href "../api_reference/python-api-reference.html#power_grid_model.data_types.IndexPointer" + click SingleColumn href "../api_reference/python-api-reference.html#power_grid_model.data_types.SingleColumn" + click BatchColumn href "../api_reference/python-api-reference.html#power_grid_model.data_types.BatchColumn" +``` + +```{mermaid} +graph TD + subgraph Dataset values + ComponentData --> DataArray + ComponentData --> ColumnarData + + DataArray --> SingleArray + DataArray --> BatchArray + + BatchArray --> DenseBatchArray + BatchArray --> SparseBatchArray + + ColumnarData --> SingleColumnarData + ColumnarData --> BatchColumnarData + + BatchColumnarData --> DenseBatchColumnarData + BatchColumnarData --> SparseBatchColumnarData + end + + click ComponentData href "../api_reference/python-api-reference.html#power_grid_model.data_types.ComponentData" + click DataArray href "../api_reference/python-api-reference.html#power_grid_model.data_types.DataArray" + click ColumnarData href "../api_reference/python-api-reference.html#power_grid_model.data_types.ColumnarData" + click SingleArray href "../api_reference/python-api-reference.html#power_grid_model.data_types.SingleArray" + click BatchArray href "../api_reference/python-api-reference.html#power_grid_model.data_types.BatchArray" + click DenseBatchArray href "../api_reference/python-api-reference.html#power_grid_model.data_types.DenseBatchArray" + click SparseBatchArray href "../api_reference/python-api-reference.html#power_grid_model.data_types.SparseBatchArray" + click SingleColumnarData href "../api_reference/python-api-reference.html#power_grid_model.data_types.SingleColumnarData" + click BatchColumnarData href "../api_reference/python-api-reference.html#power_grid_model.data_types.BatchColumnarData" + click DenseBatchColumnarData href "../api_reference/python-api-reference.html#power_grid_model.data_types.DenseBatchColumnarData" + click SparseBatchColumnarData href "../api_reference/python-api-reference.html#power_grid_model.data_types.SparseBatchColumnarData" + +``` + +- **{py:class}`Dataset `:** Either a single or a batch dataset. it is a dictionary with keys as the component types (eg. `line`, `node`, etc) and values as **ComponentData** + - **{py:class}`SingleDataset `:** A data type storing input data (i.e. all elements of all components) for a single scenario. + - **{py:class}`BatchDataset `:** A data type storing update and or output data for one or more scenarios. A batch dataset can contain sparse or dense data, depending on the component. + +- **{py:class}`ComponentData `:** The data corresponding to the component. + - **{py:class}`DataArray `:** A data array can be a single or a batch array. It is a numpy structured array. + - **{py:class}`SingleArray `:** A 1D numpy structured array corresponding to a single dataset. + - **{py:class}`BatchArray `:** Multiple batches of data can be represented in sparse or dense forms. + - **{py:class}`DenseBatchArray `:** A 2D structured numpy array containing a list of components of the same type for each scenario. + - **{py:class}`SparseBatchArray `:** A typed dictionary with a 1D numpy array of `Indexpointer` type under `indptr` key and `SingleArray` under `data` key which is all components flattened over all batches. + - **{py:class}`ColumnarData `:** A dictionary of attributes as keys and individual numpy arrays as values. + - **{py:class}`SingleColumnarData `:** A dictionary of attributes as keys and `SingleColumn` as values in a single dataset. + - **{py:class}`BatchColumnarData `:** Multiple batches of data can be represented in sparse or dense forms. + - **{py:class}`DenseBatchColumnarData `:** A dictionary of attributes as keys and 2D/3D numpy array of `BatchColumn` type as values in a single dataset. + - **{py:class}`SparseBatchColumnarData `:** A typed dictionary with a 1D numpy array of `Indexpointer` type under `indptr` key and `SingleColumn` under `data` which is all components flattened over all batches. + +- **{py:class}`IndexPointer `:** A 1D numpy array of int64 type used to specify sparse batches. It indicates the range of components within a scenario. For example, an Index pointer of [0, 1, 3, 3] indicates 4 batches with element indexed with 0 in 1st batch, [1, 2, 3] in 2nd batch and no elements in 3rd batch. +- **{py:class}`SingleColumn `:** A 1D/2D numpy array of values corresponding to a specific attribute. +- **{py:class}`BatchColumn `:** A 2D/3D numpy array of values corresponding to a specific attribute. + +### Dimensions of numpy arrays + +The dimensions of numpy arrays and the interpretation of each dimension is as follows. + +| **Data Type** | **1D** |**2D** | **3D** | +|--------------------------|-----------------------------------|-------------------------------------------------------|-------------------------------------------------------------------------------| +| **SingleArray** | Corresponds to a single dataset. | ❌ | ❌ | +| **DenseBatchArray** | ❌ | Batch number $\times$ Component within that batch | ❌ | +| **SingleColumn** | Component within that batch. | Component within that batch $\times$ Phases ✨ | ❌ | +| **BatchColumn** | ❌ | Batch number $\times$ Component within that batch | Batch number $\times$ Component within that batch $\times$ Phases ✨ | + +```{note} +✨ The "Phases" dimension is optional and is available only when the attributes are asymmetric. +``` + +### Type of Dataset + +The types of `Dataset` include the following: `input`, `update`, `sym_output`, `asym_output`, and `sc_output`. They are included under the enum {py:class}`DatasetType `. +Exemplary datasets attributes are given in a dataset containing a `line` component. + +- **input:** Contains attributes relevant to configuration of grid. - Example: `id`, `from_node`, `from_status` -- **update:** Contains attributes relevant to multiple scenarios. +- **update:** Contains attributes relevant to multiple scenarios. - Example: `from_status`,`to_status` - **sym_output:** Contains attributes relevant to symmetrical steady state output of power flow or state estimation calculation. - Example: `p_from`, `p_to`