diff --git a/docs/advanced_documentation/native-data-interface.md b/docs/advanced_documentation/native-data-interface.md index 0b17b8056..a47af7e39 100644 --- a/docs/advanced_documentation/native-data-interface.md +++ b/docs/advanced_documentation/native-data-interface.md @@ -55,21 +55,57 @@ node_dtype = np.dtype( To recreate the same node input dataset, we just create a `numpy` array using this special defined `dtype`. The `numpy` array has exactly the same data layout as the `std::vector` above. - ```python node = np.empty(shape=2, dtype=node_dtype) node['id'] = [1, 2] node['u_rated'] = [150e3, 10e3] ``` +## Columnar data format + +Additionally, we can represent the contents mentioned `NodeInput` struct in [Structured Array](#structured-array) for only specific attributes. +This is especially useful when the component in question, e.g., a transformer, has many default attributes. In that case, the user can save significantly on memory usage. Hence, we can term it into `NodeInputURated` which is of `double` type. +(note again, its representation in C++ core might be different than that of `NodeInputURated`). + +One can create a `std::vector` to hold input for multiple nodes. +In a similar example we create attribute data with `u_rated` of two nodes of 150 kV and 10 kV. + +```c++ +using NodeInputURated = double; +std::vector node_u_rated_input{ 150.0e3 , 10.0e3 }; +``` + +Similar would be the case for `NodeInputId` and `std::vector` + +To recreate this in Python using NumPy arrays, we should create it with the correct dtype - as mentioned in [Structured Array](#structured-array) - for each attribute. + +```python +node_id = np.empty(shape=2, dtype=node_dtype["id"]) +node_id['id'] = [1, 2] +node_u_rated = np.empty(shape=2, dtype=node_dtype["u_rated"]) +node_u_rated['u_rated'] = [150e3, 10e3] +``` + +## Creating Dataset + We further save this array into a dictionary. With other types of components, the dictionary is a valid input dataset for the constructor of `PowerGridModel`, see [Python API Reference](../api_reference/python-api-reference.md). +For a row based data format, + ```python input_data = {'node': node} ``` +or for columnar data format, + +```python +input_data_columnar = {'node': {"id": node_id, "u_rated": node_u_rated}} +``` + +There can also be a combination of both row based and columnar data format in a dataset. + In the `ctypes` wrapper the pointers to all the array data will be retrieved and passed to the C++ code. This is also true for result dataset. The memory block of result dataset is allocated using `numpy`. @@ -141,9 +177,14 @@ The code below creates an array which is compatible with transformer input datas ```python from power_grid_model import ComponentType, DatasetType, power_grid_meta_data -transformer = np.empty(shape=5, dtype=power_grid_meta_data[DatasetType.input][ComponentType.transformer]['dtype']) +transformer_dtype = power_grid_meta_data[DatasetType.input][ComponentType.transformer].dtype +# Array for row based data +transformer = np.empty(shape=5, dtype=transformer_dtype) +# Array for columnar data +transformer_tap_pos = np.empty(shape=5, dtype=transformer_dtype["tap_pos"]) + # direct string access is supported as well: -# transformer = np.empty(shape=5, dtype=power_grid_meta_data['input']['transformer']['dtype']) +# transformer = np.empty(shape=5, dtype=power_grid_meta_data[DatasetType.input][ComponentType.transformer].dtype) ``` Furthermore, there is an even more convenient function `initialize_array` diff --git a/docs/user_manual/serialization.md b/docs/user_manual/serialization.md index d3f1aa417..f5a56902a 100644 --- a/docs/user_manual/serialization.md +++ b/docs/user_manual/serialization.md @@ -94,6 +94,9 @@ A [`ComponentDataset`](#json-schema-component-dataset-object) is an array of [`C - [`ComponentDataset`](#json-schema-component-dataset-object): `Array` - [`ComponentData`](#json-schema-component-data-object): the data per single component. +**NOTE:** The actual deserialized data representation may be row based or columnar, depending on the `data_filter` provided at deserialization (Check {py:function}`json_deserialize ` for example). +Regardless of whether the deserialized data representation data is row based or columnar, the serialization format remains the same. + #### JSON schema component data object A [`ComponentData`](#json-schema-component-data-object) object is either a [`HomogeneousComponentData`](#json-schema-homogeneous-component-data-object) object or an [`InhomogeneousComponentData`](#json-schema-inhomogeneous-component-data-object) object