From db7b0b9a78ca64ef994f9edefa79348b22b239a3 Mon Sep 17 00:00:00 2001
From: Martijn Govers <Martijn.Govers@Alliander.com>
Date: Fri, 25 Oct 2024 12:24:20 +0200
Subject: [PATCH 1/4] add C API columnar buffer documentation

Signed-off-by: Martijn Govers <Martijn.Govers@Alliander.com>
---
 docs/advanced_documentation/build-guide.md |  2 +-
 docs/advanced_documentation/c-api.md       | 78 +++++++++++++++++-----
 2 files changed, 63 insertions(+), 17 deletions(-)

diff --git a/docs/advanced_documentation/build-guide.md b/docs/advanced_documentation/build-guide.md
index c2ae5bc3a..6b2ca2dbf 100644
--- a/docs/advanced_documentation/build-guide.md
+++ b/docs/advanced_documentation/build-guide.md
@@ -488,4 +488,4 @@ cd docs/doxygen
 doxygen
 cd ..
 sphinx-build -b html . _build/html
-```
\ No newline at end of file
+```
diff --git a/docs/advanced_documentation/c-api.md b/docs/advanced_documentation/c-api.md
index d34c804bd..fa4577cb7 100644
--- a/docs/advanced_documentation/c-api.md
+++ b/docs/advanced_documentation/c-api.md
@@ -35,7 +35,7 @@ Since the C API is a dynamically linked library, the user is responsible for pla
 and making it available to their binaries, e.g. by adding its location to `PATH` or `RPATH`.
 ```
 
-## Opaque Struct/Pointer
+## Opaque struct/pointer
 
 As a common C API practice, we use [opaque struct/pointer](https://en.wikipedia.org/wiki/Opaque_pointer) in the API. 
 The user creates the object by `PGM_create_*` function and release the object by `PGM_destroy_*` function.
@@ -55,7 +55,7 @@ to check if there is error during the creation and the error message.
 
 If you are calling the C API in multiple threads, each thread should have its own handle object created by `PGM_create_handle`.
 
-## Calculation Options
+## Calculation options
 
 To execute a power grid calculation you need to specify many options, 
 e.g., maximum number of iterations, error tolerance, etc.
@@ -69,45 +69,83 @@ In the `PGM_calculate` function you need to pass a pointer to `PGM_Options`.
 In this way, we can ensure the API backwards compatibility.
 If we add a new option, it will get a default value in the `PGM_create_options` function.
 
-## Buffer and Attributes
+## Buffers and attributes
 
 The biggest challenge in the design of C API is the handling of input/output/update data buffers.
 We define the following concepts in the data hierarchy:
 
 * Dataset: a collection of data buffers for a given purpose. 
-At this moment, we have four dataset types: `input`, `update`, `sym_output`, `asym_output`.
-* Component: a homogeneous data buffer for a component in our [data model](../user_manual/components.md), e.g., `node`.
+  At this moment, we have four dataset types: `input`, `update`, `sym_output`, `asym_output`.
+* Component: the representation of attributes for the same component in our [data model](../user_manual/components.md), e.g., `node`.
 * Attribute: a property of given component. For example, `u_rated` attribute of `node` is the rated voltage of the node.
 
-### Create and Destroy Buffer
+Additionally, at this time, we distinguish two buffer types: [component buffers](#component-buffers) and [attribute buffers](#attribute-buffers).
+
+### Component buffers
+
+These buffers represent component data in a row-based format.
+I.e., all attributes of the same component are represented sequentially.
+The individual components are represented in the buffer with a given size and alignment,
+which can be retrieved from the C API via the `PGM_meta_component_size` and `PGM_meta_component_alignment` functions.
+The type (implying the size) and offset of each attribute can be found using the `PGM_meta_attribute_ctype` and `PGM_meta_attribute_offset`.
+
+While we recommend users to create their own buffers using the `PGM_meta_component_size` and `PGM_meta_component_alignment`,
+we do provide functionality to ease the burden (see below), while also providing backwards compatibility by design.
+
+#### Component buffer layout example
+
+The following example shows how `line` update data may be represented in the buffer.
+
+```{note}
+These values are for illustration purposes only.
+These may not be the actual values retrieved by the `PGM_meta_*` functions,
+and may vary between power grid model versions, compilers, operating systems and architectures.
+```
+
+* an unaligned size of 6 bytes consisting of:
+  * the `id` (4 bytes, offset by 0 bytes)
+  * the `from_status` (1 byte, offset by 4 bytes)
+  * the `to_status` (1 byte, offset by 5 bytes)
+* an aligned size of 8 bytes
+* a 4 bytes alignment
+
+```txt
+<line_0><line_1><line_2>   <-- 3 lines.
+|   |   |   |   |   |   |  <-- alignment: a line may start every 4 bytes.
+iiiift  iiiift  iiiift     <-- data: 6 bytes per line: 4 bytes for the ID, 1 for the from_status and 1 for the to_status.
+|     ..|     ..|     ..|  <-- padding: (6 mod 4 = 2) bytes after every line.
+|       |       |       |  <-- aligned size: (6 + 2 = 8) bytes every line.
+```
+
+#### Create and destroy buffer
 
 Data buffers are almost always allocated and freed in the heap. We provide two ways of doing so.
 
 * You can use the function `PGM_create_buffer` and `PGM_destroy_buffer` to create and destroy buffer.
-In this way, the library is handling the memory (de-)allocation.
+  In this way, the library is handling the memory (de-)allocation.
 * You can call some memory (de-)allocation function in your own code according to your platform, 
-e.g., `aligned_alloc` and `free`.
-You need to first call `PGM_meta_*` functions to retrieve the size and alignment of a component.
+  e.g., `aligned_alloc` and `free`.
+  You need to first call `PGM_meta_*` functions to retrieve the size and alignment of a component.
 
 ```{warning}
 Do not mix these two methods in creation and destruction.
 You cannot use `PGM_destroy_buffer` to release a buffer created in your own code, or vice versa.
 ```
 
-### Set and Get Attribute
+#### Set and get attribute
 
 Once you have the data buffer, you need to set or get attributes. We provide two ways of doing so.
 
 * You can use the function `PGM_buffer_set_value` and `PGM_buffer_get_value` to get and set values.
 * You can do pointer cast directly on the buffer pointer, by shifting the pointer to proper offset
-and cast it to a certain value type. 
-You need to first call `PGM_meta_*` functions to retrieve the correct offset.
+  and cast it to a certain value type. 
+  You need to first call `PGM_meta_*` functions to retrieve the correct offset.
 
 Pointer cast is generally more efficient and flexible because you are not calling into the 
 dynamic library everytime. But it requires the user to retrieve the offset information first.
 Using the buffer helper function is more convenient but with some overhead.
 
-### Set NaN Function
+#### Set NaN function
 
 In the C API we have a function `PGM_buffer_set_nan` which sets all the attributes in a buffer to `NaN`.
 In the calculation core, if an optional attribute is `NaN`, it will use the default value.
@@ -116,7 +154,7 @@ If you just want to set some attributes and keep everything else as `NaN`,
 calling `PGM_buffer_set_nan` before you set attribute is convenient.
 This is useful especially in `update` dataset because you do not always update all the mutable attributes.
 
-### Backwards Compatibility
+#### Backwards compatibility
 
 If you do want to set all the attributes in a component, you can skip the function call to `PGM_buffer_set_nan`.
 This will provide better performance as there is no use of setting `NaN`.
@@ -134,6 +172,14 @@ You do not need to call `PGM_buffer_set_nan` on output buffers,
 because the buffer will be overwritten in the calculation core with the real output data.
 ```
 
+### Attribute buffers
+
+An attribute buffer contains data for a single component attribute and represents one column in a columnar representation of component data.
+A combination of attribute buffers with the same amount of elements has the power to carry the same information as row-based [component buffers](#component-buffers).
+
+Since all attributes consist of primitive types, operations are straightforward.
+We therefore do not provide explicit interface functionality.
+
 ## Dataset views
 
 For large datasets that cannot or should not be treated independently,
@@ -142,9 +188,9 @@ Currently implemented are `PGM_dataset_const`, `PGM_dataset_mutable`, and `PGM_d
 These three dataset types expose a dataset to the power-grid-model with the following permissions on buffers:
 
 | Dataset interface        | power-grid-model permissions | User permissions    | Treat as        |
-|--------------------------| ---------------------------- | ------------------- | --------------- |
+| ------------------------ | ---------------------------- | ------------------- | --------------- |
 | `PGM_dataset_const_*`    | Read                         | Create, read, write | `const * const` |
-| `PGM_dataset_mutable_*`  | Read, write                  | Create, read, write | `* const` |
+| `PGM_dataset_mutable_*`  | Read, write                  | Create, read, write | `* const`       |
 | `PGM_dataset_writable_*` | Read, write                  | Read                | `* const`       |
 
 A constant dataset is completely user-owned.

From 501647bce2b37840f163c72026bcf7e8c2aead94 Mon Sep 17 00:00:00 2001
From: Martijn Govers <Martijn.Govers@Alliander.com>
Date: Fri, 25 Oct 2024 12:24:55 +0200
Subject: [PATCH 2/4] document UB when permanent update fails

Signed-off-by: Martijn Govers <Martijn.Govers@Alliander.com>
---
 .../power_grid_model_c/include/power_grid_model_c/model.h  | 7 +++----
 src/power_grid_model/core/power_grid_model.py              | 5 +++++
 2 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/power_grid_model_c/power_grid_model_c/include/power_grid_model_c/model.h b/power_grid_model_c/power_grid_model_c/include/power_grid_model_c/model.h
index f577852a1..3827aefb6 100644
--- a/power_grid_model_c/power_grid_model_c/include/power_grid_model_c/model.h
+++ b/power_grid_model_c/power_grid_model_c/include/power_grid_model_c/model.h
@@ -29,17 +29,16 @@ extern "C" {
  * @param input_dataset Pointer to an instance of PGM_ConstDataset. It should have data type "input".
  * @return The opaque pointer to the created model.
  * If there are errors during the creation, a NULL is returned.
- * Use PGM_error_code() and PGM_error_message() to check the error. */
+ * Use PGM_error_code() and PGM_error_message() to check the error.
+ */
 PGM_API PGM_PowerGridModel* PGM_create_model(PGM_Handle* handle, double system_frequency,
                                              PGM_ConstDataset const* input_dataset);
 
 /**
  * @brief Update the model by changing mutable attributes of some elements.
  *
- * All the elements you supply in the update dataset should have valid ids
- * which exist in the original model.
- *
  * Use PGM_error_code() and PGM_error_message() to check if there are errors in the update.
+ * NOTE: The model will be in an undefined state after errors occured during the update and should be destroyed.
  *
  * @param handle
  * @param model A pointer to an existing model.
diff --git a/src/power_grid_model/core/power_grid_model.py b/src/power_grid_model/core/power_grid_model.py
index ee8632d12..46564faa0 100644
--- a/src/power_grid_model/core/power_grid_model.py
+++ b/src/power_grid_model/core/power_grid_model.py
@@ -129,12 +129,17 @@ def update(self, *, update_data: Dataset):
         """
         Update the model with changes.
 
+        The model will be in an invalid state if the update fails and should be discarded.
+
         Args:
             update_data: Update data dictionary
 
                 - key: Component type
                 - value: Component data with the correct type :class:`ComponentData` (single scenario or batch)
 
+        Raises:
+            PowerGridError if the update fails. The model is left in an invalid state and should be discarded.
+
         Returns:
             None
         """

From a980e233bfad733a340b2718f4b3312cafa67b17 Mon Sep 17 00:00:00 2001
From: Martijn Govers <Martijn.Govers@Alliander.com>
Date: Fri, 25 Oct 2024 12:38:52 +0200
Subject: [PATCH 3/4] more improvements

Signed-off-by: Martijn Govers <Martijn.Govers@Alliander.com>
---
 docs/advanced_documentation/c-api.md | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/docs/advanced_documentation/c-api.md b/docs/advanced_documentation/c-api.md
index fa4577cb7..e385ab109 100644
--- a/docs/advanced_documentation/c-api.md
+++ b/docs/advanced_documentation/c-api.md
@@ -71,7 +71,10 @@ If we add a new option, it will get a default value in the `PGM_create_options`
 
 ## Buffers and attributes
 
-The biggest challenge in the design of C API is the handling of input/output/update data buffers.
+The biggest challenge in the design of the C API is the handling of input/output/update data communication.
+To this end, data is communicated using buffers, which are contiguous blocks of memory that represent data in a given format.
+For compatibility reasons, that format is dictated by the C API using the `PGM_meta_*` functions and `PGM_def_*` dataset definitions.
+
 We define the following concepts in the data hierarchy:
 
 * Dataset: a collection of data buffers for a given purpose. 
@@ -177,6 +180,8 @@ because the buffer will be overwritten in the calculation core with the real out
 An attribute buffer contains data for a single component attribute and represents one column in a columnar representation of component data.
 A combination of attribute buffers with the same amount of elements has the power to carry the same information as row-based [component buffers](#component-buffers).
 
+The type (implying the size) of each attribute can be found using the `PGM_meta_attribute_ctype`.
+
 Since all attributes consist of primitive types, operations are straightforward.
 We therefore do not provide explicit interface functionality.
 

From 20c97e8a6fef783cd438170561d6fbb5258f019a Mon Sep 17 00:00:00 2001
From: Santiago Figueroa Manrique <santiago.figueroa.manrique@alliander.com>
Date: Mon, 28 Oct 2024 14:20:50 +0100
Subject: [PATCH 4/4] addressed comments

Signed-off-by: Santiago Figueroa Manrique <santiago.figueroa.manrique@alliander.com>
---
 docs/advanced_documentation/c-api.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/docs/advanced_documentation/c-api.md b/docs/advanced_documentation/c-api.md
index e385ab109..986b8eb8a 100644
--- a/docs/advanced_documentation/c-api.md
+++ b/docs/advanced_documentation/c-api.md
@@ -79,7 +79,7 @@ We define the following concepts in the data hierarchy:
 
 * Dataset: a collection of data buffers for a given purpose. 
   At this moment, we have four dataset types: `input`, `update`, `sym_output`, `asym_output`.
-* Component: the representation of attributes for the same component in our [data model](../user_manual/components.md), e.g., `node`.
+* Component: a data buffer with the representation of all attributes of a physical grid component in our [data model](../user_manual/components.md), e.g., `node`.
 * Attribute: a property of given component. For example, `u_rated` attribute of `node` is the rated voltage of the node.
 
 Additionally, at this time, we distinguish two buffer types: [component buffers](#component-buffers) and [attribute buffers](#attribute-buffers).
@@ -183,7 +183,7 @@ A combination of attribute buffers with the same amount of elements has the powe
 The type (implying the size) of each attribute can be found using the `PGM_meta_attribute_ctype`.
 
 Since all attributes consist of primitive types, operations are straightforward.
-We therefore do not provide explicit interface functionality.
+We therefore do not provide explicit interface functionality to create an attribute buffer. Instead, you should use `PGM_dataset_const_add_buffer` or `PGM_dataset_mutable_add_buffer` with empty data (`NULL`) to set a component buffer for data in columnar-format, and use the functions `PGM_dataset_const_add_attribute_buffer` and `PGM_dataset_mutable_add_attribute_buffer` to add the attribute buffers directly to a dataset.
 
 ## Dataset views