ffi: Add class for key-value pair log events. #507

LinZhihao-723 · 2024-08-04T00:43:01Z

Description

This PR introduces a new class KeyValuePairLogEvent. This class represents a key-value pair log event serialized by the key-value pair IR format. It consists of the following members:

An unordered map of key-value pairs. Each key-value pair consists of an id (as the key) and a Value, where the id refers to the schema tree node id.
A shared pointer to the schema tree.
An UTC offset.

An instance must be created using the factory function. The factory function will validate whether the value's type matches the corresponding key. The input of the factory function should come from the deserialization methods, which will be used in the coming PRs.

Validation performed

Ensure the class can be successfully built.
Tested in development branches that this class is functional with key-value pair IR format deserializer.
Add unit tests to cover errors on creation.

components/core/src/clp/ffi/KeyValuePairLogEvent.hpp

components/core/src/clp/ffi/KeyValuePairLogEvent.cpp

kirkrodrigues · 2024-08-14T01:22:01Z

components/core/src/clp/ffi/KeyValuePairLogEvent.cpp

+            if (false == value.has_value()) {
+                // Empty value
+                if (SchemaTreeNode::Type::Obj != type) {
+                    return std::errc::protocol_error;
+                }


I kinda feel like this can be moved into the value type validation method.

prefer to leave it outside for two reasons:

empty isn't a Value type

This simplifies the helper function so it doesn't have to check has_value() and we don't need to symbols inside the helper (one for the std::optional<Value> and the over for the reference of the underlying Value)

components/core/src/clp/ffi/KeyValuePairLogEvent.cpp

components/core/src/clp/ffi/KeyValuePairLogEvent.hpp

components/core/src/clp/ffi/KeyValuePairLogEvent.cpp

Co-authored-by: kirkrodrigues <2454684+kirkrodrigues@users.noreply.github.com>

haiqi96

Just quickly skimmed through the class and haven't read the unit tests yet.

haiqi96 · 2024-08-14T15:14:14Z

components/core/src/clp/ffi/KeyValuePairLogEvent.cpp

+ * `node_id_value_pairs`. A node is considered a leaf if none of its descendants appear in the
+ * `node_id_value_pairs`.
+ */
+[[nodiscard]] auto is_leaf_node(


Disclaimer: I only have a rough understanding on how schema tree works.
This function is bit confusing for me.
"A node is considered a leaf if none of its descendants appear in the node_id_value_pairs." <- This doesn't seem to be the normal definition for a leaf node. Is this expected?

In addition, "the sub schema tree defined by node_id_value_pairs." Is there any guarantee that nodes in the node_id_value_pairs will form a subtree? Just looking at the argument without thinking about how it is generated, it is possible node_id_value_pairs contains a list of unconnected nodes.

And lastly a highlevel comment is that I feel this function should belong to the Schema tree class.

This is a validation method: if an object type node appears in the key value pairs, it must be a leaf node. The way we store key value pairs is to store the schema (all as leaf nodes), so the path from the root to the leaves are implicitly indicated. In the merged schema tree, usually an object type node is a non leaf node. But in a key value pairs' schema, this node can be a leaf node as long as its value is {} or null. Our goal is to validate that if such a node appears, all its descendants in the merged schema tree can't appear in the schema of the given key value pair log event. it is possible node_id_value_pairs contains a list of unconnected nodes isn't true since everything will be connected to the root if they are valid leaf nodes. This function is used to validate pairs using schema tree's information. I think we should split it from a schema tree's method.

haiqi96 · 2024-08-14T15:20:17Z

components/core/src/clp/ffi/KeyValuePairLogEvent.cpp

+            auto const& node{schema_tree.get_node(node_id)};
+            auto const node_type{node.get_type()};
+            if (false == value.has_value()) {
+                // Value is an empty object (`{}`, which is not the same as `null`)


I assume this comment tries to explain why the line doesn't use value.is_null(); in a way similar to line 68?
I don't think I follow the comment though. Can you clarify a bit?

And we can combine the two if statements?

{} and null are two different types of values.

I'd prefer to handle them separately. Combining these two statements will make it very hard to read

I still don't get when there will be {} and when there will be null (and why we only check for {} but not null in this function)
Is it because null is already handled in the validation? If so, I think you should leave this information in the comment.

{} is considered as a special case, where null is a valid primitive value (like an int or a string). These are user-input values.

haiqi96 · 2024-08-14T15:22:05Z

components/core/src/clp/ffi/KeyValuePairLogEvent.hpp

+              m_utc_offset{utc_offset} {}
+
+    // Variables
+    std::shared_ptr<SchemaTree> m_schema_tree;


Techinically it is possible to modify the original schema tree using this pointer in the class, right?
Should we consider using std::shared_ptr<SchemaTree const> ?

components/core/src/clp/ffi/KeyValuePairLogEvent.hpp

components/core/src/clp/ffi/KeyValuePairLogEvent.cpp

kirkrodrigues · 2024-08-15T07:32:28Z

components/core/src/clp/ffi/KeyValuePairLogEvent.cpp

+        KeyValuePairLogEvent::NodeIdValuePairs const& node_id_value_pairs
+) -> std::optional<std::errc> {
+    std::optional<std::errc> ret_val;
+    std::unordered_map<SchemaTreeNode::id_t, std::unordered_set<std::string_view>>


Let's move this into the try to reduce its scope.

kirkrodrigues · 2024-08-15T07:35:40Z

components/core/src/clp/ffi/KeyValuePairLogEvent.cpp

+ * @param node_id_value_pairs
+ * @return `std::nullopt` if the inputs are valid, or an error code indicating the failure:
+ * - std::errc::operation_not_permitted if a node ID doesn't represent a valid node in the
+ *   schema tree, or a non-leaf node ID is paired with a value.


A non-leaf node paired with a value actually returns protocol_not_supported. That said, I think it should return operation_not_permitted since if it was permitted, node_id_value_pairs wouldn't be a tree.

components/core/tests/test-ffi_KeyValuePairLogEvent.cpp

kirkrodrigues · 2024-08-15T07:37:50Z

components/core/tests/test-ffi_KeyValuePairLogEvent.cpp

+    }
+}
+
+auto clone_value(Value const& value) -> Value {


Hm, I missed this in the previous PRs but it seems like we should just have copy constructors for Value and EncodedTextAst. It feels like an overoptimization to force people to never use copy construction unless they write a method manually like this.

We can add a copy constructor for EncodedTextAst, however, I'd prefer to not have a copy constructor for Value to enforce move semantics. How about making this clone method as a member of Value, so that people can explicitly copy a value when they really need

What's bad about copying a Value? Is it not similar to copying a string?

Copying string and EncodedTextAst can be expensive in deserialization. With copy disabled, we are more confident that there's no copy overhead in deserialization.

Adding move constructor for EncodedTextAst and Value

kirkrodrigues · 2024-08-15T07:38:26Z

components/core/tests/test-ffi_KeyValuePairLogEvent.cpp

+    }
+
+    {
+        // Empty invalid_node_id_value_pairs


Overzealous refactoring.

kirkrodrigues · 2024-08-15T07:40:17Z

components/core/tests/test-ffi_KeyValuePairLogEvent.cpp

+}
+
+// NOLINTNEXTLINE(readability-function-cognitive-complexity)
+TEST_CASE("ffi_KeyValuePairLogEvent_create", "[ffi]") {


You can reduce the amount of code for this test by using SECTION and even nested SECTION macros. Recall that CMake runs every SECTION from the beginning of the test, so the schema tree, node_id_value_pairs would be reset for each outer SECTION.

E.g., instead of cloning valid_node_id_value_pairs, you could just create it outside a SECTION and use it inside different SECTIONs.

I actually thought about using SECTION but I didn't. Two reasons:

I don't want to rerun everything from the beginning

I want to test that the same schema tree is used across different log events (which have different life cycles)

But by using sections, you'd only run the things outside the section(s) from the beginning, right?

Testing the same instance of the schema tree across all of the tests seems a bit extra since it should be const in the class anyway (and if it wasn't const, you're still relying on finding a bug in the implementation with a set of test cases that isn't exhaustive). Imo, having less code in the test is more valuable than testing that corner case.

From my reading, you could do something like:

Create schema tree

Section (optional): Test empty id-value pairs

Section (optional): Test mismatched types

Section (optional): Test valid id-value pairs

Section: Test duplicate key conflicts

Section: Test implicit key conflicts

Section (optional): Test out of bound node ID

That would allow you to at least get rid of the clone_node_id_value_pairs function (and make the code a bit easier to read).

Co-authored-by: kirkrodrigues <2454684+kirkrodrigues@users.noreply.github.com>

kirkrodrigues · 2024-08-15T20:21:41Z

components/core/src/clp/ffi/Value.hpp

-    auto operator=(Value const&) -> Value& = delete;
-
-    // Default move constructor and assignment operator
+    // Default copy/move constructor and assignment operator


Since everything except the constructor is default, we don't need to say they're default explicitly, right?

kirkrodrigues · 2024-08-15T20:21:52Z

components/core/src/clp/ir/EncodedTextAst.hpp

-    auto operator=(EncodedTextAst const&) -> EncodedTextAst& = delete;
-
-    // Default move constructor and assignment operator
+    // Default copy/move constructor and assignment operator


Ditto about default.

components/core/tests/test-ffi_KeyValuePairLogEvent.cpp

kirkrodrigues · 2024-08-15T20:23:20Z

components/core/tests/test-ffi_KeyValuePairLogEvent.cpp

+        REQUIRE_FALSE(result.has_error());
+
+        SECTION("Test duplicated key conflict on node #3") {
+            auto conflict_pairs{valid_node_id_value_pairs};


I thought the idea of using nested sections (besides breaking the code up) is that you wouldn't need to duplicate valid_node_id_value_pairs since it would start with a fresh state in each section, no? (You'd obviously call it something other than valid_node_id_value_pairs, but still.

Co-authored-by: kirkrodrigues <2454684+kirkrodrigues@users.noreply.github.com>

…eady

Co-authored-by: kirkrodrigues <2454684+kirkrodrigues@users.noreply.github.com>

LinZhihao-723 added 15 commits July 26, 2024 01:04

Implement CLP string

0da3f44

Remove unused line

141f954

Add value template file

f3525aa

Implement more helper templates...

bbd3a56

Implement the super type

b1fa61c

Merge from main

e148af7

Add unit tests

f41e5d6

Fix cmake

d29a17a

... class

12679f7

Implement kv pair class

534abf0

linting...

b7bed71

Update KeyValuePair implementation

88aa36a

Merge

92c3388

Add missing comments

698b406

Update doc string

bc565f5

LinZhihao-723 changed the title ~~[WIP] ffi: Add class for key-value pair log events.~~ ffi: Add class for key-value pair log events. Aug 8, 2024

Update comments

2393038

LinZhihao-723 mentioned this pull request Aug 10, 2024

ffi: Add support for serializing a KeyValuePairLogEvent as a nlohmann::json object. #512

Merged

kirkrodrigues requested changes Aug 14, 2024

View reviewed changes

LinZhihao-723 and others added 4 commits August 13, 2024 21:47

Apply suggestions from code review

285aab3

Co-authored-by: kirkrodrigues <2454684+kirkrodrigues@users.noreply.github.com>

Apply code review comments and add unit tests

7623d76

Reorder a little bit...

ef0dd80

Fix macos build

54ec36a

haiqi96 reviewed Aug 14, 2024

View reviewed changes

Use const shared ptr and rely on type conversion

941fcbc

kirkrodrigues requested changes Aug 15, 2024

View reviewed changes

LinZhihao-723 and others added 3 commits August 15, 2024 14:06

Apply suggestions from code review

6623b5a

Co-authored-by: kirkrodrigues <2454684+kirkrodrigues@users.noreply.github.com>

Apply review comments to the src file

3ae389a

Apply comments on the unit tests

a066c67

kirkrodrigues requested changes Aug 15, 2024

View reviewed changes

LinZhihao-723 and others added 2 commits August 15, 2024 16:39

Apply suggestions from code review

353dccd

Co-authored-by: kirkrodrigues <2454684+kirkrodrigues@users.noreply.github.com>

Apply code review comments

09dd01a

LinZhihao-723 requested a review from kirkrodrigues August 15, 2024 20:42

LinZhihao-723 added 2 commits August 15, 2024 16:50

Remove default declaration since they are not implicitly declared alr…

83ae4e0

…eady

Rename

7cd1ea7

kirkrodrigues previously approved these changes Aug 15, 2024

View reviewed changes

Fix the error

6dd95a6

LinZhihao-723 dismissed kirkrodrigues’s stale review via 6dd95a6 August 15, 2024 23:18

kirkrodrigues approved these changes Aug 15, 2024

View reviewed changes

LinZhihao-723 merged commit 1b67fdb into y-scope:main Aug 15, 2024
12 of 13 checks passed

jackluo923 pushed a commit to jackluo923/clp that referenced this pull request Dec 4, 2024

ffi: Add class for key-value pair log events. (y-scope#507)

aad78ce

Co-authored-by: kirkrodrigues <2454684+kirkrodrigues@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ffi: Add class for key-value pair log events. #507

ffi: Add class for key-value pair log events. #507

LinZhihao-723 commented Aug 4, 2024 •

edited

Loading

kirkrodrigues Aug 14, 2024

LinZhihao-723 Aug 14, 2024

haiqi96 left a comment

haiqi96 Aug 14, 2024

LinZhihao-723 Aug 14, 2024

haiqi96 Aug 14, 2024

LinZhihao-723 Aug 14, 2024

haiqi96 Aug 14, 2024

LinZhihao-723 Aug 14, 2024

haiqi96 Aug 14, 2024 •

edited

Loading

LinZhihao-723 Aug 14, 2024

kirkrodrigues Aug 15, 2024

kirkrodrigues Aug 15, 2024

kirkrodrigues Aug 15, 2024

LinZhihao-723 Aug 15, 2024

kirkrodrigues Aug 15, 2024

LinZhihao-723 Aug 15, 2024

LinZhihao-723 Aug 15, 2024

kirkrodrigues Aug 15, 2024

kirkrodrigues Aug 15, 2024

LinZhihao-723 Aug 15, 2024

kirkrodrigues Aug 15, 2024

kirkrodrigues Aug 15, 2024

kirkrodrigues Aug 15, 2024

kirkrodrigues Aug 15, 2024

ffi: Add class for key-value pair log events. #507

ffi: Add class for key-value pair log events. #507

Conversation

LinZhihao-723 commented Aug 4, 2024 • edited Loading

Description

Validation performed

Choose a reason for hiding this comment

Choose a reason for hiding this comment

haiqi96 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

haiqi96 Aug 14, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

LinZhihao-723 commented Aug 4, 2024 •

edited

Loading

haiqi96 Aug 14, 2024 •

edited

Loading