Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: properly track json memory usage #3641

Merged
merged 15 commits into from
Sep 18, 2024
Merged

fix: properly track json memory usage #3641

merged 15 commits into from
Sep 18, 2024

Conversation

kostasrim
Copy link
Contributor

@kostasrim kostasrim commented Sep 3, 2024

There is a problem on how we track memory usage for JSON family. Specifically the expression below:

return zmalloc_size(u_.json_obj.json_ptr);

is wrong and it always returns the same fixed size (16) regardless of the actual memory used by the json object. It's wrong because this returns the size of the object pointed to by json_ptr but it does not account the objects it allocates internally (it's internal pointers). For that to happen we would need to track Json's object memory somehow but this is complicated (see in comments).

  • add JsonMemTracker
  • add logic based on MiMallocResource deltas that calculates json object usage
  • add test

Addresses #3661

@@ -1037,7 +1040,8 @@ size_t CompactObj::MallocUsed() const {

if (taglen_ == JSON_TAG) {
DCHECK(u_.json_obj.json_ptr != nullptr);
return zmalloc_size(u_.json_obj.json_ptr);
auto* ptr = u_.json_obj.json_ptr;
return zmalloc_usable_size(ptr) + ptr->capacity();
Copy link
Contributor Author

@kostasrim kostasrim Sep 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is also wrong. ptr->capacity does not return the actual memory consumed in bytes. So clearly we have a problem. The main issues are:

  1. There doesn't seem to exist a function in Json that clearly returns the size in bytes used by the object.
  2. We can't approximate this easily since json elements are polymoprhic and it does not use flat storage internally such that we can use zmalloc_usable_size() on its internal pointer.

We could in theory iterate over all of its elements and write a visitor that aggregates the size of all of its elements but that would be extremely slow especially when info memory command is used because now we need to iterate over all the json keys in the data store -- obviously a no no

And we also don't really control json since it's an external library.

So my idea is to wrap class MiMemoryResource and introduce a MiJsonMemoryResource. This allocator will use the same heap as the CompactObject::memory_resource with one difference. It will have a member which will keep track how much memory it allocated/deallocated. Once the json operation finishes, we will reset this member value (such that subsequent operations on different json keys get accurate memory usage) and add/subtract it from a size object which will introduce in JsonWrapper. That way, when we ask for memory we will return this field without calculating anything and the memory consumption will properly be tracked by the allocator eliminating the need for manual tracking on an object that belongs to a library we don't own.

WDYT? @chakaz @adiholden @romange

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally, using a custom memory resource for JSON sounds like a good solution. Indeed it would just call the data heap so all data will be in the same place as it was before.

This however won't play very nicely with the way we compute object memory today, and in many places we want to know the size of a specific object. So I wonder, perhaps we want to have a memory_resource per JSON object? It has the cost of 2 pointers per object, so not sure if it's worth it..

Copy link
Contributor Author

@kostasrim kostasrim Sep 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

know the size of a specific object.

Yep and that why we will introduce a single size_t per json object in CompactObj. So the overhead will only be for one element which will keep the size (which is fine because we need a way to track memory for json objects)

@@ -696,6 +697,8 @@ void CompactObj::SetJson(JsonType&& j) {
SetMeta(JSON_TAG);
u_.json_obj.json_len = j.size();
u_.json_obj.json_ptr = AllocateMR<JsonType>(std::move(j));
DCHECK(std::pmr::polymorphic_allocator<char>{memory_resource()} ==
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bingo.

2024-09-03T16:48:59.0738486Z 60: F20240903 16:48:59.073283 22070 compact_object.cc:700] Check failed: std::pmr::polymorphic_allocator<char>{memory_resource()} == u_.json_obj.json_ptr->get_allocator() 
2024-09-03T16:48:59.0740855Z 60: *** Check failure stack trace: ***
2024-09-03T16:48:59.0761492Z 60:     @     0x564941768dbb  google::LogMessage::Fail()
2024-09-03T16:48:59.0781948Z 60:     @     0x564941768d02  google::LogMessage::SendToLog()
2024-09-03T16:48:59.0802555Z 60:     @     0x5649417684f7  google::LogMessage::Flush()
2024-09-03T16:48:59.0824658Z 60:     @     0x56494176c20e  google::LogMessageFatal::~LogMessageFatal()
2024-09-03T16:48:59.0869710Z 60:     @     0x5649412c6855  dfly::CompactObj::SetJson()
2024-09-03T16:48:59.0895859Z 60:     @     0x564940be862b  dfly::RdbLoaderBase::OpaqueObjLoader::HandleBlob()
2024-09-03T16:48:59.0930073Z 60:     @     0x564940be4b97  dfly::RdbLoaderBase::OpaqueObjLoader::operator()()
2024-09-03T16:48:59.0972643Z 60:     @     0x564940c0b783  std::__invoke_impl<>()
2024-09-03T16:48:59.1012552Z 60:     @     0x564940c05aaa  std::__invoke<>()
2024-09-03T16:48:59.1059136Z 60:     @     0x564940bffb53  std::__detail::__variant::__gen_vtable_impl<>::__visit_invoke_impl()
2024-09-03T16:48:59.1091480Z 60:     @     0x564940bffb96  std::__detail::__variant::__gen_vtable_impl<>::__do_visit_invoke()

memory_resource was not propagated.

Copy link
Contributor Author

@kostasrim kostasrim Sep 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my understanding the allocate used should have been CompactObject::memory_resource(). I haven't checked the specific case but based on the semantics on the json family this should be the case. Let me know if I am missing something.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you think this is a problem with jsoncons, I'd try to create a simple standalone example of this happening and report it in their repo.
We've already reported similar issues before, and Daniel (the maintainer) always helped us

@kostasrim kostasrim self-assigned this Sep 3, 2024
@romange
Copy link
Collaborator

romange commented Sep 5, 2024

In the worst case we can handle json differently from the other objects by using a separate MiMemoryResource with the same data heap. MiMemoryResource will count used memory, so will know the aggregate of JSON objects.

We won't know "memory usage " but this is much smaller problem.

@kostasrim
Copy link
Contributor Author

In the worst case we can handle json differently from the other objects by using a separate MiMemoryResource with the same data heap. MiMemoryResource will count used memory, so will know the aggregate of JSON objects.

We won't know "memory usage " but this is much smaller problem.

that's the plan and it's almost ready. But I found bugs in their json impl not propagating the allocators properly. I am patching this and I will raise the issue on jsoncons lib

Signed-off-by: kostas <kostas@dragonflydb.io>
struct JsonConsT {
JsonType* json_ptr;
// Owning and must be explicitly deallocated
MiMemoryResource* mr;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't really need this but I want to keep the DCHECKS. If in the future we decide we need an extra 8 bytes for JsonConsT, we can remove this 🕺

return decoder.get_result();
auto res = decoder.get_result();
// This should not trigger once pmr is fixed in jsoncons
DCHECK(res.get_allocator().resource() == mr);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we are blocked on this until it's addressed by jsoncons

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so add an issue in the comment and comment out the DCHECK I guess?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it will cause a memory leak. We will loose this mr. Yes maybe we can release it here but then the tracking will be wrong. :/

struct JsonConsT {
JsonType* json_ptr;
// Owning and must be explicitly deallocated
MiMemoryResource* mr;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure why you keep MiMemoryResource* mr; inside each CompactObject.

  1. JsonType already has allocator in it
  2. All json objects will use the same MiMemoryResource instance unless I am missing something

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know about this. I keep it because I want to DCHECK the hell out of it. It's just breaks with pmr in so many levels that even if we rolled in production I would add LOG(warning) to warn us that the pmr went brrr

@@ -14,12 +14,17 @@
#include <jsoncons_ext/jsonpath/jsonpath.hpp>
#include <jsoncons_ext/jsonpointer/jsonpointer.hpp>
#include <jsoncons_ext/mergepatch/mergepatch.hpp>
#include <memory>
#include <memory_resource>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please do not include <memory_resource> directly and in general, can you please disable include what your use rule in your environment? it's too agressive

}
pv.SetJsonSize(net_positive, zero_diff, diff);
// Under any flow we must not end up with this special value.
DCHECK(pv.MallocUsed() != 0);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe worth adding LOG(ERROR) for that case so if it ever happens we get an alert.

@kostasrim
Copy link
Contributor Author

I tested this locally but I am unsure how I can write explicit tests for memory usage that will work on the CI. My assumption here is that different platforms won't have the exact same memory usage but maybe I am wrong. WDYT @romange @chakaz

@kostasrim kostasrim changed the title [Do not review -- PR only for discussion] fix: properly track json memory usage fix: properly track json memory usage Sep 12, 2024
@@ -50,6 +52,11 @@ size_t QlMAllocSize(quicklist* ql) {
return res + ql->count * 16; // we account for each member 16 bytes.
}

uint32_t JsonEnconding() {
static bool is_experimental = absl::GetFlag(FLAGS_experimental_flat_json);
Copy link
Contributor Author

@kostasrim kostasrim Sep 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe worth making it thread local since the compiler injects a mutex on initialization to make the static thread safe 🤷

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd do that, why not?

@@ -50,6 +52,11 @@ size_t QlMAllocSize(quicklist* ql) {
return res + ql->count * 16; // we account for each member 16 bytes.
}

uint32_t JsonEnconding() {
static bool is_experimental = absl::GetFlag(FLAGS_experimental_flat_json);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
static bool is_experimental = absl::GetFlag(FLAGS_experimental_flat_json);
static bool is_flat = absl::GetFlag(FLAGS_experimental_flat_json);

@@ -50,6 +52,11 @@ size_t QlMAllocSize(quicklist* ql) {
return res + ql->count * 16; // we account for each member 16 bytes.
}

uint32_t JsonEnconding() {
static bool is_experimental = absl::GetFlag(FLAGS_experimental_flat_json);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd do that, why not?

Comment on lines 721 to 728
if (u_.json_obj.cons.bytes_used == 0) {
u_.json_obj.cons.bytes_used = size;
} else if (net_positive) {
u_.json_obj.cons.bytes_used += size;
} else {
DCHECK(u_.json_obj.cons.bytes_used >= size);
u_.json_obj.cons.bytes_used -= size;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not accept size (call it delta) as some signed type (int64_t or whatever), and then just do u_.json_obj.cons.bytes_used += delta;?
Why all the ifs and extra parameter?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also specifically this first if is unneeded right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes I agree we don't need the first case. Furthermore, I was afraid of overflowing/underflowing but we do the exact same thing as you suggest in AccountObjectMemory so I guess it should not be an issue. Delta sounds simpler also. I will apply this.

@@ -445,13 +448,21 @@ class CompactObj {
};
} __attribute__((packed));

struct JsonConsT {
JsonType* json_ptr;
size_t bytes_used;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not reduce it to uint32_t and keep the encoding inside the object?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uint_32 limits to 4gb. It's extreme to expect any object would take more than that but you never know. As for enconding, I don't think it makes much sense to store this. It's fixed once it's set and that's also one of the reasons I refactored it

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can also use uint48:=uint8_t[6]

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason for storing it per object is that I expect we would want a heuristic at some point for encoding, managed per object basis. like we do with hash maps, sets etc. flat is awful for mutable operations, for example. we do not have to do it now though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to pack this in the beginning and in fact I already mentioned to @adiholden that if in the future we wanted more space on the json we store within a compcat object we could get some. However I thought that the size_t suffices for now and until we actually need this there is no reason to take the bitfield route.

Copy link
Contributor Author

@kostasrim kostasrim Sep 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also if we ever want a heuristic, how many bits would that require? 4-8 max ? For that use case we will just make the bytes_used (and the length for flat json) a bitfield and mask the last n bits we want to use for enconding. It would be a simple change that would serve our need.

What I am trying to say here is I already thought about us needing a few more bits in the future and it's more than doable :)

};
uint32_t json_len = 0;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not keep both fields common for both encodings?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO len is how many characters. The other one is how many bytes. I find it a little clearer 🤷

// It uses RAII to set the memory usage of json objects.
class JsonMemTracker {
public:
explicit JsonMemTracker(const OpArgs& args, std::string_view key, bool op_set = false)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please refrain from using default arguments

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure will remove

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reminder :)

explicit JsonMemTracker(const OpArgs& args, std::string_view key, bool op_set = false)
: op_args_(args), key_(key), op_set_{op_set} {
start_size_ = static_cast<MiMemoryResource*>(CompactObj::memory_resource())->used();
auto it_res = op_args_.GetDbSlice().FindMutable(op_args_.db_cntx, key_, OBJ_JSON);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we now call Find 3 times more times?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 more times in the worst case (when the json does not exist). Otherwise it's one.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it possible to use a similar pattern with already existing iterator? i.e. avoid additional finds?

Copy link
Contributor Author

@kostasrim kostasrim Sep 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good question. The short answer is yes we can but it would clutter the code and I wanted to avoid that (unless what I do here is a huge regression on the flows which IMO I don't think it's the case as we already do this in a few places already -- OpMerge). Take for example JSON.SET and it's implementation. It has two main flows.

  1. If the path is the root element, then call FindReadOnly and then SetJson which calls AddOrFind()
  2. If the path is not root, call UpdateEntry()

The thing is that we need to track the memory from the start, at the moment we call a Find variant because it could be that we replace the old json with a new one (case 1 above) and we would need to track the memory as follows (mem_of_prev_json + new_mem_delta) where new_mem_delta can be negative.

So if I replaced the RAII class I introduced to avoid the Find at the start of a json operation, we would need to carefully go around the json_family functions and track explicitly the starting memory after we call Find on each flow which would be really error prone. I understand that we don't have many paths with this, since most operations are built around UpdateEntry but there is also another issue.

To further complicate things, we usually call ShardJsonFromString which we call std::move() on its result to move it around in the interesting flows until we either store it as is, or use it to update existing jsons. This is the json item that we need to track its memory. The problem is that we construct this item, before we call Find on the flows I describe above but we need to know the memory before we construct this item.

So to fully answer, maybe we could avoid those finds but that would require a complicated refactor that seems to be cumbersome as it requires explicit control of memory tracking around each json flow.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we modify JsonMemTracker to use the compact object directly (i.e. not to call Find()) and modify UpdateEntry() to use it?
And even if I missed some call that it won't work for, we can code around that instead, because it does seem like all (the majority?) of calls use it

@@ -50,6 +52,8 @@ size_t QlMAllocSize(quicklist* ql) {
return res + ql->count * 16; // we account for each member 16 bytes.
}

const thread_local uint32_t kJsonEnconding = absl::GetFlag(FLAGS_experimental_flat_json);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This won't work I'm afraid.
Globals are initialized before main() runs, not lazily. This means that it would run after flags parsing.
Easiest workaround imho would be to wrap this in a function, and declare the thread_local variable inside it. This would evaluate this lazily.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes you are right. For some reason I always forget about the order of initialization for globals. Same function as a I had before just with a thread_local within it :)

// It uses RAII to set the memory usage of json objects.
class JsonMemTracker {
public:
explicit JsonMemTracker(const OpArgs& args, std::string_view key, bool op_set = false)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reminder :)

explicit JsonMemTracker(const OpArgs& args, std::string_view key, bool op_set = false)
: op_args_(args), key_(key), op_set_{op_set} {
start_size_ = static_cast<MiMemoryResource*>(CompactObj::memory_resource())->used();
auto it_res = op_args_.GetDbSlice().FindMutable(op_args_.db_cntx, key_, OBJ_JSON);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we modify JsonMemTracker to use the compact object directly (i.e. not to call Find()) and modify UpdateEntry() to use it?
And even if I missed some call that it won't work for, we can code around that instead, because it does seem like all (the majority?) of calls use it

Copy link
Collaborator

@chakaz chakaz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally LG! Please see some last nits

@@ -50,6 +52,11 @@ size_t QlMAllocSize(quicklist* ql) {
return res + ql->count * 16; // we account for each member 16 bytes.
}

uint32_t JsonEnconding() {
static thread_local uint32_t json_enc = absl::GetFlag(FLAGS_experimental_flat_json);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
static thread_local uint32_t json_enc = absl::GetFlag(FLAGS_experimental_flat_json);
static thread_local uint32_t json_enc = absl::GetFlag(FLAGS_experimental_flat_json) ? kEncodingJsonFlat : kEncodingJsonCons;

@@ -36,7 +38,7 @@
ABSL_FLAG(bool, jsonpathv2, true,
"If true uses Dragonfly jsonpath implementation, "
"otherwise uses legacy jsoncons implementation.");
ABSL_FLAG(bool, experimental_flat_json, false, "If true uses flat json implementation.");
ABSL_DECLARE_FLAG(bool, experimental_flat_json);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd expose the function instead of using the flag here directly

void SetJsonSize(PrimeValue& pv, bool is_op_set) {
const size_t current = static_cast<MiMemoryResource*>(CompactObj::memory_resource())->used();
int64_t diff = static_cast<int64_t>(current) - static_cast<int64_t>(start_size_);
// If the diff is 0 it means the object uses the same memory as before. No actio needed.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// If the diff is 0 it means the object uses the same memory as before. No actio needed.
// If the diff is 0 it means the object use the same memory as before. No action needed.

@kostasrim kostasrim enabled auto-merge (squash) September 18, 2024 12:45
@kostasrim kostasrim merged commit 6e45c9c into main Sep 18, 2024
12 checks passed
@kostasrim kostasrim deleted the kpr1 branch September 18, 2024 13:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants