Flamegraph generation memory usage is quite high for large inputs #201

itamarst · 2020-12-06T14:39:30Z

Memory usage for inferno-flamegraph, and the equivalent Rust API usage, is proportional to input file size. A 44MB file results in 60MB memory usage for me, a 3KB input file results in 3MB (presumably the minimum).

I discovered this when processing a 440MB file, which resulted in hundreds of MB RAM usage, which is embarrassing when one is implementing a memory profiler 😁 So now I'm prefiltering out tiny irrelevant frames, which is why it's 44MB and not 440MB. Still, less memory usage would be nice.

Now, the output file is typically more like a 1 megabyte or less, because all those repeating frames in the input file get combined into a graph in the output. So it ought to be possible to reduce memory usage quite a lot in the internal representation as well.

My completely unverified guess as to the problem: my input files have quite long strings for frame names, and multiple copies of each string are being stored in memory when the data structures are built up. If this is the case, usage of a string interner in the right place might be quite helpful, and potentially even speed up runtime because more data would fit in the CPU memory caches.

The text was updated successfully, but these errors were encountered:

jonhoo · 2020-12-06T19:38:49Z

That is interesting indeed. Are you seeing this problem with inferno-flamegraph or inferno-stack-collapse? If it is indeed the former, my guess is that this comes from the need to sort the input before processing it. It's an unfortunate property of the algorithm it uses to merge stack frames that it requires the input to be sorted, which means reading it all into memory and then sorting the lines. If your input is already sorted, you could try the --no-sort flag which assumes the input is already sorted, and therefore should be able to avoid reading it all into memory.

As far as the blow-up goes, a 1.4x blowup is unfortunate, though surprisingly good given little mind has been paid to optimizing memory use. That is, given that the input file has to be in memory to sort it! I'm a little strapped for time, but if you want to do some digging, this article has some good tips on profiling memory use in Rust!

itamarst · 2020-12-06T22:32:31Z

Pre-sorting might help for me, yeah. But—

Consider an input that looks like this:

A;B;C 123
A;B;D 345

The strings A and B repeat. In fact, most of the memory usage of loaded lines will be repeated. Which is what makes me think you can not just go from 1.4× to 1×, you can plausibly go to 0.1× or even better.

jonhoo · 2020-12-08T04:49:56Z

That's a neat idea. I wonder how well it'll turn out in practice though. Currently we store a single string A;B;C and a single string A;B;D, but with the proposed change we'd store two vectors each holding three (interned) strings. Assuming interned strings are stored as a u32, that's:

 2x String (8b pointer, 8b length, 8b capacity) + "A;B;C" + "A;B;D" = 58b

versus

2xVec (8b pointer, 8b length, 8b capacity) + 4x distinct interned Strings (24b) + 6u32s + "A"+"B"+"C"+"D" = 172b

Of course the benefits add up the more strings there are, but if the strings are generally short (like main) it probably doesn't end up buying that much. It'd be super interesting to see experiments on this on some real traces!

itamarst · 2021-03-17T17:27:11Z

Going to try to do this.

itamarst · 2021-03-17T17:30:02Z

Or at least, try to find some way to reduce memory usage, I'm getting hundreds of MB in memory use.

itamarst · 2021-03-17T17:35:22Z

It's possible that it's easier for me to just do this on my side, where each unique frame text is mapped to a unique unicode character, and then I search and replace on the resulting SVG. Which feels terrible, but is plausibly less work. Will think about it some more as I read code.

itamarst · 2021-03-17T18:04:23Z

After further thought, going to go back and see why my files are so big, the input file sizes do seem excessive even for a worst-case scenario.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Flamegraph generation memory usage is quite high for large inputs #201

Flamegraph generation memory usage is quite high for large inputs #201

itamarst commented Dec 6, 2020

jonhoo commented Dec 6, 2020

itamarst commented Dec 6, 2020

jonhoo commented Dec 8, 2020

itamarst commented Mar 17, 2021

itamarst commented Mar 17, 2021

itamarst commented Mar 17, 2021

itamarst commented Mar 17, 2021

Flamegraph generation memory usage is quite high for large inputs #201

Flamegraph generation memory usage is quite high for large inputs #201

Comments

itamarst commented Dec 6, 2020

jonhoo commented Dec 6, 2020

itamarst commented Dec 6, 2020

jonhoo commented Dec 8, 2020

itamarst commented Mar 17, 2021

itamarst commented Mar 17, 2021

itamarst commented Mar 17, 2021

itamarst commented Mar 17, 2021