Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generate data with zero allocs #554

Open
matheusd opened this issue Feb 27, 2024 · 10 comments
Open

Generate data with zero allocs #554

matheusd opened this issue Feb 27, 2024 · 10 comments

Comments

@matheusd
Copy link
Contributor

How can I generate/encode data with zero heap allocations?

Example use case: I want to generate data to write into a file, so I need to allocate a single structure, message and arena. I can pre-allocate the arena buffer. But all my attempts at writing a loop fail to make it zero alloc (either the arena grows or message escapes to heap).

Here's the set of test benchmarks: https://github.com/matheusd/capnptest01/blob/master/bench_test.go

And here are my results:

BenchmarkSetText01-7     6966074               174.9 ns/op            99 B/op          0 allocs/op
BenchmarkSetText02-7     1000000              1100 ns/op             352 B/op          5 allocs/op
BenchmarkSetText03-7     2315973               517.9 ns/op            72 B/op          2 allocs/op
BenchmarkSetText04-7     2661026               429.9 ns/op           260 B/op          0 allocs/op

While benchmarks 01 and 04 are shown with 0 allocs, that's actually just the amortized growth of the arena which ends up becoming large in those tests.

@lthibault
Copy link
Collaborator

If you call Release() on your message, the underlying buffers are returned to a sync.Pool. This is the recommended way to avoid allocations. I am not certain that there is a way to guarantee zero-allocation (un)marshalling, though the Arena API is designed to support it in principle.

@matheusd
Copy link
Contributor Author

matheusd commented Mar 2, 2024

AFAICT, my BenchmarkTest03() was supposed to do that (through the use of msg.Reset(arena)) and indeed, the arena itself does not grow, but some objects (looks to be the *Segment created inside Message.setSegment() ) still escape to the heap.

func BenchmarkSetText03(b *testing.B) {
	var msg capnp.Message
	arena := capnp.SingleSegment(nil)

	b.ReportAllocs()
	b.ResetTimer()

	for i := 0; i < b.N; i++ {
		seg, err := msg.Reset(arena)
		if err != nil {
			b.Fatal(err)
		}

		tx, err := NewTransaction(seg)
		if err != nil {
			b.Fatal(err)
		}

		err = tx.SetDescription("my own descr")
		if err != nil {
			b.Fatal(err)
		}
	}

	// b.Log(arena.String())
}

profile001

@matheusd
Copy link
Contributor Author

matheusd commented Mar 2, 2024

I notice that API wise, msg.Reset() has an issue, where I can't be completely in control of the arena. msg.Reset() forcibly calls arena.Release which puts the arena back to the sync pool, even if I know that I'll reuse the arena without necessitating going through the arena sync pool.

This is easily solved by instantiating my own arena type that NOPs on Release().

@matheusd
Copy link
Contributor Author

matheusd commented Mar 2, 2024

Ok, so the first escape to heap is caused because the second time Message.Segment() is executed, m.segs != nil but len(m.segs) == 0, meaning the first segment (which is statically allocated) has been removed from the segs map, causing a new Segment to be initialized. The following diff fixes this issue:

diff --git a/message.go b/message.go
index 5ae7f3e..4bcceaa 100644
--- a/message.go
+++ b/message.go
@@ -100,6 +100,9 @@ func (m *Message) Release() {
 func (m *Message) Reset(arena Arena) (first *Segment, err error) {
 	m.capTable.Reset()
 	for k := range m.segs {
+		if k == 0 && m.segs[k] == &m.firstSeg {
+			continue
+		}
 		delete(m.segs, k)
 	}
 
@@ -113,6 +116,7 @@ func (m *Message) Reset(arena Arena) (first *Segment, err error) {
 		DepthLimit:    m.DepthLimit,
 		capTable:      m.capTable,
 		segs:          m.segs,
+		firstSeg:      Segment{msg: m},
 	}
 
 	if arena != nil {

I noticed that Message.allocSegment requires m.segs to be non-nil, so it seems to me that Reset should just directly create m.segs if it is nil and populate it with the first seg, removing the need to consider the two cases throughout the code and make it easier to reason about. What do you think?

If the previous diff is reasonable, I can send it as a proper PR.

@matheusd
Copy link
Contributor Author

matheusd commented Mar 2, 2024

Ok, to fix the second escape to heap I had to create a new Arena that does not use the bufferpool to manage its memory: https://github.com/matheusd/capnptest01/blob/fd2e71a57e5c1ccf8a42f28804ba8f85129002d9/arena.go#L35

This makes it possible to encode messages without any heap allocations (other than the initial buffer if correctly sized).

Is ManualSegmentArena is something useful to be contributed to go-capnp?

@lthibault
Copy link
Collaborator

@matheusd Both of your suggestions seem sensible to me, and a PR is most welcome. Please be aware that I am currently traveling, so I will be slower to respond/review than usual.

Looking forward to the PR :)

@matheusd
Copy link
Contributor Author

(No rush to respond, I'm aware you're travelling)

I've sent the first PR to address this (#555).

I previously wrote:

I noticed that Message.allocSegment requires m.segs to be non-nil, so it seems to me that Reset should just directly create m.segs if it is nil and populate it with the first seg, removing the need to consider the two cases throughout the code and make it easier to reason about

But the more I think about it, looking at a cpu profile with the heap allocs fixed, the more I think that segs should really be a function of (and therefore moved to) the Area implementations.

In particular, this would allow implementing an arena that forgoes the Message.mu protection, getting a performance boost at the expense of the caller having to ensure no concurrent access to the same message.

For example, in the following profile of my benchmark, Reset() is over 50% of the time while it's basically useless (because the size of the arena is accounted for, only a single segment will ever be used and access is all done in a single goroutine).

gocapnp02

Rewriting to use an Arena that does not use the map and mutex would make Reset() essentially free.

@lthibault
Copy link
Collaborator

Hey @matheusd Thanks for investigating this. Your proposal sounds reasonable. I don't think this will break any existing behavior ... wanna give it a shot?

@matheusd
Copy link
Contributor Author

Yeah, I've started work on this. I only work on this as time permits, so no hard deadline on when I'll submit, but I think I've got a rough high level design going already.

Thank you for the support!

@lthibault
Copy link
Collaborator

Absolutely no rush! I'm very, very glad you're able to contribute some cycles 😃
Please let me know if I can be of help!

@matheusd matheusd mentioned this issue May 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants