Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: BatchIt #677

Merged
merged 19 commits into from
Sep 23, 2024
Merged

WIP: BatchIt #677

merged 19 commits into from
Sep 23, 2024

Conversation

nwf
Copy link
Collaborator

@nwf nwf commented Sep 18, 2024

This recreates the current contents of #637 from a different repository, since the lifetime of the @nwf-msr account is now pretty limited. (I couldn't find a way to move branches between repos in a way that swung the PR, so had to settle for this low-tech solution.)

Copy link
Member

@mjp41 mjp41 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great stuff, @nwf-msr and @nwf. A few minor things, but it is really clean. Thanks for making it so easy to review.

src/snmalloc/ds/allocconfig.h Outdated Show resolved Hide resolved
src/snmalloc/mem/corealloc.h Outdated Show resolved Hide resolved
src/snmalloc/mem/corealloc.h Outdated Show resolved Hide resolved
src/snmalloc/mem/corealloc.h Outdated Show resolved Hide resolved
src/snmalloc/mem/metadata.h Outdated Show resolved Hide resolved
src/snmalloc/mem/remoteallocator.h Outdated Show resolved Hide resolved
src/snmalloc/mem/remoteallocator.h Show resolved Hide resolved
src/snmalloc/mem/remoteallocator.h Show resolved Hide resolved
src/snmalloc/mem/remotecache.h Show resolved Hide resolved
src/snmalloc/mem/remotecache.h Show resolved Hide resolved
@nwf
Copy link
Collaborator Author

nwf commented Sep 21, 2024

Pushing with some changes applied to get CI's opinion. Will address the rest of your review comments soon.

nwf and others added 8 commits September 21, 2024 15:57
Unlike its brethren, `dealloc_local_object` and
`dealloc_local_object_slow`, the `dealloc_local_object_slower` method
does not take a pointer to free space.  Make this slightly more apparent
by renaming it and adding some commentary to both definition and call
site.
Make both _fast() and _slow() arms take the meta as an argument; _meta()
already did.
Plumb its use around remoteallocator and remotecache
This prepares the recipient to process a batched message.
Exercise recipient machinery by having the senders collect adjacent frees to
the same slab into a batch.
@nwf
Copy link
Collaborator Author

nwf commented Sep 22, 2024

OK, I think that's all the review comments.

Also, huh, I guess my time at MS using this account (what a mistake) reset the CLA bot, which I appeased all the way back in #41 (how time flies!).
I think the right thing to say here is:

@microsoft-github-policy-service agree

@nwf
Copy link
Collaborator Author

nwf commented Sep 22, 2024

I'm unsure what I could have done to make Windows unhappy, but that suggest I've broken something subtle. :(

I no longer have a Windows machine on which to test things; can I ask you to have a look, @mjp41?

Comment on lines 518 to 532
if (SNMALLOC_LIKELY(_batch < _meta->needed()))
{
// Will not hit threshold for state transition
_meta->needed() -= _batch;
return false;
}

/*
* If we've already taken a step and there are no more unreturned
* objects, then we don't need to take another step. It can happen
* that we get here because _meta->needed() is still 0, tho', so
* bail.
*/
if (!first && _batch == 0)
return false;
Copy link
Member

@mjp41 mjp41 Sep 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm unsure what I could have done to make Windows unhappy, but that suggest I've broken something subtle. :(

I no longer have a Windows machine on which to test things; can I ask you to have a look, @mjp41?

I haven't tried on Windows yet. But just inspecting the CI, I am wondering if it is valid to read needed after we have performed the last meta data update. Could another thread have reused the meta-data by this point or turned the page behind the meta-data off? Perhaps, we need to use the batch size first in the cases where we can. Which is effectively what your original code did as it would only look another time if it was non-zero.

Suggested change
if (SNMALLOC_LIKELY(_batch < _meta->needed()))
{
// Will not hit threshold for state transition
_meta->needed() -= _batch;
return false;
}
/*
* If we've already taken a step and there are no more unreturned
* objects, then we don't need to take another step. It can happen
* that we get here because _meta->needed() is still 0, tho', so
* bail.
*/
if (!first && _batch == 0)
return false;
/*
* Batch is non-zero for non-first iterations, so only check if it isn't the
* first call.
*/
SNMALLOC_ASSERT(!first || (batch != 0));
if (!first && _batch == 0)
return false;
if (SNMALLOC_LIKELY(_batch < _meta->needed()))
{
// Will not hit threshold for state transition
_meta->needed() -= _batch;
return false;
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, updated the suggestion above to fix my failure to do basic logic.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, that's a good point. Lemme apply that fix.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that SNMALLOC_ASSERT is right, though. The thing the new condition is guarding against is precisely the case that we have taken the first step (and enough subsequent steps) to have returned all objects.

Copy link
Member

@mjp41 mjp41 Sep 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I'm a bit confused why the first one would be empty? Perhaps, applying de-Morgan to the assert would be helpful:

   SNMALLOC_ASSERT(!(first && batch == 0));

This holds when I build locally.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, I just couldn't read. I'll add the assert.

@nwf
Copy link
Collaborator Author

nwf commented Sep 22, 2024

Looks like that was it!

This might involve multiple (I think at most two, at the moment) transitions in
the slab lifecycle state machine.  Towards that end, return indicators to the
caller that the slow path must be taken and how many objects of the original
set have not yet been counted as returned.
We can, as Matt so kindly reminds me, go get them from the pagemap.  Since we
need this value only when closing a ring, the read from over there is probably
not very onerous.  (We could also get the slab pointer from an object in the
ring, but we need that whenever inserting into the cache, so it's probably more
sensible to store that locally?)
Move ring set bits and associativity knobs to allocconfig and expose them via
CMake.  If associtivity is zero, use non-batched implementations of the
`RemoteMessage` and `RemoteDeallocCacheBatching` classes.

By default, kick BatchIt on when we have enough room in the minimum allocation
size to do it.  Exactly how much space is enough is a function of which
mitigations we have enabled and whether or not we are compiling with C++20.

This commit reverts the change to `MIN_ALLOC_SIZE` made in "Introduce
RemoteMessage structure" now that we have multiple types, and zies, of
remote messages to choose from.
There's no need for a full pointer here, it'd just make the structure larger on
CHERI.
In order not to thwart `mitigations(random_preserve)` too much, if it's on in
combination with BatchIt, roll the dice every time we append to a batch to
decide if we should stochastically evict this batch.  By increasing the number
of batches, we allow the recipient allocator increased opportunity to randomly
stripe batches across the two `freelist::Builder` segments associated with each
slab.
@mjp41
Copy link
Member

mjp41 commented Sep 23, 2024

@nwf want to click the merge button?

@nwf nwf merged commit fb776da into microsoft:main Sep 23, 2024
52 checks passed
@nwf nwf deleted the batchit branch September 23, 2024 18:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants