Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vmm: Avoid making guest capabilities visible in the bhyve process context #2274

Open
wants to merge 7 commits into
base: dev
Choose a base branch
from

Conversation

markjdb
Copy link
Contributor

@markjdb markjdb commented Dec 19, 2024

No description provided.

lib/libvmmapi/vmmapi.c Outdated Show resolved Hide resolved
@@ -415,7 +415,7 @@ setup_memory_segment(struct vmctx *ctx, vm_paddr_t gpa, size_t len, char *base)
flags |= MAP_NOCORE;

/* mmap into the process address space on the host */
ptr = mmap(base + gpa, len, PROT_RW, flags, ctx->fd, gpa);
ptr = mmap(base + gpa, len, PROT_RW | PROT_CAP, flags, ctx->fd, gpa);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I probably broke this in d2f37a3. With this change we generally don't add capability permissions to non-anonymous mappings.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be clear, this is the intended behaviour?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it probably is. I think we probably believe that MAP_SHARED file mappings that need capability permissions will be sufficiently rare that requiring an explicit PROT_CAP is not a deal breaker in terms of compatibility.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MAP_SHARED mappings with should have to request PROT_CAP to ensure the developer means it as MAP_SHARED mappings in general present a way for capabilities to leak between address spaces where they may cause aliasing and temporal safety violations.

That does make me wonder if it's actually safe to sweep these mappings during revocation. If they only contain kernel pointers then it should be fine, but if they contain user pointers we might blow things up quite excitingly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I hadn't really considered that. This is a mapping of the guest physical address space, so I see no reason that we wouldn't encounter user pointers. We can't safely sweep this mapping, nor can we leave it alone with PROT_CAP set without breaking temporal safety within bhyve itself.

Since PROT_CAP is needed only for the gdb stub, at least so far, perhaps the solution is to ask the kernel to fetch the capability tag for us?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It also avoids confused deputy attacks on the userspace part of the hypervisor by being able to inject capabilities via this mapping.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A vmmdev ioctl is fine, yes, that's what I meant by syscall, but that was poorly worded on my part.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a working implementation of such an ioctl now.

While working on this, I realized that the vm_get_register() and vm_get_register_set() ioctls should also be returning tags out-of-band to avoid undesirable interactions with the revoker.

It can be done by extending the structures in question, which would be a compatibility break. On arm64, those ioctls are only used by the gdb stub I believe - do you think it's worth adding compat shims given that the scope of the break will be fairly small?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you need compat shims for adding tags, no. Given that the vCPU has to be stopped to query registers, you could probably get by with new vm_get_register_tag() and vm_get_register_tags() that just return packed bitmasks of tags. That would be less of a diff relative to upstream.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good idea, thanks. The PR now implements three ioctls and converts gdb.c to use them instead of exposing guest capabilities to the revoker.

lib/libvmmapi/vmmapi.c Outdated Show resolved Hide resolved
@markjdb markjdb force-pushed the dev-libvmmapi-prot-cap branch from e84bc1d to ebb6b74 Compare December 20, 2024 14:30
This is needed by the gdb stub when loading a capability from guest
memory.  The capability itself is fetched via a mapping of the guest
address space, but this mapping necessarily clears tags, as otherwise
it'd be subject to revocation sweeps.
The mapping of the guest address space clears tags, so we need to fetch
them out of band.
The VM_GET_REGISTER and VM_GET_REGISTER_SET ioctls return guest register
values; on arm64 they are used only by bhyve's gdb stub.

The ioctls may return valid capabilities, which is a mistake since they
belong to a different address space.  Modify the ioctl handlers to strip
capability tags before returning to userspace.

Since the gdb stub still wants the tag values, provide two new ioctls
for this purpose: VM_GET_REGISTER_CHERI_CAPABILITY_TAG and
VM_GET_REGISTER_CHERI_CAPABILITY_TAG_SET.  They are not atomic, but for
practical purposes the target vCPU will be stopped anyway, and this lets
us side-step compatibility concerns.
The vm_get_register() and vm_get_register_set() functions now always
return capabilities with tags cleared.  Use new libvmmapi functions to
fetch tags where necessary.
@markjdb markjdb force-pushed the dev-libvmmapi-prot-cap branch from ebb6b74 to 85ec018 Compare January 3, 2025 19:27
@markjdb markjdb changed the title libvmmapi: Map the guest physical address space with PROT_CAP vmm: Avoid making guest capabilities visible in the bhyve process context Jan 3, 2025
uintcap_t *cap;
void *cookie;

if ((vt->gpa & sizeof(*cap) - 1) != 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

!__is_aligned(vt->gpa, sizeof(*cap)?

@@ -1570,8 +1570,9 @@ _vm_gpa_hold(struct vm *vm, vm_paddr_t gpa, size_t len, int reqprot,
void * __capability gpap;

#if __has_feature(capabilities)
gpap = cheri_setaddress(vmm_gpa_root_cap,
trunc_page(gpa));
gpap = cheri_setbounds(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could presumably be cheri_setboundsexact()? I prefer *exact() where possible as you don't have to convince yourself there's no padding involved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants