Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More API updates for 2.0 #10317

Open
wants to merge 6 commits into
base: main
Choose a base branch
from
Open

More API updates for 2.0 #10317

wants to merge 6 commits into from

Conversation

j-xiong
Copy link
Contributor

@j-xiong j-xiong commented Aug 19, 2024

Reworked the dynamic MR commit and expand to include more pending 2.0 API changes:

  • core: Move flags only used for memory registration calls to fi_domain.h
  • core: Introduce Sub-MR
  • core: Define flag for single use MR
  • core: Define capability bit for tagged multi receive
  • core: Define capability for tagged message only directed recv
  • core: Define capbility for directed receive without wildcard src_addr

·

Copy link
Member

@shefty shefty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stopping review to move discussion to PR

man/fi_mr.3.md Outdated Show resolved Hide resolved
include/rdma/fabric.h Outdated Show resolved Hide resolved
include/rdma/fi_domain.h Outdated Show resolved Hide resolved
@shefty
Copy link
Member

shefty commented Aug 20, 2024

This change is kind of like memory windows, but not. It requires the provider support 64-bit keys (or smaller). And it's actually adding 2 concepts together.

FI_MR_SINGLE_USE is really independent and could apply to existing registrations. There's no way for a provider to indicate that it supports single use regions, outside of just trying the flag in a call and seeing if it works. I don't know if this should be a domain capability, but maybe...

The dynamic keys are basically trying to create a new MR object that references the same pinned pages. That's equivalent to a memory window. However, MWs are created by posting to a QP/EP, whereas, this is a MR/domain operation. In either case, the user should be given a fid_mr structure here, not a u64 key. That allows integration with the other fi_mr calls (map, unmap, bind, refresh, enable).

This is probably doable by adding fid_mr to the fi_mr_attr, uhm, somewhere. The user can just create/destroy the extra MRs through the existing calls, with the same capabilities/restrictions that the provider has for other MRs (user or provider selected keys).

I don't know if we would actually need a new capability in the latter case. A provider could always perform a second registration, in which case, saving on the page pinning is simply an optimization.

@j-xiong
Copy link
Contributor Author

j-xiong commented Aug 20, 2024

@shefty Thanks for the feedback. Yes, single use and dynamic key assignment did come as two separate issues but I combined them here with the assumption that single use could be simpler with the key instead of the entire MR. I agree that this mostly equivalent to memory windows. One consideration about the bulk key allocations is that it may have the advantage of avoiding kernel involvement when the key is assigned (in case the provider doesn't allow user selected keys). But how much can be gained from that is unclear. I am going to rework the patch, maybe along the line of getting closer to a MW-like approach.

@shefty
Copy link
Member

shefty commented Aug 21, 2024

Bulk key allocation could be defined as an attribute when the original MR is created. I agree that a single use MR is less useful than a single use window/key.

Signed-off-by: Jianxin Xiong <jianxin.xiong@intel.com>
@j-xiong j-xiong changed the title core: Introduce MR with dynamic keys API updated for 2.0 Sep 30, 2024
@j-xiong
Copy link
Contributor Author

j-xiong commented Sep 30, 2024

PR updated. The first three commits cover the rework of the original dynamic MR proposal. Four more 2.0 API related commits are added. PR title updated to reflect the new scope.

include/rdma/fabric.h Outdated Show resolved Hide resolved
man/fi_mr.3.md Outdated Show resolved Hide resolved
man/fi_mr.3.md Outdated Show resolved Hide resolved
man/fi_mr.3.md Show resolved Hide resolved
man/fi_getinfo.3.md Outdated Show resolved Hide resolved
include/rdma/fi_collective.h Outdated Show resolved Hide resolved
@j-xiong
Copy link
Contributor Author

j-xiong commented Oct 1, 2024

PR updated to address comments.

man/fi_mr.3.md Outdated Show resolved Hide resolved
man/fi_mr.3.md Show resolved Hide resolved
man/fi_mr.3.md Outdated Show resolved Hide resolved
man/fi_mr.3.md Outdated Show resolved Hide resolved
man/fi_mr.3.md Show resolved Hide resolved
include/rdma/fabric.h Outdated Show resolved Hide resolved
Memory registration consists of two parts: map/pin the memory for local
access and export with a key for remote access. The first part is usually
heavyweight and requries kernel involvement. The second part is less
expensive and can be further separated into key allocation and key
assignment. Key allocation may needs kernel involvement, but key assignment
can be done in user space. Here sub-MR is introduced as a way to allow
separattion of the forementioned two parts, and key reservation is added
to further optimize sub-MR creation.

A sub-MR is created from an existing MR (the base MR).  It inherits the
memory mapping/pinning of the base MR but has its own access key.  The
address range exposed can be same as the the base MR or a subpart of that.
The access rights can be different, too.

Now the base MR can be created with a few extra keys reserved. These reserved
keys will be automatically used for sub-MR registration. This only applies
to FI_MR_PROV_KEY mode.

Signed-off-by: Jianxin Xiong <jianxin.xiong@intel.com>
Signed-off-by: Jianxin Xiong <jianxin.xiong@intel.com>
Currently FI_MULTI_RECV is effectively only defined for untagged message
only. Simply expanding the definition to tagged message would cause
difficulties in either provider support or discovery.

Define FI_TAGGED_MULTI_RECV to indicate that multi recv is supported in
tagged message as well. This is only used as a capability bit. The op
flag and cq flag continues to use FI_MULTI_RECV.

Signed-off-by: Jianxin Xiong <jianxin.xiong@intel.com>
FI_DIRECTED_RECV covers both untagged and tagged message. However, the most
often used case is for tagged message. Having a saparate bit for tagged
message allows the provider to optimize non-tagged messsage implementation
while maintain support directed recv over tagged message.

Signed-off-by: Jianxin Xiong <jianxin.xiong@intel.com>
The new bit FI_EXACT_DIRECTED_RECV is similar to FI_DIRECTED_RECV, but
requires exact source address. I.e., the wildcard address FI_ADDR_UNSPEC
is not allowed.

It can be used alone, or be used together with FI_DIRECTED_RECV or
FI_TAGGED_DIRECTED_RECV as a modifier.

Not allowing wildcard source address allows the provider to better
optmize the receive handling.

Signed-off-by: Jianxin Xiong <jianxin.xiong@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants