-
Notifications
You must be signed in to change notification settings - Fork 376
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve fabtest latency tests #10394
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fun change! Have you run with other providers? Which see an improvement and which see degradation?
not necessarily related to this change, but I notice with the fabtests you're requesting tcp;rxm. tcp does directly support rdm endpoints now, so this is not necessary and could remove some overheads if the rxm layering is not needed (it should likely only be needed for backward compatibility). additionally, if your mpi runs are not using the rxm layering (I'm not sure how you're requesting provider in that case) that could contribute to some differences as well. |
I have run with RxM + TCP and CXI providers. Both saw improvements with posting TX before RX. |
Thinking out loud..... The reason why I did not want to delete the existing logic is because I cannot guarantee the new logic will not introduce a regression for some providers (hence the -T arg). Do we want to consider removing old pingpong logic? |
We need to keep the old logic. The same code is also used by fi_msg_pingpong. Verbs requires preposting receive buffers w/o rxm.
From: iziemba ***@***.***>
Sent: Wednesday, September 18, 2024 11:27 AM
To: ofiwg/libfabric ***@***.***>
Cc: Xiong, Jianxin ***@***.***>; Review requested ***@***.***>
Subject: Re: [ofiwg/libfabric] Improve fabtest latency tests (PR #10394)
Thinking out loud..... The reason why I did not want to delete the existing logic is because I cannot guarantee the new logic will not introduce a regression for some providers (hence the -T arg). Do we want to consider removing old pingpong logic?
—
Reply to this email directly, view it on GitHub<#10394 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ACA7MLMZUVUKXPLXDZXG2ZTZXHAYFAVCNFSM6AAAAABOODBPXGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNJZGE2DCMZQG4>.
You are receiving this because your review was requested.Message ID: ***@***.******@***.***>>
|
574c7ab
to
7110984
Compare
Posting the Tx prior to posting the Rx effectively hides the latency of posting the Rx, assuming the RTT is longer than the Rx posting time. If an Rx is posted prior to entering the pingpong loop, this should work for verbs without verbs hitting RNR NACK. |
Move the current pingpong logic into a pingpong pre-posted RX function. This better describes the behavior of this pingpong test. In addition, this change will allow for a pingpong without pre-posted RX buffers to be defined. Signed-off-by: Ian Ziemba <ian.ziemba@hpe.com>
The run() logic in rdm_pingpong and rdm_tagged_pingpong is the same. Consolidate this logic into a single run_pingpong() function. Signed-off-by: Ian Ziemba <ian.ziemba@hpe.com>
Split out the ft_sync logic into two separate functions: inband sync (ft_sync_inband) and out-of-band sync (ft_sync_oob). The inband sync supports the option to conditionally repost buffers after the sync. The breaking out of the sync logic is needed to support fi_rdm_pingpong/fi_rdm_tagged_pingpong with a no pre-posted RX buffer option. Signed-off-by: Ian Ziemba <ian.ziemba@hpe.com>
The new pingpong test allows for TX operations to be posted and processed, if necessary, before post the RX buffer. This better aligns to how OSU latency works. By doing this, fi_rdm_tagged_latency is now lower than OSU latency which makes sense since less SW is involved. The no prepost RX pingpong test can be enabled by using the -r option. Signed-off-by: Ian Ziemba <ian.ziemba@hpe.com>
7110984
to
353d62a
Compare
@iziemba I am fixing Intel CI and will replay your PR when I am finished. Sorry for the delays |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me!
Added condition to make preposted rx buffers based on FT_OPT_NO_PRE_POSTED_RX in case of msg endpoint based init code in fabtests. Changes similar to ofiwg#10394 Signed-off-by: Nikhil Nanal <nikhil.nanal@intel.com>
This patch adds the missing inband sync in ft_fabric_init_cm to handle the case where rx buffers are not pre-posted by the application. The default behaviour in fabtests is to pre-post a rx buffer. This change enables fabtests using ft_fabric_init_cm to consume the posted receive with an inband sync by setting the test option FT_OPT_NO_PRE_POSTED_RX. Similar changes have been made to ft_init_fabric ofiwg#10394 Signed-off-by: Nikhil Nanal <nikhil.nanal@intel.com>
The issue with current fi_rdm_pingpong and fi_rdm_tagged_pingpong logic is reported latency is higher than running OSU latency with with MPI + libfabric. Because of this, these pingpong benchmarks are not particularly useful.
Looking at how OSU latency + MPI + libfabric works, OSU latency initiator issues an MPI_Send before the MPI_Recv. Taking a similar approach in fabtests shows decreased latency. The following are examples with rxm;tcp.
Default (pre-post RX) pingpong
TX First pingpong
The -T option is used to enable this behavior. Since this option requires no pre-posting of buffers, inband address exchange cannot be used.