Notes 2017 02 28

Jump to bottom Edit New page

Marc-Andre Hermanns edited this page Mar 8, 2017 · 2 revisions

Participants

Forum Meeting Participants
Marc-Andre Hermanns

Notes

MPI Events

Marc-Andre is planning to move the current events draft code to the Forum github
- Current draft API lives in repo the the University GitLab
- People need to get access via GitHub OAuth and let Marc-Andre know
Review of the current spec
- Events now have a notion of sources, the same event type could be raised from different sources/devices
- get_info lists source IDs so tools can query additional info and set up per-source datastructures
- Sources should enable chronological order in event streams
  - A source can have different ordering semantics
    - MPI_T_ORDER_NONE - Source does not give any guarantees on ordering
    - MPI_T_ORDER_BEST_EFFORT - Source will provide ordering on a best effort basis, single events may be reported out-of-order
    - MPI_T_ORDER_STRICT - Source guarantees a ordering of events
Is this the best API design choice?
- What about dynamic sources?
  - get_info might not know all sources at the time of query
    - e.g Dynamic loading of modules in open MPI
  - If tool requires ordering, an additional source that is reported ad-hoc in the callback would require memory allocation in callback
  - The current approach assumes tool gets sources at init time
  - What happens if you get a new source at run time?
What about event per source
- Event id is unique for (event, source id)
- How would the registration work for this? Wouldn't this lead to a proliferation of event types?
If a new source is added then how do you detect that?
- After init tool interaction is mostly in in call backs all the time
- Do you have to check from PMPI all the time?
- Dynamic creation of a source could be an event
  - Do we need to standardize at least this one?
- Then you could alloc mem to deal with new event stream from source
  - Allocation would have to be delayed until it is safe to allocation memory
  - Is this okay in a callback?
  - Marshall task off to thread to register callback and alloc mem for event stream?
    - Separate thread may perform allocation
  - Would lengthen the time that you would be missing events before you get registered/set up
    - Could MPI or source buffer them for you until you are set up?
    - What if tool does not consume buffered events?
    - Need way for tool to decide to buffer; discard as default?
Do we need a new source for each thread in a multithreaded app?
- What about endpoints?
Event order enum to describe the event ordering "strictness" per source
- Would all sources just say "best effort"
- What do we do with out-of-order events? Drop them?
  - What about tools that rely on event pair matching ? (e.g. request and completion) Can't drop in that case
- What about open MP tasks? Event stream for each task? Too many
- How to enforce ordering of events across sources (threads) that occur in the same task? Can we?
  - Does it depend on the level of the tool?
Is this discussion getting out of hand?
- Discussion started to address the problem of different hardware sources
- Are we trying to solve this problem at the wrong level?
  - A stream/source means different things at different levels (user level tool, system tool, etc)
  - Should ordering definition go the other way? Instead of MPI to tool, tool to MPI, to optimize what is done at runtime
Ordering of events in streams may be OK as postprocessing step in most cases
What is a source anyway?
- Thread? Endpoint?
- Network device?
- How often would a device change? What about long running service, hot swap some device
- A source is only a medium to enable event ordering for tools that need it
Could we use sequence ids versus timestamps
- Two different kinds of information
- sequence IDs would allow recovery of unordered streams
- With timestamps you don't know if any events could have occurred between two events (infinite split of time)
- With seq ids, a gap means something is missing (tool could buffer -> TCP sliding window approach)
- Some events might not have timestamps, so a seq id could be useful
What about not having sources predefined and query event information when parsing event?
- Get timestamp, seq id, some notion of source (source id)
- Up to tool to do all the event ordering across streams after the fact, postprocessing
UID - Universal IDs for cvar/pvar/events
defining a unique identifier for each MPI to differentiate variables with same/similar names across MPIs and versions
Need to get back to the idea of how to describe datatypes in the interface