Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide API to map real device ID to parsec ID and back #660

Open
devreal opened this issue Jun 4, 2024 · 5 comments
Open

Provide API to map real device ID to parsec ID and back #660

devreal opened this issue Jun 4, 2024 · 5 comments
Labels
enhancement New feature or request
Milestone

Comments

@devreal
Copy link
Contributor

devreal commented Jun 4, 2024

Description

PaRSEC numbers its devices differently from the rest of world (starting with 0 for host, possibly 1 for recursive, 2... for devices). CUDA and HIP start numbering devices from 0.

Describe the solution you'd like

An efficient API to map between CUDA/HIP IDs and parsec device IDs.

Describe alternatives you've considered

TTG currently implements this mapping but it requires us to find the first device ID and then add that to the CUDA/HIP ID to get the parsec ID. That seems brittle and would fail if we ever support multiple device types.

Additional context

Add any other context, references, and related works about the feature request here.

@devreal devreal added the enhancement New feature or request label Jun 4, 2024
@bosilca
Copy link
Contributor

bosilca commented Jun 4, 2024

Why "different than the rest of the world" ? PaRSEC number its devices starting from 0 and includes all supported devices. Exactly like all the others.

As a user you are not supposed to address an accelerator directly, everything you can do should happen in the context provided by the runtime. In this context you do have access to the real accelerator id via the device_t.

What exactly are you trying to do that would require the real device number ? And in what context ?

@therault
Copy link
Contributor

therault commented Jun 4, 2024

The context is a TTG application.

The need is to be able to advise the runtime to schedule a given task on a given device.

The issue is that PaRSEC does not exist for the programmer of a TTG application: only TTG exists. So, it doesn't make sense for a TTG application programmer to find a parsec_device_t.

So, TTG needs to expose a concept of device or device index, and there should be a portable way (for the TTG implementation with the PaRSEC backend) to convert from / to a TTG device / device index to the actual runtime device / device index.

@bosilca
Copy link
Contributor

bosilca commented Jun 7, 2024

PaRSEC has its own dialect to identify devices, a dialect (and the API going with it) that TTG chooses not to expose to users. Thus, it seems more reasonable to have TTG provide the conversion between TTG and PaRSEC devices instead of forcing PaRSEC to speak the TTG dialect for naming devices.

@therault
Copy link
Contributor

therault commented Jun 7, 2024

But it can also be useful for most applications that do device binding, whatever the DSL.

If we take gemm_gpu in DPLASMA for example, we start by counting how many CUDA GPUs we have (nbgpus), then we create an array that maps the space [0, nbgpus-1] to the actual device number; in the PR from Qinglei to give scheduling advice to POTRF, he needs to do the same thing. Most GPU-enabled tests also compute the number of GPU available, and some need to create a map in order to easily express that they want to bind a data or task to a specific GPU.

In the end, every time the user wants to map either data or tasks to a specific device, they end up re-inventing a way to do this mapping. The proposal is just to provide an API that they can use (or ignore).

@abouteiller
Copy link
Contributor

also seen in ICLDisco/dplasma#118

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants