-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add parsec_advise_data_on_device for zpotrf_L #118
Add parsec_advise_data_on_device for zpotrf_L #118
Conversation
eeb82ec
to
811e173
Compare
a4b81a9
to
f676e32
Compare
I uploaded the performance results. BTW, I updated to the latest PaRSEC. |
f676e32
to
a597442
Compare
I understand the performance benefits but I have many concerns about the approach. We (or at least I) claimed for a long time that JDF algorithms were expressing the dataflow and were independent of other platform related constraints. This PR is clearly at odds with this statement, and I'm not sure we want to go that route. |
As discussed, these changes will be moved to the wrapper to keep the JDF clean. |
c67b4b5
to
6743a5a
Compare
How about adding this feature in parsec instead of in dplasma? |
Can you give us a hint on how that would look in parsec ? |
I'm thinking of providing this feature in PaRSEC like |
252d978
to
726d4ab
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other programming paradigms call this a mapper. It would be nice if we reuse a concept that is familiar to [at least some] users.
src/dplasmaaux.c
Outdated
#if defined(DPLASMA_HAVE_CUDA) || defined(DPLASMA_HAVE_HIP) | ||
|
||
/* Find all devices */ | ||
void dplasma_find_nb_devices(int **dev_index, int *nb) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function has a generic name suggesting it gets all devices when internally it only selects the accelerators. Please change the name, and please add also support for ZE.
src/dplasmaaux.c
Outdated
if((*nb) == 0) { | ||
char hostname[256]; | ||
gethostname(hostname, 256); | ||
fprintf(stderr, "No CUDA device found on rank %d on %s\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have proper methods in parsec to output warnings. Please use parsec_warning instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general I don't think it is a good idea to return low level memory upstream, the ownership is unclear. As here we are talking about a very small amount of memory, I would suggest requiring the caller to provide an array of the size parsec_nb_devices
into this call, and this function then does a single loop to fill this array up. All memory ownership is then on the caller size.
6813296
to
40a87d4
Compare
There will be some memory leak as |
Please fix in parsec, there is absolutely no reason for parsec to free something it didn't allocate. Thus it is a bug not a feature. |
Sure. I will create an issue and fix that in PaRSEC. |
ICLDisco/parsec#676 has been merged |
1314a0e
to
25acbe2
Compare
src/zplghe_wrapper.c
Outdated
{ | ||
zplghe_args_t *params = (zplghe_args_t*)malloc(sizeof(zplghe_args_t)); | ||
|
||
params->bump = bump; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggested change:
Keep the same dplasma_zplghe_New
prototype and the malloc here too, and do the free in dplasma_zplghe_Destruct
you can get it from tp->_g_op_args
if you cast tp
to a parsec_apply_taskpool_t *
bd37e51
to
a5e1f57
Compare
a5e1f57
to
5cf5a0b
Compare
These are the performance comparisons.
I changed to Lower in Cholesky testing and will create an issue to track why Upper behaves differently.