-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Flow Probes #543
base: development
Are you sure you want to change the base?
Flow Probes #543
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My only comment really is that this feels complicated. We do probes in amr-wind with particles and it is fast and very scalable. I would have recommended that route before but you are quite far along here so if this works, maybe it's fine. My recommendation would be to try a test with millions of probes (users are going to do that to you) and make sure you don't spend too much time in init and timestepping.
amrex::Real alpha, beta, gama; | ||
amrex::Real value = 0.0; | ||
|
||
alpha = 0.0; | ||
beta = 0.0; | ||
gama = 0.0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all this should be defined as one line. And use ternary expressions for beta and (renamed?) gamma.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed the variables alpha, beta and gama and replaced with an array from the code snippet you suggested.
const int XDIR = 0; | ||
const int YDIR = 1; | ||
const int ZDIR = 2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think these exist in other place. Also you can see a trilinear interpolation (with a for loop) here: https://github.com/AMReX-Combustion/PeleC/blob/development/Exec/RegTests/HIT/prob.H#L67
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed XDIR, YDIR and ZDIR and added a loop to populated the values in the slp array
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As @marchdf says, this won't scale for very large numbers of probes, but I'm fine with that for now.
Thanks Marc and Bruce for your suggestions. I did think about using particles as probes, but I was under the impression that including particle functionalities (say in an otherwise non-particle simulation) might cause more overheads. Also, tbh, I didn't foresee anything more than 10-20 probes (max) to be used in practical simulations. Maybe, we can develop the same functionality using particles in the future? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, let's leave this implementation for now and consider a particle based implementation in the future.
amrex::ParallelDescriptor::ReduceRealSum( | ||
m_values_at_probe_d.data(), static_cast<int>(m_values_at_probe_d.size())); | ||
|
||
amrex::Gpu::copy( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only IO rank needs to do this copy right?
}); | ||
} | ||
|
||
amrex::ParallelDescriptor::ReduceRealSum( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is the kinda thing that is crushing users in amr-wind (until I fixed it ;) ). Each rank is holding way more data than it actually needs. Just a comment, not something to fix here.
0.0}; // Neighbour cell solution values | ||
int stIdx = idx_d_p[n]; | ||
|
||
if (m_interpType_d == Linear) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would pull this if out of the parallelfor and have 2 parallelfors. This is to avoid potential thread divergence on the GPU. There's very little shared code so I think it would be easy to do.
Created a new diagnostic to probe flow variables (user-defined) at user-specified step intervals. The ASCII output data is written to "./temporals/" folder.