Skip to content

Commit

Permalink
Attempt to workaround gpu issues in axisymmetric Tomboulides
Browse files Browse the repository at this point in the history
The u_next_rad_comp_gf_ ParGridFunction seems to be causing problems
on some gpu systems.  This is intended to alias the first
(i.e. radial) component of the velocity grid function.  It is
constructed via

u_next_rad_comp_gf_ = new ParGridFunction(pfes_, *u_next_gf_);

But, this leads to failures like the following:

Verification failed: (IsHostMemory(h_mt)) is false

in mfem::MFEM_VERIFY_TYPES, which is called (eventually) from
Vector::HostRead.  Initially I thought this had to do with a failure
to properly call Vector::SyncMemory, since the base memory (in
u_next_gf_) is being modified.  But, an attempt to use this had no
impact on the problem.

So, in this workaround, I eliminate the alias and simply copy the
first component into a new grid function.  This seems to work, but it
would be nice to understand the problem with the original approach,
b/c clearly we'd rather not copy when it shouldn't be necessary.
  • Loading branch information
trevilo committed Jul 2, 2024
1 parent 304fe7b commit 0286e96
Showing 1 changed file with 29 additions and 2 deletions.
31 changes: 29 additions & 2 deletions src/tomboulides.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -253,10 +253,13 @@ void Tomboulides::initializeSelf() {

mu_total_gf_ = new ParGridFunction(pfes_);

pp_div_rad_comp_gf_ = new ParGridFunction(pfes_, *pp_div_gf_);
u_next_rad_comp_gf_ = new ParGridFunction(pfes_, *u_next_gf_);
// pp_div_rad_comp_gf_ = new ParGridFunction(pfes_, *pp_div_gf_);
// u_next_rad_comp_gf_ = new ParGridFunction(pfes_, *u_next_gf_);

if (axisym_) {
pp_div_rad_comp_gf_ = new ParGridFunction(pfes_);
u_next_rad_comp_gf_ = new ParGridFunction(pfes_);

utheta_gf_ = new ParGridFunction(pfes_);
utheta_next_gf_ = new ParGridFunction(pfes_);
}
Expand Down Expand Up @@ -1365,7 +1368,19 @@ void Tomboulides::step() {
// Add axisymmetric "forcing" term to rhs
if (axisym_) {
pp_div_gf_->HostRead();
{
// Copy radial components of pp_div_gf_. We should be able to
// just make pp_div_rad_comp_gf_ wrap pp_div_gf_, but that is
// causing problems (that aren't understood) on some gpu
// systems. As a workaround, we copy instead.
auto d_pp_div_rad = pp_div_rad_comp_gf_->Write();
auto d_pp_div = pp_div_gf_->Read();
MFEM_FORALL(i, pp_div_rad_comp_gf_->Size(), {
d_pp_div_rad[i] = d_pp_div[i];
});
}
pp_div_rad_comp_gf_->HostRead();

Faxi_poisson_form_->Update();
Faxi_poisson_form_->Assemble();
Faxi_poisson_form_->ParallelAssemble(Faxi_poisson_vec_);
Expand Down Expand Up @@ -1469,6 +1484,18 @@ void Tomboulides::step() {
evaluateVelocityGradient();

if (axisym_) {
u_next_gf_->HostRead();
{
// Copy radial components of u_next_gf_. Same comments here
// about pp_div_rad_comp_gf_ above.
auto d_u_next_rad = u_next_rad_comp_gf_->Write();
auto d_u_next = u_next_gf_->Read();
MFEM_FORALL(i, u_next_rad_comp_gf_->Size(), {
d_u_next_rad[i] = d_u_next[i];
});
}
u_next_rad_comp_gf_->HostRead();

// Update the Helmholtz operator and inverse
Hv_bdfcoeff_.constant = coeff_.bd0 / dt;
Hs_form_->Update();
Expand Down

0 comments on commit 0286e96

Please sign in to comment.