Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DDotMPI and DGEMVMPI implementation #45

Open
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

CYChenFudan
Copy link
Contributor

MPI versions for the blas subroutines ddot and dgemv are presented. They are included in lib/mpif90/mpiu.f90, then are called by dvdson module.

00. src/tool: Makefile, add "chmod a+x $(GRASP)/bin/rsave"
01. jj2lsj90: In jj2lsj_code.f90, lines 220 and 222 are changed to deal with both parities present.
03. rcsfzerofirst90: In RCSFzerofirst.f90, the number of CSFs in zero-order space
are output. The informations could be used conveniently in RCSF and RCI calculations.
04. rmcdhf90: In newco.f90, weighted average energy is output to monitor the energy convergence.
05. rangular90_mpi: getinf.90 modified for iccut=1
06. rci90_mpi: setham_gg.f90, only myid=0 reports how far the calculation has proceeded.
07. libdvd90, dvdson.f90: make lines 1119 and 1120 work for convergence
monitor. The output messages are very useful for extremely large-scaled calculations.
08. libdvd90, dvdson.f90: function TSTSEL is modified, so that dvdson performs at least TWO
iteration calculations. Or, error results could be obtained in some cases (in
O-like isoelectronic sequence?).
09. coming dvdson--mpi
@jongrumer
Copy link
Member

jongrumer commented Sep 22, 2020

Ok, I'm slowly getting up to speed after summer vacation. Sorry for the delay!

You have done many things here Chen, very nice! Just summarizing the changes you have made here, so it's easy for everyone to see:

Note: [output] means that the edit only affects output from the code to screen, and is therefore a less sensitive modification.

  1. src/tool: Makefile, add "chmod a+x $(GRASP)/bin/rsave" [already implemented]
  2. jj2lsj90: In jj2lsj_code.f90, lines 220 and 222 are changed to deal with both parities present. [rejected]
  3. rcsfzerofirst90: In RCSFzerofirst.f90, the number of CSFs in zero-order space
    are output. The informations could be used conveniently in RCSF and RCI calculations. [output, ok]
  4. rmcdhf90: In newco.f90, weighted average energy is output to monitor the energy convergence. [output, ok]
  5. rangular90_mpi: getinf.90 modified for iccut=1 [output, ok]
  6. rci90_mpi: setham_gg.f90, only myid=0 reports how far the calculation has proceeded. [output, ok]
  7. libdvd90, dvdson.f90: make lines 1119 and 1120 work for convergence
    monitor. The output messages are very useful for extremely large-scaled calculations. [output, ok]
  8. libdvd90, dvdson.f90: function TSTSEL is modified, so that dvdson performs at least TWO iteration calculations. Or, error results could be obtained in some cases (in O-like isoelectronic sequence?). [check, cff]
  9. The rscfgenerate bug that was discussed during the spring (issue bug in the generation of csf's using rcsfgenerate #26) [already merged through separate PR]
  10. And finally the MPI versions of DDOT and DGEMV - leading to a parallell version of DVDSON

I need help with testing these changes out properly before we add them to master - any of the more senior members perhaps? @tspejo @gaigalas @cffischer or others?

No 1 (jj2lsj) [rejected], 4 (rangular) and 6-7 (dvdson) are the critical ones of the first group.
Then we have the rcsfgenerate fix in no 8, and the larger update with new parallell versions of DDOT and DGEMV in No 9 which will require some testing.

@gaigalas - could you perhaps verify that update no 1 in jj2lsj on both parities is ok? [done] And maybe also the other updates you feel you can verify easily.
@cffischer - could you have a look at update 7 perhaps?

Copy link
Member

@mortenpi mortenpi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@CYChenFudan if you'd like, I think me and/or @jongrumer can also make the necessary changes for that, since they're pretty straightforward and do not affect the core PR.

@@ -60,8 +60,9 @@ SUBROUTINE GETINF
do i = 1,nblock
write(istde,*) 'Give ICCUT for block',i
1 READ *, ICCUT(i)
IF ((ICCUT(i) <= 1).OR.(ICCUT(i) >= ncfblk(i))) THEN
WRITE (istde,*) 'GETINF: ICCUT must be greater than 1',&
!cychen IF ((ICCUT(i) <= 1).OR.(ICCUT(i) >= ncfblk(i))) THEN
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest removing these commented out lines with the old code before merging.

@@ -26,4 +33,4 @@ export FC_MPI="mpifort" # MPI
export FC_MPIFLAGS="${FC_FLAGS}" # Parallel code compiler flags
export FC_MPILD=${FC_LD} # Serial linker flags
# -------------------------------------------------------------------------------------------------------------------
export MPI_TMP="${HOME}/grasp_mpi_tmp" # Location for temporary files
export MPI_TMP="${HOME}/tmp" # Location for temporary files
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR needs to be brought up to speed with the latest master before merging. E.g. make_environment_gfortran is not a thing anymore. I don't think any of the substantial parts of the PR have conflicts though, just the build files.

Copy link
Member

@jongrumer jongrumer Sep 30, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, and the changes with copying blas and lapack libraries from a given folder are way too case specific. Better to set e.g. the LD_LIBRARY_PATH flag or let CMake do the job.

@CYChenFudan
Copy link
Contributor Author

CYChenFudan commented Sep 24, 2020 via email

@jongrumer
Copy link
Member

@CYChenFudan Me and @mortenpi realized that you have submitted this pull request from your forked master branch of grasp. This is not ideal perhaps since any edits we will do now on the patch, will also directly affect your personal main version of the code.

For this time around to get this done as soon as possible, is it ok with you that we go ahead and do the necessary edits and deletes on your master branch? We will try to do as few edits/deletes as possible.

For the next time, first create a separate dev branch on your fork that you then submit a pull request from. We will probably squash-merge this PR, which will create a single commit on master that is different from the numerous ones that exist on your forked master branch. This means that from Git's point of view, the CompAS master and you master will diverge once we merge. So you will have to reset you master to the actual master from that on. For this reason, usually, pull requests are preferably made from separate branches to avoid that.

Oh, and try to reply on the github online chat instead of by email to make the thread less cluttered: here #45

@CYChenFudan
Copy link
Contributor Author

@jongrumer @mortenpi
It is OK that you go ahead and do the necessary edits and deletes on my master branch.

@jongrumer
Copy link
Member

@CYChenFudan Great, we'll get started with the stuff we actually can deal with now then, like sorting out the conflicting files. We need to wait for input from the other CompAS members on the larger changes like the one related to DVDSON.

@gaigalas
Copy link
Member

@gaigalas - could you perhaps verify that update no 1 in jj2lsj on both parities is ok? And maybe also the other updates you feel you can verify easily.
This is not good idea for many reasons. So I am not planning to check what is not efficient.

@jongrumer
Copy link
Member

jongrumer commented Sep 30, 2020

@gaigalas Ok great - I trust your judgement of course, so we shall remove the edits to jj2lsj (part 1 in the list above) from this patch. Let us know if you have opinions on any of the other edits. But I think we can handle them now, this was the most critical to get info on. Thanks a lot!

@gaigalas
Copy link
Member

OK. I will look other issue, too. But, there are the files *.f90 for 2, 3, 5 ... ?

@jongrumer
Copy link
Member

jongrumer commented Sep 30, 2020

@gaigalas Great thanks GG 👍 I think I have most of them under control, I updated the list above now with "status comments", but if you could have a look at the edits to no 7 libdvd90, dvdson.f90: function TSTSEL and if possible also the MPI-libraries in no 9, that would be very helpful. My feeling is that we can merge 7 but have to wait with 9 until tested more.

You can see the changes to the files directly here on github quite easily by clicking on the "Files changed" tab up on the top of this page. This takes you here: https://github.com/compas/grasp/pull/45/files where you can scroll through the files. The files related to dvdson and mpi-blas are a bit down. Red lines are those that are removed, and green the new ones. If you have any direct comments on a certain line of code, you can press the blue "plus"-sign in the source code. This opens upp a message box where you can write your comment related to that particular line.

If you want to have a look at the files locally on your own computer, you have to download Chens version of the codes, either by pressing the green button "Code" --> "Download ZIP" at his page https://github.com/CYChenFudan/grasp or by cloning the repository directly to some folder at your computer with:

git clone https://github.com/CYChenFudan/grasp.git

Let me know if you have any further questions, and thanks again for the help!

Revert changes, this is move to a separate PR (jg/chen_rcsfgenerate) to get this into master as soon as possible.
@jongrumer
Copy link
Member

I've now split out the rcsfgenerate fix to a separate PR (#55) that I will merge now directly since this is the most crucial fix to get into master as soon as possible as far as I can tell.

I will edit my first summarizing comment accordingly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

bug in the generation of csf's using rcsfgenerate jj2lsj's parity limitation
4 participants