-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
I have some questions #198
Comments
|
Thank you very much! Another question is how about the degree of parallelism and parallel efficiency of this software using cpu/gpu? |
Currently there're three different modes of parallelism supported: mpi, cuda, mpi+cuda. All of these have can have different parallelism characteristics, which can also be affected by different devices, OS and compilers, and so on. And parameters of simulation can also have significant effect, like grid size, usage of different modes, sizes of parallel buffers, and so on. So, it's better to test on your device of interest. For some simple scenarios, here's fdtd3d benchmark page: https://zer011b.github.io/fdtd3d/. |
Thank you. Moreover, I would like to know which kind of Virtual topology of grid (x, y, z, xy, xz, yz, xyz) through all computational nodes has the best efficiency for 3D model on the supercomputer. If I only assign x, y or z, do they have obvious difference in this case? I guess xyz should be better but I'm not sure whether the more complicated connection and communication between neighboring nodes will decrease the efficiency instead. |
This question doesn't have a simple answer. First of all, it depends on actual simulation area, consider next simulation examples:
But this is not all, because virtual topology also depends on characteristics of target device. For example:
For simulation examples above in case of 2 nodes you can split just one axis, so it'll be Besides, ideally virtual topology should somehow match the physical one, because otherwise data sharing might become ineffective (for example, data might be sent through multiple transit nodes). For homogeneous architectures there's an automatic virtual topology selection built-in in
If any of those is not met,
Yes, potentially this might affect optimal virtual topology, but I think it's not the case for now with PML, because non-pml and pml grid points have same amount of computations. Yet, this is the case for TFSF. Anyway, dynamic grid is planned for such cases, but now only manual setup of topology is available. So, to sum it, you you need to test it out on your simulation and on your device. |
Thank you very much. I tested 200200200/100010001000 model on my homogeneous supercomputer. However, I found that when I set the virtual topology as x > y > z rather than x=y=z, the efficiency is the best. Is it normal? Besides, I would like to know whether there are some papers about the study of virtual topology, especially some mathematical formulas about that. For example, in this paper https://ieeexplore.ieee.org/document/1606757, the conclusion is "As to the same dimensional virtual topology, the topology scheme should be created along the directions where the amount of the FDTD grids is larger." Does it make sense? Thank you! |
As I mentioned in #198 (comment), it depends on actual device that you use. Even if all nodes have the same performance, virtual topology should match physical one (actual connections between nodes), because communication speed between nodes can be different.
You can check papers mentioned in README (https://github.com/zer011b/fdtd3d?tab=readme-ov-file#how-to-cite), there's mathematical analysis of best virtual topology. And fdtd3d uses the same logic in code to identify it.
It's true, but lacks details of when this becomes true/false. For example, for 2d grid x=10000, y=10 it's clear that x axis should be divided between both 10 nodes, or 100 nodes. However, this doesn't describe what to do in x=1200, y=600 case with 6 nodes. Should it be x=200,y=600 for each node, or should it be x=400,y=300 for each node? The papers I've mentioned above describe how to choose between these. |
Thank you for this great software, I have some questions:
Thank you
The text was updated successfully, but these errors were encountered: