Back to performance benchmark lists
F90OPTFLAGS = -r8 -i4 -ftz -IPF_fma -IPF_fltacc -WB -O2
name | |
---|---|
# of Cores | Number of used CPU cores |
# of Processes | Number of MPI processes |
# of Threads | Number of threads for each process |
$N_{c}$ | Truncation lavel for Chebyshev polynomials |
$l_{max}$ | Truncation lavel for spherical harmonincs |
$(N_{r},N_{\theta},N_{\phi})$ | Nuber of grids in spherical coordinate |
Elapsed | Elapsed (wall clock time) for one time step |
Nonlinear | Elapsed (wall clock time) for nonlinear terms (including communications) |
Solver | Elapsed (wall clock time) for linear calculation |
Comm. | Elapsed (wall clock time) for data communication |
Efficiency | Parallel efficiency |
SUs | Service unit for $10^{4}$ time steps (Core hours) |
$N_{c}$ | $l_{max}$ | $(N_{r},N_{\theta},N_{\phi})$ | Elapsed | Nonlinear | Solver | Comm. | SU |
---|---|---|---|---|---|---|---|
$N_{c}$ | $l_{max}$ | $(N_{r},N_{\theta},N_{\phi})$ |
---|---|---|
192 | 170 | (193,256,256) |
# of Cores | # of Processes | # of SMP | Elapsed | Nonlinear | Solver | Comm. | Efficiency | SUs |
---|---|---|---|---|---|---|---|---|
32 | 32 | 1 | 0.719657 | 0.518729 | 0.200927 | 0.182378 | 1 | 63.9695 |
64 | 64 | 1 | 0.545835 | 0.336805 | 0.209030 | 0.21079 | 0.659226 | 97.0373 |
128 | 128 | 1 | 0.285552 | 0.19960 | 0.085952 | 0.174127 | 0.630058 | 101.529 |
$N_{c}$ | $l_{max}$ | $(N_{r},N_{\theta},N_{\phi})$ |
---|---|---|
256 | 341 | (257,512,512) |
# of Cores | # of Processes | # of SMP | Elapsed | Nonlinear | Solver | Comm. | Efficiency | SUs |
---|---|---|---|---|---|---|---|---|
86 | 86 | 1 | 2.47102 | 1.88142 | 0.589604 | 0.379399 | 1 | 658.939 |
128 | 128 | 1 | 2.32883 | 1.91045 | 0.418383 | 0.857392 | 0.712896 | 828.03 |
129 | 129 | 1 | 2.09683 | 1.68765 | 0.409184 | 0.663364 | 0.785635 | 838.734 |
256 | 256 | 1 | 1.41293 | 1.13385 | 0.279076 | 0.675334 | 0.58751 | 1004.75 |
257 | 257 | 1 | 1.33908 | 1.06534 | 0.273739 | 0.621596 | 0.61991 | 1011.75 |
Elapsed (wall clock) time for the strong scaling. Number of OpenMP threads are shown by the numbers. Ideal scaling is plotted by dotted line.
# of Cores | # of Processes | # of SMP | $N_{c}$ | $l_{max}$ | $(N_{r},N_{\theta},N_{\phi})$ | Elapsed | Nonlinear | Solver | Comm. | SUs |
---|---|---|---|---|---|---|---|---|---|---|
4 | 4 | 1 | 256 | 42 | (257,64,64) | 0.322194 | 0.165516 | 0.156679 | 0.0525075 | 14.3197 |
16 | 16 | 1 | 256 | 85 | (257,128,128) | 0.406717 | 0.204322 | 0.202394 | 0.0716188 | 18.0763 |
64 | 64 | 1 | 256 | 170 | (257,256,256) | 0.743692 | 0.44779 | 0.295902 | 0.312471 | 132.212 |
256 | 256 | 1 | 256 | 341 | (257,512,512) | 1.41293 | 1.13385 | 0.279076 | 0.675334 | 1004.75 |
5 | 5 | 1 | 256 | 42 | (257,64,64) | 0.252394 | 0.127396 | 0.124998 | 0.0395718 | 11.2175 |
17 | 17 | 1 | 256 | 85 | (257,128,128) | 0.359973 | 0.180849 | 0.179124 | 0.0811399 | 31.9976 |
65 | 65 | 1 | 256 | 170 | (257,256,256) | 0.639472 | 0.363557 | 0.275915 | 0.227634 | 142.105 |
257 | 257 | 1 | 256 | 341 | (257,512,512) | 1.33908 | 1.06534 | 0.273739 | 0.621596 | 1011.75 |
Elapsed time for the weak scaling in the horizontal directions. Scaling of $O(Ncore^{1/2})$ (ideal scaling for Legendre transform) is plotted by dotted line.
# of Cores | # of Processes | # of SMP | $N_{c}$ | $l_{max}$ | $(N_{r},N_{\theta},N_{\phi})$ | Elapsed | Nonlinear | Solver | Comm. | SUs |
---|---|---|---|---|---|---|---|---|---|---|
8 | 8 | 1 | 16 | 341 | (17,512,512) | 1.28171 | 1.20689 | 0.0748194 | 0.372245 | 56.9651 |
16 | 16 | 1 | 32 | 341 | (33,512,512) | 1.52159 | 1.42569 | 0.0958951 | 0.315524 | 67.6261 |
32 | 32 | 1 | 64 | 341 | (65,512,512) | 1.61663 | 1.45842 | 0.158208 | 0.389299 | 143.7 |
64 | 64 | 1 | 128 | 341 | (129,512,512) | 1.75436 | 1.50559 | 0.248772 | 0.453414 | 311.887 |
128 | 128 | 1 | 256 | 341 | (257,512,512) | 2.32883 | 1.91045 | 0.418383 | 0.857392 | 828.03 |
9 | 9 | 1 | 16 | 341 | (17,512,512) | 1.05288 | 0.993848 | 0.0590345 | 0.207514 | 46.7948 |
17 | 17 | 1 | 32 | 341 | (33,512,512) | 1.03108 | 0.94969 | 0.0813889 | 0.194925 | 91.6515 |
33 | 33 | 1 | 64 | 341 | (65,512,512) | 1.25166 | 1.10257 | 0.149091 | 0.218563 | 166.888 |
65 | 65 | 1 | 128 | 341 | (129,512,512) | 1.43519 | 1.19399 | 0.241198 | 0.262446 | 318.931 |
129 | 129 | 1 | 256 | 341 | (257,512,512) | 2.09683 | 1.68765 | 0.409184 | 0.663364 | 838.734 |
Elapsed time for the weak scaling in the radial direction. Scaling of $O(Ncore^{1/2})$ (ideal scaling for Legendre transform) is plotted by dotted line.