Back to performance benchmark lists
F90OPTFLAGS = -O3 -g -xhost
name | |
---|---|
# of Cores | Number of used CPU cores |
# of parallel FEM | Number of subdomain in meridional plane |
# of parallel FFT | Number of parallelization for FFT |
$N_{med}$ | Number of nodes for fluid in a meridional plane |
$N_{\phi}$ | Number of nodes (modes) in longitudinal direction |
Elapsed time | Elapsed (wall clock time) for one time step |
Solver time | Elapsed (wall clock time) for linear solver (including communications) |
Comm. time | Elapsed (wall clock time) for data communication |
Efficiency | Parallel efficiency |
SUs | Service unit for $10^{4}$ time steps (Core hours) |
Elapsed time is evaluated by averaging over 100 steps and number of cores from “fort.702”
Solver time is evaluated by averaging over 100 steps and number of cores from “fort.705”
Comm. time is evaluated by averaging over 100 steps and number of cores from “fort.703”
$N_{med}$ | $N_{\phi}$ |
---|---|
53280 | 32 |
# of Cores | # of parallel FEM | # of parallel FFT | Elapsed time | Solver time | Comm. time | Efficiency | SUs |
---|---|---|---|---|---|---|---|
64 | 8 | 8 | 3.26451 | 0.112633 | 0.0521764 | 0.958659 | 580.357 |
64 | 16 | 4 | 5.39355 | 0.134547 | 0.0933418 | 0.580239 | 958.854 |
128 | 8 | 16 | 1.85212 | 0.108784 | 0.0344249 | 0.844854 | 658.533 |
128 | 16 | 8 | 1.84533 | 0.0709971 | 0.0290995 | 0.847966 | 656.116 |
256 | 8 | 32 | 1.18584 | 0.427434 | 0.0484521 | 0.659775 | 843.263 |
256 | 16 | 16 | 1.04266 | 0.0631451 | 0.023694 | 0.750375 | 741.448 |
256 | 32 | 8 | 1.20359 | 0.0506269 | 0.0211725 | 0.650045 | 855.886 |
512 | 16 | 32 | 0.707405 | 0.248459 | 0.0314082 | 0.552998 | 1006.09 |
512 | 32 | 16 | 0.732251 | 0.0445155 | 0.0205402 | 0.534234 | 1041.42 |
512 | 64 | 8 | 0.649872 | 0.119073 | 0.00713214 | 0.601955 | 924.262 |
$N_{med}$ | $N_{\phi}$ |
---|---|
132587 | 128 |
# of Cores | # of parallel FEM | # of parallel FFT | Elapsed time | Solver time | Comm. time | Efficiency | SUs |
---|---|---|---|---|---|---|---|
512 | 32 | 16 | 5.347 | 0.195319 | 0.162323 | 0.852482 | 7604.62 |
512 | 16 | 32 | 4.55822 | 0.225011 | 0.18032 | 1 | 6482.8 |
1024 | 64 | 16 | 3.99642 | 0.162858 | 0.117593 | 0.570288 | 11367.6 |
1024 | 32 | 32 | 2.30393 | 0.126079 | 0.107139 | 0.989228 | 6553.4 |
1024 | 16 | 64 | 3.2378 | 0.214095 | 0.144102 | 0.703906 | 9209.75 |
2048 | 64 | 32 | 1.40748 | 0.0818107 | 0.0677741 | 0.809641 | 8007.01 |
2048 | 32 | 64 | 1.51652 | 0.111759 | 0.0867726 | 0.75143 | 8627.29 |
2048 | 16 | 128 | 2.39379 | 0.338092 | 0.107656 | 0.476046 | 13618 |
4096 | 128 | 32 | 1.02445 | 0.0618993 | 0.0500832 | 0.55618 | 11655.9 |
4096 | 64 | 64 | 0.954898 | 0.0737486 | 0.0606541 | 0.59669 | 10864.6 |
4096 | 32 | 128 | 1.12072 | 0.180406 | 0.0659313 | 0.508405 | 12751.3 |
Elapsed (wall clock) time for the strong scaling. Number of parallelization for FFT is shown by the numbers.
Parallel Efficiency for the strong scaling. Number of parallelization for FFT is shown by the numbers.
# of Cores | # of parallel FEM | # of parallel FFT | $N_{med}$ | $N_{\phi}$ | Elapsed time | Solver time | Comm. time | SUs |
---|---|---|---|---|---|---|---|---|
128 | 32 | 4 | 132587 | 16 | 4.45475 | 0.100929 | 0.0287522 | 1583.91 |
256 | 32 | 8 | 132587 | 32 | 2.82371 | 0.101547 | 0.0439264 | 2007.97 |
512 | 32 | 16 | 132587 | 64 | 2.41257 | 0.117023 | 0.0525979 | 3431.22 |
1024 | 32 | 32 | 132587 | 128 | 2.30393 | 0.126079 | 0.107139 | 6553.4 |
2048 | 32 | 64 | 132587 | 256 | 2.39006 | 0.126997 | 0.149923 | 13596.8 |
Elapsed time for the weak scaling in the zonal direction.
# of Cores | # of parallel FEM | # of parallel FFT | $N_{med}$ | $N_{\phi}$ | Elapsed time | Solver time | Comm. time | SUs |
---|---|---|---|---|---|---|---|---|
256 | 16 | 16 | 7620 | 64 | 0.498492 | 0.0331437 | 0.0233114 | 354.484 |
256 | 64 | 16 | 30667 | 64 | 0.734665 | 0.0432489 | 0.0237405 | 2089.71 |
2304 | 144 | 16 | 67412 | 64 | 1.05416 | 0.0557186 | 0.0527788 | 6746.65 |
4096 | 256 | 16 | 119590 | 64 | 1.26638 | 0.0689046 | 0.0430469 | 14408.5 |
Elapsed time for the weak scaling in the meridional directions.
Scaling of O(Ncore1/2) is plotted by dotted line.