[[wg:dynamo:Performance_results|Back to performance benchmark lists]] \\ ===== Compile options ===== F90OPTFLAGS = -O3 -xhost ===== Definition of columns ===== ^ name ^ ^ | # of Cores | Number of used CPU cores | | # of Processes | Number of MPI processes | | # of Threads | Number of threads for each process | | $l_{max}$ | Truncation lavel for spherical harmonincs | | $(N_{r},N_{\theta},N_{\phi})$ | Nuber of grids in spherical coordinate | | Elapsed | Elapsed (wall clock time) for one time step | | Nonlinear | Elapsed (wall clock time) for nonlinear terms (including communications) | | Solver | Elapsed (wall clock time) for linear calculation | | Comm. | Elapsed (wall clock time) for data communication | | Efficiency | Parallel efficiency | | SUs | Service unit for $10^{4}$ time steps (Core hours) | ===== Single Processor Result ===== ^ $l_{max}$ ^ $(N_{r},N_{\theta},N_{\phi})$ ^ Elapsed ^ Nonlinear ^ Solver ^ Comm. ^ SU ^ | 47 | ( 73,72,144) | 1.604797 | 1.56274 | 0.042059 | 0.508469 | 71.3243 | ===== Strong Scaling Results ===== ^ $l_{max}$ ^ $(N_{r},N_{\theta},N_{\phi})$ ^ | 47 | (48,72,144) | ^ # of Cores ^ # of Processes ^ # of SMP ^ Elapsed ^ Nonlinear ^ Solver ^ Comm. ^ Efficiency ^ SUs ^ | 1 | 1 | 1 | 1.04900 | 0.880913 | 0.168083 | 0.360225 | 1.0 | 2.91389 | | 2 | 2 | 1 | 0.538092 | 0.453756 | 0.0843343 | 0.194296 | 0.97474 | 2.9894 | | 4 | 4 | 1 | 0.274424 | 0.23125 | 0.0431727 | 0.0996035 | 0.955636 | 3.04916 | | 8 | 8 | 1 | 0.145301 | 0.122894 | 0.0224057 | 0.0558862 | 0.902437 | 3.22891 | | 16 | 16 | 1 | 0.095041 | 0.0821946 | 0.0128446 | 0.0449899 | 0.689833 | 4.22404 | ^ $l_{max}$ ^ $(N_{r},N_{\theta},N_{\phi})$ ^ | 127 | (256,192,384) | ^ # of Cores ^ # of Processes ^ # of SMP ^ Elapsed ^ Nonlinear ^ Solver ^ Comm. ^ Efficiency ^ SUs ^ | 4 | 4 | 1 | 17.9454 | 16.918 | 1.02739 | 6.01468 | 1.45157 | 1749.79 | | 8 | 8 | 1 | 9.98835 | 9.45138 | 0.536971 | 3.40571 | 1.30397 | 1749.79 | | 16 | 16 | 1 | 6.51225 | 6.2059 | 0.00297473 | 2.40569 | 1.0 | 1749.79 | | 32 | 32 | 1 | 3.30412 | 3.15372 | 0.00297473 | 1.23981 | 0.985473 | 1749.79 | | 64 | 64 | 1 | 1.71612 | 1.64141 | 0.00297473 | 0.6747 | 0.948685 | 1749.79 | | 128 | 128 | 1 | 0.91125 | 0.870656 | 0.00297473 | 0.383074 | 0.893313 | 1749.79 | ^ $l_{max}$ ^ $(N_{r},N_{\theta},N_{\phi})$ ^ | 255 | (513,384,768) | ^ # of Cores ^ # of Processes ^ # of SMP ^ Elapsed ^ Nonlinear ^ Solver ^ Comm. ^ Efficiency ^ SUs ^ | 128 | 128 | 1 | 10.4146 | 10.0946 | 0.320031 | 4.01002 | 1.0 | 3702.97 | | 256 | 256 | 1 | 5.4918 | 5.33942 | 0.152376 | 2.2764 | 0.948195 | 3905.28 | {{wg:dynamo:Performance_results:ETH:ETH_Elapsed.png?480}}\\ Elapsed (wall clock) time for the strong scaling. Ideal scaling is plotted by dotted line. {{wg:dynamo:Performance_results:ETH:ETH_efficiency.png?480}}\\ Parallel Efficiency for the strong scaling. ===== Weak Scaling Results ===== ^ # of Cores ^ # of Processes ^ # of SMP ^ $l_{max}$ ^ $(N_{r},N_{\theta},N_{\phi})$ ^ Elapsed ^ Nonlinear ^ Solver ^ Comm. ^ SUs ^ | 4 | 4 | 1 | 31 | (256,48,96) | 0.367794 | 0.302528 | 0.0652638 | 0.13866 | 4.0866 | | 16 | 4 | 1 | 63 | (256,96,192) | 0.829839 | 0.757857 | 0.0719786 | 0.333987 | 36.8817 | | 64 | 16 | 1 | 127 | (256,192,384) | 1.71612 | 1.64141 | 0.0747133 | 0.674700 | 305.089 | | 256 | 64 | 1 | 255 | (256,384,768) | 2.74791 | 2.67373 | 0.0741758 | 1.13636 | 1954.07 | {{wg:dynamo:Performance_results:ETH:eth_weak_sph.png?480}}\\ Elapsed time for the weak scaling in the horizontal resolution. An ideal scaling for Legendre transform is plotted by dotted line. \\ ^ # of Cores ^ # of Processes ^ # of SMP ^ $l_{max}$ ^ $(N_{r},N_{\theta},N_{\phi})$ ^ Elapsed ^ Nonlinear ^ Solver ^ Comm. ^ SUs ^ | 32 | 32 | 1 | 255 | (64,384,768) | 5.15760 | 4.98965 | 0.167950 | 1.90546 | 458.454 | | 64 | 64 | 1 | 255 | (128,384,768) | 5.15654 | 4.99703 | 0.159511 | 1.92557 | 916.718 | | 128 | 128 | 1 | 255 | (256,384,768) | 5.30425 | 5.14686 | 0.157383 | 2.07861 | 1885.96 | | 256 | 256 | 1 | 255 | (512,384,768) | 5.49180 | 5.33942 | 0.152376 | 2.2764 | 3905.28 | {{wg:dynamo:Performance_results:ETH:eth_weak_r.png?480}}\\ Elapsed time for the weak scaling in the radial resolution. An ideal scaling is a constant elapsed time.\\ [[wg:dynamo:Performance_results|Back to performance benchmark lists]] \\ [[wg:dynamo:Performance_results:ETH:files|files]]