Computational Infrastructure for Geodynamics Wiki

Compile options

F90FLAGS = -FR -fpp -r8 -O3 -xAVX -shared_intel -I$(MKLROOT)/include -I$(MKLROOT)/include/fftw

Definition of columns

name
# of Cores	Number of used CPU cores
# of Processes	Number of MPI processes
# of Threads	Number of threads for each process
$l_{max}$	Truncation lavel for spherical harmonincs
$N_{C}$	Truncation lavel for Chebyshev polynomials
$(N_{r},N_{\theta},N_{\phi})$	Nuber of grids in spherical coordinate
Elapsed	Elapsed (wall clock time) for one time step
Nonlinear	Elapsed (wall clock time) for nonlinear terms (including communications)
Solver	Elapsed (wall clock time) for linear calculation
Comm.	Elapsed (wall clock time) for data communication
Efficiency	Parallel efficiency
SUs	Service unit for $10^{4}$ time steps (Core hours)

Single Processor Result

$N_{C}$	$l_{max}$	$(N_{r},N_{\theta},N_{\phi})$	Elapsed	Nonlinear	Solver	Comm.	SU
48	47	( 73,72,144)	1.604797	1.56274	0.042059	0.508469	71.3243

Strong Scaling Results

$N_{C}$	$l_{max}$	$(N_{r},N_{\theta},N_{\phi})$
255	(512,384,768)

# of Cores	# of Processes	# of SMP	Elapsed	Nonlinear	Solver	Comm.	Efficiency	SUs
64	64	1	5.9551	2.97054	0.77414	2.21042	1	1058.68
128	128	1	3.12294	1.67361	0.436457	1.01287	0.953444	1110.38
256	256	1	1.49993	0.793621	0.224334	0.481976	0.992562	1066.62
512	512	1	0.901664	0.441342	0.145236	0.315087	0.82557	1282.37
1024	1024	1	0.457387	0.219177	0.0752801	0.16293	0.813738	1301.01
2048	2048	1	0.322376	0.136652	0.0529017	0.132822	0.577267	1833.96
4096	4096	1	0.185785	0.0691138	0.0287718	0.0878993	0.500839	2113.82

Elapsed (wall clock) time for the strong scaling. Number of OpenMP threads are shown by the numbers. Ideal scaling is plotted by dotted line.

Parallel Efficiency for the strong scaling. Number of OpenMP threads are shown by the numbers.

Weak Scaling Results

# of Cores	# of Processes	# of SMP	$l_{max}$	$(N_{r},N_{\theta},N_{\phi})$	Elapsed	Nonlinear	Solver	Comm.	SUs
16	4	4	31	(513,48,96)	0.345327	0.334976	0.0103503	0.0346996	15.3479
64	16	4	63	(513,96,192)	0.377983	0.367511	0.0104712	0.0701478	67.197
256	64	4	127	(513,192,384)	0.506746	0.496344	0.0104008	0.215548	360.352
1024	256	4	255	(513,768,1536)	0.523838	0.513385	0.0104525	0.175344	1490.03
4096	1024	4	511	(513,768,1536)	0.744473	0.733799	0.010673	0.386788	8470.44

Elapsed time for the weak scaling in the horizontal resolution. The results with 4 OpenMP threads are shown. An ideal scaling for Legendre transform is plotted by dotted line.

# of Cores	# of Processes	# of SMP	$l_{max}$	$(N_{r},N_{\theta},N_{\phi})$	Elapsed	Nonlinear	Solver	Comm.	SUs
128	32	4	255	(33,384,768)	0.253204	0.248189	0.00501417	0.100608	90.0281
256	64	4	255	(65,384,768)	0.261203	0.256194	0.00500897	0.0883585	185.744
512	128	4	255	(129,384,768)	0.266168	0.261061	0.00510643	0.0962178	378.549
1024	256	4	255	(257,384,768)	0.303394	0.298234	0.00515566	0.145423	760.043
2048	512	4	255	(513,384,768)	0.276864	0.271508	0.00535592	0.118606	1575.05
4096	1024	4	255	(1025,384,768)	0.279425	0.27406	0.00536459	0.127257	3179.23

Elapsed time for the weak scaling in the radial resolution. The results with 4 OpenMP threads are shown.

Back to performance benchmark lists

files

Computational Infrastructure for Geodynamics Wiki

Sidebar

Table of Contents

Compile options

Definition of columns

Single Processor Result

Strong Scaling Results

Weak Scaling Results

Computational Infrastructure for Geodynamics Wiki

User Tools

Site Tools

Sidebar

Table of Contents

Compile options

Definition of columns

Single Processor Result

Strong Scaling Results

Weak Scaling Results

Page Tools