Computational Infrastructure for Geodynamics Wiki

compile options

F90OPTFLAGS = -r8 -i4 -ftz -IPF_fma -IPF_fltacc -WB -O2

Definition of columns

name
# of Cores	Number of used CPU cores
# of Processes	Number of MPI processes
# of Threads	Number of threads for each process
$N_{c}$	Truncation lavel for Chebyshev polynomials
$l_{max}$	Truncation lavel for spherical harmonincs
$(N_{r},N_{\theta},N_{\phi})$	Nuber of grids in spherical coordinate
Elapsed	Elapsed (wall clock time) for one time step
Nonlinear	Elapsed (wall clock time) for nonlinear terms (including communications)
Solver	Elapsed (wall clock time) for linear calculation
Comm.	Elapsed (wall clock time) for data communication
Efficiency	Parallel efficiency
SUs	Service unit for $10^{4}$ time steps (Core hours)

Single Processor Result

$N_{c}$	$l_{max}$	$(N_{r},N_{\theta},N_{\phi})$	Elapsed	Nonlinear	Solver	Comm.	SU
47	42	(48,64,129)	0.460322	0.378743	0.0565741	0.005765

Strong Scaling Results

$N_{c}$	$l_{max}$	$(N_{r},N_{\theta},N_{\phi})$
128	128	(129,192,385)

# of Cores	# of Processes	# of SMP	Elapsed	Nonlinear	Solver	Comm.	Efficiency	SUs
4	4	1	13.6562	12.3256	1.19259	8.13115	1	151.736
8	8	1	5.27784	4.71791	0.559926	1.8018	1.29373	117.285
16	16	1	2.83756	2.56503	0.240695	1.55375	1.20317	126.114
32	32	1	1.41826	1.2796	0.12139	1.18198	1.20361	126.067
64	64	1	2.73806	2.66596	0.0632037	2.87671	0.311722	486.766
128	128	1	6.82779	6.79328	0.0298972	7.1671	0.0625029	2427.66

$N_{c}$	$l_{max}$	$(N_{r},N_{\theta},N_{\phi})$
192	128	(193,192,385)

# of Cores	# of Processes	# of SMP	Elapsed	Nonlinear	Solver	Comm.	Efficiency	SUs
4	4	1	21.845	19.1889	2.41306	12.8516	1	242.723
8	8	1	8.69087	7.22496	1.34667	2.81284	1.25678	193.13
16	16	1	4.43835	3.84567	0.535587	2.32362	1.23047	197.26
32	32	1	2.31319	2.01689	0.266718	1.90409	1.18046	205.617
64	64	1	4.14431	3.99401	0.135566	4.28234	0.329443	736.767
128	128	1	10.8246	10.7533	0.0646424	11.3395	0.0630654	3848.74

Elapsed (wall clock) time for the strong scaling for $(N_{c}, l_{max}) = (192, 128)$ case. Number of OpenMP threads are shown by the numbers. Ideal scaling is plotted by dotted line.

Parallel Efficiency for the strong scaling for $(N_{c}, l_{max}) = (192, 128)$ case.

Weak Scaling Results

# of Cores	# of Processes	# of SMP	$N_{c}$	$l_{max}$	$(N_{r},N_{\theta},N_{\phi})$	Elapsed	Nonlinear	Solver	Comm.	SUs
32	32	1	192	128	(193,192,385)	2.31319	2.01689	0.266718	1.90409	205.617

# of Cores	# of Processes	# of SMP	$N_{c}$	$l_{max}$	$(N_{r},N_{\theta},N_{\phi})$	Elapsed	Nonlinear	Solver	Comm.	SUs
32	32	1	192	128	(193,192,385)	2.31319	2.01689	0.266718	1.90409	205.617

Back to performance benchmark lists

files

Computational Infrastructure for Geodynamics Wiki

Sidebar

Table of Contents

compile options

Definition of columns

Single Processor Result

Strong Scaling Results

Weak Scaling Results

Computational Infrastructure for Geodynamics Wiki

User Tools

Site Tools

Sidebar

Table of Contents

compile options

Definition of columns

Single Processor Result

Strong Scaling Results

Weak Scaling Results

Page Tools