Computational Infrastructure for Geodynamics Wiki

compile options

F90OPTFLAGS = -O3 -warn all -g -xhost -openmp

Notes

Nonlinear terms is calculated twice for each step
All process have full matrix for all harmonics degree
LU decomposition is done for full matrix
Time integration is done by a solver for banded matrix

Definition of columns

name
# of Cores	Number of used CPU cores
# of Processes	Number of MPI processes
# of Threads	Number of threads for each process
$l_{max}$	Truncation lavel for spherical harmonincs
$(N_{r},N_{\theta},N_{\phi})$	Nuber of grids in spherical coordinate
Elapsed	Elapsed (wall clock) time for one time step
Nonlinear	Elapsed (wall clock) time for nonlinear terms (including communications)
Solver	Elapsed (wall clock) time for linear calculation
Comm.	Elapsed (wall clock) time for data communication
Init.	Elapsed (wall clock) time for initialization
Efficiency	Parallel efficiency
SUs	Service unit for $10^{4}$ time steps (Core hours)

Single Processor Result

$l_{max}$	$(N_{r},N_{\theta},N_{\phi})$	Elapsed	Nonlinear	Solver	Comm.	Init.	SU
47	( 73,72,144)	0.678760	0.488277	0.190479	0.029903	2.4152	30.1671

Strong Scaling Results

$l_{max}$	$(N_{r},N_{\theta},N_{\phi})$
255	(256,384,768)

# of Cores	# of Processes	# of SMP	Elapsed	Nonlinear	Solver	Comm.	Init.	Efficiency	SUs
64	8	8	6.40703	5.89877	0.508254	1.07863	3554.5	1	1139.03
128	16	8	3.54131	3.2940	0.247309	0.890222	3552.51	0.904612	1259.13
256	32	8	1.86101	1.7352	0.125808	0.475738	3550.63	0.860692	1323.39
1024	64	8	1.04298	0.977307	0.0656672	0.361399	3552.23	0.383937	2966.7

Elapsed (wall clock) time for the strong scaling. Ideal scaling is plotted by dotted line.

Parallel Efficiency for the strong scaling.

Weak Scaling Results

# of Cores	# of Processes	# of SMP	$l_{max}$	$(N_{r},N_{\theta},N_{\phi})$	Elapsed	Nonlinear	Solver	Comm.	Init.	SUs
2	1	2	31	(256,48,96)	0.551937	0.391925	0.160007	0.0214999	68.4944	15.3479
8	1	8	63	(256,96,192)	1.12191	0.939082	0.182825	0.0476302	92.4576	67.197
32	4	8	127	(256,192,384)	1.81109	1.62482	0.186259	0.27049	271.017	360.352
128	16	8	255	(256,384,768)	2.62543	2.43587	0.189552	0.624057	969.922	1490.03
512	64	8	511	(256,768,1536)	4.65903	4.45921	0.199815	2.01773	3545.45	8470.44

Elapsed time for the weak scaling in the horizontal resolution. Elapsed time for each time step is plotted by black, and initialization time is plotted by red. Scaling of O(Ncore^1/2) is plotted by dotted lines.

# of Cores	# of Processes	# of SMP	$l_{max}$	$(N_{r},N_{\theta},N_{\phi})$	Elapsed	Nonlinear	Solver	Comm.	Init.	SUs
96	12	8	383	(64,576,1152)	2.76226	2.59632	0.165937	0.509965	43.1944	736.604
192	24	8	383	(128,576,1152)	2.81134	2.63732	0.174011	0.613411	484.779	1499.38
384	48	8	383	(256,576,1152)	3.63671	3.44881	0.187897	1.27930	6423.87	3879.16

Elapsed time for the weak scaling in the radial resolution. The results with 4 OpenMP threads are shown.

Back to performance benchmark lists

files

Computational Infrastructure for Geodynamics Wiki

Sidebar

Table of Contents

compile options

Notes

Definition of columns

Single Processor Result

Strong Scaling Results

Weak Scaling Results

Computational Infrastructure for Geodynamics Wiki

User Tools

Site Tools

Sidebar

Table of Contents

compile options

Notes

Definition of columns

Single Processor Result

Strong Scaling Results

Weak Scaling Results

Page Tools