ALI Performance Tests on Weaver

Introduction

Currently testing the Greenland Ice Sheet (GIS) in Albany Land Ice (ALI) using Nvidia Tesla V100 GPUs on weaver. Note: Waterman was moved to a different network and renamed weaver. The last waterman test was executed on 6/16/2020.

Architectures:

Name Weaver (P9/V100)
CPU Dual-socket IBM POWER9
GPU Nvidia Tesla V100
Cores/Node 40
Threads/Core 8
GPUs/Node 4
Memory/Node 319 GB
Interconnect Mellanox EDR IB (100 GB/s)
Compiler gcc 7.2.0
GPU Compiler cuda 10.1.105
MPI openmpi 4.0.1

Cases:

Case Name Number of Processes (np) Description
green-3-20km_vel_fea_1ws 4 Unstructured 3-20km GIS, Velocity problem, finite element assembly only, no memoization
green-3-20km_vel_fea_mem 4 Unstructured 3-20km GIS, Velocity problem, finite element assembly only, memoization
green-3-20km_vel_muk 4 Unstructured 3-20km GIS, Velocity problem, MueLu w/ Kokkos
green-3-20km_ent_fea_1ws 4 Unstructured 3-20km GIS, Enthalpy problem, finite element assembly only, no memoization
green-3-20km_ent_fea_mem 4 Unstructured 3-20km GIS, Enthalpy problem, finite element assembly only, memoization
green-3-20km_ent_muk 4 Unstructured 3-20km GIS, Enthalpy problem, MueLu w/ Kokkos
humboldt-1-10km_cop_fea 4 Unstructured 1-10km Humboldt Glacier, Coupled problem, finite element assembly only

Timers:

Timer Name Level Description
Albany Total Time 0 Total wall-clock time of simulation
Albany: Setup Time 1 Preprocess
Albany: Total Fill Time 1 Finite element assembly
Albany Fill: Residual 2 Residual assembly
Albany Residual Fill: Evaluate 3 Compute the residual, local/global assembly
Albany Residual Fill: Export 3 Update global residual across MPI ranks
Albany Fill: Jacobian 2 Jacobian assembly
Albany Jacobian Fill: Evaluate 3 Compute the Jacobian, local/global assembly
Albany Jacobian Fill: Export 3 Update global Jacobian across MPI ranks
NOX Total Preconditioner Construction 1 Construct Preconditioner
NOX Total Linear Solve 1 Linear Solve

Specifications

Performance Timelines

Plot of wall-clock times or memory for nightly runs

Changepoints are estimated using a generalized likelihood ratio method on each timer, and then merged over all timers for a given test case.

Plot window controls

Pollak, Moshe; Siegmund, D. Sequential Detection of a Change in a Normal Mean when the Initial Value is Unknown. Ann. Statist. 19 (1991), no. 1, 394--416. doi:10.1214/aos/1176347990. https://projecteuclid.org/euclid.aos/1176347990

Siegmund, D.; Venkatraman, E. S. Using the Generalized Likelihood Ratio Statistic for Sequential Detection of a Change-Point. Ann. Statist. 23 (1995), no. 1, 255--271. doi:10.1214/aos/1176324466. https://projecteuclid.org/euclid.aos/1176324466

Hawkins, D. M., & Zamba, K. D. (2005). Statistical Process Control for Shifts in Mean or Variance using a Change Point Formulation. Technometrics, 47, 164-173.

Hawkins DM, Qiu P, Kang CW. The changepoint model for statistical process control. Journal of Quality Technology. 2003 Oct 1;35(4):355-366.