We describe a minor format change for representing a symmetric band matrix AB using the same array space specified by LAPACK. In LAPACK, band codes operating on the lower part of a symmetric matrix reference matrix element (i, j) as AB1+i−j,j . The format change we propose allows LAPACK band codes to reference the (i, j) element as ABi,j . Doing this yields lower band codes that use standard
matrix terminology so that they become clearer and hence easier to understand. As a second contribution, we simplify the LAPACK Cholesky Band Factorization routine pbtrf by reducing from six to three the number of subroutine calls one needs to invoke during a right-looking block factorization step. Our new routines perform exactly the same number of floating-point arithmetic operations as the current LAPACK routine pbtrf. Almost always they deliver higher performance. The experimental results show that this is especially true on SMP platforms where parallelism is obtained via the use level-3 multi-threaded BLAS. We only consider the lower triangular case of the factorization here; the upper triangular case is currently under investigation.
|Series||D T U Compute. Technical Report|