Maintaining high performance in the QR factorization while scaling both problem size and parallelism
QR factorization is an extremely important linear algebra operation used in solving multiple linear equations, particularly least-square-error problems, and in finding eigenvalues and eigen-vectors. This thesis details the author's contributions to the field of computer science by providing performance-efficient QR routines to ATLAS (Automatically Tuned Linear Algebra Software). ATLAS is an open source linear algebra library, intended for high performance computing. The author has added new implementations for four types/precisions (single real, double real, single complex, and double complex) in four different variants of matrix factorization (QR, RQ, QL and LQ). QR factorization involves a panel factorization and a trailing matrix update operation. A statically blocked algorithm is used for the full matrix factorization. A recursive formulation is implemented for the QR panel factorization, providing more robust performance. Together these techniques result in substantial performance improvement over the LAPACK version.