Abstract
Implementations of QR
decomposition. At
the moment, these are not very fast for single large matrices, but
they are serviceable. Performance is quite good on "batches" on
many smaller matrices (i.e. when you map
QR decomposition), where
"small" is less than 16x16 or 32x16.
Much of this code is based on work by Kasper Unn Weihe, Kristian Quirin Hansen, and Peter Kanstrup Larsen. See their report for details.
Synopsis
module mk_block_householder  :  (T: ordered_field) > {
 
module mk_gram_schmidt  :  (T: ordered_field) > {

Description
 ↑module mk_block_householder
QR decomposition via the blocked Householder transform. The block size affects performance, although usually only slightly. Use 16 for a reasonable default. At the moment, the input size must be a multiple of the block size.
 ↑module mk_gram_schmidt
QR decomposition with the GramSchmidt process. Note: Very numerically unstable.