An efficient out-of-core implementation of block Cholesky decomposition on a multi-GPU system

with Lin Cheng, Peter Yoon and Jiajia Zhao

Paper presented at IASTED PDCS (2012)
Poster presented at the IEEE EMBS (2012)



We use computing power of general-purpose GPUs to accelerate a dense linear algebra routine known as Cholesky decomposition. Our implementation eliminates the limitation in memory space by storing the system matrix in hard disk and loading only parts of it into main memory.

Publication Details

