Authors
Srinidhi Kestur, John D Davis, Eric S Chung
Publication date
2012/4/29
Conference
2012 IEEE 20th International Symposium on Field-Programmable Custom Computing Machines
Pages
9-16
Publisher
IEEE
Description
We present the design and implementation of a universal, single-bit stream library for accelerating matrix-vector multiplication using FPGAs. Our library handles multiple matrix encodings ranging from dense to multiple sparse formats. A key novelty in our approach is the introduction of a hardware-optimized sparse matrix representation called Compressed Variable-Length Bit Vector (CVBV), which reduces the storage and bandwidth requirements up to 43% (on average 25%) compared to compressed sparse row (CSR) across all the matrices from the University of Florida Sparse Matrix Collection. Our hardware incorporates a runtime-programmable decoder that performs on-the-fly-decoding of various formats such as Dense, COO, CSR, DIA, and ELL. The flexibility and scalability of our design is demonstrated across two FPGA platforms: (1) the BEE3 (Virtex-5 LX155T with 16GB of DRAM) and (2) ML605 (Virtex-6 …
Total citations
20122013201420152016201720182019202020212022202320243411151539151313684
Scholar articles
S Kestur, JD Davis, ES Chung - 2012 IEEE 20th International Symposium on Field …, 2012