FPGASort: A High Performance Sorting Architecture Exploiting Run-time Reconfiguration on FPGAs for Large Problem Sorting

FPGASort: A High Performance Sorting Architecture Exploiting Run-time Reconfiguration on FPGAs for Large Problem Sorting

Abstract

This paper analyses different hardware sorting architectures in order to implement a highly scaleable sorter for solving huge problems at high performance up to the GB range in linear time complexity. It will be proven that a combination of a FIFO-based merge sorter and a tree-based merge sorter results in the best performance at low cost. Moreover, we will demonstrate how partial run-time reconfiguration can be used for saving almost half the FPGA resources or alternatively for improving the speed. Experiments show a sustainable sorting throughput of 2GB/s for problems fitting into the on-chip FPGA memory and 1 GB/s when using external memory. These values surpass the best published results on large problem sorting implementations on FPGAs, GPUs, and the Cell processor.

Bibtex

@INPROCEEDINGS{fpga09koch,
        AUTHOR             = {{Koch}, {Dirk} and {Torresen}, {Jim}},
        ADDRESS            = {{Monterey, California, USA}},
        BOOKTITLE          = {{Proceedings of the 19th ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA 2011)}},
        PUBLISHER          = {ACM},
        MONTH              = feb,
        PAGES              = {45--54},
        TITLE              = {{FPGASort: A High Performance Sorting Architecture Exploiting Run-time Reconfiguration on FPGAs for Large Problem Sorting}},
        YEAR               = {2011}
}
Published Feb. 28, 2010 12:16 AM - Last modified Aug. 30, 2011 12:12 PM