Genomics is the foundation of precision medicine, global food security and virus surveillance. Exact-match is one of the most essential operations widely used in almost every step of genomics such as alignment, assembly, annotation, and compression. Modern genomics adopts Ferragina-Manzini Index (FM-Index) augmenting space-efficient Burrows-Wheeler transform (BWT) with additional data structures to permit ultra-fast exact-match operations. However, FM-Index is notorious for its poor spatial locality and random memory access pattern. Prior works create GPU-, FPGA-, ASIC- and even process-in-memory (PIM)-based accelerators to boost FM-Index search throughput. Though they achieve the state-of-the-art FM-Index search throughput, the same as all prior conventional accelerators, FM-Index PIMs process only one DNA symbol after each DRAM row activation, thereby suffering from poor memory bandwidth utilization. In this paper, we propose a hardware accelerator, EXMA, to enhance FM-Index search throughput. We first create a novel EXMA table with a multi-task-learning (MTL)-based index to process multiple DNA symbols with each DRAM row activation. We then build an accelerator to search over an EXMA table. We propose 2-stage scheduling to increase the cache hit rate of our accelerator. We introduce dynamic page policy to improve the row buffer hit rate of DRAM main memory. We also present CHAIN compression to reduce the data structure size of EXMA tables. Compared to state-of-the-art FM-Index PIMs, EXMA improves search throughput by $4.9\times$, and enhances search throughput per Watt by $4.8\times$.
翻译:基因组是精密医学、全球粮食安全和病毒监控的基础。 精确匹配是几乎所有基因组学步骤( 如对齐、 组装、 批注和压缩) 中广泛使用的最重要的操作之一。 现代基因组学采用Ferragina- Manzini指数( FM- Index), 增加空间高效的 Burrows- Wheeleler 转换( BWT), 增加数据结构, 允许超快的精确匹配操作。 然而, FM- Index 因其空间位置差和随机存储访问模式而臭名昭著。 先前的工作创建了 GPPU、 FPGA、 ASIC- 甚至是进程模拟( PIM) 。 以基于 GPP- Index 的加速器, 提升了调频- Index 搜索速度。 尽管它们实现了最先进的调频调频的搜索进程, 也通过 We- Indler PIMS 程序, 将一个DNA符号升级到现在的调动的调频- IMA 。