DCIRApr 25, 2018

Giving Text Analytics a Boost

arXiv:1806.01103v114 citations
Originality Incremental advance
AI Analysis

This addresses the problem of handling 'Big Data' in text analytics for users of IBM's SystemT, though it is incremental as it builds on existing compilation and communication methods.

The paper tackled the inefficiency of traditional server architectures in analyzing large-scale textual data with IBM's SystemT software, achieving an order of magnitude improvement in throughput rates for information extraction queries using a streaming hardware accelerator.

The amount of textual data has reached a new scale and continues to grow at an unprecedented rate. IBM's SystemT software is a powerful text analytics system, which offers a query-based interface to reveal the valuable information that lies within these mounds of data. However, traditional server architectures are not capable of analyzing the so-called "Big Data" in an efficient way, despite the high memory bandwidth that is available. We show that by using a streaming hardware accelerator implemented in reconfigurable logic, the throughput rates of the SystemT's information extraction queries can be improved by an order of magnitude. We present how such a system can be deployed by extending SystemT's existing compilation flow and by using a multi-threaded communication interface that can efficiently use the bandwidth of the accelerator.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes