Deliver unmatched processing throughput to your Spark workloads

The Speedata APU achieves its breakthrough throughput by mapping the required processing into its internal hardware pipeline. The Speedata Dash software automatically configures a data flow in silicon, where row processing is broken into hundreds of steps, each efficiently flowing to the next one at every hardware clock. Therefore, at any given time, hundreds of rows are at different stages of processing in the hardware, in parallel, resulting in a processing throughput of over a billion rows per second.

Accelerated Parquet processing in hardware

Parquet is the leading file format for Analytics. Speedata’s APU efficiently processes Parquet files as part of its hardware pipeline, from decompressing and decoding columns through columnar filters and projections to rows assembly and flattening of nested data (EXPLODE).

Decompression

Uncompressing Parquet columns

Decoding

Decoding Parquet columns

Columnar processing

Computing column-level filters and projections

Row Assembly

Assembling columns into rows

Joins and aggregations

Computing joins and aggregations

Shuffle preparation

Partitioning and Compression

Seamless integration
with Apache Spark

Speedata’s Dash software transparently plugs into the Spark Catalyst optimizer to automatically identify and offload compute-intensive work to the APU, delivering dramatic acceleration for Apache Spark 3.x workloads on Kubernetes, YARN and standalone cluster managers

Analytics at the Speed of Silicon

Accelerate Apache Spark by 100x -

right at the hardware layer, with zero code changes

Book a Demo

Deliver unmatched processing throughput to your Spark workloads

Accelerated Parquet processing in hardware

Seamless integration with Apache Spark

Analytics at the Speed of Silicon

Seamless integration
with Apache Spark