The Future of Intelligent Systems: The Intersection of Silicon, Data and AI

Adi Fuchs
Feb 11
6 min read

As technology advances at a staggering pace, the world is bracing for a multi-trillion-dollar economic transformation unlike anything we’ve seen before. At the center of this shift is the rise of intelligent systems, along with the massive compute infrastructure required to power them.

These systems mark a real break from the technological rules we’ve grown used to - they don’t run on compute alone, but on a three-part foundation of compute, intelligence, and data.

You don’t get intelligent systems unless all three work together.

SQL has become the bridge between LLMs and real enterprise systems, powering agent-driven exploration, model grounding, and continuous ML pipelines. But AI-native workloads generate bursts of iterative, high-concurrency queries that CPUs struggle to handle and GPUs were never designed for, turning SQL performance into a system-level constraint.

In this post, we’ll share how we at Speedata help enable that foundation with our APU (Analytics Processing Unit), the world’s first ASIC chip purpose-built to accelerate data analytics workloads.

3 Trends Shaping SQL for AI

The venerable Structured Query Language (SQL) has long been the workhorse of enterprise data and is now experiencing a resurgence thanks to AI. As AI-driven analytics take center stage, SQL is evolving from a back-end utility to a core component of AI pipelines. This is particularly important since organizations are dealing with the non-trivial challenges of merging GenAI into their business, and the vast majority fail at it, because of the lack of maturity of access pipelines to organization data.

At Speedata, where we build ASICs to accelerate SQL queries, we see three emerging trends reshaping how organizations leverage SQL alongside AI:

1. AI Agents Commoditizing & Supercharging SQL

Much like coding copilots have developers running more Python/C++ code, AI assistants are making SQL a commodity and enabling more SQL queries to be generated and executed. AI is turning SQL into an execution substrate for agents, not a legacy interface for analysts.

The importance of this is dramatic and cannot be overstated; a recently-published study by Databricks shows that 80% of modern databases are now AI-generated.

Practically every AI company now supports a text-to-SQL capability:

Google’s BigQuery, Gemini assistant, AWS’ “Q Generative SQL”, Microsoft Fabric, Snowflake Copilot, Databricks Lakehose IQ, and OpenAI’s GPT Action for SQL.

As “data copilots” and text-to-SQL models get embedded into warehouses, lakehouses, and BI tools, the bottleneck shifts from writing queries to running many more of them: iterative query refinement, auto-generated joins/aggregations, schema exploration, and tool-driven “plan-and-execute” loops that look a lot like how coding agents spam compilers and interpreters.

The practical outcome inside hyperscalers and modern data stacks is higher query concurrency and bursty workloads; LLMs don’t run one perfect query; they run a chain of “probe → validate → expand → drill down” queries until the answer is good enough.

Furthermore, there is a recent trend of integrating AI functions directly into distributed query engines. Syntactically, these “AI functions” embed AI tool calls into the dataflow of a SQL query itself.

Platforms such as Databricks, Snowflake, and Azure already support native AI functions that allow users or agents to invoke LLMs from within a standard SELECT statement. As a result, a single query can join relational tables while simultaneously classifying sentiment in a text column or extracting structured entities from an unstructured blob.

These AI functions create a hybrid execution model that blends structured and unstructured computation. This model is most efficient when each component runs on the silicon best suited to it: LLM inference on GPUs, and classical SQL primitives, such as joins, aggregations, and group-bys, on APUs.

Example of a SQL query with AI functions: aggregating the reasons for negative product reviews. Source: Databricks.

2. LLMs will be grounded by SQL using real data

LLM-native analytics is inherently stochastic, and enterprises don’t get paid for plausible data, they get paid for correct data. So organizations are pairing LLMs with real-time SQL databases to ensure answers are accurate, current, and trustworthy. Vector databases and embedding-based retrieval work well in some cases, but they were proven to have known limits and generally can’t replace “classical” accurate table retrieval and deterministic querying.

That’s why the winning pattern is “LLM for intent + SQL for truth”; the model interprets the question, selects governed data assets, generates (or retrieves) a SQL plan, and the database returns deterministic results.

The reliability layer is where teams are investing: semantic models and metric definitions, row-level access controls, query linting and sandboxing, lineage, caching, and explainability that cites the exact tables and filters used. In this setup, SQL makes AI usable for revenue, compliance, and real-time decisioning.

3. SQL in ML/RL Pipelines

From ETL for model training to reinforcement learning feedback loops, SQL remains essential in preparing and managing data post-training, ensuring AI models can scale and adapt with clean, well-structured data. Grounding organizations with post-training and RL pipelines - data quality, labeling, feature extraction, sampling, evaluation, and monitoring, all boil down to joins, filters, aggregations, and time-windowed transforms, which are classic “SQL territory” at massive scale. RLHF-style loops need constant ETL over interaction logs and feedback signals; evaluation needs reproducible slices and regressions; production needs continuous drift and KPI tracking with strict definitions.

When you combine agentic analytics (more queries) with continuous learning (more pipelines), SQL throughput becomes a strategic lever. That’s exactly where hardware acceleration matters, if AI is going to multiply query volume, reducing latency and cost per query, especially for scan/aggregate-heavy workloads, turns SQL performance into an AI-systems advantage, not just a database tuning concern.

Heterogenous Silicon: Supercharging AI Cycles

GPUs have been key enablers of modern AI, helping end decades of “AI winters” and bringing AI to center stage in real-world practice. The connecting fine line between the AlexNet neural network, which kicked off the “deep learning era,” and Transformers which became the foundation of LLMs and generative AI, is that both were specifically designed to run on GPU chips.

Two years later, scaling laws turned silicon fleets into a powerful engine: if you feed more compute to larger models that process more data, you get better quality AI, and GPUs were the default substrate for turning that equation into reality.

Nowadays, the AI computation stack is shifting toward inference, and the hardware story is entering a new heterogeneous chapter. LLM inference is not a single application from a compute-demand perspective.

It splits into two stages:

(a) the compute-heavy “prefill” stage (processing the prompt and building the KV cache),

(b) the bandwidth-heavy “decode” stage (autoregressive token generation).

This split matters because a chip’s balance of bandwidth and compute is fixed at fabrication time and cannot be dynamically traded after the fact.

NVIDIA’s recent moves are a public acknowledgment of this challenge - after years of advocating monolithic GPU fleets as an “all-in-one” solution for AI, it has introduced Rubin CPX alongside the Rubin GPU line. CPX is specialized for the compute-heavy prefill phase in an ASIC-like manner.

On the other side of the serving pipeline, NVIDIA’s reported $20B licensing agreement with Groq reads like a strategic hedge toward decode-optimized silicon, as decode-side economics become a first-class constraint.

The bottom line is that the future stack will be inherently heterogeneous, there is no one-size-fits-all, and infrastructure economics will be shaped by our ability to run each phase of computation on the chip that serves it best.

The Compute Problem of SQL

However, unlike AI, SQL has a silicon problem. CPUs are inefficient for many large-scale analytics workloads, and GPUs are architecturally mismatched for core SQL execution patterns.

While > 90% of modern high-end GPU throughput is concentrated in Tensor Cores which are Dense Matrix Engines, SQL tends to stress very different patterns: irregular (scatter/gather) memory access and divergent control flow. These map directly to the two “high-priority” hazards NVIDIA flags for CUDA performance: non-coalesced global memory access and warp divergence. GPU stacks like RAPIDS/cuDF and GPU SQL engines like HeavyDB can accelerate many tabular primitives.

The Solution: The APU (Analytics Processing Unit)

That SQL gap motivates compute architectures designed from the ground up for SQL execution, exactly what the APU targets. Unlike monolithic acceleration approaches the APU is a fundamentally heterogeneous processor that integrates specialized engines for the most performance‑critical data‑processing tasks - including tabular compression/decompression, advanced hashing for joins, hardware‑driven partitioning and shuffling, and efficient handling of columnar and row‑based data formats

By replacing general‑purpose CPU instructions with dedicated hardware pathways, the APU removes the three core bottlenecks of analytics: I/O, memory bandwidth, and compute throughput. As the world’s first ASIC optimized specifically for SQL and large‑scale data processing, the APU delivers order‑of‑magnitude improvements in performance and cost for organizations running analytics at hyperscale.

Conclusion

In the next era of intelligent systems, performance and economics will be set by these three rules:

Big compute requires specialized silicon.
Fragmented compute requires heterogeneous silicon.
Run each workload on the silicon that serves it best.

As LLMs and analytics agents proliferate, SQL won’t fade, but it will be amplified, because agents don’t issue one perfect query; they iterate, probe, validate, and drill down, driving higher concurrency and burstier query traffic across the enterprise. At the same time, SQL will increasingly be fused with AI either via in-query AI functions or as the deterministic backbone of RL/post-training pipelines.

The consequence is straightforward: if AI multiplies the number of queries and pipelines, then accelerating SQL primitives moves from “database optimization” to a first-class AI-systems advantage, and that’s exactly the gap Speedata’s APU is built to close.

See How Speedata Performs on Your Workloads