Databricks Lakehouse//RT: New Reyden Engine Aims to Make Dedicated Real-Time Databases Obsolete
Summary
Databricks introduced Lakehouse//RT at the Data + AI Summit 2026, a new real-time analytics engine powered by the ground-up Reyden vector engine. It promises millisecond query performance directly on open formats like Delta Lake and Apache Iceberg without copying data. This move directly challenges dedicated real-time OLAP databases such as Apache Pinot (StarTree), ClickHouse, and Apache Druid. By keeping data in the lakehouse, it eliminates the operational complexity of separate serving layers. StarTree quickly issued a response outlining key architectural differences.
What happened?
- Product Announcement: Databricks officially announced Lakehouse//RT. The engine supports query response times down to 10 milliseconds on smaller datasets and sub-100 milliseconds at scale.
- New “Reyden” Engine: The core of Lakehouse//RT is a brand-new compute engine named Reyden (nicknamed “Reynold’s Dream Engine” after co-founder Reynold Xin), optimized from the ground up for extreme concurrency and low latency.
- Competitor Reaction: StarTree, the creator of Apache Pinot, immediately published a comparison highlighting the differences and arguing why dedicated real-time systems remain necessary.
- Heise Coverage: Heise Online published independent coverage of the summit, focusing on Databricks’ ambition to make separate real-time databases redundant.
Why it matters
Previously, companies requiring real-time analytics for user-facing dashboards, fraud detection, or AI agents had to export data from the lakehouse to specialized database clusters like ClickHouse or Apache Pinot. This introduced data duplication, ETL pipelines, and governance fragmentation. Lakehouse//RT promises to handle these workloads directly in Delta Lake or Apache Iceberg, governed by Unity Catalog. This could significantly simplify modern data architectures and reduce infrastructure costs.
Evidence
- Benchmark Claims: Databricks asserts that Lakehouse//RT provides median latencies up to 72% faster than SQL Serverless, supporting tens of thousands of concurrent users and AI agents.
- Compatibility: The system operates zero-copy directly on standard Delta Lake and Iceberg tables, requiring no format conversions.
- StarTree Comparison: In their analysis, StarTree notes that dedicated engines like Pinot, using advanced indexing (e.g., star-tree indexes), remain superior for strict SLAs at massive scale, but acknowledges Lakehouse//RT is compelling for standard workloads.
Analysis
Databricks’ move illustrates the convergence of analytical and transactional patterns, sometimes referred to as LTAP (Lake Transactional/Analytical Processing). Driven by the Reyden engine, the boundary between the classic data lake and the real-time serving layer is blurring. For specialized database vendors like StarTree or ClickHouse, this poses a major threat in the mainstream enterprise market. However, for extreme concurrency workloads (e.g., LinkedIn feeds with millions of active users), Pinot remains dominant because scanning open formats like Delta/Iceberg hits physical disk I/O limits that even the Reyden engine cannot completely bypass.
Practical Takeaways
- Architecture Review: Evaluate if your current real-time pipelines copying data to ClickHouse, Pinot, or Redis can be consolidated using Lakehouse//RT to save licensing and maintenance overhead.
- SLA Evaluation: For mission-critical, customer-facing applications with extreme sub-second latency and high concurrency requirements, dedicated OLAP databases like StarTree/Pinot remain the safer choice.
- Governance Utilization: Leverage Lakehouse//RT’s deep integration with Unity Catalog to apply security and governance policies to real-time data without extra configuration.
Open Questions
- How does the Reyden engine perform in independent, third-party TPC-DS benchmarks compared to ClickHouse and Apache Pinot under real production loads?
- What are the licensing and consumption pricing models for Lakehouse//RT within the Databricks platform?
- Can Databricks fully compensate for the physical limitations of cloud object storage (S3/Blob Storage) under extreme query throughput using purely software-level optimizations?