| ▲ | Show HN: Rocky – Rust SQL engine with branches, replay, column lineage(github.com) | |||||||||||||||||||||||||
| 67 points by hugocorreia90 a day ago | 10 comments | ||||||||||||||||||||||||||
Hi HN, I'm Hugo. I've been building Rocky over the past month, shipping fast in the open. The binary is on GitHub Releases, `dagster-rocky` on PyPI, and the VS Code extension on the Marketplace. I held off on a broader announcement until the trust-system surface was coherent enough to talk about as one thing. The governance waveplan — column classification, per-env masking, 8-field audit trail on every run, `rocky compliance` rollup, role-graph reconciliation, retention policies — landed end-to-end last week in engine-v1.16.0 and rounded out in v1.17.4 (tagged 2026-04-26). That's the milestone I'd been waiting for. The pitch: keep Databricks or Snowflake. Bring Rocky for the DAG. Rocky is a Rust-based control plane for warehouse pipelines. Storage and compute stay with your warehouse. Rocky owns the graph — dependencies, compile-time types, drift, incremental logic, cost, lineage, governance. The things your current stack can't give you because it doesn't own the DAG. A few things I think are interesting: - Branches + replay. `rocky branch create stg` gives you a logical copy of a pipeline's tables (schema-prefix today; native Delta SHALLOW CLONE and Snowflake zero-copy are next). `rocky replay <run_id>` reconstructs which SQL ran against which inputs. Git-grade workflow on a warehouse. - Column-level lineage from the compiler, not a post-hoc graph crawl. The type checker traces columns through joins, CTEs, and windows. VS Code surfaces it inline via LSP. - Governance as a first-class surface. Column classification tags plus per-env masking policies, applied to the warehouse via Unity Catalog (Databricks) or masking policies (Snowflake). 8-field audit trail on every run. `rocky compliance` rollup that CI can gate on. Role-graph reconciliation via SCIM + per-catalog GRANT. Retention policies with a warehouse-side drift probe. - Cost attribution. Every run produces per-model cost (bytes, duration). `[budget]` blocks in `rocky.toml`; breaches fire a `budget_breach` hook event. - Compile-time portability + blast radius. Dialect-divergence lint across Databricks / Snowflake / BigQuery / DuckDB (12 constructs). `SELECT *` downstream-impact lint. - Schema-grounded AI. Generated SQL goes through the compiler — AI suggestions type-check before they can land. What Rocky isn't: - Not a warehouse — it's the control plane on top. - Not a Fivetran replacement. `rocky load` handles files (CSV/Parquet/JSONL); for SaaS sources use Fivetran, Airbyte, or warehouse-native CDC. - Not dbt Cloud — no hosted UI, no managed scheduler. First-class Dagster integration if you need orchestration. Adapters: Databricks (GA), Snowflake (Beta), BigQuery (Beta), DuckDB (local dev / playground). Apache 2.0. I'd love feedback on the trust-system framing, the governance surface (particularly classification-to-masking resolution in `rocky compile` and the `rocky compliance` CI gate), the branches/replay design, the cost-attribution primitives, or anything else that catches your eye. Happy to go deep in the thread. | ||||||||||||||||||||||||||
| ▲ | ramon156 3 hours ago | parent | next [-] | |||||||||||||||||||||||||
If your introduction message already includes a bunch of uncurated claims and LLM smells, then what does that say about the code I'm about to run? | ||||||||||||||||||||||||||
| ||||||||||||||||||||||||||
| ▲ | mollerhoj 3 hours ago | parent | prev | next [-] | |||||||||||||||||||||||||
Its a bit confusing to claim that "The things your current stack can't give you because it doesn't own the DAG" and use DataBricks as your example: DataBricks includes jobs and pipelines, so it very much owns the DAG, no? | ||||||||||||||||||||||||||
| ||||||||||||||||||||||||||
| ▲ | hasyimibhar 4 hours ago | parent | prev | next [-] | |||||||||||||||||||||||||
Looks cool, I've been waiting for someone to build this since dbt and SQLMesh acquisition. It would be great to have model versioning and support for ClickHouse SQL. | ||||||||||||||||||||||||||
| ||||||||||||||||||||||||||
| ▲ | mergisi 4 hours ago | parent | prev [-] | |||||||||||||||||||||||||
* * * | ||||||||||||||||||||||||||
| ||||||||||||||||||||||||||