Remix.run Logo
tremon 6 days ago

You don't actually need to physically copy data, just create a view for every table that does a replacing merge between the original read-only data and the developer's own copy. And you can put a trigger on the view to redirect writes to the same private-copy table, making the whole thing transparent to the user.

Not disputing that Oracle might have had something like this built-in, but it sounds like something that I could have whipped up in a day or so as a custom solution. I actually proposed a similar system to create anonymized datasets for researchers when I worked at a national archive institute.

TheMrZZ 6 days ago | parent [-]

Snowflake uses a similar system with their 0-copy cloning. It starts with the original table's partition, and keeps track of the delta created by subsequent operations. Always found that builtin mechanism pretty neat!

gregw2 4 days ago | parent [-]

I heard about this feature first from Snowflake but there are similar options around in other ecosystems which may be of interest to someone here and one thing to keep in mind with even Snowflake's implementation...

Snowflake's implementation only works within a single Snowflake account, not cross-account, which implies if you want to clone across dev/qa/prod you must manage those environments within a single Snowflake account.

BigQuery has a very similar "table clone" feature. It works across GCP projects (accounts) but not across organizations.

Redshift and Azure Synapse do not really have this feature at all.

Databricks, Microsoft Fabric and the Iceberg Nessie-only catalog do support something similar, often called shallow cloning.

(Nobody really supports cross-region cloning... which makes sense if you think about it.)