Remix.run Logo
egruy 5 days ago

I am building Tracelake - a data quality solution for SAP replications.

https://tracelake.com/

Why? SAP holds the most important data for companies that use it, but it's notoriously difficult to replicate this data consistently into a data analytics platform (think Snowflake, Redshift, etc...).

Couple of companies specialize in the SAP replication, but it's hard to validate the correctness of the replicated data, because:

- the SAP data is changing continuously and rapidly

- there are hundreds of tables and TBs of data

Usually it's the consumers of data downstream who notice that the data just "doesn't feel right".

Tracelake adds a validation layer on top of the SAP to X replication, which periodically compares the data between source and target and informs you about any missing / incorrect data, so you can tackle data quality issues proactively.