Remix.run Logo
RobinL 4 days ago

We use aws glue for spark (but are increasingly moving towards duckdb because it's faster for our workloads and easier to test and deploy).

For Spark, glue works quite well. We use it as 'spark as a service', keeping our code as close to vanilla pyspark as possible. This leaves us free to write our code in normal python files, write our own (tested) libraries which are used in our jobs, use GitHub for version control and ci and so on