| ▲ | skeeter2020 10 hours ago | |
>> The .csv parser is amazing Their csv support coupled with lots of functions and fast & easy iterative data discovery has totally changed how I approach investigation problems. I used to focus a significant amount of time on understanding the underlying schema of the problem space first, and often there really wasn't one - but you didn't find out easily. Now I start with pulling in data, writing exploratory queries to validate my assumptions, then cleaning & transforming data and creating new tables from that state; rinse and repeat. Aside from getting much deeper much quicker, you also hit dead ends sooner, saving a lot of otherwise wasted time. There's an interesting paper out there on how the CSV parser works, and some ideas for future enhancements. I couldn't seem to find it but maybe someone else can? | ||
| ▲ | tosh 10 hours ago | parent [-] | |
not a paper but I found this: https://duckdb.org/2025/04/16/duckdb-csv-pollock-benchmark | ||