| ▲ | csjh 8 hours ago | |
Obligatory DuckDB solution: > duckdb -s "COPY (SELECT url[20:] as url, date, count(*) as c FROM read_csv('data.csv', columns = { 'url': 'VARCHAR', 'date': 'DATE' }) GROUP BY url, date) TO 'output.json' (ARRAY)" Takes about 8 seconds on my M1 Macbook. JSON not in the right format, but that wouldn't dominate the execution time. | ||
| ▲ | cess11 7 hours ago | parent [-] | |
This log in one of the PR:s claims a 5.4s running time on some Mac. https://github.com/tempestphp/100-million-row-challenge/pull... | ||