Remix.run Logo
ramses0 3 hours ago

I'll share this comment from 7 months ago with you:

https://news.ycombinator.com/item?id=40100069

"prefer shallow arrays of 'records', possibly with a deeply nested 'uri'-style identifier"

...the clutch result is: "it can be loaded into a database and treated as a table".

The origin of this technique for me was someone saying back in 2000'ish timeframe (and effectively modernized here):

    sqlite-utils insert example.db ls_part <( jc ls -lart )
    sqlite3 example.db --json \
      "SELECT COUNT(*) AS c, flags FROM ls_lart GROUP BY flags" 
    [
      {
        "c": 9,
        "flags": "-rw-r--r--"
      },
      {
        "c": 2,
        "flags": "drwxr-xr-x"
      }
    ]
...this is a 'trivial' example, but it puts a really fine point on the capabilities it unlocks. You're not restricted to building a single pipeline, you can use full relational queries (eg: `... WHERE date > ...`, `... LEFT JOIN files ON git_status...`), you can refer to things by column names rather than weird regexes or `awk` scripts.

This particular example is "dumb" (but ayyyy, I didn't get a UUOC cat award!) in that you can easily muddle through it in different (existing pipeline) ways, but SQL crushes the primitive POSIX relationship tooling (so old, ugly, and unused they're tough to find!), eg: `comm`, `paste`, `uniq`, `awk`