▲ | noosphr 5 days ago | |
The AI company I've been working at ran out of money last week so I'm taking a month long break. I've been playing around with defining a standard that is easy to implement for serializing tabular data using the ASCII delimiters. So far I've got:
Which seems like a good way to avoid all the trouble of escaping separators in CSV files, if a bit clunky since you need to end each record with US RS and each file with US RS GS.I also accidentally found another test that _all_ LLMs fail at (including all the reasoning models): the ability to decide if a given string is derivable from a grammar. I was asking for tests before I started coding and _every_ frontier model gave me obvious garbage. I've not seen such bad performance on such low hanging fruit for automated training in over a year. | ||
▲ | mac3n 4 days ago | parent [-] | |
Hey, good to see someone using ASCII Don't forget File Separator 0x1c |