▲ | aaroninsf 4 days ago | |
Funny, I wrote my own, so had to click, but mine was for a very different use case: converting extremely varied date strings into date ranges, where a significant % of cases are large number are human-entered human-readable date and date range specifiers, as used in periodicals and other material dating back a century or two. I.e. I had correctly interpret not just ISO dates, but, ambiguous dates and date ranges as accepted in (library catalog) MARC records, which allows uncertain dates such as "[19--]" and "19??", and, natural language descriptors such such as "Winter/Spring 1917" and "Third Quarter '43" and "Easter 2001." In as many languages as possible for the corpus being digitized. Output was a date range, where precision was converted into range. I'd like to someday enhance things to formalize the distinction between ambiguity and precision, but, that's a someday. When schema is totally uncontrolled, many cases are ambiguous without other context (e.g. XX-ZZ-YYYY could be month-day-year or day-month-year for a large overlap); and some require fun heuristics (looking up Easter in a given year... but did they mean orthodox or...) and arbitrary standards (when do seasons start? what if a publication is from the southern hemisphere?) and policies for illegal dates (Feburary 29 on non-leap-years being a surprisingly common value)... In a dull moment I should clean up the project (in Python) and package it for general use... |