Remix.run Logo
AusiasTsel 3 days ago

Author here. The piece is about bibliographic infrastructure, but the finding that surprised me most while building the dataset was language-specific: Catalan/Valencian (~10M speakers) jumped from near-invisibility in commercial aggregators to 8th place globally once nine national library catalogues were cross-referenced. Bengali, Thai and Urdu —all with substantial publishing industries— remained near the bottom, not because translations don't exist but because the institutions documenting them haven't been connected yet. The 97% figure (editions appearing in only one of 14 sources) held across every sample I could run. Happy to answer questions about methodology, source coverage, or why ISBN metadata is such a mess.

btrettel 2 hours ago | parent [-]

Have you all considered adding scientific articles to your bibliographic database? Finding existing translations of scientific articles can be a real pain. I know because I spent a lot of time doing that during my PhD [1].

For a while I was collaborating with Victor Venema in the volunteer organization Translate Science [2] to try to create a bibliographic database of scientific translations, but unfortunately Victor died, and I became too busy to continue.

[1] https://academia.stackexchange.com/a/93209/31143

[2] https://translate-science.codeberg.page/