Remix.run Logo
amluto 5 hours ago

Do you know what kinds of features the model is picking up on to distinguish ink from papyrus? And did you have any labeled data (images where a human expert has identified ink or perhaps a scan of a burnt scroll with known content) to help train it?

Certainly my Mark 1 eyeballs would not obviously perform better than random guessing at this task. Although my eyeballs are, if nothing else, nerfed by only being able to see a 2D slice of the data.

verditelabs 4 hours ago | parent [-]

Yes. Most of the ink we have come across is carbon based. This leaves a certain texture on the scrolls that is recoverable and viewable with fairly basic physically based rendering, though how much ink is recoverable varies greatly from one character to the next. I don't have links handy but we just published updates to our data viewer page on our website. Pherc.Paris.4 I believe has the best overlay of ink.

A lot of labeled data is available on our ftp server which has public access

amluto 2 hours ago | parent | next [-]

When you say "physically based rendering" do you mean that one could build a PBR model based on the (unrolled?) xray data, render that model, and be able to see the ink?

edit: I found this:

https://scrollprize.org/data_browser#/samples/PHercParis4/se...

The JSON seems to suggest that I'm mostly looking at ink detection output, but I could easily be using the tool wrong.

But I also found this awesome explanation:

https://scrollprize.org/data_fragments

I guess I bunch of the training was done by using fragments of scrolls where ground truth data is available using IR photography.

Also... that xray resolution is absolutely amazing!

verditelabs an hour ago | parent [-]

Some images on that page, specifically the "alpha composite" and "combined alpha" images, are a pretty simple PBR (if it's even that complex; it's just a composite rendering over a 3d array to a 2d image) rendering with no ML based ink detection in the input.

londons_explore 3 hours ago | parent | prev [-]

I assume that's because the writer probably sometimes shortly after re-inking the writing instrument was putting down a 10x thicker layer...