Reproducing the text of a book in the library is a synonym for identifying the book. So this is really called "text compression", which is a well-studied field.