| ▲ | Fast Concordance: Instant concordance on a corpus of >1,200 books(iafisher.com) | |
| 34 points by evakhoury 4 days ago | 3 comments | ||
| ▲ | simonw 4 hours ago | parent | next [-] | |
This is a neat brute-force search system - it uses goroutines, one for each of the 1,200 books in the corpus, and has each one do a regex search against the in-memory text for that book. Here's a neat trick I picked up from the source code:
An earlier comment explains this:
So instead of `\bWORD\b` it does the simplest possible match and then checks to see if the character one index before the match and or one index after the matches are also letters. If they are it skips the match. | ||
| ▲ | 2b3a51 5 hours ago | parent | prev | next [-] | |
It is, indeed, impressively fast. The results seem to be sorted by first name of author. Is that a deliberate choice? | ||
| ▲ | drivebyhooting 2 hours ago | parent | prev [-] | |
It seems to work at the word level. Why not use a precomputed posting list? | ||