Remix.run Logo
kordlessagain 8 hours ago

This is really cool...great job! It's a favorite pastime of mine to index various large corpora.

As for speed, this might help for code referencing: https://github.com/deepbluedynamics/lume

Blog post: https://deepbluedynamics.com/blog/lume-retrieval-primitives

I use a small local model to extract entities for the graph, but it's not necessary.

You can optionally use GTR-T5 which is a few years old now, but still good for generating fast and free embeddings. That step is only run once if you run it in hybrid mode.

Feel free to take and remix or use!