| ▲ | simlevesque 2 hours ago | |||||||||||||||||||
I've been making skills from arxiv papers for a while. I have a one for multi-object tracking for example. It has a SKILL.md describing all important papers (over 30) on the subject and a folder with each paper's full content as reStructuredText. To feed Arxiv papers to LLMs I found that RST gives the best token count/fidelity ratio. Markdown lacks precision. LateX is too verbose. I have a script with the paper's urls, name and date that downloads the LateX zips from Arxiv, extracts it, transforms them to RST and then adds them to the right folder. Then I ask a LLM to make a summary from the full text, then I give other LLMs the full paper again with the summary and ask them to improve on and and proofread them. While this goes on I read the papers myself and at the end I read the summaries and if I approve them I add it to the skill. I also add for each paper info on how well the algorithms described do in common benchmarks. I highly recommend doing something similar if you're working in a cutting-edge domain. Also I'd like to know if anyone has recommendations to improve what I do. | ||||||||||||||||||||
| ▲ | paulluuk 2 hours ago | parent | next [-] | |||||||||||||||||||
This sounds like it would work, but honestly if you've already read all 30 papers fully, what do you still need to llm to do for you? Just the boilerplate? | ||||||||||||||||||||
| ||||||||||||||||||||
| ▲ | ctoth an hour ago | parent | prev | next [-] | |||||||||||||||||||
I've been working on ctoth/research-papers-plugin, the pipeline to actually get LLMs to extract the notes. I really like your insight re RST over Markdown! It sounds like we're working on similar stuff and I'll absolutely reach out :) | ||||||||||||||||||||
| ||||||||||||||||||||
| ▲ | satvikpendem an hour ago | parent | prev | next [-] | |||||||||||||||||||
Does that even fit in the context? It seems like 30 papers worth of content would just overflow it. | ||||||||||||||||||||
| ||||||||||||||||||||
| ▲ | alex000kim 2 hours ago | parent | prev | next [-] | |||||||||||||||||||
sounds similar to "LLM Knowledge Bases" https://xcancel.com/karpathy/status/2039805659525644595 | ||||||||||||||||||||
| ▲ | MrLeap 2 hours ago | parent | prev [-] | |||||||||||||||||||
What is RST? | ||||||||||||||||||||
| ||||||||||||||||||||