▲ | letaem77 6 days ago | |
This is my way to do it: 1. Archive whole repository into single text file with Repopack: https://github.com/yamadashy/repomix 2. To reduce token, compress the file with LLMLingua-2: https://github.com/microsoft/LLMLingua (fewer token = more context can be given to LLM = LLM better understands your repository) 3. Copy & Paste the compressed archive text contents as a context, into ChatGPT’s input field as-is, or local LLMs. 4. Ask the LLM for documentation generation. Like, “this is a repository source code. given context, generate a ‘table-of-content’.” Then you will get a ToC. If it looks good, you can ask for generating first chapter. And keeps going until you finish whole documentation. If you are trying to document Typescript/Javascript codebase, You may use bundlers like esbuild for step 2, which will beneficial for token reducing. If you interested in step 2’s LLMLingua-2, check out my Typescript port that can be ran without no installation at: https://atjsh.github.io/llmlingua-2-js/ |