Show HN: ChunkHound, a local-first tool for understanding large codebases

ChunkHound’s goal is simple: local-first codebase intelligence that helps you pull deep, core-dev-level insights on demand, generate always-up-to-date docs, and scale from small repos to enterprise monorepos — while staying free + open source and provider-agnostic (VoyageAI / OpenAI / Qwen3, Anthropic / OpenAI / Gemini / Grok, and more).

I’d love your feedback — and if you have, thank you for being part of the journey!

▲

goda90 3 hours ago | parent | next [-]

A few years ago I set out to refactor some of my team's code that I wasn't particularly familiar with, but we wanted to modularize and re-use in more places. The primary file alone was 18k+ lines of Typescript that was a terrible mess of spaghetti. Most of it had been written in JavaScript but later converted haphazardly. I ended up writing myself a little app that used the Typescript compiler APIs to help me just explore all the many branches of the code and annotate how I would refactor different parts. It helped a bit, but I never got time to add some of the more intelligent features I wanted like finding every execution path between two points.

	▲	henryhale an hour ago \| parent [-]
		give depgraph a try - https://github.com/henryhale/depgraph - i'd like to learn about how i could improve it.

▲

henryhale an hour ago | parent | prev | next [-]

I have been working on depgraph (https://github.com/henryhale/depgraph) for a while now. It is truly local with several output options(json, mermaid, jsoncanvas). Mutliple languages are supported (js, go, c) - expanding the list slowly but sure.

▲

dmos62 22 minutes ago | parent | prev | next [-]

Will try this out. Was always envious of how Augment was able to do this. Kudos.

▲

dcreater 4 hours ago | parent | prev | next [-]

you say "local-first" but have placed voyage API for embeddings as the default (had to go to the website and dig to find that you can infact use local embedding models). Please fix

▲

apgwoz 3 hours ago | parent | prev | next [-]

Perhaps I am missing something, but this seems to require a Lemon (LLM)? Is the idea that the Lemon is used to help build an index AOT that can be queried locally, after?

I want to figure out how to build advanced tools, potentially by leveraging Lemons to iterate quickly, that allow us all to rely _less_ on Lemons, but still get 10,20,30x efficiency gains when building software, without needing to battle the ethics of it all.

▲

conception an hour ago | parent | prev | next [-]

I have chunckhound is a few projects and it’s noted in both the agent md file as well as mcp and claude never uses it. Ever. Never once.

Is there a prompt special sauce y’all use to get it to use it?

▲

Neywiny 5 hours ago | parent | prev | next [-]

Might give this a try to experiment if it's really free to use (I'll have to read up on that I guess). The qemu codebase is huge and every contributer seems to solve problems in slightly different ways. Would be nice if this tool could help distill it.

▲

dogman123 5 hours ago | parent | prev | next [-]

Is there a way to have the model inside of codex to make use of chunkhound instead of its “built in” search/explore functionality with rg? Whenever I spin up a new agent using xhigh thinking it spins its wheels for a while to get up to speed — wondering if chunkhound can make this process faster.

▲

CamperBob2 2 hours ago | parent | prev | next [-]

Looks like the tutorial link is broken.

▲

bravura 4 hours ago | parent | prev [-]

Can you please expose the functionality as a self-documenting CLI command with machine readable output? (Or did I misunderstand that MCP isn't the only way to use it?)

I am curious to try it but do not want to adopt MCP servers.

Telling Claude to call the CLI tool is more efficient.

▲

blackqueeriroh 2 hours ago | parent | next [-]

Am I confused or is this not an open-source project on GitHub?

You have every ability to make these modifications yourself; is there a reason you feel the need to require the creator to do so?

	▲	from_memory an hour ago \| parent [-]
		I think the term is "Instrumentalism".

▲

dcreater 4 hours ago | parent | prev [-]

Agree. And to make the CLI usage more effective/efficient, if you can publish a skill that would be excellent