▲ | Show HN: Sourcebot – Self-hosted Perplexity for your codebase(github.com) | ||||||||||||||||||||||||||||||||||
101 points by bshzzle 5 days ago | 27 comments | |||||||||||||||||||||||||||||||||||
Hi HN, We’re Brendan and Michael, the creators of Sourcebot (https://www.sourcebot.dev/), a self-hosted code understanding tool for large codebases. We originally launched on HN 9 months ago with code search (https://news.ycombinator.com/item?id=41711032), and we’re excited to share our newest feature: Ask Sourcebot. Ask Sourcebot is an agentic search tool that lets you ask complex questions about your entire codebase in natural language, and returns a structured response with inline citations back to your code. Some types of questions you might ask: - “How does authentication work in this codebase? What library is being used? What providers can a user log in with?” (https://demo.sourcebot.dev/~/chat/cmdpjkrbw000bnn7s8of2dm11) - “When should I use channels vs. mutexes in go? Find real usages of both and include them in your answer” (https://demo.sourcebot.dev/~/chat/cmdpiuqhu000bpg7s9hprio4w) - “How are shards laid out in memory in the Zoekt code search engine?” (https://demo.sourcebot.dev/~/chat/cmdm9nkck000bod7sqy7c1efb) - "How do I call C from Rust?" (https://demo.sourcebot.dev/~/chat/cmdpjy06g000pnn7ssf4nk60k) You can try it yourself here on our demo site (https://demo.sourcebot.dev/~) or checkout our demo video (https://youtu.be/olc2lyUeB-Q). How is this any different from existing tools like Cursor or Claude code? - Sourcebot solely focuses on code understanding. We believe that, more than ever, the main bottleneck development teams face is not writing code, it’s acquiring the necessary context to make quality changes that are cohesive within the wider codebase. This is true regardless if the author is a human or an LLM. - As opposed to being in your IDE or terminal, Sourcebot is a web app. This allows us to play to the strengths of the web: rich UX and ubiquitous access. We put a ton of work into taking the best parts of IDEs (code navigation, file explorer, syntax highlighting) and packaging them with a custom UX (rich Markdown rendering, inline citations, @ mentions) that is easily shareable between team members. - Sourcebot can maintain an up-to date index of thousands of repos hosted on GitHub, GitLab, Bitbucket, Gerrit, and other hosts. This allows you to ask questions about repositories without checking them out locally. This is especially helpful when ramping up on unfamiliar parts of the codebase or working with systems that are typically spread across multiple repositories, e.g., micro services. - You can BYOK (Bring Your Own API Key) to any supported reasoning model. We currently support 11 different model providers (like Amazon Bedrock and Google Vertex), and plan to add more. - Sourcebot is self-hosted, fair source, and free to use. Under the hood, we expose our existing regular expression search, code navigation, and file reading APIs to a LLM as tool calls. We instruct the LLM via a system prompt to gather the necessary context via these tools to sufficiently answer the users question, and then to provide a concise, structured response. This includes inline citations, which are just structured data that the LLM can embed into it’s response and can then be identified on the client and rendered appropriately. We built this on some amazing libraries like the Vercel AI SDK v5, CodeMirror, react-markdown, and Slate.js, among others. This architecture is intentionally simple. We decided not to introduce any additional techniques like vector embeddings, multi-agent graphs, etc. since we wanted to push the limits of what we could do with what we had on hand. We plan on revisiting our approach as we get user feedback on what works (and what doesn’t). We are really excited about pushing the envelope of code understanding. Give it a try: https://github.com/sourcebot-dev/sourcebot. Cheers! | |||||||||||||||||||||||||||||||||||
▲ | perelin a day ago | parent | next [-] | ||||||||||||||||||||||||||||||||||
Just recently discovered Devins DeepWikis and love them. Same idea, talk to your repo, right? What does Sourcebot doe differently / better? https://deepwiki.org/ | |||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||
▲ | drcongo 3 days ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||
This looks pretty neat. Just spotted in the docs that it has an MCP server too, however, I haven't found anything in the docs about using a locally hosted model. Running this on a box in the corner of the office would be great, but external AI providers would be a deal breaker. | |||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||
▲ | nkmnz 3 days ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||
How does this compare to ingesting all your code into some RAG tool and using that in a chat? I understand the citations part, which is a cool feature indeed, but especially tools for graph-RAG, such as graphiti https://github.com/getzep/graphiti can deliver so much more information that can be stored in a graph versus the code-repository alone, such as info about collaborators, infrastructure, metrics, logs, etc. pp. | |||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||
▲ | cobbzilla 3 days ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||
Love this idea, docs are good I just need to read them better :) Trying it out now. Keep it fully open source and nicely pluggable and I'll keep being a fan! | |||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||
▲ | prepend 3 days ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||
So can I use Functional Source licensed code in internal products if I’m a commercial org? | |||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||
▲ | hahaxdxd123 3 days ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||
I got this set up and working in basically 5 minutes. Going to try to set it up at work. Super cool! It seems like the open source version already has a bunch of features, how do you plan on making sure you can sustainably support it? | |||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||
▲ | er0k 2 days ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||
congrats guys, this new feature looks really cool :) | |||||||||||||||||||||||||||||||||||
▲ | witnessme 2 days ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||
I see you use the Zoekt project for code search. Why did you choose this over alternatives and how has been your experience so far? | |||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||
▲ | dchuk 3 days ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||
In reading the docs, it doesn't look like the MCP server supports the Ask Sourcebot capability. Is that correct or am I missing something in the docs? Is that planned to be added? | |||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||
▲ | pkz1234 2 days ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||
Just tried it, very cool! | |||||||||||||||||||||||||||||||||||
▲ | cuzinluver 4 days ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||
Love that it’s free to use | |||||||||||||||||||||||||||||||||||
▲ | Alifatisk 3 days ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||
I thought this had anything to do with Perplexity | |||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||
▲ | dvhull2 3 days ago | parent | prev [-] | ||||||||||||||||||||||||||||||||||
Nioce |