| ▲ | Show HN: I ported Tree-sitter to Go(github.com) |
| 114 points by odvcencio 2 hours ago | 37 comments |
| This started as a hard requirement for my TUI-based editor application, it ended up going in a few different directions. A suite of tools that help with semantic code entities: https://github.com/odvcencio/gts-suite A next-gen version control system called Got: https://github.com/odvcencio/got I think this has some pretty big potential! I think there's many classes of application (particularly legacy architecture) that can benefit from these kinds of analysis tooling. My next post will be about composing all these together, an exciting project I call GotHub. Thanks! |
|
| ▲ | acedTrex 14 minutes ago | parent | next [-] |
| Claude attempted a treesitter to go port Better title |
| |
| ▲ | gritzko 8 minutes ago | parent | next [-] | | I work on a revision control system project, except merge is CRDT. On Feb 22 there was a server break-in (I did not keep unencrypted sources on the client, server login was YubiKey only, but that is not 100% guarantee). I reported break-in to my Telegram channel that day. My design docs https://replicated.wiki/blog/partII.html I used tree-sitter for coarse AST. Some key parts were missing from the server as well, because I expected problems (had lots of adventures in East Asia, evil maids, various other incidents on a regular basis). When I saw "tree-sitter in go" title, I was very glad initially. Solves some problems for me. Then I saw the full picture. | |
| ▲ | red_hare 9 minutes ago | parent | prev | next [-] | | How is OP using Claude relevant? | | | |
| ▲ | odvcencio 11 minutes ago | parent | prev [-] | | well how did it do? | | |
|
|
| ▲ | trickypr 4 minutes ago | parent | prev | next [-] |
| Do you have an equivalent of TreeCursors or tree-sitter-generate? There are at least some use cases where neither queries nor walks are suitable. And I have run into cases where being able to regenerate and compile grammars on the fly is immeasurably helpful. At least for my use cases, this would be unusable. Also, what the hell is this: > partial [..] missing external scanner Why do you have a parsing mode that guarantees incorrect outputs on some grammars (html comes to mind) and then use it as your “90x faster” benchmark figure? |
|
| ▲ | sluongng 2 hours ago | parent | prev | next [-] |
| Oh this is really neat for the Bazel community, as depending on tree-sitter to build a gazelle language extension, with Gazelle written in Go, requires you to use CGO. Now perhaps we can get rid of the CGO dependency and make it pure Go instead.
I have pinged some folks to take a look at it. |
| |
| ▲ | odvcencio an hour ago | parent [-] | | thanks so much for the note! i really appreciate it. i built this precisely for folks like yourself with this specific pain, thanks again! |
|
|
| ▲ | shayief 8 minutes ago | parent | prev | next [-] |
| This is great, I was looking for something like this, thanks for making this! I imagine this can very useful for Go-based forges that need syntax highlighting (i.e. Gitea, Forgejo). I have a strict no-cgo requirement, so I might use it in my project, which is Git+JJ forge https://gitncoffee.com. |
| |
| ▲ | odvcencio 5 minutes ago | parent [-] | | thank you for the kind words! Very cool project! Very happy you can find some utility in it |
|
|
| ▲ | 3rly an hour ago | parent | prev | next [-] |
| Wouldn't `got` be confused with OpenBSD's Got: https://gameoftrees.org/index.html |
| |
| ▲ | odvcencio an hour ago | parent [-] | | oh wow! i really thought i was being too clever but i shouldve assumed nothing new under the sun. well im taking name suggestions now! | | |
| ▲ | allknowingfrog 39 minutes ago | parent | next [-] | | Well, find and sed have modern "fd" and "sd" alternatives. Naming it "gt" allows you to claim that your version save 33% compared to typing "git". | |
| ▲ | boobsbr an hour ago | parent | prev | next [-] | | Goty McGotface | |
| ▲ | Imustaskforhelp 40 minutes ago | parent | prev | next [-] | | uGOT / uGOTme? (sort of like the idea behind uTorrent) but I will agree that sbankowi's idea of Yet another got is great as well. +1 to that as well. | |
| ▲ | sbankowi an hour ago | parent | prev [-] | | YAGOT (Yet Another GOT) | | |
| ▲ | bityard 37 minutes ago | parent [-] | | Probably taken already, better use YAGOT-NG (Next Generation) just to be safe. | | |
|
|
|
|
| ▲ | conartist6 28 minutes ago | parent | prev | next [-] |
| It looks like porting the custom C lexers is a big part of the trouble you had to go to do this. |
| |
| ▲ | odvcencio 22 minutes ago | parent [-] | | yes basically about 70% of the engineering effort was spent porting the external scanners and ensuring parity with original (C) tree-sitter |
|
|
| ▲ | jbreckmckye 31 minutes ago | parent | prev | next [-] |
| Interesting. I have a similar usecase but intended to use CGo tree-sitter with Zig Are these pretty up-to-date grammars? I'm awfully tempted to switch to your project How large are your binaries getting? I was concerned about the size of some of the grammars |
| |
| ▲ | odvcencio 24 minutes ago | parent [-] | | 206 binary blobs = 15MB, so not crazy but i built for this use case where you can declare the registry of languages you want to load and not have to own all the grammar binaries by default | | |
| ▲ | jbreckmckye 23 minutes ago | parent [-] | | If all the languages together add up to 15MB that is a game changer for me. It means the CLI I am working on can ship support for many languages whilst still being a smallish (sub 50mb) download I shall definitely check it out! | | |
| ▲ | odvcencio 21 minutes ago | parent [-] | | re: up to date grammars, yes i found the official grammars in use by the original tree-sitter library today |
|
|
|
|
| ▲ | gritzko an hour ago | parent | prev | next [-] |
| That is very very interesting. I work on a similar project https://replicated.wiki/blog/partII.html I use CRDT merge though, cause 3-way metadata-less merges only provide very incremental improvements over e.g. git+mergiraf. How do you see got's main improvement over git? |
| |
| ▲ | odvcencio 40 minutes ago | parent [-] | | primarily, got is structural VCS intended for concurrent edits of the same file. it does this via gotreesitter and gts-suite abstractions that enable it to:
- have entity-aware diffs
- not line by line but function by function
- structural blame
- attribution resolution for the lifetime of the entity
- semver from structure
- it can recommend bumps because it knows what is breaking change vs minor vs patch
- entity history
- because entities are tracked independently, file renames or moves dont affect the entity's history when gotreesitter cant parse a language, the 3way text merge happens as a fallback. what the structural merge enables is no conflicts unless same entity has conflicting changes | | |
| ▲ | gritzko 36 minutes ago | parent | next [-] | | I think I understand the situation. | |
| ▲ | odvcencio 36 minutes ago | parent | prev [-] | | gah,. sincere apologies for formatting of this post. i ahve been on HN for basically 10 years now without ever having made a post (: | | |
|
|
|
| ▲ | skybrian an hour ago | parent | prev | next [-] |
| How about making 'got' compatible with git repos like jujutsu? It would be a lot easier to try out. |
| |
| ▲ | odvcencio 38 minutes ago | parent [-] | | it is interoperable with git. we like git when its good but attempted to ease the pains in UX somewhat. you can take advantage of got locally but still push it to git remote forges jsut the same. when you pull stuff in this way, got will load the entity history into the git repo ensuring that you can still do got stuff locally (inspect entity histories, etc) |
|
|
| ▲ | irishcoffee 37 minutes ago | parent | prev [-] |
| Is it a go-ism that source for implementation and test code lives in the root of the repo or is this an LLM thing? |
| |
| ▲ | odvcencio 33 minutes ago | parent [-] | | yeah the tests live with the implementation code always (Go thing) and the repo root thing is like a preference, main is an acceptable package to put stuff in (Go thing), i see this a lot with smaller projects or library type projects |
|