| ▲ | Show HN: CLI tool for detecting non-exact code duplication with embedding models(github.com) | |||||||||||||||||||||||||
| 38 points by rkochanowski 3 hours ago | 13 comments | ||||||||||||||||||||||||||
| ▲ | rkochanowski 3 hours ago | parent | next [-] | |||||||||||||||||||||||||
I built Slopo to solve one specific problem: finding similar code that is hardest to detect by other tools, coding AI agents, and humans. It finds similar-looking code with embeddings. This detects more than just copy-paste clones or even clones with minor changes. Similar code is often not a clone to refactor, and this is a trade-off. Initial results need to be verified, but coding agents can do this quickly. Example prompts are available on https://slopo.dev Additionally, similar code distant in the codebase is ranked higher to focus on less obvious duplication. The results differ a lot depending on the codebase. I noticed that sometimes most of the detected duplicates are false positives, but the remaining ones are strong candidates to refactor or even bugs. Sometimes it reveals much more real duplication. | ||||||||||||||||||||||||||
| ||||||||||||||||||||||||||
| ▲ | forhadahmed 30 minutes ago | parent | prev | next [-] | |||||||||||||||||||||||||
self plug (for similar tool): https://github.com/forhadahmed/refactor | ||||||||||||||||||||||||||
| ▲ | BrandiATMuhkuh 44 minutes ago | parent | prev | next [-] | |||||||||||||||||||||||||
What a simple and smart idea. Wonderful | ||||||||||||||||||||||||||
| ▲ | SpyCoder77 an hour ago | parent | prev | next [-] | |||||||||||||||||||||||||
I think that this is pretty cool, but is there any reason why we would want to remove similar/possible duplicate code? | ||||||||||||||||||||||||||
| ||||||||||||||||||||||||||
| ▲ | philajan an hour ago | parent | prev | next [-] | |||||||||||||||||||||||||
This is neat. Have you noticed any difference in duplicate detection between strongly typed and loosely typed languages / code bases? | ||||||||||||||||||||||||||
| ||||||||||||||||||||||||||
| ▲ | murats 2 hours ago | parent | prev | next [-] | |||||||||||||||||||||||||
Nice idea. I can see this being useful before refactors, especially when the duplication is semantic rather than copy paste. | ||||||||||||||||||||||||||
| ▲ | hdz an hour ago | parent | prev [-] | |||||||||||||||||||||||||
Very nice. I can imagine putting this into a pre push hook to keep things clean after an initial sweep. | ||||||||||||||||||||||||||