| ▲ | I de-vibed a vibe-coded NLP app(theasymptotic.substack.com) | |||||||
| 1 points by tipoffdosage904 6 hours ago | 2 comments | ||||||||
| ▲ | tipoffdosage904 6 hours ago | parent [-] | |||||||
Earlier this year I built a local app called How I Prompt that analyzes your AI coding conversations and gives you a prompt/persona breakdown. The first version was very much vibe-coded. It looked good, but once I audited it properly I found several issues: - the behavioral axes used to generate prompting personas were heavily correlated and mostly measuring prompt length - one persona was mathematically unreachable - the pipeline was counting logs / tool noise / machine-generated text as human prompts (this was the highest LOE for me to clean the data myself) - the whole thing had a stronger appearance of rigor than actual rigor I wrote up the rebuild here: https://theasymptotic.substack.com/p/how-i-de-vibed-a-vibe-c... Repo: https://github.com/eeshansrivastava89/howiprompt What I changed in v2: - much more aggressive data cleaning - simpler feature-based scoring using logistic regression instead of embeddings (something I understand better) - external prompt datasets for broader validation - a more transparent 2-axis system that seems to behave much better than the original It runs locally and doesn't upload your prompt data anywhere. Point your agent at the repo to validate yourself. Would especially love feedback from people who have worked on behavioral measurement, NLP evaluation, or human/AI interaction. I'm definitely not a domain expert. One of the main things I wanted to document here was the difference between "AI helped me ship a prototype fast" and "this is actually a sound measurement system." | ||||||||
| ||||||||