Remix.run Logo
nicce 9 hours ago

LLMs still do not have proper contextual understanding of their solutions. Just couple days ago I was using GPT 5.5 with xhigh to vide code some application, and yet it defaulted for sorting dates from new to old by using plain string comparison. Just one of the many bugs.

properbrew 9 hours ago | parent | next [-]

This absolutely fascinates me. I had a friend who needed subtitle files generating for audio and using in CapCut yesterday yet none of the available stuff was suitable, so he asked if I could adapt some of my software to export subtitles.

2 hours later he's got a fully working piece of local software that does exactly what he wants, yet yours is not able to even sort dates correctly. Feel free to download it if you want to see for yourself, I didn't even do any UI tweaks as this was just a tool for him to use:

Linux - https://downloads.blazingbanana.com/whistle-subtitles/unstab...

Windows - https://downloads.blazingbanana.com/whistle-subtitles/unstab...

Mac - https://downloads.blazingbanana.com/whistle-subtitles/unstab...

How can there be such a massive gap in what can be produced?

nicce 8 hours ago | parent [-]

> How can there be such a massive gap in what can be produced?

What I was doing looks really nice and mostly works on the surface, but it is all about the corner cases where these bugs appear. In another day I was able to generate Frida script with LLM help that bypasses Dart certificate pinning/validation and proxies all the traffic by injecting the runtime binaries. With the latest Flutter/Dart version on Android when doing security analysis.

properbrew 8 hours ago | parent [-]

Ahhh ok I totally understand what you mean. Yea the edge cases are absolutely where you start to feel the pain and things look good on the surface until you dig in. I think even in the age of LLMs the adage of 90% of the time is spent of the last 10% will ring true.

Sure an app can be built and spun up in an afternoon, but are you willing to spend another 6 months ironing out all those little bugs, tuning it a bit, testing, tweaking, testing etc.

christkv 5 hours ago | parent | prev [-]

LLMs is a little like the guy in Memento right. Every single conversation is it looking at the compressed scribbles of the context.