| ▲ | Tips for building performant LLM applications(moduloware.ai) | |
| 4 points by zuzuen_1 11 hours ago | 2 comments | ||
| ▲ | zuzuen_1 11 hours ago | parent | next [-] | |
I've been building Modulo AI for the past year - an AI system that fixes GitHub issues. Early versions took 5+ minutes to analyze a single issue. After months of optimization, we're now sub-60 seconds with better accuracy. This presentation encapsulates what we learned about the performance characteristics of production LLM systems that nobody talks about. - Strategies for faster token throughput. - Strategies for quick time to first token. - Effective context window management and - Model routing strategies. If you're interested in building AI agents, I'm sure you'll find some interesting insights in it! Install and try out our Github application: https://github.com/apps/solve-bug Try Modulo via browser at: https://moduloware.ai Here are the code examples for the presentation: https://github.com/kirtivr/pydelhi-talk What performance issues have you been seeing in your AI agents? And how did you tackle them? | ||
| ▲ | badabidi 4 hours ago | parent | prev [-] | |
[dead] | ||