| ▲ | HugoDias a day ago | |||||||||||||||||||||||||||||||||||||
Can you elaborate on that? In which part of the RAG pipeline did GPT-4.1 perform better? I would expect GPT-5 to perform better on longer context tasks, especially when it comes to understanding the pre-filtered results and reasoning about them | ||||||||||||||||||||||||||||||||||||||
| ▲ | tifa2up a day ago | parent [-] | |||||||||||||||||||||||||||||||||||||
For large context (up to 100K tokens in some cases). We found that GPT-5: a) has worse instruction following; doesn't follow the system prompt b) produces very long answers which resulted in a bad ux c) has 125K context window so extreme cases resulted in an error | ||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||