Remix.run Logo
dominotw 2 hours ago

> The mentioned other models that are trained for the same purpose with close to the same capabilities.

well they dont tell you that do they? there is no way to tell what model can and cannot do unless you extensivevly test it yourself and pray for the best.

sajithdilshan 2 hours ago | parent [-]

If you don’t have a proper testing mechanism for your product, regardless of the model, you shouldn’t be shipping it. Praying for the best is not a strategy and don’t blame your lack of testing strategy on the LLM capability mismatch

dominotw 2 hours ago | parent [-]

even anthropic uses 'user reports' in alignment system card.

Do they lack "testing strategy" to test their own alignment?

Can you share the you testing strategies that are letting you plug and play models.