▲ | -_- 5 days ago | |||||||||||||||||||||||||||||||
DSPy is great for prompt optimization but not so much for RL fine-tuning (their support is "extremely EXPERIMENTAL"). The nice thing about RL is that the exact prompts don't matter so much. You don't need to spell out every edge case, since the model will get an intuition for how to do its job well via the training process. | ||||||||||||||||||||||||||||||||
▲ | nextworddev 5 days ago | parent [-] | |||||||||||||||||||||||||||||||
Isn’t the latest trend in RL mostly about prompt optimization as opposed to full fine tuning | ||||||||||||||||||||||||||||||||
|