| ▲ | folderquestion 4 hours ago | |
I think that LLMs are composed of little windows or brains and they are isolated, the RLHF is an averaging tool. One example, the LLM when prompted to be critic can say you that what you are trying to do is a dead end, but at the same time it encourage you to follow in the same direction. The critic and the follower are two isolated faces, there is no continuity, so today there is no self in LLMs. | ||