Remix.run Logo
GolDDranks 18 hours ago

Why aren't the role tags preprocessed algorithmically/deterministically and then fed in as one-hot-encoded vectors alongside the semantic word embeddings? I'd imagine that it would be easier to train to _stay_ in the role an not confuse it, if the current role marker is explicitly set as a part of each input token, and not just implied by some past token. Plus a input separate from the word embedding would be unforgeable.

peterldowns 18 hours ago | parent [-]

Always wondered this. Must have been tried and not worked?