▲ | raincole 3 days ago | ||||||||||||||||||||||
> Another issue is that some of the words are segmented very unnaturally I immediately noticed that too. Are the "gaps" generated by an LLM? I think the model might not understand Japanese very well. | |||||||||||||||||||||||
▲ | yorwba 3 days ago | parent [-] | ||||||||||||||||||||||
It's a bit like segmenting "don't see" into "don't" and "see." ません is the negative of the auxiliary ます just as "don't" is the negative of the auxiliary "do." If you have to split Japanese text into words and want to be principled about it, treating ません as a separate word is not a bad way to go about it. But of course there are other ways, so a "fill in the blank" question with two gaps right next to each other is generally a bad idea. | |||||||||||||||||||||||
|