▲ | NitpickLawyer 3 days ago | |
> I'm pretty sure Llama itself trained on a bunch of copyrighted data. Every good, "SotA" model is trained on copyrighted data. This fact becomes aparent when models are released with everything public (i.e. training data) and they score significantly behind in every benchmark. | ||
▲ | tough a day ago | parent [-] | |
Research team from orielly found out openai trained on copyirghted books prob got a sub... https://ssrc-static.s3.us-east-1.amazonaws.com/OpenAI-Traini... |