Remix.run Logo
Joel_Mckay 6 hours ago

Indeed, isomorphic plagiarism by its nature forms strong vector search paths that were made from stealing both global websites, real peoples work, and LLM user-base input/markdown.

However, reasoning models adding a random typo to seem less automated, still do not hide the fairly repeatable quantized artifacts from the training process. For LLM, it is rather trivial to find where people originally scraped the data from if they still have annotated training metadata.

Finally, reading LLM output is usually clear once one abandons the trap of thinking "I think the author meant [this/that]", and recognizing a works tone reads like a fake author had a stroke [0]. =3

[0] https://en.wikipedia.org/wiki/Stroke