| ▲ | thelucent 2 days ago |
| I tried writing a short novel using Claude Opus 4.6, I gave it outline and raw draft, and the style is very similar to this writing. I tried to steer it away from this kind of writing because it feels weird. But it always try to output something similar to this. Or maybe I am just not used to reading novel. So I was curious, what kind of training data was Claude trained on, that its very hard to steer it out from this style. So I opened my kindle and looking through the recommended popular novels. Just reading through its free samples. And the similarities are striking. Now, I dont know whether the recommended novel is the training data, or its actually written by LLM. Or maybe its just how novelist writes. I even tried writing full chapter from scratch. And asked Claude to ghost write the second chapter for me using my writing style. It still wont follow my style and keeps writing in this kind of style from the article. Not accusing the article of using an LLM to ghost write. Even so its fine to use LLM to ghost write. Its just one anecdote from my side, on how LLM fails to follow my writing style and keeps coming back to its training data. |
|
| ▲ | torben-friis 2 days ago | parent | next [-] |
| Don't take this as a defense of LLMs, because it absolutely isn't, but: >Or maybe I am just not used to reading novel. If you're not even used to reading novels, how can you judge the results of writing one? That is one hell of a confession for someone who's trying to write fiction. |
| |
| ▲ | thelucent 2 days ago | parent | next [-] | | Thanks for sharing your perspective. The quote you referenced is about “the weird” feelings. Maybe its weird because I didn’t read too many novels. So its totally something personal to me. Not that weird for me = bad writing. However, I do read a lot of LLM generated output. I spent weeks tinkering with LLM while I was asking it to ghostwrite my novels. So I was exposed to a lot of text that has this weird feelings. Which I eventually felt when I read this article. Its like hearing a song that has the same chords so many times, and then you listen to another song that had the same chords, you might be able to know that they are kinda the same, even when you don’t listen to a lot of songs. | |
| ▲ | KineticLensman 2 days ago | parent | prev [-] | | > That is one hell of a confession for someone who's trying to write fiction. Indeed. A significant part of gaining skills in creative writing is learning to 'read as a writer'. How to examine classic texts to understand how to develop scenes, characters, narrative styles, etc. | | |
| ▲ | moritzwarhier 2 days ago | parent [-] | | An important part of writing is also to write as the reader, eschewing meaningless fluff and sentences that use bombastic emotional language without really communicating. The latter is prevalent in LLM writing. Imitating "poetry" without the feelings is something that the default, "aligned" chat models with reinforcement all do in one way or another. It's hard to get even a technical essay without empty emotional language. And I'm only speaking for myself, I like reading novels, but it's perfectly possible to have a slop-meter without doing so. My own signal-to-noise ratio in writing is also often bad, but with today's "frontier" LLM output I feel there's a specific tendency towards this harmless, emtpy, flowery language full of false dichotomies and rhetorical devices devoid of any purpose to communicate. A model trained and fine-tuned to generate divisive Reddit threads sure has different tendencies. But for the friendly assistants, there's often this solipsism and pseudo-poetic aspect. Related, although just tangentially:
https://www.astralcodexten.com/p/the-claude-bliss-attractor And, regardless of the generation aspect: An essay that starts with > On bronze pirates, cloudy days, and the roads we do not know we are walking just sounds pretentious to me and doesn't spark my interest. |
|
|
|
| ▲ | computably 2 days ago | parent | prev | next [-] |
| > And the similarities are striking. Now, I dont know whether the recommended novel is the training data, or its actually written by LLM. Or maybe its just how novelist writes. For traditionally published works, it's trivial to exclude LLM-written content, just look for anything published before Nov 30, 2022. |
| |
| ▲ | elcapitan 2 days ago | parent | next [-] | | Which is also a good filter for web searches to exclude a lot of garbage results (if the specific search makes sense for non-recent results) | | |
| ▲ | dijit 2 days ago | parent [-] | | Except many search engines have a recency bias. A sane default previously; as news changes and the status quo also, but it makes you even more likely to encounter slop now. | | |
| ▲ | elcapitan 2 days ago | parent [-] | | Not sure how that changes the fact that you can filter by date range in searches where you don't actually need anything recent? |
|
| |
| ▲ | kbrkbr 2 days ago | parent | prev | next [-] | | I think we are discussing the wrong problem here. I have no solution to offer, but I think the problem is not so much generated content, but the surroundings in which it can thrive and become the content you see everywhere. If we hadn't removed the gatekeepers everywhere (and I know there are problems with them, too), then all that technology would not be able to do much harm. It might also have to do with incentives. The incentives in our economy are not to help and advance society, the invisible hand nonwithstanding. | |
| ▲ | fingerlocks 2 days ago | parent | prev | next [-] | | Why stop with traditionally published works? Before dead-internet-day, very-nearly all forms of writing were guaranteed to be hand crafted, organic, and made with 100% Natural Intelligence. The artificial stuff often has an odd taste, but boy it sure is quick and convenient. | | |
| ▲ | throw-the-towel 2 days ago | parent | next [-] | | Don't you remember the endless SEO spam that swamped the Net even before GPT, allegedly written by real humans? | |
| ▲ | ares623 2 days ago | parent | prev [-] | | You joke, but I bet every person in this forum, when presented the choice between a bot-filled forum and a guaranteed human-only* forum, they'd go with the latter. * this is a hypothetical scenario. I don't know any guaranteed human-only digital forums. | | |
| ▲ | sigbottle 2 days ago | parent | next [-] | | I converse enough with LLMs for research at this point where I feel I have a good enough structure to hop on/off them to primary sources and stuff, so I don't get annoyed with them too easily. Whereas I haven't seriously reflected on my social media consumption habits for over 15 years, and over the years I'm getting more and more annoyed at social media. Not to be a bit misanthropic, but there's something seriously wrong with my social media usage, especially when I know there's a real human on the other side, combined with ever increasing annoyance towards commenters and just the feelings I get after reading social media. It may be dopamine / self-help related, but no actually, I think all of that is part of the issue (discovered that in high school when it was taking off). Something about the way I'm fundamentally interacting with the medium seems so horrible and icky the more I mature. | |
| ▲ | fnordian_slip 2 days ago | parent | prev | next [-] | | I agree with you, but as to your addendum: Niche hobbyist forums are still safe, for now. There's just not enough commercial interest in petroleum lantern restoration to make it worth anyone's time to poison this particular well. Even some larger niche hobbies like the saltwater aquarium community seemspretty safe for now (though it also helps that many forums have members who visit each other to trade corals and admire each others tanks). | |
| ▲ | fingerlocks 2 days ago | parent | prev [-] | | On the contrary! The dead-day theorem established earlier states that an 11/22 date filter is a necessary condition for verifiable human-only content, when filtered by content-creation date. A weaker theorem can be postulated that any such filter provides a second order sufficient condition. This means we can filter content by account creation date, for example, by hiding all posts and comments from accounts created after the digital death event. This won’t always guarantee human-only content but certainly more than otherwise. But then we wouldn’t be having this most definitively human-to-human conversation, right? |
|
| |
| ▲ | echelon 2 days ago | parent | prev [-] | | Is the ChatGPT launch the "low background steel" date for writing? What's are the dates for images and video? Nano Banana Pro and Seedance 2.0? And code? Opus 4.6? | | |
| ▲ | alex43578 2 days ago | parent [-] | | It's not the launch of GPT, but probably about 4 or 4o that it really became solid. I also don't think video is there just yet, at least for video over 10 seconds. | | |
| ▲ | operatingthetan 2 days ago | parent | next [-] | | Is it "solid" if people can read it and instantly know it's generated content? | | |
| ▲ | kaashif 2 days ago | parent | next [-] | | No. But you can easily make and post content that is not easily detectable as generated. You only notice plastic surgery when it's bad, but that doesn't mean all plastic surgery looks bad... | |
| ▲ | alex43578 2 days ago | parent | prev [-] | | Who's "people"? The bottom X% (40%?) of the population is already falling for AI slop video scams, but before that, they were also falling for pig butchering and nigerian prince scams, so the "average" person benchmark has already been passed for text, photos, videos, etc. For more astute consumers, video isn't there yet. There's also the question of whether people are even trying to disguise AI content, and how effective that disguise is. Are you or I missing the AI-generated text that just has a veneer of disguise on it? | | |
| ▲ | operatingthetan 2 days ago | parent [-] | | >Who's "people"? If you follow this thread up you will see the context is 'people who want to read content written by humans.' |
|
| |
| ▲ | throawayonthe 2 days ago | parent | prev [-] | | why does it matter when it "became solid?" there was plenty of slop generated with ChatGPT, that really was the turning point (because of public access) |
|
|
|
|
| ▲ | postsantum 2 days ago | parent | prev | next [-] |
| 4-5 words sentences ted talk style, yes. I hated it even when humans were doing it. It's like motivational speakers trying their hand at writing novels |
|
| ▲ | fabioz 2 days ago | parent | prev | next [-] |
| I agree it's hard to get it to output things in different styles... I started doing a side project for writing with LLMs (ailivrum.com -- my main focus being doing some writing/reading for my younger daughter right now, although I'm structuring for others to use it too). So far what I found is that doing prompt engineering does not yield great results. LLMs just go with their own style regardless and I had not much luck changing it... it can do some interesting stories though, but it's far from just outline + prompt > gen story to get something that's readable (on the good side, there are many LLMs, so testing a different provider may give better results). |
|
| ▲ | wormpilled 2 days ago | parent | prev [-] |
| This is such an obnoxious reply holy crap... Why is it upvoted to the top of the thread. |