| ▲ | sharkjacobs 3 hours ago |
| > Blanchard's account is that he never looked at the existing source code directly. He fed only the API and the test suite to Claude and asked it to reimplement the library from scratch This feels sort of like saying "I just blindly threw paint at that canvas on the wall and it came out in the shape of Mickey Mouse, and so it can't be copyright infringement because it was created without the use of my knowledge of Micky Mouse" Blanchard is, of course, familiar with the source code, he's been its maintainer for years. The premise is that he prompted Claude to reimplement it, without using his own knowledge of it to direct or steer. |
|
| ▲ | dathinab 3 hours ago | parent | next [-] |
| > Blanchard is, of course, familiar with the source code, he's been its maintainer for years. I would argue it's irrelevant if they looked or didn't look at the code. As well as weather he was or wasn't familiar with it. What matters is, that they feed to original code into a tool which they setup to make a copy of it. How that tool works doesn't really matter. Neither does it make a difference if you obfuscate that it's an copy. If I blindfold myself when making copies of books with a book scanner + printer I'm still engaging in copyright infringement. If AI is a tool, that should hold. If it isn't "just" a tool, then it did engage in copyright infringement (as it created the new output side by side with the original) in the same way an employee might do so on command of their boss. Which still makes the boss/company liable for copyright infringement and in general just because you weren't the one who created an infringing product doesn't mean you aren't more or less as liable of distributing it, as if you had done so. |
| |
| ▲ | Legend2440 12 minutes ago | parent | next [-] | | >that they feed to original code into a tool which they setup to make a copy of it Well, no. They fed the spec (test cases, etc) into a tool which made a new program matching the spec. This is not a copy of the original code. But also this feels like arguing over the color of the iceberg while the titanic sinks. If you have a tool that can make code to spec, what is the value in source code anymore? Even if your app is closed-source, you can just tell claude to write new code that does the same thing. | |
| ▲ | spullara 2 hours ago | parent | prev | next [-] | | if the actual text of the code isn't the same or obviously derivative, copyright doesn't apply at all. | | |
| ▲ | sigseg1v 2 hours ago | parent | next [-] | | What does derivative mean here? Because IMO it means that the existing work was used as input. So if you used a LLM and it was trained on the existing work, that's a derivative work. If you rot13 encode something as input, so you can't personally read it, and then a device decides to rot13 on it again and output it, that's a derivative work. | | |
| ▲ | spullara 2 hours ago | parent | next [-] | | In order for it to be creatively derivative you would need to copy the structure, logic, organization, and sequence of operations not just reimplement the functionality. It is pretty clear in this case that wasn't done. | |
| ▲ | ghostpepper 2 hours ago | parent | prev | next [-] | | As a cynical person I assume all the frontier LLMs were trained on datasets that include every open source project, but as a thought experiment, if an LLM was trained on a dataset that included every open source project _execept_ chardet, do you think said LLM would still be able to easily implement something very similar? | | | |
| ▲ | nicole_express 2 hours ago | parent | prev | next [-] | | Of course, the problem with this interpretation is that all modern LLMs are derivatives from huge amounts of text under completely different licenses, including "All rights reserved", and therefore can not be used for any purpose. I'm not sure how you square the circle of "it's alright to use the LLM to write code, unless the code is a rewrite of an open source project to change its license". | |
| ▲ | satvikpendem 2 hours ago | parent | prev | next [-] | | > Because IMO it means that the existing work was used as input That's your opinion (since you said "IMO"), not the actual legal definition. | |
| ▲ | bmcahren 2 hours ago | parent | prev | next [-] | | LLMs do not encode nor encrypt their training data. The fact they can recite training data is a defect not a default. You can understand this more simply by calculating the model size as an inverse of a fantasy compression algorithm that is 50% better than SOTA. You'll find you'd still be missing 80-90% of the training data even if it were as much of a stochastic parrot as you may be implying. The outputs of AI are not derivative just because they saw training data including the original library. Then onto prompting: 'He fed only the API and (his) test suite to Claude' This is Google v Oracle all over again - are APIs copyrightable? | | |
| ▲ | satvikpendem an hour ago | parent [-] | | > This is Google v Oracle all over again - are APIs copyrightable? Yes this is the best way to ask the question. If I take a public facing API and reimplement everything, whether it's by human or machine, it should be sufficient. After all, that's what Google did, and it's not like their engineers never read a single line of the Java source code. Even in "clean room" implementations, a human might still have remembered or recalled a previous implementation of some function they had encountered before. |
| |
| ▲ | wizzwizz4 2 hours ago | parent | prev [-] | | See also: https://monolith.sourceforge.net/, which seeks to ask the question: > But how far away from direct and explicit representations do we have to go before copyright no longer applies? |
| |
| ▲ | yorwba an hour ago | parent | prev | next [-] | | Copyright protects even very abstract aspects of human creative expression, not just the specific form in which it is originally expressed. If you translate a book into another language, or turn it into a silent movie, none of the actual text may survive, but the story itself remains covered by the original copyright. So when you clone the behavior of a program like chardet without referencing the original source code except by executing it to make sure your clone produces exactly the same output, you may still be infringing its copyright if that output reflects creative choices made in the design of chardet that aren't fully determined by the functional purpose of the program. | |
| ▲ | NSUserDefaults an hour ago | parent | prev [-] | | If you pirate a movie and reencode it, does that apply as well? You can still watch the movie and it is “obviously” the same movie. Here you can use the program and it is, to the user, also the same. |
| |
| ▲ | margalabargala 2 hours ago | parent | prev [-] | | > If it isn't "just" a tool, then it did engage in copyright infringement Copyright infringement is a thing humans do. It's not a human. Just like how the photos taken by a monkey with a camera have no copyright. Human law binds humans. | | |
| ▲ | malicka an hour ago | parent [-] | | Correct. The human who shares the copy is the one who engages in copyright infringement. | | |
| ▲ | margalabargala an hour ago | parent [-] | | So, let's say that rather than actually touching any copyrighted material, a human merely tells an AI about how to go onto the internet and find copyrighted material, download it, and ingest it for training. The AI, fully autonomously, does so, and after training itself on the material deletes it so no human ever downloads, consumes, or shares it. If we are saying AI is "more than a tool", which seems to be the case courts are leaning since they've ruled AI output without direct human involvement is not copyrightable[0], then the above seems like it would be entirely legal. [0] https://www.copyright.gov/newsnet/2025/1060.html |
|
|
|
|
| ▲ | logicprog 3 hours ago | parent | prev | next [-] |
| I just don't see how it's relevant whether he did look or didn't. In my opinion, it's not just legally valid to make a re-implementation of something if you've seen the code as long as it doesn't copy expressive elements. I think it's also ethically fine as well to use source code as a reference for re-implementing something as long as it doesn't turn into an exact translation. |
| |
| ▲ | simonw 2 hours ago | parent | next [-] | | Right. The alternative is that we reward Dan for his 14 years of volunteer maintenance of a project... by banning him from working on anything similar under a different license for the rest of his life. | |
| ▲ | atomicnumber3 2 hours ago | parent | prev | next [-] | | It's actually not legally fine, or at least it's extremely dangerous. Projects that re-implement APIs presented by extremely litigious companies specifically do not allow people who, for instance, have seen the proprietary source code to then work on the project. | | |
| ▲ | jpc0 2 hours ago | parent | next [-] | | I don't think fear or legal action makes it illegal. If I know it is legal to make a turn at a red light. And I know a court will uphold that I was in the right but a police officer will fine me regardless and I would need to go to actually pursue some legal remedy I'm unlikely to do it regardless of whether it is legal because it is expensive, if not in money but time. In the case of copyright lawsuits they are notoriously expensive and long so even if a court would eventually deem it fine, why take the chance. | |
| ▲ | sunshowers an hour ago | parent | prev [-] | | My understanding is that that is a maximalist position for the avoidance of risk, and is sufficient but probably not necessary. |
| |
| ▲ | sarchertech 2 hours ago | parent | prev [-] | | Ignoring the legal or ethical concerns. Let’s say we live in a world where the cost of copying code is so close to zero that it’s indistinguishable from a world without copyright. Anything you put out can and will be used by whatever giant company wants to use it with no attribution whatsoever. Doesn’t that massively reduce the incentive to release the source of anything ever? | | |
| ▲ | satvikpendem an hour ago | parent | next [-] | | No, because (most) people don't work on OSS for vanity, they do it to help other people, whether it's individuals or groups of individuals, ie corporations. It's the same question as, if an AI can generate "art", or photographers can capture a scene better than any (realistic) painter, then will people still create art? Obviously yes, and we see it of course after Stable Diffusion was released three years ago, people are still creating. | |
| ▲ | pocksuppet 2 hours ago | parent | prev | next [-] | | Yes, and it reduces the incentives to release binaries too. Such a world will be populated by almost entirely SaaS, which can still compete on freedom. | |
| ▲ | intrasight 2 hours ago | parent | prev [-] | | Most commercial software that I've used has the model of a legal moat around a pretty crappy database schema. The non IP protection has largely been in the effort involved in replicating an application's behavior and that effort is dropping precipitously. |
|
|
|
| ▲ | axus 2 hours ago | parent | prev | next [-] |
| Oracle had it's day in court with Google over the Java APIs. Reimplementing APIs can be done without copyright infringement, but Oracle must have tried to find real infringement during discovery. In this case, we could theoretically prove that the new chardet is a clean reimplementation. Blanchard can provide all of the prompts necessary to re-implement again, and for the cost of the tokens anyone can reproduce the results. |
|
| ▲ | Aurornis 2 hours ago | parent | prev | next [-] |
| Can anyone find the actual quote where Blanchard said this? My understanding was that his claim was that Claude was not looking at the existing source code while writing it. |
| |
| ▲ | pklausler 2 hours ago | parent | next [-] | | Conveniently ignoring the likelihood that Claude had been trained on the freely accessible source code. | |
| ▲ | mrgoldenbrown 2 hours ago | parent | prev [-] | | Does he have access to Claude's training data? How can he claim Claude wasn't trained on the original code? |
|
|
| ▲ | SpicyLemonZest 2 hours ago | parent | prev | next [-] |
| Isn't this a red herring? An API definition is fair use under Google v. Oracle, but the test suite is definitely copyrightable code! |
|
| ▲ | 3 hours ago | parent | prev | next [-] |
| [deleted] |
|
| ▲ | esafak 2 hours ago | parent | prev | next [-] |
| If you only stick to the API and ignore the implementation, it is not Mickey Mouse any more but a rodent. If it was just a clone it wouldn't be 50x as fast. Nevertheless, APIs apparently can be copyrightable. I generally disagree with this; it's how PC compatibles took off, giving consumers better options. |
| |
| ▲ | amarant 2 hours ago | parent | next [-] | | Wait what, didn't oracle lose the case against Google? Have I been living in an alternate reality where API compatibility is fair use? | |
| ▲ | Copyrightest 2 hours ago | parent | prev [-] | | [dead] |
|
|
| ▲ | re-thc 3 hours ago | parent | prev | next [-] |
| > This feels sort of like saying "I just blindly threw paint at that canvas on the wall and > He fed only the API and the test suite to Claude and asked it Difference being Claude looked; so not blind. The equivalent is more like I blindly took a photo of it and then used that to... Technically did look. |
| |
| ▲ | amarant 2 hours ago | parent [-] | | The article is poorly written. Blanchard was a chardet maintainer for years. Of course he had looked at it's code! What he claimed, and what was interesting, was that Claude didn't look at the code, only the API and the test suite. The new implementation is all Claude. And the implementation is different enough to be considered original, completely different structure, design, and hey, a 48x improvement in performance! It's just API-compatible with the original. Which as per the Google Vs oracle 2021 decision is to be considered fair use. | | |
| ▲ | mrgoldenbrown 2 hours ago | parent | next [-] | | did he claim that Claude wasn't trained on the original? Or just that he didn't personally provide Claude with a copy? | | |
| ▲ | amarant 2 hours ago | parent [-] | | I recon the latter, how would he know what was in Claude's training data? |
| |
| ▲ | re-thc 2 hours ago | parent | prev [-] | | > What he claimed, and what was interesting, was that Claude didn't look at the code Who opened the PR? Who co-authored the commits? It's clearly on Github. > Blanchard was a chardet maintainer for years. Of course he had looked at its code! So there you have it. If he looked, he co-authored then there's that. | | |
| ▲ | kjksf 2 hours ago | parent [-] | | If I put my signature on Picasso painting, it doesn't make me co-author of said painting. Blanchard is very clear that he didn't write a single line of code. He isn't an author, he isn't a co-author. Signing GitHub commit doesn't change that. | | |
| ▲ | re-thc an hour ago | parent [-] | | > Blanchard is very clear that he didn't write a single line of code He used Claude to write it. Difference? The fact that I write on the notepad vs printed it out = I didn't do it? > Signing GitHub commit doesn't change that. That's the equivalent of me saying I didn't kill anyone. The fingerprints on the knife doesn't change that. | | |
| ▲ | satvikpendem an hour ago | parent [-] | | I'll take a commit authored by someone else and then git amend the author to myself, did I write that commit then? By your logic I did apparently. |
|
|
|
|
|
|
| ▲ | babypuncher 2 hours ago | parent | prev [-] |
| What if we said that generative AI output is simply not copyrightable. Anything an AI spits out would automatically be public domain, except in cases where the output directly infringes the rights of an existing work. This would make it so relicensing with AI rewrites is essentially impossible unless your goal is to transition the work to be truly public domain. I think this also helps somewhat with the ethical quandary of these models being trained on public data while contributing nothing of value back to the public, and disincentivize the production of slop for profit. |
| |
| ▲ | kjksf an hour ago | parent | next [-] | | We did in fact say so. https://www.carltonfields.com/insights/publications/2025/no-... > No Copyright Protection for AI-Assisted Creations: Thaler v. Perlmutter > A recent key judicial development on this topic occurred when the U.S. Supreme Court declined to review the case of Thaler v. Perlmutter on March 2, 2026, effectively upholding lower court rulings that AI-generated works lacking human authorship are not eligible for copyright protection under U.S. law | | |
| ▲ | pseudalopex an hour ago | parent [-] | | > > A recent key judicial development on this topic occurred when the U.S. Supreme Court declined to review the case of Thaler v. Perlmutter on March 2, 2026, effectively upholding lower court rulings that AI-generated works lacking human authorship are not eligible for copyright protection under U.S. law This was AI summary? Those words were not in the article. The courts said Thaler could not have copyright because he refused to list himself as an author. |
| |
| ▲ | idle_zealot an hour ago | parent | prev [-] | | > This would make it so relicensing with AI rewrites is essentially impossible unless your goal is to transition the work to be truly public domain. That's not true at all. Anyone could follow these steps: 1. Have the LLM rewrite GPL code. 2. Do not publish that public domain code. You have no obligation to. 3. Make a few tweaks to that code. 4. Publish a compiled binary/use your code to host a service under a proprietary license of your choice. |
|