| ▲ | ainch 5 hours ago | |
The human genome contains around 1.5GB of information and DeepSeek v3 weighs in at around 800GB, so it's a bit apples-to-oranges. As you say, what's been evolved over hundreds of millions of years is the learning apparatus and architecture, but we largely learn online from there (with some built-in behaviours like reflexes). It's a testament to the robustness of our brains that the overwhelming majority of humans learn pretty effectively. I suspect LLM training runs are substantially more volatile (as well as suffering from the obvious data efficiency issues). If you'd like an unsolicited recommendation, 'A Brief History of Intelligence' by Max Bennett is a good, accessible book on this topic. It explicitly draws parallels between the brain's evolution and modern AI. | ||
| ▲ | jack_pp 3 hours ago | parent | next [-] | |
And that same information contained in an LLM is a compression of how many terabytes of training data? Maybe in the future there will be models an order of magnitude smaller and still better performing. What I'm saying is you can't judge the data in the genome by purely counting the bytes of data. | ||
| ▲ | idiotsecant an hour ago | parent | prev [-] | |
The human genome isn't its own thing, the genome as a static sequence is really just an abstraction. What actually functions as the heritable unit includes epigenetic marks, non-coding RNA regulation, 3D chromatin structure, and mitochondrial DNA. In the real biological world there are very few sharp edges - systems bleed into one another and trying to define something like 'the number of bits in the human genome' is very difficult, but it's undoubtedly way bigger than you posit here. | ||