| ▲ | bob1029 2 days ago |
| > Where does the genome, genetic representation, you are evolving come from The instruction set of the program that is being searched for. This is probably the best publicly available summary of the idea I am pursuing: https://github.com/kurtjd/brainfuck-evolved |
|
| ▲ | esafak 2 days ago | parent | next [-] |
| A program is composed of arbitrarily many instructions of your set. How are you accounting for this; trying every possible program length? And you are considering the simpler case where the search space is discrete, unlike the continuous spaces in most machine learning problems. I think you need to think this through some more. You may see there is a reason nobody uses genetic algorithms for real world tasks. |
| |
| ▲ | bob1029 2 days ago | parent [-] | | > How are you accounting for this; trying every possible program length? Part of the mutation function involves probabilistically growing and shrinking the program size (i.e., inserting and removing random instructions). > And you are considering the simpler case where the search space is discrete, unlike the continuous spaces in most machine learning problems. All "continuous spaces" that embody modern machine learning techniques are ultimately discrete. | | |
| ▲ | esafak 2 days ago | parent [-] | | No, they are not. Model outputs can be discretized but the model parameters (excluding hyperparameters) are typically continuous. That's why we can use gradient descent. | | |
| ▲ | bob1029 2 days ago | parent [-] | | Where are the model parameters stored and how are they represented? | | |
| ▲ | esafak 2 days ago | parent [-] | | In disk or memory as multidimensional arrays ("tensors" in ML speak). | | |
| ▲ | bob1029 2 days ago | parent [-] | | Do we agree that these memories consist of a finite # of bits? | | |
| ▲ | esafak a day ago | parent [-] | | Yes, of course. Consider a toy model with just 1000 double (64-bit), or 64Kb parameters. If you're going to randomly flip bits over this 2^64K search space while you evaluate a nontrivial fitness function, genetic style, you'll be waiting for a long time. | | |
| ▲ | bob1029 a day ago | parent [-] | | I agree if you approach it naively you will accomplish nothing. With some optimization, you can evolve programs with search spaces of 10^10000 states (i.e., 10 unique instructions, 10000 instructions long) and beyond. Visiting every possible combination is not the goal here. |
|
|
|
|
|
|
|
|
| ▲ | dartos 2 days ago | parent | prev [-] |
| you're talking about specifically using genetic programming to create new programs as opposed to gradient decend in LLMs to minimize a loss function, right? How would you construct a genetic algorithm to produce natural language like LLMs do? Forgive me if i'm misunderstanding, but in programming we have "tokens" which are minimal meaningful bits of code. For natural languages it's harder. "Words" are not super meaningful on their own, i don't think. (at least not as much as a token) so how would you break down natural language for a genetic algorithm? |
| |
| ▲ | bob1029 2 days ago | parent [-] | | > how would you break down natural language for a genetic algorithm? The entire point is that you do not bother trying. From an information theory and computational perspective, raw UTF-8 bytes can work just as well as "tokens". The program that is being evolved is expected to develop whatever strategy is best suited to providing the desired input/output transformation. Back to the bitter lesson on this one. | | |
| ▲ | dartos a day ago | parent [-] | | I’ll need to read up on genetic algorithms, I think. That sounds really cool, but coming from training other statistical models, im having a hard time imagining what the training loop looks like. |
|
|