Remix.run Logo
esafak 2 days ago

No, they are not. Model outputs can be discretized but the model parameters (excluding hyperparameters) are typically continuous. That's why we can use gradient descent.

bob1029 2 days ago | parent [-]

Where are the model parameters stored and how are they represented?

esafak 2 days ago | parent [-]

In disk or memory as multidimensional arrays ("tensors" in ML speak).

bob1029 a day ago | parent [-]

Do we agree that these memories consist of a finite # of bits?

esafak a day ago | parent [-]

Yes, of course.

Consider a toy model with just 1000 double (64-bit), or 64Kb parameters. If you're going to randomly flip bits over this 2^64K search space while you evaluate a nontrivial fitness function, genetic style, you'll be waiting for a long time.

bob1029 a day ago | parent [-]

I agree if you approach it naively you will accomplish nothing.

With some optimization, you can evolve programs with search spaces of 10^10000 states (i.e., 10 unique instructions, 10000 instructions long) and beyond.

Visiting every possible combination is not the goal here.