If you squint your eyes it's a fixed iteration ODE solver. I'd love to see a generalization on this and the Universal Transformer metioned re-envisioned as flow-matching/optimal transport models.

▲

kevmo314 4 days ago | parent | next [-]

How would flow matching work? In language we have inputs and outputs but it's not clear what the intermediate points are since it's a discrete space.

▲

Etheryte 4 days ago | parent [-]

One of the core ideas behind LLMs is that language is not a discrete space, but instead a multidimensional vector field where you can easily interpolate as needed. It's one of the reasons LLMs readily make up words that don't exist when translating text for example.

	▲	kevmo314 4 days ago \| parent \| next [-]
		Not the input and output though, which is the important part for flow matching modeling. Unless you're proposing flow matching over the latent space?
	▲	Xmd5a 4 days ago \| parent \| prev [-]
		[flagged]

▲

cfcf14 4 days ago | parent | prev [-]

This makes me think it would be nice to see some kinda child of modern transformer architecture and neural ODEs. There was such interesting work a few years ago on how neural ode/pdes could be seen as a sort of continuous limit of layer depth. Maybe models could learn cool stuff if the embeddings were somehow dynamical model solutions or something.