It's hard to prove rigorously which is why people usually refer to it as the "manifold hypothesis." But it is reasonable to suppose that (most) data does live on a manifold in the strict sense of the term. If you imagine the pixels associated with a handwritten "6", you can smoothly deform the 6 into a variety of appearances where all the intermediate stages are recognizable as a 6.

However the embedding space of a typical neural network that is representing the data is not a manifold. If you use ReLU activations the kinks that the ReLU function creates break the smoothness. (Though if you exclusively used a smooth activation function like the swish function you could maintain a manifold structure.)

▲

macleginn 2 days ago | parent [-]

People also apply the notion of data manifold to language data (which is fundamentally discrete), and even for images the smoothness is hard to come buy (e.g., "images of cars" is not smooth because of shape and colour discontinuities). I guess the best we can do is to hope that there is an underlying virtual "data manifold" from which our datapoints have been "sampled", and knowing its structure may be useful.

	▲	hansvm 2 days ago \| parent [-]
		Those are less problematic than you might imagine. - For language, individual words might be discrete, but concepts being communicated have more nuance and fill in the gaps. - For language, even to the extent that discreteness applies, you can treat the data as being sampled from a coarser manifold and still extract a lot of meaningful structure. - Images of cars are more continuous than you might imagine because of hue differences induced by time of day, camera lens, shadows, etc. - Images of cars are potentially smooth even when considering shape and color discontinuities. Manifolds don't have to be globally connected. Local differentiability is usually the thing people are looking for in practical applications.