Remix.run Logo
ericdoerheit 2 days ago

Thank you for your work! I would be interested to see what this means to a CNN architecture. Maybe it wouldn't actually be needed to have the whole architecture based on uGMM-NNs but only the last layers?

zakeria 2 days ago | parent [-]

Thanks - good question, in theory, the uGMM layer could complement CNNs in different ways - for example, one could imagine (as you mentioned):

using standard convolutional layers for feature extraction,

then replacing the final dense layers with uGMM neurons to enable probabilistic inference and uncertainty modeling on top of the learned features.

My current focus, however, is exploring how uGMMs translate into Transformer architectures, which could open up interesting possibilities for probabilistic reasoning in attention-based models.