▲ | zakeria 2 days ago | |||||||
Thanks for the comment. Just to clarify, the uGMM-NN isn't simply "Gaussian sampling along the parameters of nodes." Each neuron is a univariate Gaussian mixture with learnable mean, variance, and mixture weights. This gives the network the ability to perform probabilistic inference natively inside its architecture, rather than approximating uncertainty after the fact. The work isn’t framed as "replacing MLPs." The motivation is to bridge two research traditions: - probabilistic graphical models and probabilistic circuits (relatively newer) - deep learning architectures That's why the Iris dataset (despite being simple) was included - not as a discriminative benchmark, but to show the model could be trained generatively in a way similar to PGMs, something a standard MLP cannot do. Hence, the other benefits of the approach mentioned in the paper. | ||||||||
▲ | vessenes 2 days ago | parent [-] | |||||||
Thanks for writing back! I appreciate the plan to integrate the two architectures. On that front, it might be interesting to have a future research section - like what would be uniquely good about this architecture if scaled up? On ‘usefulness’ I think I’m still at my original question - it seems like an open theoretical q to say that the combination of a tripled-or-greater training budget, data size budget of the NN, and probably a close to triple or greater inference budget, the costs of the architecture you described, cannot be closely approximated by the “fair equivalent”-ly sized MLP. I hear you that the architecture can do more, but can you talk about this fair size question I have? That is, if a PGM of the same size as your original network in terms of weights and depth is as effective, then we’d still have a space savings to just have the two networks (MLP and PGM) side by side. Thanks again for publishing! | ||||||||
|