| ▲ | Mamba-3(together.ai) | ||||||||||||||||||||||
| 67 points by matt_d 3 days ago | 5 comments | |||||||||||||||||||||||
| ▲ | nl an hour ago | parent | next [-] | ||||||||||||||||||||||
I'm looking forward to comparing this to Inception 2 (the text diffusion model) which in my experience is very fast and reasonably high quality. | |||||||||||||||||||||||
| ▲ | robofanatic an hour ago | parent | prev [-] | ||||||||||||||||||||||
> Mamba-3 is a new state space model (SSM) designed with inference efficiency as the primary goal — a departure from Mamba-2, which optimized for training speed. The key upgrades are a more expressive recurrence formula, complex-valued state tracking, and a MIMO (multi-input, multi-output) variant that boosts accuracy without slowing down decoding. Why can’t they simply say - Mamba-3 focuses on being faster and more efficient when making predictions, rather than just being fast to train like Mamba-2. | |||||||||||||||||||||||
| |||||||||||||||||||||||