▲ | marvin-hansen 4 days ago | ||||||||||||||||||||||
No surprise. About a year ago, I looked at fly.io because of it's low pricing and I was wondering where they were cutting corners to still make some money. Ultimately, I found the answer in their tech docs where it was spelled out clearly that an fly instance is hardwired to one physical server and thus cannot fail over in case that server dies. Not sure if that part still is in the official documentation. In practice, that means if a server goes down, they have to load the last snapshot from that instance from the Backup and push it on a new server, update the network path, and pray to god that not more server fail than spare capacity is available. Otherwise you have to wait for a restore until the datacenter mounted a few more boxes in the rack. That explains quite a bit the randomness of those outage reports i.e. my app is down vs the other is fine and mine came back in 5 minutes vs the other took forever. As a business on a budget, I think anything else i.e. a small civo cluster serves you better. | |||||||||||||||||||||||
▲ | ignoramous 4 days ago | parent | next [-] | ||||||||||||||||||||||
Fly.io can migrate vm+volume now: https://fly.io/docs/reference/machine-migration/ / https://archive.md/rAK0V > a fly instance is hardwired to one physical server and thus cannot fail over I'm having trouble understanding how else this is supposed to be? I understand that live migration is a thing, but even in those cases, a VM is "hardwired" to some physical server, no? | |||||||||||||||||||||||
| |||||||||||||||||||||||
▲ | dilyevsky 4 days ago | parent | prev | next [-] | ||||||||||||||||||||||
> Ultimately, I found the answer in their tech docs where it was spelled out clearly that an fly instance is hardwired to one physical server and thus cannot fail over in case that server dies. Majority of EC2 instance types did not have live migration until very recently. Some probably still don't (they don't really spell out how and when it's supposed to work). It is also not free - there's a noticeable brown-out when your VM gets migrated on GCP for example. | |||||||||||||||||||||||
| |||||||||||||||||||||||
▲ | pier25 4 days ago | parent | prev | next [-] | ||||||||||||||||||||||
If you want HA on Fly you need to deploy an app to multiple regions (multiple machines). Fly might still go down completely if their proxy layer fails but it's much less common. | |||||||||||||||||||||||
| |||||||||||||||||||||||
▲ | fulafel 4 days ago | parent | prev [-] | ||||||||||||||||||||||
The status tells a story about a high-availability/clustering system failure so I think in this case the problem is rather the complexity of the HA machinery hurting the system's availability vs something like a simple VPS. |