| ▲ | misswaterfairy 8 hours ago | |
Part of the up-time solution is keeping as much of your app and infrastructure within your control, rather than being at the behest of mega-providers as we've witnessed in the past month: Cloudflare, and AWS. Probably: - a couple of tower servers, running Linux or FreeBSD, backed up by a UPS and an auto-run generator with 24 hours worth of diesel (depending on where you are, and the local areas propensity for natural disasters - maybe 72 hours), - Caddy for a reverse proxy, Apache for the web server, PostgreSQL for the database; - behind a router with sensible security settings, that also can load-balance between the two servers (for availability rather than scaling); - on static WAN IPs, - with dual redundant (different ISPs/network provider) WAN connections, - a regular and strictly followed patch and hardware maintenance cycle, - located in an area resistant to wildfire, civil unrest, and riverine or coastal flooding. I'd say that'd get you close to five 9s (no more than ~5 minutes downtime per year), though I'd pretty much guarantee five 9s (maybe even six 9s - no more than 32 seconds downtime per year) if the two machines were physically separated from each other by a few hundred kilometres, each with their own supporting infrastructure above, sans the load balancing (see below), through two separate network routes. Load balancing would become human-driven in this 'physically separate' example (cheaper, less complex): if your-site-1.com fails, simply re-point your browser to your-site-2.com which routes to the other redundant server on a different network. The hard part now will be picking network providers that don't use the same pipes/cables, i.e. they both use Cloudflare, or AWS... Keep the WAN IPs written down in case DNS fails. PostgreSQL can do master-master replication, but it's a pain to set up I understand. | ||