Remix.run Logo
m000 16 hours ago

Deploying on Kubernetes using Helm solves a lot of these cases: Migrations are run at the init stage of the pods. If successful, pods of the new version are started one by one, while the pods of the new version are shutdown. For a short period, you have pods of both versions running.

When you add new stuff or make benign modifications to the schema (e.g. add an index somewhere), you won't notice a thing.

If the introduced schema changes are not compatible with the old code, you may get a few ProgramingErrors raised from the old pods, before they are replaced. Which is usually acceptable.

There are still some changes that may require planning for downtime, or some other sort of special handling. E.g. upgrading a SmallIntegerField to an IntegerField in a frequently written table with millions of rows.

ndr 16 hours ago | parent [-]

Without care new-schema will make old-code fail user requests, that is not zero downtime.

m000 14 hours ago | parent [-]

A request not being served can happen for a multitude of reasons (many of them totally beyond your control) and the web architecture is designed around that premise.

So, if some of your pods fail a fraction of the requests they receive for a few seconds, this is not considered downtime for 99% of the use cases. The service never really stopped serving requests.

The problem is not unique to Django by any means. If you insist on being a purist, sure count it as downtime. But you will have a hard time even measuring it.