Here's a basic comparison:
Many small boxes
Pros:
- Cost. Small boxes cost less. Spend less or spend more as needed. If one dies, cheaper to replace.
- Efficiency. Scaling *down* saves money when load is low. Can schedule specific loads to specific boxes.
- Redundancy. Multiple VMS, OSes, network paths, etc. No single point of failure.
- Zero-downtime. Rolling deployments, upgrades means changes with no user impact.
- System bandwidth. More network links, cpus, kernels, disks, etc = more bandwidth, capacity.
- Performance resilience. A heavily loaded app on one server doesn't affect others.
- Immutability. "Pets" rather than "cattle" uses automation to reduce maintenance/instability.
- Scalability. When you run out of resources, adding more is easy, zero impact.
Cons:
- Does not work with applications that require large memory/cpu.
- Inefficient for apps that require shared filesystem access (as opposed to database).
- Requires smarter architecture to reduce long tail of cross-host calls.
- More transient network path failures, troubleshooting issues.
One big box
Pros:
- Allows applications which require large memory/cpu.
- More efficient for apps that share a filesystem.
- Simpler architecture.
- Fewer network path failures.
Cons:
- Large cost that you can't easily reduce as needed.
- Waste (in unused resources) unless load is constant.
- Single point of failure, for reliability and security.
- Upgrades require reboots. App goes down; possibility the server might not boot up properly.
- Single network, cpu, kernel, disks(s), etc become bottlenecks.
- A single heavily-loaded process, excess interrupts, etc can bring down entire system performance.
- Often treated as "pet" rather than "cattle"; creates more maintenance, instability.
- Not scalable.