| ▲ | toenail 11 hours ago | ||||||||||||||||
Dunno, I've had three node clusters running very stable for years. Which issues did you have that require a full team? | |||||||||||||||||
| ▲ | PedroBatista 11 hours ago | parent | next [-] | ||||||||||||||||
Even most toy databases "built in a weekend" can be very stable for years if: - No edge-case is thrown at them - No part of the system is stressed ( software modules, OS,firmware, hardware ) - No plug is pulled Crank the requests to 11 or import a billion rows of data with another billion relations and watch what happens. The main problem isn't the system refusing to serve a request or throwing "No soup for you!" errors, it's data corruption and/or wrong responses. | |||||||||||||||||
| |||||||||||||||||
| ▲ | unethical_ban 11 hours ago | parent | prev [-] | ||||||||||||||||
To be fair, I think it is chronically underprovisioned clusters that get overwhelmed by log forwarding. I wasn't on the team that managed the ELK stack a decade ago, but I remember our SOC having two people whose full time job was curating the infrastructure to keep it afloat. Now I work for a company whose log storage product has ES inside, and it seems to shit the bed more often than it should - again, could be bugs, could be running "clusters" of 1 or 2 instead of 3. | |||||||||||||||||
| |||||||||||||||||