▲ | the_duke 9 hours ago | |||||||
I'm curios: are many people here actually still running mainline Prometheus over one of the numerous compatible solutions that are more scalable and have better storage? (Mimir, Victoria, Cortex, OpenObserve, ...) | ||||||||
▲ | robinhoodexe 8 hours ago | parent | next [-] | |||||||
We’re running standard Prometheus on Kubernetes (14 onprem Talos clusters, total of 191 nodes, 1.1k cpu cores, 4.75TiB memory and 4k pods). We use Thanos to store metrics in self-hosted S3 (seaweedfs) with 30 days retention, aggressively downsample after 3 days. It works pretty good tbh. I’m excited about upgrading to version 3, as is does take a lot of resources to keep going, especially on clusters with a lot of pods being spawned all the time. | ||||||||
| ||||||||
▲ | aorth 7 hours ago | parent | prev | next [-] | |||||||
Using Victoria Metrics here. Very easy to set up and run. I monitor under 100 hosts and resource usage is low, performance is good. One gripe is that they recently stopped publishing tarballs for LTS versions, which caused some grumbling in the community. Fair enough since they are developing for free, but felt like a bait and switch. | ||||||||
▲ | majewsky 6 hours ago | parent | prev | next [-] | |||||||
Regular Prometheus inside clusters for collection and alerting, Thanos for cross-cluster aggregation and long retention. | ||||||||
▲ | never_inline 7 hours ago | parent | prev | next [-] | |||||||
I am curious to hear from people on this forum, at what point will people practically cross the limits of prometheus, and straightforward division (eg, different prometheus across clusters and environments) does not work? | ||||||||
| ||||||||
▲ | raffraffraff 9 hours ago | parent | prev | next [-] | |||||||
Nope. Mimir. Before that, Thanos. | ||||||||
▲ | rad_gruchalski 8 hours ago | parent | prev [-] | |||||||
Mimir |