This is one of those areas where the gap between theory and practice is enormous...

hedora · 2026-03-14T04:30:30 1773462630

One (expensive) vendor claimed to support this stuff, and it flat out didn't work. After lots and lots of escalations, the fix ended up being with a massive terraform diff for us to deploy to prod.

The problem had nothing to do with our network infrastructure, deployment, etc, etc, it was completely their end.

I get the impression this standard is 10x more complicated than it needs to be, but the stuff it replaced was 100x too complicated.

Honestly though, I feel like it's all just a big hack working around splunk and similar systems' inability to understand summary statistics.

Less politely: If I had shell on the telemetry servers, I could probably get the same functionality from a perl one-liner that's easier to understand + maintain.