How do you monitor and manage API performance at scale?

I’m exploring ways to improve API reliability and performance monitoring, especially in systems with high traffic (like video streaming platforms).

Would love to hear:

What monitoring tools are you using (e.g., Prometheus, Postman, API Gateway logs)?
How do you handle downtime alerts and performance logging?

I can share what we’ve implemented as well (rate limiting, gateway config, uptime monitoring) if helpful.
Looking forward to your thoughts!