The Art of Fleet-Wide Kubernetes Observability

The Art of Fleet-Wide Kubernetes Observability
Slide Note
Embed
Share

Core strategies for monitoring Kubernetes at scale, from identifying key metrics to implementing fleet-wide observability. Gain insights into actionable alerts, correlation techniques, and the future of observability

  • Kubernetes
  • Observability
  • Monitoring
  • Scalability
  • Alerts

Uploaded on Apr 03, 2025 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. The Art of Fleet-Wide Kubernetes Observability: 3 Core Strategies FOSDEM 2025 Monitoring and Observability Track Pratik Panda Site Reliability Engineer, RedHat Mitali Bhalla Site Reliability Engineer, RedHat

  2. Agenda What is Fleet-Wide Observability? The Observability Challenge at Scale Looking at the 3 Strategy Metrics: Identifying What Matters Alerts: From Noise to Actionable Signals Correlation: Connecting the Dots Implementing Fleet wide Observability Looking Ahead The Future of Observability Q&A

  3. What is Fleet-Wide Observability?

  4. The Observability Challenge at Scale

  5. Metrics: Identifying What Matters

  6. The Metrics That Matter in Kubernetes

  7. What Makes a Metric Useful?

  8. Things That Mislead and How to Avoid Them Overcollection Focus on high-signal, high-value metrics. Lack of Standardization Adopt a consistent monitoring framework. Ignoring Cardinality Be selective with label usage. Reactive Monitoring Prioritize proactive, SLO-driven metrics.

  9. Alerts: From Noise to Actionable Signals

  10. What Makes an Alert Actionable?

  11. Journey to Alerting Effectiveness

  12. Correlation: Connecting the Dots

  13. Logs and Traces: The Context and The Journey

  14. The Complete Fleet Observability Picture

  15. Implementing Fleet wide Observability Leveraging the concepts of SLIs, SLOs, SLAs

  16. Setting SLOs for key services

  17. Optional section marker or title Benefits of SLO Driven Approach Proactive issue Management Scalable Reliability Management Consistency across services Error Budget Framework Alerting and Insights Enhanced User Experience Aligned goals across teams Key focus on customer impact Unified Metrics Reduced Downtime Prioritized Efforts Continuous Improvement Business-Driven Observability 17

  18. Looking Ahead - The future of Observability AI-Driven Insights Cloud Native and Cross Cloud Provider Automated Remediation Open Standards & Interoperability

  19. Q&A

  20. Thank You! linkedin.com/company/red-hat youtube.com/user/RedHatVideo s facebook.com/redhatinc twitter.com/RedHat

Related


More Related Content