The Art of Fleet-Wide Kubernetes Observability
Core strategies for monitoring Kubernetes at scale, from identifying key metrics to implementing fleet-wide observability. Gain insights into actionable alerts, correlation techniques, and the future of observability
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
The Art of Fleet-Wide Kubernetes Observability: 3 Core Strategies FOSDEM 2025 Monitoring and Observability Track Pratik Panda Site Reliability Engineer, RedHat Mitali Bhalla Site Reliability Engineer, RedHat
Agenda What is Fleet-Wide Observability? The Observability Challenge at Scale Looking at the 3 Strategy Metrics: Identifying What Matters Alerts: From Noise to Actionable Signals Correlation: Connecting the Dots Implementing Fleet wide Observability Looking Ahead The Future of Observability Q&A
Metrics: Identifying What Matters
Things That Mislead and How to Avoid Them Overcollection Focus on high-signal, high-value metrics. Lack of Standardization Adopt a consistent monitoring framework. Ignoring Cardinality Be selective with label usage. Reactive Monitoring Prioritize proactive, SLO-driven metrics.
Alerts: From Noise to Actionable Signals
Correlation: Connecting the Dots
Implementing Fleet wide Observability Leveraging the concepts of SLIs, SLOs, SLAs
Optional section marker or title Benefits of SLO Driven Approach Proactive issue Management Scalable Reliability Management Consistency across services Error Budget Framework Alerting and Insights Enhanced User Experience Aligned goals across teams Key focus on customer impact Unified Metrics Reduced Downtime Prioritized Efforts Continuous Improvement Business-Driven Observability 17
Looking Ahead - The future of Observability AI-Driven Insights Cloud Native and Cross Cloud Provider Automated Remediation Open Standards & Interoperability
Thank You! linkedin.com/company/red-hat youtube.com/user/RedHatVideo s facebook.com/redhatinc twitter.com/RedHat