Development of Monitoring Service for BM@N Information Systems

the xxvii international scientific conference n.w
1 / 14
Embed
Share

"Explore the implementation and features of a monitoring service for BM@N information systems, ensuring stability, reliability, and efficient resource usage. Learn about automated configuration generation, tool deployment, and the architecture of the interconnection between systems."

  • Monitoring Service
  • BM@N Systems
  • Information Systems
  • Development
  • Technology

Uploaded on | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. The XXVII International Scientific Conference of Young Scientists and Specialists (AYSS-2023) Development of Monitoring Service for BM@N information systems O. Nemova, P. Klimai, K. Gertsenberger November, 2023

  2. Brief BM@N infrastructure description Key features of the BM@N information systems: Requirements for these systems: high stability and reliability, quick alerting of failures, flexible usage of resources, easy configuration, reduce time to repair, and etc. * Big Data systems, stability is critical, expensive collection of information, an immediate alerting. * Some of them may partially solved by the implementation of a Monitoring service for BM@N information systems 2 Development of Monitoring Service for BM@N information systems | O.Nemova, P.Klimai, K. Gertsenberger | November, 2023

  3. Implemented features for the monitoring service For checking stability and reliability : Endpoints state: network interfaces: PINGs checks, Telegraf data. memory (timeseries): Telegraf data. disk (timeseries): Telegraf data. CPU (timeseries): Telegraf data. Endpoint availability Raised database (i.g. PostgreSQL): using TIG (InfluxDB + Telegraf + Grafana) stack, health checks via Python (InfluxDB timeseries), alerting via Grafana and SMTP-server. Service availability Web interfaces: HTTP requests checks (i.g. GET-request). 3 Development of Monitoring Service for BM@N information systems | O.Nemova, P.Klimai, K. Gertsenberger | November, 2023

  4. Automated configuration generation (JSON) - The rendering tools are used for flexible configuration of the monitoring tools (i.g. Grafana); - A Python script was implemented; - Alerts and dashboards are currently configured for the environment using Jinja2. Development of Monitoring Service for BM@N information systems | O.Nemova, P.Klimai, K. Gertsenberger | November, 2023 4

  5. Deployment of the tools for the monitoring service 5 Development of Monitoring Service for BM@N information systems | O.Nemova, P.Klimai, K. Gertsenberger | November, 2023

  6. Architecture of the interconnection between BM@N monitoring system and information systems 6 Development of Monitoring Service for BM@N information systems | O.Nemova, P.Klimai, K. Gertsenberger | November, 2023

  7. BM@N monitoring clients view (Grafana dashboard: Default) 7 Development of Monitoring Service for BM@N information systems | O.Nemova, P.Klimai, K. Gertsenberger | November, 2023

  8. 8

  9. BM@N monitoring alerting (Client mail) 9 Development of Monitoring Service for BM@N information systems | O.Nemova, P.Klimai, K. Gertsenberger | November, 2023

  10. Directions for improvement: [Experiment] Swapping the health check service with deploying, Friendly configuration: unified configuration files format (YML user-friendly). Many config files bad (Good example avito actions or github actions) Increasing flexibility for: other databases types (except PostgreSQL), other information system parts, deployed endpoints OS type. 10 Development of Monitoring Service for BM@N information systems | O.Nemova, P.Klimai, K. Gertsenberger | November, 2023

  11. Stack (Directions for improvement): For more comprehensive informing could be considered: -stack (Elasticsearch + Logstach + Kibana) [Good for text information], TICK-stack (Telegraf + InfluxDB + Chronograf + Kapacitor) etc. Agents: [Good in Grafana bundle]. Virtual machines: [Experiment] containers (migrating to docker compose [additional DOCKER HOST configuring]). 11 Development of Monitoring Service for BM@N information systems | O.Nemova, P.Klimai, K. Gertsenberger | November, 2023

  12. Conclusions the monitoring service for BM@N information systems has been implemented, the service provides comprehensive information on the deployed endpoint of the information system, information is continuously obtained about the endpoints, email alerting was implemented, directions for improving the service are highlighted. 12 Development of Monitoring Service for BM@N information systems | O.Nemova, P.Klimai, K. Gertsenberger | November, 2023

  13. Thank you for your attention! 13

  14. Abstract The software infrastructure of the BM@N experiment contains a set of various information systems that are essential for the work with experimental or simulated data on all processing stages, including the collection, storage, intermediate processing and physics analysis. Some examples of the systems are the Electronic Logbook Platform, Condition Database and Event Metadata System. In case one of such systems stops functioning, the work with BM@N data by collaboration members gets either impossible or, at least, much less productive. Due to this fact, the timely detection of possible failures in the systems due to software or hardware failures is fairly important. The Monitoring Service described in the report is used to check availability and health status of information systems. This includes measuring, storing, visualizing and sending alert notifications on monitored parameters, such as CPU, memory and disk utilization, DBMS functioning parameters, response times of databases and API endpoints, ping round-trip times, and so on. The current implementation of the BM@N monitoring service is discussed in detail. A related task of building highly available information services is also briefly noted. 14 Development of Monitoring Service for BM@N information systems | O.Nemova, P.Klimai, K. Gertsenberger | November, 2023

More Related Content