Enhancing Network Connectivity with Ceph Checker Solutions

network connectivity checker n.w
1 / 31
Embed
Share

"Explore how the Network Connectivity Checker team of Manjunath Shettar, Jayashankar Tekkedatha, and Samhith Venkatesh leverages Ceph.network for effective point-to-point connectivity checks and mesh connectivity tests. Learn about the initial approach, design strategies, and implementation methods for ensuring robust network connectivity in your infrastructure."

  • Network Connectivity
  • Ceph Checker
  • Point-to-Point
  • Mesh Connectivity
  • Design Strategies

Uploaded on | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. Network Connectivity Checker Team: Manjunath Shettar Jayashankar Tekkedatha Samhith Venkatesh

  2. Agenda Overview Design Implementation Demo

  3. Ceph Network

  4. Objective Network Connectivity Checker Point to Point OSD connectivity check Front network Back network Topology aware mesh connectivity check

  5. Initial Approach Start network server external to OSD process during OSD-init Physical node s connectivity can still be checked even if OSD crashes Reason: Socket infra to be associated with OSD We have to assume if OSD is down then node is unreachable Other OSD processes ping the above socket to check connectivity Ceph daemon commands employed to perform ping test Reason: Daemon commands are local to the OSD nodes

  6. Design Point to Point Check Existing heartbeat mechanism employed ceph tell command is used for ping Mesh Check Ceph OSD Tree topology generated by CRUSH Wrapper Point to Point checks are efficiently utilized for mesh check

  7. Design Point to Point check Commands introduced ceph tell osd.<source-osd> nc_check ping <destination-osd> ceph tell osd.<source-osd> nc_check ping_front <destination-osd> ceph tell osd.<source-osd> nc_check ping_back <destination-osd>

  8. Design Point to Point check Message structure

  9. Design Point to Point check Piggy back on the heartbeat infrastructure

  10. Design Point to Point check Check back and front network

  11. Design Point to Point check Ping Response

  12. General Ceph Topology Root Datacenter The CRUSH hierarchy is aligned with the physical infrastructure Room Ceph allows creation of buckets to define hierarchy Row ceph osd tree format json-pretty Rack Host OSD

  13. Design - Mesh Check Steps: Obtain the CRUSH hierarchy using the ceph osd tree format json-pretty Parse the json output and traverse the entire topology Use the ceph tell osd version command to validate the status of each OSD Cross ping check between entities of the same level (both front and back) Add active OSD as representative for the parent and its ancestor. Assumption: OSDs once verified as active during mesh check will not go down until completion.

  14. Design - Mesh Check Room Row Rack - 1 Rack - 2 Host - 1 Host - 2 Host - 3 Host - 4 OSD - 1 OSD - 2 OSD - 3 OSD - 4 OSD - 5 OSD - 6 OSD - 7 OSD - 8

  15. Design - Mesh Check Mesh check - traverse to OSD Issue ceph tell osd version for OSD process Room Row Rack - 1 Rack - 2 Host - 1 Host - 2 Host - 3 Host - 4 OSD - 1 OSD - 2 OSD - 3 OSD - 4 OSD - 5 OSD - 6 OSD - 7 OSD - 8

  16. Design - Mesh Check Identified OSD as active Room Row Rack - 1 Rack - 2 Active children: 1 Host - 1 Host - 2 Host - 3 Host - 4 OSD - 1 OSD - 2 OSD - 3 OSD - 4 OSD - 5 OSD - 6 OSD - 7 OSD - 8

  17. Design - Mesh Check Room Row Rack - 1 Rack - 2 Active children: 1, 2 Host - 1 Host - 2 Host - 3 Host - 4 OSD - 1 OSD - 2 OSD - 3 OSD - 4 OSD - 5 OSD - 6 OSD - 7 OSD - 8

  18. Design - Mesh Check Performing cross ping check Room Row Rack - 1 Rack - 2 Ping check between active children Host - 1 Host - 2 Host - 3 Host - 4 OSD - 1 OSD - 2 OSD - 3 OSD - 4 OSD - 5 OSD - 6 OSD - 7 OSD - 8

  19. Design - Mesh Check Ping check successful Room Row Rack - 1 Rack - 2 Host - 1 Host - 2 Host - 3 Host - 4 OSD - 1 OSD - 2 OSD - 3 OSD - 4 OSD - 5 OSD - 6 OSD - 7 OSD - 8

  20. Design - Mesh Check OSD failure scenario Room Row Rack - 1 Rack - 2 Active children: 4 Host - 1 Host - 2 Host - 3 Host - 4 OSD - 1 OSD - 2 OSD - 3 OSD - 4 OSD - 5 OSD - 6 OSD - 7 OSD - 8

  21. Design - Mesh Check Performing cross ping check across hosts Room Row Rack - 1 Rack - 2 Ping check between active children Host - 1 Host - 2 Host - 3 Host - 4 OSD - 1 OSD - 2 OSD - 3 OSD - 4 OSD - 5 OSD - 6 OSD - 7 OSD - 8

  22. Design - Mesh Check Ping successful Room Row Active children: 1,2,4 Rack - 1 Rack - 2 Host - 1 Host - 2 Host - 3 Host - 4 OSD - 1 OSD - 2 OSD - 3 OSD - 4 OSD - 5 OSD - 6 OSD - 7 OSD - 8

  23. Design - Mesh Check Mesh check in other rack Room Row Active children: 5,6,7,8 Rack - 1 Rack - 2 Host - 1 Host - 2 Host - 3 Host - 4 OSD - 1 OSD - 2 OSD - 3 OSD - 4 OSD - 5 OSD - 6 OSD - 7 OSD - 8

  24. Design - Mesh Check Performing cross ping check across racks Room Row Active children: 1,2,4 Active children: 5,6,7,8 Rack - 1 Rack - 2 Host - 1 Host - 2 Host - 3 Host - 4 OSD - 1 OSD - 2 OSD - 3 OSD - 4 OSD - 5 OSD - 6 OSD - 7 OSD - 8

  25. Design - Mesh Check Ping successful Room Row Rack - 1 Rack - 2 Host - 1 Host - 2 Host - 3 Host - 4 OSD - 1 OSD - 2 OSD - 3 OSD - 4 OSD - 5 OSD - 6 OSD - 7 OSD - 8

  26. Design - Mesh Check Mesh check completes Reports OSD and link failures Room Row Rack - 1 Rack - 2 Host - 1 Host - 2 Host - 3 Host - 4 OSD - 1 OSD - 2 OSD - 3 OSD - 4 OSD - 5 OSD - 6 OSD - 7 OSD - 8

  27. Mesh check Recursive function call

  28. Mesh check Ping check across OSDs in a Host

  29. Mesh check Ping check across representative OSDs at an hierarchy

  30. DEMO

  31. Thank you

Related


More Related Content