
BGP Safety Challenges and Convergence Conditions
Explore the complexities of Border Gateway Protocol (BGP) in autonomous systems, focusing on safety challenges and convergence conditions. Learn about interdomain routing, business-driven policies, and the importance of safety verification for network operators.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
BGP Safety with Spurious Updates The Conditions of BGP Convergence Martin Suchara in collaboration with: Alex Fabrikant and Jennifer Rexford January 12, 2011
Local Control vs. Global Properties The Internet is a network of networks ~35,000 independently administered ASes Competitive cooperation to find routes Local Control Interdomain policies, Intradomain routing Global Properties Stability, reliability, performance, etc. 2
Interdomain Routing Autonomous systems (AS) have different goals Different views on which path is best Interdomain routing: agree on a set of paths 2 1 4 3 Prefer 5346 to 5326 5 Prefer 326 to 346 6 Has to use : 5 3 2 6 3 client
The Border Gateway Protocol (BGP) ASes exchange information about paths I can reach d via AS 3 I can reach d 3 1 2 Data traffic d Policy configurations provided by AS operators Path selection: which path do I choose? Path export: which neighbors do I tell? 4
Business Driven Policies of ASes Customer-Provider Relationship Provider exports its customer s routes to everybody Customer exports provider s routes only to downstream customers Peer-Peer Relationship Export only customer routers to a peer Export peer routes only to customers 5
BGP Safety Challenges 35,000 ASes and 300,000 address blocks Flexible AS policies Routing convergence usually takes minutes But the system does not always converge Prefer 120 to 10 Prefer 210 to 20 1 2 Use 20 Use 210 Use 10 Use 120 0 6 d
Results on BGP Safety Safety verification important to network operators Absence of a dispute wheel sufficient for safety (Griffin, Shepherd, Wilfong, 2002) Necessary or sufficient conditions of safety (Gao and Rexford, 2001), (Gao, Griffin and Rexford, 2001), (Griffin, Jaggard and Ramachandran, 2003), (Feamster, Johari and Balakrishnan, 2005), (Sobrinho, 2005), (Fabrikant and Papadimitriou, 2008), (Cittadini, Battista, Rimondini and Vissicchio, 2009), 7
Models of BGP Existing models (variants of SPVP) Widely used to analyze BGP properties Simple but do not capture spurious behavior of BGP This work A new model of BGP with spurious updates Spurious updates have major consequences More accurate model makes proofs easier! 8
Overview I. Classical model of BGP: the SPVP II. Spurious BGP updates: what are they? III. The surprise: networks believed to be safe oscillate! IV. The consequences: applicability of earlier results V. Convergence conditions: polynomial time verifiable VI. Conclusion 9
SPVP Traditional Model of BGP (Griffin and Wilfong, 2000) The higher the more preferred Permitted paths 120 10 210 20 Always includes the empty path 1 2 0 The destination Network topology 10
SPVP Traditional Model of BGP (Griffin and Wilfong, 2000) 120 10 210 20 Selected paths 1 2 0 11
SPVP Traditional Model of BGP (Griffin and Wilfong, 2000) 120 10 210 20 Activation 1 2 0 Activation models the processing of BGP update messages sent by neighbors Vertex or edge activations 12
SPVP Traditional Model of BGP (Griffin and Wilfong, 2000) Switch to best available 120 10 210 20 Activation 1 2 0 Activation models the processing of BGP update messages sent by neighbors Vertex or edge activations 13
SPVP Traditional Model of BGP (Griffin and Wilfong, 2000) Switch to best available 120 10 210 20 Activation 1 2 0 System is safe if all fair activation sequences lead to a stable path assignment 14
Overview I. Classical model of BGP: the SPVP II. Spurious BGP updates: what are they? III. The surprise: networks believed to be safe oscillate! IV. The consequences: applicability of earlier results V. Convergence conditions: polynomial time verifiable VI. Conclusion 15
What are Spurious Updates? A phenomenon: router announces a route other than the highest ranked one Spurious BGP update 230: 230 1230 10 210 20 230 Selected path: 20 1 2 30 3 0 Behavior not allowed in SPVP 16
What Causes Spurious Updates? 1. Limited visibility to improve scalability Internal structure of ASes Cluster-based router architectures 2. Timers and delays to prevent instabilities and reduce overhead Route flap damping MRAI timers Grouping updates to priority classes Finite size message queues in routers 17
Cause 1 Limited Visibility The internal structure of ASes improves scalability while reducing visibility Autonomous system (AS) RR 5 Route reflector RR r1 r2 r3 r3 r1 r2 Router Route announcement A A B ri r3 r1 r2 r1 r3 1 2 3 4 After route r1 is withdrawn, router B temporarily announces r3 18
Cause 1 Limited Visibility The internal structure of ASes improves scalability while reducing visibility Autonomous system (AS) RR 5 Route reflector RR r1 r2 r3 Router Route announcement A A B ri r2 r1 r3 1 2 3 4 After withdrawals of routes r1 and r3, router B temporarily withdraws the route 19
Cause 2 Delays Route flap damping temporarily suppresses all routes learned from a neighbor Autonomous system (AS) 4 Router Route announcement A r1 r2 r3 r3 r1 r2 A ri r3r2 r3 r1 r2 2 3 1 After the update r2 r1 the less preferred route r3 is temporarily selected 20
DPVP A More General Model of BGP DPVP = Dynamic Path Vector Protocol Generalizes the earlier model (SPVP) Spurious update with a less preferred route that was recently available Spurious updates allowed in transient period after last route change Safety is independent of numerical value 21
DPVP A More General Model of BGP The permitted paths and their ranking Spurious update Selected path: 210 120 10 210 20 Remember all recently available paths (e.g. 20, 210) 20 1 2 StableTime = after last path change 0 Spurious updates are allowed only if current time < StableTime Spurious updates may include paths that were recently available or the empty path 22
DPVP A More General Model of BGP Behavior captured irrespective of cause Simple future-proof model independent of underlying network technologies For every allowed spurious behavior in DPVP we can find a possible cause Details in our technical report: TR-881-10, Dept. of Comp. Sci., Princeton, July 2010 www.cs.princeton.edu/~msuchara/publications.html 23
Overview I. Classical model of BGP: the SPVP II. Spurious BGP updates: what are they? III. The surprise: networks believed to be safe oscillate! IV. The consequences: applicability of earlier results V. Convergence conditions: polynomial time verifiable VI. Conclusion 24
Consequences of Spurious Updates Spurious behavior is temporary Tempting to conclude that it cannot have long term consequences The surprise: spurious behavior may trigger permanent oscillations! 25
The Surprise: Spurious Announcements Trigger Permanent Oscillations! Safe instance in all classical models of routing: 210 20 2 130 10 1 Stable outcome 0 3210 320 310 30 3 Permanent oscillations with spurious behavior 26
Example of Oscillation 210 20 RR 2 B 130 10 1 A 0 Autonomous system (AS) 1 3210 320 310 30 3 Route reflector RR Router A AS activation 1 27 Route announcement
Example of Oscillation 210 20 RR 2 B 130 10 1 A 0 Autonomous system (AS) 1 3210 320 310 30 3 Route reflector RR Router A AS activation 1 28 Route announcement
Example of Oscillation 210 Did not learn about withdrawal of 210 yet 210 20 RR 2 B 130 10 1 Does not know any route 210 A 0 Autonomous system (AS) 1 3210 320 310 30 3 Route reflector RR Router A AS activation 1 29 Route announcement
Example of Oscillation 210 Did not learn about withdrawal of 210 yet 210 20 RR 2 B 130 10 1 Does not know any route 210 A 0 Autonomous system (AS) 1 3210 320 310 30 3 Route reflector RR Router A AS activation 1 30 Route announcement
Example of Oscillation 210 Did not learn about withdrawal of 210 yet 210 20 RR 2 B 130 10 1 Does not know any route 210 A 0 Autonomous system (AS) 1 3210 320 310 30 3 Route reflector RR Router A AS activation 1 31 Route announcement
Example of Oscillation 20 210 20 RR 2 B 130 10 20 1 20 A 0 Autonomous system (AS) 1 3210 320 310 30 3 Route reflector RR Router A AS activation 1 32 Route announcement
Example of Oscillation 210 20 RR 2 B 130 10 1 A 0 Autonomous system (AS) 1 3210 320 310 30 3 Route reflector RR Router A AS activation 1 33 Route announcement
Example of Oscillation 210 20 RR 2 B 130 10 1 A 0 Autonomous system (AS) 1 3210 320 310 30 3 Route reflector RR Router A AS activation 1 34 Route announcement
Example of Oscillation 210 20 RR 2 B 130 10 1 A 0 Autonomous system (AS) 1 3210 320 310 30 3 Route reflector RR Router A AS activation 1 35 Route announcement
Example of Oscillation 210 20 RR 2 B 130 10 1 A 0 Autonomous system (AS) 1 3210 320 310 30 3 Route reflector RR Router A AS activation 1 36 Route announcement
Consequences of Spurious Updates Temporary behavior may cause permanent oscillations The number of oscillating nodes and / or frequency of oscillations may increase Which results do not hold in the new model? 37
Overview I. Classical model of BGP: the SPVP II. Spurious BGP updates: what are they? III. The surprise: networks believed to be safe oscillate! IV. The consequences: applicability of earlier results V. Convergence conditions: polynomial time verifiable VI. Conclusion 38
Which SPVP Results Hold in DPVP? Most previous results in SPVP also hold for DPVP Formal justification later in the talk Some results cannot be extended Slightly different conditions of convergence Exponentially slower convergence possible 39
Case Study I Different Conditions of Convergence Safety under filtering: is instance safe under any filtering? 3210 320 310 30 3 Filter arbitrary subset of routes Subgraph with specific properties Absence of a dispute reel necessary and sufficient for safety under filtering in SPVP (Cittadini et al., 2009) Our result: permanent oscillations in DPVP even without a reel 40
Case Study I Different Conditions of Convergence Example of a safe topology, Cittadini et al.: 320 1320 10 130 3210 30 320 1 3 130 0 210 2130 20 210 2 Spurious updates cause oscillations 41
Case Study II Exponential Slowdown of Convergence BGP converges in 2l + 2 phases (Sami et al., 2009) l: length of longest customer-provider chain Phase: each node processes and sends updates Assumes standard business relationships With spurious updates exponential slowdown to (2k + 1)l-2phases k: max. # of spurious updates after route change 42
Overview I. Classical model of BGP: the SPVP II. Spurious BGP updates: what are they? III. The surprise: networks believed to be safe oscillate! IV. The consequences: applicability of earlier results V. Convergence conditions: polynomial time verifiable VI. Conclusion 43
Convergence Conditions Absence of a dispute wheel sufficient for safety in SPVP (Griffin, Shepherd, Wilfong, 2002) One of the most cited results Absence of a dispute wheel is still sufficient for safety in DPVP Most of the previous results of the past decade still hold under DPVP! Other stronger results in DPVP next 44
Why are Proofs Easier in DPVP? No need to prove that: Announced route is the highest ranked one Announced route is the last one learned from the downstream neighbor We changed the problem PSPACE complete vs. NP complete Next the necessary and sufficient conditions 45
Necessary and Sufficient Conditions How can we prove a system may oscillate? Classify each node as stable or coy At least one coy node exists Prove that stable nodes must be stable Prove that coy nodes may oscillate Easy in a model with spurious announcements Next: a formal definition of a construction that captures this intuition 46
Necessary and Sufficient Conditions Definition:CoyOTEisatriple(C, S, ) satisfyingseveralconditions 1230 10 210 20 230 Coy nodes may make spurious announcements 1 2 30 Stable nodes have a permanent path 3 One path assigned to each node proves if the node is coy or stable 0 Theorem: DPVP oscillates if and only if it has a CoyOTE 47
Necessary and Sufficient Conditions Definition: CoyOTE satisfies these conditions: 1) The best stable path assigned to each stable node 1230 10 210 20 230 1 2 2) Coy node is assigned a coy path: more preferred than the best stable path consistent with the paths of stable nodes 30 3 0 = coy 3) Origin is stable 48 = stable
Verifying the Convergence Conditions = Finding a CoyOTE In general an NP-hard problem Compact inputs with regular expressions Can be checked in polynomial time for most reasonable network configurations! e.g. (i) filter paths violating business relationships (ii) prefer paths not containing certain AS numbers (iii) prefer paths from certain groups of neighbors (iv) prefer shorter paths over longer ones (v) prefer paths from a lowest AS number neighbor 49
DeCoy Safety Verification Algorithm Goal: verify safety in polynomial time Main idea: find the maximal stable set S by expanding it in a greedy fashion If all nodes are stable system is safe Otherwise system may oscillate 50