
Parallel BFS Bag in Cilk++ for Efficient Graph Traversal
"Explore the efficient parallel Breadth-First Search algorithm in Cilk++, thanks to Charles E. Leiserson, for level-by-level graph traversal with associative reduce functions for optimized set merges."
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
CS 140 : Breadth-first search in Cilk++ Thanks to Thanks to Charles E. Charles E. Leiserson Leiserson for for some of some of these slides these slides 1
Breadth First Search Level-by-level graph traversal Serially complexity: (m+n) 1 2 3 4 Graph: G(E,V) E: E: Set of edges (size m) V: G(E,V) 5 6 7 8 V: Set of vertices (size n) 9 10 11 12 16 17 18 19 2
Breadth First Search Level-by-level graph traversal Serially complexity: (m+n) Level 1 1 1 2 3 4 5 6 7 8 9 10 11 12 16 17 18 19 3
Breadth First Search Level-by-level graph traversal Serially complexity: (m+n) Level 1 1 1 2 3 4 Level 2 5 2 5 6 7 8 9 10 11 12 16 17 18 19 4
Breadth First Search Level-by-level graph traversal Serially complexity: (m+n) Level 1 1 1 2 3 4 Level 2 5 2 5 6 7 8 Level 3 6 3 9 9 10 11 12 16 17 18 19 5
Breadth First Search Level-by-level graph traversal Serially complexity: (m+n) Level 1 1 1 2 3 4 Level 2 5 2 5 6 7 8 Level 3 6 3 9 9 10 11 12 Level 4 4 16 10 7 16 17 18 19 6
Breadth First Search Level-by-level graph traversal Serially complexity: (m+n) Level 1 1 1 2 3 4 Level 2 5 2 5 6 7 8 Level 3 6 3 9 9 10 11 12 Level 4 4 4 16 16 10 7 16 17 18 19 Level 5 11 8 17 7
Breadth First Search Level-by-level graph traversal Serially complexity: (m+n) Level 1 1 1 2 3 4 Level 2 5 2 5 6 7 8 Level 3 6 3 9 9 10 11 12 Level 4 4 4 16 16 10 7 16 17 18 19 Level 5 11 8 17 Level 6 18 12 8
Breadth First Search Who is parent(19)? If we use a queue for expanding the frontier? Does it actually matter? Level 1 1 1 2 3 4 Level 2 5 2 5 6 7 8 Level 3 6 3 9 9 10 11 12 Level 4 4 4 16 16 10 7 16 17 18 19 Level 5 11 8 17 Level 6 18 12 9
Parallel BFS Bag<T> has an associative reduce function that merges two sets Bag<T> has an associative reduce function that merges two sets Way #1: A custom reducer void BFS(Graph *G, Vertex root) { Bag<Vertex> frontier(root); while ( ! frontier.isEmpty() ) { cilk::hyperobject< Bag<Vertex> > succbag(); cilk_for (int i=0; i< frontier.size(); i++) { for( Vertex v in frontier[i].adjacency() ) { if( ! v.unvisited() ) succbag() += v; } } frontier = succbag.getValue(); } } operator+=(Vertex & rhs) also marks rhs visited operator+=(Vertex & rhs) also marks rhs visited 10
Parallel BFS Way #2: Concurrent writes + List reducer void BFS(Graph *G, Vertex root) { list<Vertex> frontier(root); Vertex * parent = new Vertex[n]; while ( ! frontier.isEmpty() ) { cilk_for (int i=0; i< frontier.size(); i++) { for( Vertex v in frontier[i].adjacency() ) { if ( ! v.visited() ) parent[v] = frontier[i]; } } ... } } An intentional data race An intentional data race How to generate the new frontier? How to generate the new frontier? 11
Parallel BFS Run cilk_for loop again Run cilk_for loop again void BFS(Graph *G, Vertex root) { ... while ( ! frontier.isEmpty() ) { ... hyperobject< reducer_list_append<Vertex> > succlist(); cilk_for (int i=0; i< frontier.size(); i++) { for( Vertex v in frontier[i].adjacency() ) { if ( parent[v] == frontier[i] ) { succlist.push_back(v); v.visit(); } } } frontier = succlist.getValue(); } } // Mark visited !v.visited() check is not necessary. Why? !v.visited() check is not necessary. Why? 12
Parallel BFS Each level is explored with (1) span Graph G has at most d, at least d/2 levels Depending on the location of root d=diameter(G) Work: Span: Work: Span: T1(n) = (m+n) T (n) = (d) T1(n) T (n) Parallelism: Parallelism: = ((m+n)/d) 13
Parallel BFS Caveats d is usually small d = lg(n) for scale-free graphs But the degrees are not bounded Parallel scaling will be memory-bound Lots of burdened parallelism, Loops are skinny Especially to the root and leaves of BFS-tree 14