Google PageRank Algorithm - Understanding How Rankings Work

Google PageRank Algorithm - Understanding How Rankings Work
Slide Note
Embed
Share

From the PageRank algorithm to the implementation methods, delve into how Google ranks and displays search results. Discover the significance of links, influential vertices, convergence, and matrix solvers. Uncover insights through a toy example, exploring the intricacies of determining importance in graphs.

  • Google
  • PageRank
  • Algorithm
  • Ranking
  • Search

Uploaded on Feb 18, 2025 | 3 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. Page Rank CSCI 476: Parallel Programming Question: How does Google Rank/Display Search Results?

  2. The PageRankAlgorithm Give ranks (scores) based on links to them A page that has: Links from many nodes high rank Link from a high-ranking node high rank

  3. The PageRankAlgorithm The PageRank algorithm can find important or influential vertices in a graph. Which node is more important or influential? 3

  4. The PageRankAlgorithm Node 0 passes its score to Node 2. Node 1 passes its score to Node 0 and Node 2. Node 2 passes its score to Node 1. Iteration 0 Iteration 1 0.333 0.333 0.191 0.475 0.333 0.333 4

  5. The PageRankAlgorithm When the score of every node does not change across iterations, we refer to it as the algorithm converged. Iteration k Iteration k+1 0.217 0.396 0.217 0.396 0.386 0.386 5

  6. The PageRankAlgorithm Adjacent matrix: Transition matrix: (row sums to 1) What if a node has zero outgoing links? Set each element of that row to 1/n, where n is the number of nodes in the graph. 6

  7. The PageRankAlgorithm PageRank algorithm can usually converge in 10 iterations. Two ways to implement Step 3: matrix solver for-loop solver (required) 7

  8. MatrixSolver Matrix implementation can be very fast. But it might not be applied to very large graphs due to the memory constraint. Use sparse matrix to store M. 8

  9. For-loopSolver Less efficient but scalable to very large graphs. You are required to implement the for-loop solver. 9

  10. A Toy Example -1 10

  11. A Toy Example -1 11

  12. A Toy Example -1 12

  13. A Toy Example -1 13

  14. A Toy Example -2 dangling user isolated user 14

  15. A Toy Example -2 15

  16. A Toy Example -2 16

  17. Bonus Task: Speed upSpark Hints: Eliminate repeated calculations in PageRank. Monitor your instances to make sure they are fully utilized. Develop a better understanding of RDD manipulations. Understand the "lazy" transformation in Spark. Spark is a tunable framework where there are many parameters that you can configure to make the best use of the resources. Be careful with repartition on your RDDs. 17

  18. PageRankEvaluation I have a sample datafile in a prescribed format [id1] [site1] [idN] [siteN] [idA] [idB] [idX] [idY] <# sites> <# links> Check if the sum of PageRank score is 1.0 at each iteration Bonus: tune your system to run as fast as possible 18

  19. Lab Handout Coming

More Related Content