Timely Identification and Delivery of Trending Search Content to Mobile Users

Timely Identification and Delivery of Trending Search Content to Mobile Users
Slide Note
Embed
Share

"PocketTrend focuses on delivering trending search content to mobile users efficiently, improving real-time user experience and conserving energy. Through data-driven analysis of user queries, the platform aims to enhance user engagement and optimize search result stability. With a motivation to provide timely information, PocketTrend addresses the challenges of search limitations and ensures the timely push of crucial updates like electoral events. The platform offers insights into when and what to push based on user perspectives and data analytics, creating a seamless search experience for users."

  • Trending Searches
  • Mobile Users
  • Data Analysis
  • Search Content
  • User Experience

Uploaded on Mar 14, 2025 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. PocketTrend: Timely Identification and Delivery of Trending Search Content to Mobile Users Gennady Pekhimenko, Dimitrios Lymberopoulos, Oriana Riva, Karin Strauss, Doug Burger

  2. Pocket Cloudlets [ASPLOS11] 2

  3. PocketSearch Local Search Web Search Stability of the search results Small subset of queries covers most of the searches, e.g., 55% hit rate with 2500 search queries (1MB in space) Repetitive queries from the same user 3

  4. PocketSearch Limitations Total Volume Volume w/o Trend-related Queries # of Queries (per hour) 100000 30% 80000 60000 40000 20000 0 0 3 6 9 12 15 18 21 0 3 6 9 12 15 18 21 11/5/12 11/6/12 Time (hours) USA Presidential Elections 4

  5. Motivation for PocketTrend User perspective Improve real-time user experience by delivering trending queries/content ahead of time Longer battery life by decreasing the number of radio activations Data center perspective Avoid worst-case scenarios with higher than normal peaks Potential energy savings by servicing fewer queries 5

  6. Data Analysis Data-driven analysis: Search queries from Bing users 1 million unique users in US 2 months of the data analyzed Information available: User ID (encrypted and hashed) Search query Full URLs visited ( clicks ) Timestamp Geographical location 6

  7. What to Push? Boston Marathon President Elections Pope Election 35% % of total clicks 30% 25% 20% 15% 10% 5% 0% 1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 Search results (URLs) Very few URLs cover most of the clicks 7

  8. When to Push? Pole Election in Rome 10% % of Queries 8% 6% 4% 2% 0% 0 2 4 6 8 10121416182022 13/3/2013 Small window for update - push immediately 8

  9. When to Push? US Presidential Elections 35% % of Queries 30% 25% 20% 15% 10% 5% 0% 0 2 4 6 8 10121416182022 6-Nov-12 Larger window for update less aggressive pushes 9

  10. Whom to Push? Pairs of Events Marathon-Pope President-Marathon President-Pope 50% % of users 40% 30% 20% 10% 0% 0 > 10 > 20 > 30 > 40 > 50 > 60 > 70 > 80 > 90 Query volume Same users are interested in multiple trends 10

  11. Whom to Push? (2) Boston Marathon President Elections Pope Election 100% % of users 95% 90% 85% 80% 75% 70% 0 > 10> 20> 30> 40> 50> 60> 70> 80> 90 Query volume Higher user volume means higher chances for an interest in a trending event 11

  12. PocketTrend: Analysis Summary Several hours window to start pushing the content Target push receivers based on user search volume Small subset of queries/URLs covers most of the accesses 12

  13. Outline Motivation & Background PocketTrend: Data Analysis PocketTrend: Implementation Evaluation Conclusion Future Work 13

  14. PocketTrend: Key Idea What to push? When to push? Query Cache Formation boston+bomb boston+marathon boston+explosion Trend Detection Whom to push? Data How to push? Compression Delta Encoding 14

  15. Step #1 out of 5 Trend Detection Trend Identification Trending Event Detection: Finding Trending Keywords Keyword: Boston Facebook explosion cnn . Curr. hour: 150 4000 100 1100 . Ref. hour: 80 3900 1 800 . trending words 15

  16. Step #2 out of 5 Trend Detection Trend Identification Trending Event Detection: Forming and Merging Trends explosion + marathon cnn + fox + news trends merge trending words 16

  17. Step #3 out of 5 Trend Detection Trend Identification Trending Content Identification: Forward Pass Query URLs Clicked boston url1, url2, url1 bomb+boston url1, url2, url2 Trending queries and URLs Search logs Forward Pass trending words 17

  18. Step #4 out of 5 Trend Detection Trend Identification Trending Content Identification: Backward Pass Query bomb in boston url1, explosion at marathon url1, Clicked Trending queries and URLs Trending queries and URLs Backward Pass 18

  19. Step #5 out of 5 Trend Detection Trend Identification Trending Content Identification: Identify & Compress Cache Content Trending Search Content Trending queries and URLs 19

  20. Trend Detection Example 14:00pm 12:09pm 12:11pm 12:19pm 11:49am 13:00pm Cache V.1 Cache V.2 formed Time (PDT) boston Initial trend detected formed explosion marathon Trending words: marathon, boston, explosion, news, fox, bomb, cnn Boston Marathon Bombing 20

  21. Typical Trends Trend Name Duration Trending Words # Trending Word List USA Presidential Elections 60+ hours 5-53 vote, election, polls, results, presidential Boston Marathon Bombing 120+ hours 7-18 boston, marathon, bomb, explosion Pope Election 10+ hours 2-8 pope, elected, francis, new, cardinal, jorge Lil Wayne Hospitalization 30+ hours 3-4 lil, wayne, hospitalization Father s Day 10+ hours 3-4 fathers, happy, father s, day Gandolfini Death 20+ hours 3 Gandolfini, james, death 4th July 4th , july, fireworks 20+ hours 3 San Francisco Plane Crash 10+ hours 3 Crash, plane, francisco 21

  22. Trend Development over Time Trending Keywords + mit, watertown 20 18 + jfk, explosions, library 16 14 + fbi, suspect 12 10 + bombing, Boston 8 6 4 2 0 0 4 8 121620 0 4 8 121620 0 4 8 121620 0 4 8 121620 0 4 8 121620 0 4 8 121620 0 4 8 121620 4/15/13 4/16/13 4/17/13 4/18/13 4/19/13 4/20/13 4/21/13 Time (hours) Boston Marathon Bombing 22

  23. Different Update Strategies Passive updates: update a user that comes with any query with the whole cache Pros: simple to implement and energy efficient (no additional radio activations) Cons: potential increase in bandwidth and may be to slow to update some users in time Active updates: send a cache to specific users, e.g., based on the overall user search volume Pros: have the highest hit rate Cons: energy inefficient (additional radio activations) 23

  24. Methodology In-house infrastructure to replay the sequence of search queries Mobile volume up to 100k queries per hour Cache version is updated every hour It is possible to do it more frequently in practice and this should lead to a better cache hit ratio 24

  25. Results: Presidential Elections # of Requests (per hour) NoCaching PT-UpdatesOnly PT-5k PT-IdealCache 100000 90000 80000 70000 60000 50000 40000 30000 12 13 14 15 16 Time (hours) 17 18 19 20 21 22 23 Passive updates strategy is effective 25

  26. Results: Boston Marathon NoCaching PT-UpdatesOnly PT-5k PT-IdealCache 70000 # of Requests (per hour) 65000 60000 55000 50000 45000 40000 11 12 13 14 15 16 17 18 19 20 21 Time (hours) 26

  27. Effectiveness Analysis PT-UpdatesOnly PT-5k 0.20 Eliminated requests per cache transfer 0.16 0.12 0.08 0.04 0.00 0 3 6 9 12 15 18 21 0 3 6 9 12 15 18 21 11/6/12 11/7/12 Time (hours) Passive updates quality is usually better than active 27

  28. Cache Effectiveness How many users benefit from the trending cache? Depends on how long the event lasts For Boston Marathon Bombing it was ~19.5% users For Presidential Elections - ~10.7% users Passive update strategy (UpdatesOnly) is better in terms of relative % Active update strategy (5K) is better in the absolute numbers 28

  29. Cache Size Sensitivity NoCaching PT-1000 PT-10 PT-Unlimited PT-50 PT-IdealCache PT-100 # of Requests (per hour) 100000 90000 80000 70000 60000 50000 40000 30000 13 14 15 16 17 18 19 20 21 22 23 Time (hours) 29

  30. Conclusions PocketTrend a new system to effectively cache the dynamically evolving trends both the search queries and dynamic web-content provides benefits to both the mobile users and data centers Most of the benefits are possible with minimal overhead small storage for caches minimal energy and bandwidth overheads 30

  31. PocketTrend: Timely Identification and Delivery of Trending Search Content to Mobile Users Gennady Pekhimenko, Karin Strauss, Dimitrios Lymberopoulos, Oriana Riva, Doug Burger

  32. Trend Detection Step #1. Detecting the key words that exceed the usual number of appearances in searches Relative frequency over the same hour in the reference day ( 5x) Absolute counters ( 100 queries) to be statistically significant Step #2. Group together the words that are frequently searched together 20% of the searches that include one word, also have a second word in the same search query For example, marathon and explosion 93% Step #3. If there are multiple trends that have a word in the intersection -> merge them For example, news , cnn , fox is first detected as a separate trend, but later joined with the rest of the boston marathon bombing 34

  33. Trend Detection - 2 Step #4. Evaluate the overall importance of the resulting set of words (based on the number of strongly matching queries -> forward pass) Should be 1000 of matching queries per hour Should be 0.5% of matching queries over all queries per hour Step #5.For all strongly matching queries find all resulting clicks, and then perform backward pass to find all queries that lead to these clicks Step #6. Form a corresponding cache for the trend 35

  34. Trend Development over Time - 2 60 Trending Keywords 50 40 30 20 10 0 0 4 8 12 16 20 0 4 8 12 16 20 0 4 8 12 16 20 11/6/12 11/7/12 Time (hours) 11/8/12 USA Presidential Elections 37

  35. Cache Size Sensitivity IdealCache 10 50 100 1000 Unlimited NoCaching 100000 90000 80000 # of Requests (per hour) 70000 60000 50000 40000 30000 20000 10000 0 0 2 4 6 8 10 12 14 16 18 20 22 0 2 4 6 8 10 12 14 16 18 20 22 0 2 4 6 8 10 12 14 16 18 20 22 0 2 4 6 8 10 12 14 16 18 20 22 11/5/12 11/6/12 11/7/12 11/8/12 Time (hours) 38

  36. How about Web Content? Reuse Distance Study 60000 50000 40000 Users # 30000 20000 10000 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 Time (hours) USA Presidential Elections Users tend to use the trending cache within first 10 hours, hence PocketSearch is not going to be effective for Web content. 39

  37. Comparison with PocketSearch NoCaching PocketSearch PocketTrend PocketSearch+Trend 100000 90000 80000 # of Requests (per hour) 70000 60000 50000 40000 30000 20000 10000 0 0 2 4 6 8 10 12 14 16 18 20 22 0 2 4 6 8 10 12 14 16 18 20 22 0 2 4 6 8 10 12 14 16 18 20 22 0 2 4 6 8 10 12 14 16 18 20 22 11/5/12 11/6/12 11/7/12 11/8/12 Time (hours) PocketTrend can help in the cases when we have active trend. With and without PocketSearch. 40

  38. Comparison with PocketSearch (2) PocketSearch PocketTrend PocketSearch+Trend NoCaching 80000 70000 60000 # of Requests (per hour) 50000 40000 30000 20000 10000 0 0 2 4 6 8 10 12 14 16 18 20 22 0 2 4 6 8 10 12 14 16 18 20 22 0 2 4 6 8 10 12 14 16 18 20 22 0 2 4 6 8 10 12 14 16 18 20 22 4/15/13 4/16/13 4/17/13 4/18/13 Time (hours) Boston Marathon Bombing 41

  39. Overhead Analysis PT-UpdatesOnly PT-5k of cache transers 700000 Cumulative # 600000 500000 400000 300000 200000 100000 0 0 3 6 9 12 15 18 21 0 3 6 9 12 15 18 21 11/6/12 11/7/12 Time (hours) Passive updates strategy is more efficient 42

  40. Comparison to Prior Work End-to-end system/evaluation 46

  41. PocketTrend: Key Ideas Detect the current trendbased on the unusual word frequencies, e.g., boston+marathon+bombing Collect the top search queries and web sites clicks that belong to the trend and cache them Deliver the cache to the mobile phone users (either actively or lazily, + compression and diffing or delta encoding) Perform periodic updates with the new cache version (if needed) 47

  42. Effect of Compression Search queries up to 5x with XPRESS9 level 12 Web-links up to 4.5x with XPRESS9 level 12 48

  43. Future Work Explore web-content opportunities Delta-encoding and diffing of the web-pages Compression opportunities for similar web-pages Comparison with desktop traffic Searching for more trending events over longer periods of time 49

Related


More Related Content