Session 20 - Todays Agenda and Training Opportunities

Session 20 - Todays Agenda and Training Opportunities
Slide Note
Embed
Share

"Session 20 featuring announcements, group discussions, and Blackboard orientation. Training opportunities include DE Rubric, Bb Collaborate classes, and IET Webinar Series. Register for online standards courses by OAE for professional development."

  • Session
  • Training
  • Blackboard
  • Online Courses
  • Webinar

Uploaded on Apr 04, 2025 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. Edge computing (1) Content Distribution Networks Chen Qian Department of Computer Science and Engineering qian@ucsc.edu https://users.soe.ucsc.edu/~qian/

  2. Algorithmic Nuggets in Content Delivery Bruce M. Maggs Ramesh K. Sitaraman

  3. Overview Background Representative research Conclusion 3

  4. Web caches (proxy server) goal: satisfy client request without involving origin server user sets browser: Web accesses via cache browser sends all HTTP requests to cache object in cache: cache returns object else cache requests object from origin server, then returns object to client proxy server client origin server client origin server Application Layer 4

  5. More about Web caching cache acts as both client and server server for original requesting client client to origin server typically cache is installed by ISP (university, company, residential ISP) why Web caching? reduce response time for client request reduce traffic on an institution s access link When is cache not good? Every client of the ISP requests different content. Waste time on visiting cache server Application Layer 5

  6. Background Content delivery network (CDN) A geographically distributed network of proxy servers and their data centers. Distribute service spatially relative to end-users e.g. Service for DNS query 6

  7. Background Top-three objectives of CDN High reliability Fast and consistent performance Low operating cost Representative research on CDN Global load balancing Load balancing within a single cluster of servers Bloom filters for CDN Overlay routing Leader election and consensus 7

  8. Overview Background Representative research Conclusion 8

  9. Representative research on CDN Global load balancing Load balancing within a single cluster of servers Bloom filters for CDN Leader election and consensus 9

  10. Global Load Balancing Purpose: Map clients to the server clusters of the CDN. Clusters assignments are made at the granularity of map units. <IP address prefix, traffic class> e.g. <1.2.3.4/24, video> Question: How to assign each map unit ??,? ? ? to a server cluster ??,? ? ? 10

  11. Global Load Balancing Preference Each map unit has preferences for clusters, higher preference indicates better predicted performance. Each server cluster has preferences regarding which map units it would like to serve. Constraints Each map unit is associated with a demand ?. Each cluster has a notion of capacity ?. 11

  12. Global Load Balancing Stable allocations Blocking pairs: ?? prefers ?? over its current partner ?? & ?? prefers ?? over its current partner ?? Stable: There are no blocking pairs in the allocations ?1:(?2,?1,?3) ?1:(?2,?1,?3) ?2:(?1,?2,?3) ?2:(?2,?1,?3) ?3:(?2,?3,?1) ?3:(?1,?3,?2) 12

  13. Gale-Shapley algorithm Gale-Shapley algorithm is a distributed algorithm to find a stable allocations. (Propose-And-Reject algorithm) man(who proposes)-optimal map-unit-optimal 13

  14. Some Limitations Unequal number of map units and clusters More map units than clusters Partial preference lists Tens of millions of map units VS Thousands of clusters Rank for each map unit the top dozen clusters that are likely to provide the best performance Modeling integral demands and capacities A server cluster cannot be accurately modeled as a single resource with a single number capacity 14A Survey of the Stable Marriage Problem and Its Variants

  15. Resource Trees Bps: the rate at which data can be sent out of the cluster modeled. Fps: the capacity of non-network serve resources such as the processor, memory and disk. 50 Bps A Violation B 30 Fps 25 Fps Apps Video Web 1. 20 units of demand from a video map, each unit requires 0.25 Fps and 1 Bps. (5 Fps & 20 Bps) C E D 30 Fps 40 Fps 30 Fps 2. 26 units of demand from application map, each unit requires 1 Fps and 0.25 Bps. (26 Fps & 6.5 Bps) 5 Fps & 20 Bps 26 Fps & 6.5Bps 25 Fps 4 Fps 15

  16. Resource Trees If cluster has a higher preference for map units with application traffic than video traffic 50 Bps A Violation B 30 Fps 25 Fps Evict a lower preference map unit Apps Video Web e.g. 4 units of video demand (1 Fps) are evicted. C E D 30 Fps 40 Fps 30 Fps 5 Fps & 20 Bps 26 Fps & 6.5Bps 25 Fps 4 Fps 16

  17. Implementation Challenges Complexity and scale Tens of millions of map units Thousands of clusters Over a dozen traffic classes Time to solve Map unit assignment must be recomputed every 10 to 30 seconds Demand and capacity estimation Incremental and persistent allocation 17

  18. Representative research on CDN Global load balancing Load balancing within a single cluster of servers Bloom filters for CDN Leader election and consensus 18

  19. Problem Statement In a traditional hash table, objects from a universe ? are mapped to a set of buckets ?. In CDN, an object is a file such as a JPEG image or HTML page; a bucket is the cache of a distinct web server. Naive method: Use hash functions to directly map objects to buckets. If the servers fail? 19

  20. Problem Statement Solution 1: Simply remap objects in the lost bucket to another bucket One bucket stores double the expected load Solution 2: Renumber the existing buckets and rehash the elements using a new hash function Many objects will have to be transferred between buckets. 20

  21. Each object is mapped to the next bucket that appears in clockwise order on the unit circle. Consistent Hashing Server Object 21

  22. Consistent Hashing Improvements Map each bucket to multiple locations (instances) on the unit circle to improve the balance When a server fails All of the corresponding bucket s instances are removed from the unit circle The objects that were in the buckets are remapped to other buckets. 22

  23. Consistent Hashing Popular objects It is not possible for a single server within a cluster to satisfy all of the requests for a popular object Naive method: map a popular object to the next k servers that appear in clockwise order Problem: If two popular objects happen to hash to nearby positions, the buckets that they map to will highly overlap CDN approach Use a separate mapping of the buckets for each popular object 23

  24. Representative research on CDN Global load balancing Load balancing within a single cluster of servers Bloom filters for CDN Leader election and consensus 25

  25. Bloom filters for CDN Bloom filters are useful in two different contexts Content summarization Content filtering Content summarization Use Bloom filters to succinctly store the set of objects stored in a CDN server s cache Use counting Bloom filters to support elements update (deletions and insertions) 26

  26. Content Filtering Use Bloom filters to determine what objects to cache in the first place Motivation 74% of the roughly 400 million objects in cache were accessed only once (one-hit- wonders) 90% were accessed less than four times No need to cache one-hit- wonders. 27

  27. Content Filtering Cache-on-second-hit rule Use Bloom filters to store accessed objects Server checks Bloom filters to see whether the object has been accessed before Server caches the objects have been accessed before False positives The probability of false positives increases with more objects are added to a Bloom filter. Use two Bloom filters to circumvent the problem 28

  28. Content Filtering New objects Primary Bloom filter It reaches a threshold for maximum number of objects Then new objects Secondary Bloom filter Check both the primary and secondary Bloom filters to see if the object has been accessed in the recent past 29

  29. Content Filtering Benefits Byte hit rates increased when cache filtering was turned on 30

  30. Content Filtering Benefits Not having to store the one-hit-wonders in cache reduces the disk writes by nearly one-half 31

  31. Representative research on CDN Global load balancing Load balancing within a single cluster of servers Bloom filters for CDN Overlay routing Leader election and consensus 33

  32. Overview Background Representative research Conclusion 34

  33. Conclusion This paper explores some research problems in CDN. The purpose of the paper is to illustrate How research influenced the design of CDN How the system-building challenges inspired more research in CDN 35

  34. Thank You Chen Qian cqian12@ucsc.edu cqian12@ucsc.edu https://users.soe.ucsc.edu/~qian/ https://users.soe.ucsc.edu/~qian/

Related


More Related Content