Scaling Distributed Machine Learning with the Parameter Server

Scaling Distributed Machine Learning with the Parameter Server
Slide Note
Embed
Share

Explore the unique architecture of a Parameter Server in distributed machine learning, comparing it to traditional KV stores, leveraging domain knowledge, and treating key-value pairs as sparse matrices. Discover how user-defined functions and flexible consistency play crucial roles in optimizing data aggregation and updates within the system.

  • Machine Learning
  • Distributed Systems
  • Parameter Server
  • Sparse Matrices
  • Flexible Consistency

Uploaded on Apr 03, 2025 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. Scaling Distributed Machine Learning with the Parameter Server By M. Li, D. Anderson, J. Park, A. Smola, A. Ahmed, V. Josifovski, J. Long E. Shekita, B. Su. EECS 582 W16 1

  2. Outline Motivation Parameter Server architecture Why is it special? Evaluation EECS 582 W16 2

  3. Motivation EECS 582 W16 3

  4. Motivation EECS 582 W16 4

  5. Motivation EECS 582 W16 5

  6. Parameter Server EECS 582 W16 6

  7. Parameter Server How is this different from a traditional KV store? EECS 582 W16 7

  8. Key Value Stores Key-Value Store Clients EECS 582 W16 8

  9. Diff from KV (leverage domain knowledge) Treat KVs as sparse matrices Computation occurs on PSs Flexible consistency Intelligent replication Message Compression EECS 582 W16 9

  10. KVs as Sparse Matrices EECS 582 W16 10

  11. KVs as Sparse Matrices EECS 582 W16 11

  12. User-defined Functions PSs aggregate data from the workers: In distributed gradient descent workers push gradients that PSs use to calculate how to update the parameters Users can also supply user defined functions to run on the server EECS 582 W16 12

  13. Flexible Consistency EECS 582 W16 13

  14. Flexible Consistency EECS 582 W16 14

  15. Intelligent Replication EECS 582 W16 15

  16. Intelligent Replication EECS 582 W16 16

  17. Message Compression EECS 582 W16 17

  18. Message Compression Training data does not change: Only send hash of training data on subsequent iterations Values are often 0 Only send non-zero values EECS 582 W16 18

  19. Evaluation (logistic regression) EECS 582 W16 19

  20. Evaluation (logistic regression) EECS 582 W16 20

  21. Evaluation (logistic regression) EECS 582 W16 21

  22. Evaluation (logistic regression) EECS 582 W16 22

  23. Evaluation (Topic Modeling) EECS 582 W16 23

  24. Conclusion The Parameter Server model is a much better model for machine learning computation than comparing models EECS 582 W16 24

  25. Conclusion Designing a system around a particular problem exposes The Parameter Server model is a much better model for machine learning computation than comparing models optimizations which dramatically improve performance EECS 582 W16 25

Related


More Related Content