Apache ZooKeeper: Distributed Coordination Service

apache zookeeper n.w
1 / 28
Embed
Share

Apache ZooKeeper is an open-source server designed for highly reliable distributed coordination. It provides configuration information, distributed synchronization, group services, and a simple interface for building powerful abstractions. The core features include a shared hierarchical namespace of data registers called znodes, quorum, namespaces, znode functionalities, and watchers. ZooKeeper enables clients to have high throughput, low latency, highly available, and ordered access to znodes. It is a foundational service that allows for the construction of more complex distributed applications.

  • Apache ZooKeeper
  • Distributed Coordination
  • Reliable Service
  • Distributed Computing
  • Open-source

Uploaded on | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. Apache ZooKeeper CMSC 491 Hadoop-Based Distributed Computing Spring 2016 Adam Shook

  2. What is it? Apache ZooKeeper is an effort to develop and maintain an open-source server which enables highly reliable distributed coordination. Simple Replicated Ordered Fast

  3. Provides Configuration Information Distributed Synchronization Group Services Each of these services are used in some by distributed applications

  4. Interface ZooKeeper provides a very simple interface to a highly reliable and distributed service Powerful abstractions can be built from this very simple interface Currently interfaces are in Java and C Want to expand to Python, Perl, and REST.

  5. The Core Shared hierarchical name space of data registers, called znodes Unlike file systems, provides clients with high throughput, low latency, highly available, and ordered access to znodes

  6. Quorum

  7. Namespace

  8. znodes Meta-information: Configuration Status Information Location Information Whatever you want (that s small)

  9. znodes Each node acts as a file and directory 1 MB maximum per znode Persistent vs. Ephemeral Sequential znodes Full paths An optional chroot suffix can be appended to connection string 127.0.0.1:3000,127.0.0.1:3002/app/a

  10. Watchers Tied to each znode One-time trigger Sent to the client The data for why it was sent

  11. Thats It In a nutshell Very basic service, from which powerful abstractions can be built Let s talk about how good it is! That is, if you don t have any questions right now You can ask. I don t bite Really Promise

  12. Use Case: Location Data Servers store machine hostname as ephemeral znodes /app1/machine1 /app1/machine87 /app1/machine4 When a server is added, create a new znode When a server is removed, znode is deleted When a server fails, ZK will delete the ephemeral node Allows for dynamic throttling of resources Clients can choose a hostname from children of /app1 to connect to Set a child watch on /app1, if server goes down it will receive notification and can choose a new server

  13. Performance

  14. Performance

  15. Command Line Interface Interactive usage of the namespace in a shell create [path] [data] delete [path] get [path] set [path] ls [path] rmr [path] A number of other commands Tab completion!

  16. API Current and stable v3.4.6 (March 2014) Requires only a list of ZK servers to connect IMO, good but messy interface Recommend building a nice wrapper API for getting/setting POD types and handling exceptions

  17. Recipes! We are going to talk about these: Configuration Distributed Locks Distributed Queue

  18. Configuration Configuration is often driven through key/value pairs stored in a file Can get messy when configuration is dynamic Implementation is very straightforward, as it is what ZooKeeper was designed for Each full-pathed znode is the key and the data associated with the znode is the value

  19. Variables Static Variables Those ones that are probably never going to change (not as much fun) Dynamic Variables Changed by hand via command line or by the application itself Track status of processes Update historical data

  20. Use of Watchers Applications can change configuration on the fly for some variables Whenever a variable changes, those watching a node can receive the changed variable and make the correct changes Very useful for long-running applications that require the most up to date information

  21. Distributed Locks A means to have distributed processes retrieve a lock for some operation Throttled updating of database Your use case here! Exists in ZooKeeper's recipes directory and is distributed with the release -- src/recipes/lock

  22. Algorithm Define a znode to hold the lock, say /dlock 1. mypath = create( /dlock/lock- ), with the sequence and ephemeral flags set 2. children = getChildren( /dlock ), no watch 3. If mypath has lowest number suffix in chlidren, exit 4. Call exists() on node from children with next lowest sequence number with the watch flag set 1. i.e., if mypath is /dlock/lock-6 and children contains 3,4,6, 7, call exists on /dlock/lock-4 5. If exists is false, go to step 2 6. If true, wait for watch trigger before going to step 2

  23. Distributed Queues A means to allow clients to asynchronously add elements to a queue and have a single processor application dequeue and process them. I can t remember the last time I needed a queue Maybe you have a few

  24. Algorithm Designate a znode to hold the queue, say /dqueue Enqueue: create( /dqueue/queue- ), with sequence and ephemeral flags set. Returns a real path node /dqueue/queue-X, where X is a monotonic increasing number Dequeue: getChildren( /dqueue ), watch set to true Process these nodes with the lowest number first No need to call getChildren() until the current received list is exhausted If no children are in the queue, wait for watch notification before checking again

  25. Priority Queue Extension Two simple modifications to this algorithm! When enqueuing, pathnames ends with queue-ZZ, where ZZ is the priority of the element Lower the number, higher the priority When dequeuing, if the watch notification is triggered on the /dqueue node, client needs to call getChildren() again and resort by priority.

  26. Other Recipes Group membership Barriers Two-phased commit Leader Election

  27. Apache Curator "Curator n kyoor t r: a keeper or custodian of a museum or other collection - A ZooKeeper Keeper. Contains: Recipes Framework Utilities Client Errors Extensions

  28. References http://zookeeper.apache.org http://curator.apache.org

More Related Content