Data Storage

Several ZooKeeper daemons can be started on different machines. Clients can connect to any daemon in the cluster; the clients will always see the same view of the ZooKeeper world regardless of which daemon they connect to. User data is stored in objects with a hierarchical addressing system similar to that used by a conventional file system. It has a root object named /, and additional nodes can be extended off of this in a tree-like fashion. Each node of the tree can both hold data (i.e., act like a file) and have child nodes (i.e., act like a directory). The amount of data that can be stored in an object is small: there is a hard cap of 1 MB. The reason for the cap is so that the entire data store can be stored in the RAM of the ZooKeeper machines.

This allows requests to be dispatched with high throughput. Changes are written to disk to provide permanence, but read requests are entirely handled by the data cached in memory. This is usually not a major limitation; the data stored at a node is intended to be used as a pointer. For example, ZooKeeper may know about the filename in another conventional file system, which contains the authoritative configuration file for a distributed system. The distributed system components first contact ZooKeeper to get the definitive filename, and then fetch that file for the configuration.