Notes from Zhonghua
This paper proposed a service used by Yahoo! for coorinating processes of distributed applications. We may see this service as an enhanced file system with key/value based storage ability and per client guarantee of FIFO execution. This service is "wait-free" because that each client stores a replica of the informations and updates them with "push" method. ("watch" in the paper. On every changes of a znode(value to a key), the message is pushed to the clients). When writing, the write request is sent and executed in sequence. (So there maybe some client's value not updated on time. This may cause the value obtained from these clients invailidate. The author solved this by adding a "sync" read which require all writes before it being applied. This is something like the "flush" operations but may lead to a problem of "non-wait-free".)
1, When the system scales up, there may be some chances that one znode becomes a hot spot and quite a lot of clients are watching it. In this circumstance, sending messages to all the clients whenever the znode is updated may introduce too much overhead.
2, If some clients are expecting the value of some frequently changed znode to become some value. Saying that, if the znode represents some events. Some clients want to execute some code when event 5 happens, and some are waiting for event 19, etc. Then the event code should be broadcasted to all the clients no matter whether it actually need this event. Then all the nodes should come to read the value and set the watch again. If we may add an API of event-driven watch that if the value of a znode changed to an expected value, the client will revceive an notification. Should it be better?
3, In section 4.3, the author mentioned the snapshot for recovering the states after a crash. The fuzzy snapshots is a good way. But the author claimed that "since state changes are idempotent, we can apply them twice as long as we apply the state changes in order", may it work for "sequential" znodes? If we apply a updating on sequential znodes twice, the the znode should be different because it should "have the value of a monotonically increasing counter appended to its name" (as in Section 2.1)
4, In section 2.2, "Each request instead includes the full path of the znode being operated on. Not only does this choice simplifies the APIs, but it also eliminates extra state that the server would need to maintain". This is quite similar to the stateless design in FS.
2, Develop a basic service and let the user develop there own primitives is a good choice. But how shall we justified it is enough in to present other primitives? Or enough to present some kinds of primitives.
Referenece 6 of the paper:
M.Burrows. The Chubby lock service for loosely-coupled distributed systems. In OSDI'06.
http://baijia.info/showthread.php?tid=59...http://baijia.info/showthread.php?tid=59&highli