My friend (and CouchDB committer) Chris just posted an excellent overview of the application-hosting potential of CouchDB on his blog. My first response was: okay, you’ve convinced me. Post-election, I’m porting the minimal Sinatra app backing Misfict to CouchDB, since it’s really just a minimal JSON storage engine at its core.
My second reaction was to find it a bit funny to see E4X making an appearance in this day and age; like most XML-centric tech, I had sort of assumed that the coming of JSON and YAML had sort of killed it, at least amongst the web-dev early adopters. It guess it just goes to show that everything old is new again, especially in the fast-moving world of web development tools.
Regardless, perhaps the most compelling picture Chris paints in his post is the idea of capitalizing on the off-line replication features of CouchDB to allow groups of people to separately work on a collection of documents, then merge their changes together at some point in the future. He leans heavily on a classroom metaphor, but I think the real potential may be more in the area of groupware and collaborative editing. Knowledge workers have been looking for the “holy grail” tool which combines the power of Word’s “track changes” with mixed on- and off-line authoring for a long time, and I think we’re finally building the infrastructure that will make that class of application relatively easy to build.
Looking over the CouchDB documentation, though, I still think there’s one major piece missing from their replication and conflict-resolution story: automatic merging of non-conflicting edits. Unlike a DVCS like Git, CouchDB still doesn’t (AFAIK) allow multiple contributors to edit different elements of a single document, and then commit those changes, without manually replaying edits from other contributors.
Since JSON is much more structured than raw text (which Git and other DVCS systems deal with handily enough), it seems tractable to examine potentially conflicting updates and to see if they’re isolated to different child nodes of the JSON document. Furthermore, given the degree to which CouchDB has already embrace the map/reduce model, I think you should be able to distill the conflict-resolution algorithm down to two steps: generate a “diff” in the map step, which just notes the original document ID and the changed attribute/subtree elements, and then a “reduce” which attempts to create a new document by applying those changes to the original document.
Regardless, I think it’s an interesting time to be involved in web development. The idea that you could grab just a subset of a larger data store, work with it both on- and off-line, then share your changes with a group of colleagues is a powerful one, and I applaud anyone (like Chris and the rest of the CouchDB team) working to make it possible.



