Local-first database: RxDB + PouchDB

May
12,
2020
·
local-first,
offline-first

PouchDB, “The Database that Syncs!” is probably the most well-known local-first database solution. It began (I believe) as a JavaScript implementation of Apache CouchDB, but now the home page says it is “inspired by Apache CouchDB” so maybe it has diverged somewhat. At any rate, it’s another software library that I keep coming back to, because it has such a solid value proposition.

RxDB is a reactivity-focused layer on top of PouchDB that adds a lot of nice affordances and documentation, and has also seen a lot more recent activity than PouchDB, so I’ll be evaluating the two together.

I want to mention hood.ie here too; they’re one of the earliest examples I can remember that worked toward a truly “offline first” experience, way back in 2013. Unfortunately the project appears mostly dormant at the moment, but they have certainly given me (and I’m sure many others) a lot of inspiration in this space. If you haven’t read their Dreamcode blog post, it’s well worth a look.

I’ll be using my local-first database criteria for this evaluation. You can also look at the evaluations of gun-js and remoteStorage.js to see how they compare.

Correctness

  • How are conflicts handled?

    Conflicts are not automatically handled – they must be dealt with by the client using bespoke conflict merging code. If two clients change the same document, one of them will win (via some deterministic choosing algorithm). A client must then explicitly fetch the conflicts for a given document to resolve them. I haven’t seen much discussion of conflict resolution strategies, but it’s possible that most clients just ignore them, as that’s the default behavior. I’ve thought about using a state-based json-crdt for each document, which would allow a client implementation to automatically resolve conflicts in a way that better preserves intent, but I haven’t attempted this. One option that’s mentioned in the docs is delta-pouch, which “stores every change as its own document”, and then reads out those changes to construct the “current” state of a document. I’m a little wary about the “production-readiness” of this plugin, though, as shoehorning a delta-based system onto pouchdb seems like it would introduce some pretty significant performance penalties.

  • How “bullet proof” is it? How easy is it to get it into a broken state (e.g. where different clients continue to see inconsistent data despite syncing)?

    PouchDB (and CouchDB) have been used in production for many years, and I assume they are quite robust in their syncing strategies. I haven’t observed any broken states in my testing.

  • Is there consistency verification built-in, to detect if you’re in a broken state?

    I’m not sure; I haven’t found any documentation of the “inner workings” of the synchronization protocol that’s used.

  • How well does sync preserve intent? In what cases would a user’s work be “lost” unexpectedly?

    The out-of-the-box behavior is to not handle conflicts, so if two clients change the same document at the same time, one of those is lost. As described above, you can implement custom conflict resolution code to mitigate this.

Cost

Storage

  • How much data does the client need to store to fully replicate?

    It looks like the client stores the latest version of each document, and then a “change log” recording changes to documents that happen while the client has an active connection to the server. The change log has a default max length of 1000, and specifying too small a limit could result in spurious conflicts.

  • How much data does the server need to store?

    The server maintains a complete change history of all documents.

  • How complicated is the server logic?

    There’s quite a lot going on in the server – CouchDB is a large and sophisticated database with a lot of features.

Code / implementation

  • RxDB

    • tests: 779 nodejs tests & 666 in-browser (karma) tests

    • coverage: not tracked (there were some indications in the source code that coverage was tracked at one point, but I couldn’t get it working).

    • community: 2 contributors in the past month

  • PouchDB

    • tests: 2804 tests passing / 320 pending

    • coverage: 100% statement coverage (97.5% branch coverage)

    • community: 2 contributors in the past 6 months

Other notes

A common complaint I’ve seen with pouchdb is “long startup time” due to having to replicate the entire change history, but it looks like that has been fixed now.

Flexibility

  • How does it react to schema changes? If you need to add an attribute to an object, can you?

    RxDB has client-defined schemas with validation, as well has support for data migration when your schemas change. These are both layers on top of PouchDB, which does no schema validation.

  • Is the shape of data restricted to anything less than full JSON? e.g. are nested objects, and arrays supported?

    Full JSON is supported.

  • Can it be used with an existing (server-side or client-side) database (sqlite, postgres, etc.) or do you have to use a whole new data storage solution?

    There is an adapter for using sqlite as the backend, but I haven’t tried it. Generally the idea is that you’re using this new database instead of something else.

  • Can it sync with Google Drive, Dropbox, etc. such that each user manages their own backend storage?

    No.

  • Does it require all data to live in memory, or can it work with mostly-persisted data? (such that large datasets are possible)

    No, data is loaded into memory on demand.

  • Does it support e2e encryption?

    There’s a plugin that encrypts your changes before replicating them to remote peers, and decrypts data that is received. This is done using a single secret key (as opposed to a keypair), and you’d have to take care of key management yourself.

  • Is multi-user collaboration possible, where some users only have access to a subset of the data? (think firebase access rules)

    Yes. Couchdb has a sophisticated security setup that makes this work.

  • Is collaborative text editing supported?

    Not that I could find.

  • Does it have the concept of “undo” built-in?

    You can query past revisions of a document and use those to “revert” to previous versions, which is similar to an “undo” feature.

  • Does it support a fully p2p network setup (no central authority / server)?

    Potentially? There’s a 7-year-old plugin that is working in this direction, but it’s certainly not a well-established use-case.

Production-ready

  • Is it being used in production?

    Definitely.

  • How well does it handle offline behavior?

    Very well.

  • Does it correctly handle working on multiple tabs in the same browser session?

    Yes!

  • Does it bake in auth, or can you use an existing authentication setup?

    The express-pouchdb server has authentication/authorization functionality built-in, but you can disable it and do your own thing if you want. The frontend (pouchdb/rxdb) doesn’t handle authentication out of the box, but a community plugin adds a nice db.logIn function that takes care of that for you. Sign up is something you still have to build yourself on the client, however.

Conclusion

PouchDB is certainly the most battle-tested solution that I’ve encountered, with lots of production users over many years, and RxDB adds a nice fresh interface over the top. The two main downsides I see are lack of support for automatic conflict resolution (which makes it much more difficult to use in a highly collaborative app) and lack of support for real-time collaborative text editing. But, if your app doesn’t rely heavily on collaboration, PouchDB is probably your most reliable option.

Please drop me a note on twitter if there’s anything I should add or correct!