Local-first database: gun.js

May
1,
2020
·
local-first

Gun.js is an open-source project I’ve had my eye on for several years now, every once in a while checking back to see how it’s progressed, and whether the project I’m working on at the time is a good fit for it.

Gun stands out as one of the few players that’s actually delivering a peer-to-peer / decentralized system that works on a large scale.

Using my criteria for a local-first database, here’s how it stacks up:

Correctness

  • How are conflicts handled?

    Looks like they’ve got a CRDT setup where they denormalize any nested data structures, so you can modify one attribute on one client and another attribute on another, and the changes will be merged. For a given atomic attribute, it uses “last-write-wins” (for a best-effort definition of “last” based on a hybrid logical clock).

  • Is there consistency verification built-in, to detect if you’re in a broken state? (e.g. merkle tries or similar hashing method to ensure that peers’ data haven’t been corrupted somehow)

    I couldn’t see anything in the docs about this, and couldn’t figure out anything from the source code. I asked about this in the community chat, and apparently there’s some hashing that’s used to determine when to “stop syncing” between two peers.

  • How well does sync preserve intent?

    From their CRDT docs it looks like they use a hybrid logical clock, so intent should be preserved pretty well.

Cost

Storage

  • How much data does the client need to store to fully replicate?

    The client stores a copy of the data it has used (which could be a subset of the full graph), and also keeps track of “unsynced changes”.

  • How much data does the server need to store?

    O(size of the data) - which means the server doesn’t retain a history of changes, and so is able to keep size down.

  • How complicated is the server logic?

    It appears there’s a fair amount of communication back & forth, negotiating what nodes need to be replicated, as well as connecting peers to each other – “relaying” along data that one peer needs that another one has.

Code / implementation

  • tests: npm test runs 127 passing tests

  • coverage: not reported

  • community: 4 committers in the past month, pretty active discord server.

  • The code is written in a quite idiosyncratic style, such that I had a good amount of difficulty desciphering it 😅. It doesn’t use consistent formatting, and makes extensive use of mutation, single-character variable names, custom async execution flow, etc. Additionally, some core parts of the code depend on accessing & mutating (monkey-patching) global objects. Now, aesthetics certainly isn’t everything, but code that’s readable, well-commented, etc. lends a level of “production-ready” polish to a project that I really appreciate.

  • A community member has created a typescript rewrite of gun.js, using async/await, more modern formatting, etc. I don’t know if there are plans to give it official status though, or how complete it is.

Flexibility

  • How does it react to schema changes?

    It puts no restrictions on schema changes. This also means there’s no schema validation in place, however.

  • Is the shape of data restricted to anything less than full JSON?

    Yes, no arrays. GUN has the concept of a “set” built in, which works as a kind of “unordered array”, but you need to “add” each item individually, and the items are required to be unique, and of course you have to figure out how to sort them yourself. (Further investigation turned up this experimental project which appears to be working toward sorted-array support in Gun, but it’s certainly not in a reusable/production state at the moment).

    Additionally, GUN enforces denormalization, so when you retrieve an object that has “nested” objects inside of it, you need to make another async call to fetch the nested object. You can optionally include a script to give you a .open() function, which does the extra calls for you, but the docs steer you away from it (if your “graph” is deeply nested, recursively loading everything can be quite expensive).

  • Can it be used with an existing database?

    The “storage backends” that are supported out of the box are flat files (for nodejs) and localStorage or IndexedDB in the browser. It looks like you’d be able to write storage adapter for sqlite / postgres / etc., but you’d be blazing your own trail.

  • Can it sync with Google Drive, Dropbox, etc.? (Such that users “bring their own storage backend”)

    Nope, it requires an active server.

  • Does the client implementation require all data to live in memory, or can it work with mostly-persisted data?

    With the default (localstorage) persistence, all data lives in memory, but apparently (again, not documented but I asked in the community chat) with other persistence backends it doesn’t need all data to live in memory.

  • Does it support e2e encryption?

    Yes! By default, data that’s added under your “user” sub-graph is cryptographically signed, but publicly readable. You can optionally encrypt the data & decrypt it as well with your user’s keypair. Depending on how you encrypt, it could prevent “merging” your objects however (for example, if you encrypt whole objects as in this example, the system can’t merge encrypted objects).

  • Is multi-user collaboration possible, where some users only have access to a subset of the data?

    Yes, but you’ll be implementing a lot of custom encryption / key management logic yourself. Out of the box, GUN gives you user auth, where each user has a public and private key which can be used for encryption. If user A wants to allow “read” access to user B, one way to do that is to get user B’s public key and use that to encrypt the data (see this remarkably in-depth cartoon explainer for more on that). However, if they then want to give read access to user C as well, they would need to re-encrypt it with user C’s public key. Alternatively, they could generate some shared secret, use the secret to encrypt the data, and then send the secret, encrypted with the corresponding public keys, to users B and C. As you can imagine, this gets complicated quickly, and requires a decent amount of specialized knowledge to get right.

  • Is collaborative text editing supported?

    Not currently. There are a number of proof-of-concept-level implementations, but nothing reusable that I could find.

  • Does it have the concept of “undo” built-in?

    No. Change history is not saved.

  • Does it support a fully p2p network setup?

    Yes, that’s what it’s designed for.

Production-ready

  • Is it being used in production?

    Yes! There are a number of gun-based applications being run in production, most notably notabug (a reddit clone that’s resistent to censorship/moderation) and Iris (a social networking / social media app). The discord bot claims “8M users”, although I had some trouble verifying that number. Archive.org is listed (part of their dweb initiative) but that doesn’t appear to be running anymore. dtube is also listed, but gun was removed as a dependency in March, and it doesn’t look like it was used in the source code since 2018 or so (apparently the feature they were using gun for didn’t get enough usage to justify the maintenance). HackerNoon is apparently also using gun in some capacity, but their site isn’t open source so I couldn’t verify that.

  • How well does it handle offline behavior?

    Quite well! The local instance only replicates the parts of the graph that you’ve accessed, so when you’re offline you can only read things you’ve already read. But you can update & add things without limit.

  • Does it correctly handle working on multiple tabs in the same browser session?

    Not when disconnected from a server (remote peer). When connected to a server, updates between tabs propagate correctly, as each tab is synchronizing independently with the server.

  • Does it bake in auth, or can you use an existing authentication setup?

    Auth is included as an integral part of the system. Some interesting aspects of the authentication system is that usernames are not guarenteed to be unique, and there is no established recourse for lost passwords (see this FAQ page for some discussion of possible remedies, but there’s nothing established as far as I can tell). If you wanted to have a “private graph”, you could probably gate the synchronization with some custom authentication, but it’s clearly not something that’s been designed for.

  • How is the documentation?

    There is a lot of documentation, but there are some indications that it’s not complete (I came across several API methods that weren’t documented, but appear to be public, for example).

Other notes

Under the default configuration, a “server” (relay peer) doesn’t put any restrictions on what data it will replicate, meaning that any developer could point their app at your instance and store data on it. Some apps have added a layer of security on top of GUN – for example, here’s a community plugin that extends Gun to reject data that doesn’t match a given schema. This does go somewhat against the “vision” of Gun, however, where the idea is to have one universal connected graph instead of many small walled-gardens.

Conclusion

Gun is much more than a database; I don’t think it’s an exaggeration to say that Mark Nadal is working on a whole new internet, a global decentralized peer-to-peer graph that requires and creates a new and different paradigm. So, if you’re looking for a normal database, this might not be the one for you 😉. On the other hand, if you’re looking for a decentralized, peer-to-peer data management system, Gun is just about the only player that is achieving the vision, in production, and on a large scale.

The biggest caveat I’d give for working with Gun is: be prepared to implement a lot of things yourself, and bend your mind to a new distributed, p2p-graph paradigm. Gun provides an impressive set of underlying primitives, and there are lots of folks building very cool things with them, but, as with many “cutting edge” technologies, you’ll need to be prepared to get a few cuts, and get your hands dirty molding it into the application platform that your app needs.

Don’t let that put you off, though! You’ll find the community extremely welcoming and helpful in your journey – especially the “BDFL” Mark Nadal, who is enthusiastic, passionate about the project, and definitely in this for the long haul.

Please drop me a note on twitter if there’s anything I should add or correct!