I was reading this piece from Martin Kleppmann asking us to stop using the CAP theorem to talk about distributed systems. It’s of interest to me because we often see this applied to databases, and it seems I’ve seen database platforms described as fitting along one of the sides of the CAP triangle.
However the complaint is good, although tough, reading. As I read through it, a number of the concepts and tradeoffs, and concerns are similar to what I see discussed in Azure systems, or any “cloud-like” distributed systems. We expect consistency and availability, as well as scaling through partitions. However, it seems that we are always trading consistency (what Keppmann notes is really linearizability) and availability in the data world. We simply can’t guarantee both of these things in a data world.
Or can we? At some point, perfection isn’t required, and we can tolerate some level of inconsistency in the results clients might receive. I’m sure many of you have dealt with this in the past, and perhaps even gone so far as to add the report execution time to outputs to reduce client complaints. If two sets of data are compiled at different times, most clients tend to understand some discrepancies.
Certainly many of us are starting to consider using database platforms that might not work like SQL Server. In those cases, availability and scale is often touted as the reason to abandon a RDBMS. On one hand, the lack of linearizability across nodes is often tolerable, and many of our businesses aren’t affected dramatically by them. On the other hand, if you can tolerate some delays in nodes, than perhaps SQL Server can work for you with multiple nodes in a replicated or AlwaysOn scenario.
Distributed systems are hard, and certainly your decision shouldn’t be a simple one when it comes to choosing an architecture. However it also helps to think deeply about the problem in terms of not only the CAP theorem, but also the in terms of practical measures such as latency and practical failures.