The NoSQL "Family Tree"

By Sam Bisbee

A few weeks back, one of our marketing teammates caught me explaining the NoSQL product landscape to some new employees, and they thought it would make a pretty infographic. I use this diagram a lot to help customers and business partners understand some important NoSQL basics:

Create a free Cloudant account and start the NoSQL goodness

NoSQL arose from "Big Data" (before it was called "Big Data")

During the late 1990s and 2000s Google, Amazon, and Facebook were growing through the roof. There were no commercial or open source databases capable of supporting their growth, either in scale (data volume and number of connections) or in the variety of data structures they processed (web logs, product catalogs, full-text, etc.). So they invented their own, and thankfully wrote about their successes so that others could build on their shoulders.

As you can see in the diagram, people used these ideas in different ways to create many of today’s popular NoSQL databases. For example, Apache CouchDB™ borrows from Google's MapReduce white paper, and Cloudant borrows from Apache CouchDB and Amazon's Dynamo white paper (among other things). Others, such as MongoDB, sprang up independently of the big web thought leaders.

NoSQL is not "One Size Fits All"

The color coding in the diagram highlights the fact that NoSQL products evolved to meet specialized workloads. They essentially divide into analytic solutions, like Hadoop and Cassandra, versus more operational databases like CouchDB, MongoDB, and Riak. Analytic solutions are very good at running ad-hoc queries in business intelligence and data warehousing apps. Operational databases excel at handling high numbers of concurrent user transactions.

That's not to say these solutions aren’t used for multiple purposes. One of our customers, Novartis, described using Cloudant in a data warehousing application. Another example is Cassandra, which has typically blurred the line between operational and data warehouse use cases, often leading to uncomfortable fits.

Vendor-driven versus Community-driven NoSQL

This is the last distinction I’d like to make. Projects like Apache Hadoop, Apache Cassandra, and Apache CouchDB are developed by a community of both people and vendors, requiring symbiotic relationships. The projects are sustained, supported, and enhanced collaboratively. I prefer these projects because they are more immune to the product roadmap and licensing whims of single-vendor backed projects.

In Summary

Hopefully this will help those new to NoSQL understand the playing field a bit better. There are many other NoSQL products, and NewSQL products, not pictured here. I only included the ones I hear about most often. If you’re looking for additional information on the NoSQL landscape, here are some resources I recommend:

"Apache", "Apache CouchDB" and "CouchDB" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.

Sign Up for Updates!

Recent Posts