Wednesday, May 27, 2015

Is it 'NoSQL' or 'Not Only SQL' ?


It was a compelling realization to answer myself what NoSQL is. While attending a training session on 'Cassandra Performance Tuning', there was a mention that Cassandra was a NoSQL database. Even with a good experience on NoSQL databases like MongoDB (document), Neo4J (graph) and InfluxDB (time series), I never gave a serious thought on NoSQL concept. With CQL (just like SQL), Cassandra is just like any SQL (relational) database. That made me think what really NoSQL is.

On a serious reflection, we can find that it is never No SQL, but Not only SQL. But, with a detailed consideration, it is not at all about SQL. It's about data modeling and more of data principle. The normal SQL (relational) database has a tabular data modeling to accomplish ACID and JOIN with normalization. Moreover, due to the tabular data modeling, the structured query language can filter columns in a table. On the contrary, NoSQL database is distributed implementation, mostly compromising consistency in CAP theorem and sacrificing ACID. The main difference I noticed is that the NoSQL query filters only rows (not columns) in the big data. That's why generally people make the distinction between relational and non-relational databases (not SQL and NoSQL). However, there are some non-relational databases that support ACID and JOIN, called NewSQL databases.

There are many implementations of the NoSQL (non-relational) databases in terms of their data modeling. As each of them has their own use cases and a detailed analysis on them are not in our scope, few examples for the sake of it are Column (Cassandra), Document (MongoDB), Key-Value (CouchDB), Graph (Neo4J) and Multi-model (OrientDB).

In a nutshell, it's not about the query language being used, how databases are categorized, but about the modeling. When you have non-relational database with ACID, it's called NewSQL and otherwise NoSQL. In relational data modeling, the data storage is a black-box to the user, concealing the interns of the structure, giving an interface of a structured query language. On the other hand, non-relational (NoSQL or NewSQL) database exposes the data structure to the user, providing flexibility of interface and scaling. Beware! Freedom always comes with a price. The developers (users) are given power with more responsibility (need of knowing interns) and long learning curve!

1 comment:

  1. Thanks for all your information.Website is very nice and informative content.

    Cassandra Training Courses

    ReplyDelete

ഇന്ന്

ഇന്നലെ യെ കുറിച്ച് വ്യാകുലപ്പെടുന്ന നാളെ യാണ് നമ്മുടെ ഇന്ന്!