In the past, developing web applications meant using SQL. For those relying on relational data, this was no problem. But for those with massive amounts of data, this was like steering a barge – a bulky solution, creating drag-on queries when the intended goal was speed and availability. As Whiteboard looks at the architecture of a site, we have many options from which to choose.
With the oppression of limited Databases came a rebellion, and with that rebellion came a movement. In this case, the NoSQL movement, which arrived with a myriad of motivated programmers caused a pendulum swung that cranked out new opportunities…most of them open-source.
Those opportunities have clever names, and were created by a host of wild enthusiasts to handle a huge quantity of data, especially when the data’s nature does not require a relational model. They are Mongo, Cassandra, Riak, Redis, Couch and Neo4J to name a few.
Cassandra (Apache Cassandra), for example, is a NoSQL solution that was initially developed by the people of Facebook as a hybrid database management system that allows for tunable consistency goals. This means that a query may provide different results from different angles, but it is widely available to users – and fast.
Which brings us to CAP Theorem:
In computer science, the CAP Theorem says that it is impossible for a distributed computer system to achieve these three guarantees at once:
A. Consistency (C) – all nodes and queries see the exact same data at the same time.
B. Availability (A)– 100% uptime.
C. Partition tolerance (P) – the system keeps going even when message loss occurs in part of the system.
To try all three would be like placing child seats in a race car, which of course is built for speed, not a daily shopping trip. To try, you would need to dial down your speed, therefore defeating the purpose of having a race car.
Cap Theorem suggests that to gain A, one may need to sacrifice C. To gain C, one may need to sacrifice A and so on…
- SQL allows C and P, but decreases A.
- Riak focuses on C
- Mongo gives A and P while, some say, decreases C.
These are debatable assertions, and often dependent on the programmer who is turning the knobs. But even Birmingham’s own MongoDB claims weakness, as its focus is on flexibility, power, speed, and ease of use, while sometimes sacrificing “fine-grained control and tuning, and overly powerful functionality.” Still, it is the rock star of the NoSQL movement and is now being used by SquareSpace, Craig’s List and MTV.
We often use CouchDB, at Whiteboard-IT. Jacob Kaplan-Moss, author of “The Definitive Guide to Django,” claimed here,“Django may be built for the Web, but CouchDB is built of the Web.” As the web is our native environment, CouchDB is the most natural tool for us to use.
The NoSQL movement is has great momentum, though it has earlier roots. Lotus Notes, for example, was forced to write their own database in 1985, which they called NSF (Notes Storage File). Founding member and former CEO, Tim Halvorsen was NoSQL when NoSQL wasn’t cool. He says,
“…we created it from scratch. At the time, I looked at some of the databases out there (e.g. dBase, etc), and they were all too limited for what we needed. So, we wrote our own. Its a “document database”, not a relational database, with each “document” (aka record) having a variable number of fields. No schema – each record was self-contained, but they could also be indexed (which any database must be capable of).”
History was made and even CouchDB is based on the work accomplished by Lotus Notes.
So – there are many options from which to choose, and if your web designer goes to SQL straight away, it might give reason to ask if others have been considered. Depending on your requirements, you may have another need…the need for speed.