It took a long time, but a fundamental change has happened with how we store data. In fact, it is the definition of a “paradigm shift”. If you’ll bear with me, let me give some background and I’ll explain why this is such a huge game-changer.
For the past few decades, since the 1980’s, we’ve stored data in a Relational Database Management System (RDBMS) – like SQL Server, Oracle, DB2, mySQL, etc. This meant that the authority and responsibility for structure, data integrity, and referential integrity was put squarely on the database system. The user interface and any other business logic simply consumed the database. The “application” though, was really the structure and contents of that database – and the developer code was somewhat of an afterthought on how to present it.
With an RDBMS, the database IS the “application”.
That is the world we’ve known our entire careers. The database is the authority and enforces the structure. The UI and middle tier are unstructured, but do have to abide by all of the rules and boundaries of the database.
When Windows-based user interfaces (UI) came around, there would sometimes be a deviation to this. Sometimes the developer would start by creating a pretty UI, and then figure out what the database should look like. Regardless, the “buck stopped” with the database. Every other part of the system worked around the database.
A vast majority of applications are written by: starting with a UI and then figure out the database; or starting with the database, and then figuring out the UI.
Meanwhile, we met Bob Martin, Agile, and SOLID…
Although this programming model worked for several decades, it certainly wasn’t/isn’t ideal. Why? Because, anyone can crank v1.0 of an application – but by v3.0, the app became enormously expensive to maintain and eventually collapsed under its’ own weight – and then you’d re-write it. We’ve all seen this countless times and have accepted this as “normal”.
Imagine that same scenario applied to other engineering disciplines: imagine you sink money into building a bridge. After a few years, it starts getting weak and the architect comes back and says “this bridge doesn’t have much longer, we should tear it down and build a new one” – and you actually DID that, every several years with every bridge! It’s ridiculous – and having this as the model in software is equally ridiculous.
Your application should be getter better, easier and cheaper to maintain, with every new release!
Put another way, they prescribed a FAR better way to write code. In short, write your “application” in code. Use unit tests to test it. Optionally add-in a database and UI later, if you want. The “application” though, should exist in code.
This represented, for the first time, a shift of responsibility from the database – to the code. They are suggesting that CODE be the “application”. Before the DATABASE was the “application”.
Competing for Authority and Responsibility
When you apply this “new way” of writing applications with our old world of RBDMS’ – you end up having to write a lot of extra code. The problem is, we want our “application” code to now be the authority for our data. We want to control the structure, the dependencies, the business logic, etc – but then we also have to store that data in a database which ALSO has it’s own rules on integrity, atomicity, etc.
This meant that if you wanted to write your application the new, better way, following SOLID, Agile, etc – you’d have to work around having two competing authorities: 1) your code and 2) the database schema.
Since we changed the model to having the “application” exist in code, we are now competing for authority and responsibility over the data, with the database.
What is NoSQL?
NoSQL is a highly-scalable database platform that works in a fundamentally different way than a traditional RDBMS. With an RDBMS, the responsibility for the structure and references to other data, was given to that database system. With NoSQL, it has no such authority. There are no tables, no relationships, no schemas, no stored procedures, and no structure.
“Well, that’s stupid – and it sounds like a horrible idea!” is what I thought, at first.
I initially thought that “Oh, this is just more of this dynamic and weak-typing junk that we see being all the rage lately.” It might be, but it is also a huge, huge benefit for at least 2 significant reasons:
- It is far easier to scale-up and scale-out than any RDBMS that I know of. You simply cannot scale an RDBMS to handle millions of users; you can scale NoSQL to handle that kind of load.
- It shifts the responsibility for the “application” somewhere upstream, because the database isn’t doing it anymore.
Wait a second, just a minute ago we were saying that if we wrote applications the better way (using SOLID, Agile, etc) – that the “application” would exist in code. The problem currently is that the “application” has all of the structure, referential integrity, input validation, etc – and it was competing with the RDBMS, who also wanted to play that role.
Well, now we have a database system that isn’t like that anymore. We have this system that simply stores whatever you give it, in whatever format you decide. It doesn’t care about the structure, the content, or anything. It basically has general constructs of: 1) a database and 2) a collection, within a database – which is somewhat akin to a “table”, but doesn’t need to be. It’s simply a storage area that you access by name, but where you can store whatever in it, without restriction or validation of any sort.
MongoDB (one NoSQL product, amongst many)
I plan to do a handful of blog posts on how to setup and use MongoDB from the .NET platform. I choose MongoDB as that seems to be the most popular, by far – it’s well-documented, and it has a vibrant developer community.
Tying it all together
So what does all of this mean?
For the systems manager – it means a new product to learn. I’ve messed around with the MongoDB server a fair amount and even set up a replica-set (load-balancing) and it’s pretty straight-forward, easy to script, and easy to manage.
For DBA’s – it means a new product to learn. This does away with “data modelers” or anyone who does anything with T-SQL. The database is now simply a data store who simply stores data that is in the structure of the whim of the developer. The database is now just a warehouse that holds “stuff” – “stuff” that will make sense to the developer when he/she queries it or manipulates the “records”.
For developers – this is the exciting part. This means that you can finally write code the “right” way (using SOLID, Agile, etc) – and have a much more appropriate back-end for that style of application construction. It means far, far less database code – and it transitions the responsibility and the authority of the structure of the data, to your code.
Hopefully this helps make some sense of this. This is a fundamentally different way to look at a database because the database now has a fundamentally different role. It’s role used to be, to be the authority and enforcer for data and referential integrity. Now, with NoSQL, it’s simply a very loose mechanism to store data from the NEW authority, the application – which is code written by the developer. He/she now determines the structure and integrity of the data.
There will be some new posts in the new few days to cover how to get this set up – and how to use it from .NET code.