Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Despite having used document oriented databases for many years(largely because they were shoved down my throat and I inherited someone else's architecture), I never really managed to figure out why people find them so compelling. There has been a shift in the last two years and people have started running away from them. Specifically the web-dev crowd adored them and I guess it's easy to fetch a document in the exact structure you need it but sooner or later you inevitably reach the point where you have to analyze data. And here mongo(and all the similar alternatives) become the biggest pain in the a...neck you can think of. Couchbase tried to tackle this issue with n1ql to a certain degree but at large scale it is still not particularly useful. To my mind, having a relational database which has a good architecture can't be matched by any document oriented database. But getting a large system/database right does take more effort. There are numerous ways to make relational databases incredibly scalable but again, it takes a lot more effort.


There was a time where adding a column to a database was a really big deal. You had to get it past the DBA, and there were real resource constraints on the database system. With a document store the schema is entirely in the hands of the developer.

Also JSON became the standard way to ship data around, and RDBMs systems of the time couldn't really handle JSON. So you either write a bunch of code to map complex nested JSON to relational tables, or just dump it into an un-indexible text column.

There was vendor hype, just like there was around Object databases in the pre-internet days.

If you were starting a new project you needed to decide if you were going to use a document store and an RDBMS or just on or the other. If it was just one you would choose a document store if you anticipated you would need to handle a lot of unstructured data.

Today the situation is revered. A document store only does documents well. A good hybrid database like postgres gives you the best of both worlds. Throw in hosted database services and resource constraints are much less of an issue. So people aren't running back to an old school RDBMS. They are moving to a much superior and evolved data store.


I fairly recently _really_ started to understand how important historical reasoning and understanding is in the context of software, technology and science. Your comment is a great example. Tech developments, choices, trends and so on only really make sense in the context of history. And often we forget about history, start to reinvent things or even steer into a completely useless direction because we don't apply temporal reasoning or simply don't learn from the past.

Another benefit of this kind of approach is starting to learn about a challenging subject. Say you want to deepen your knowledge in a branch of mathematics that you find interesting and useful. The history of that branch will tell you so much more than a typical lecture-style conglomerate of concepts. It provides a great overview of important actors, their relationships, cause and effect of discoveries, the culture, the problems and so on. On top of that it is easier to remember and internalize concepts if you know the story behind them.


That's only partially true. With document schemas, you simply eliminate the DBA since whatever you put in there is entirely up to you. In all fairness I've never dealt with DBAs - I've always managed to get a technological freedom and be able to design and organize my databases in whichever way I see fit. I'd generally hate to have to ask someone to clone a table for me or whatever.

JSON is the standard way to ship data around the internet, yes. Though grpc is catching up and more and more often I see people relying on grpc in their architecture. And grpc conceptually is a lot closer to RDBM, given that you have a code generation step and everything in your data needs to be defined(aka statically typed).

Recently I started several personal projects and though I struggle to find time and motivation to work on them on my own, document related databases are completely out of the question. postgre and potentially redis as a proxy for heavy loads and that's that. I wouldn't call postgres a hybrid database. It does support json datatypes natively but in it's core it is the definition of what RDBMs are. The best example for a hybrid database(from a developer's perspective since it isn't open source and I do not work for google in any shape or form) is spanner.


The DBA problem you descibing is not in database system, it's in DBA. You can have the same freedom with relational database. Or, for some reason, you can also put a DBA between you and Mongo, who won't give you change the schema of your JSONs (you do have that schema somewhere, just not managed by Mongo).

I've worked on plenty of both SQL and Mongo projects, and honestly the process around schema migration is pretty much the same. Just for Mongo you write it in the code instead of SQL.


> There was a time where adding a column to a database was a really big deal. You had to get it past the DBA, and there were real resource constraints on the database system. With a document store the schema is entirely in the hands of the developer.

That time is still here if you're running enough read nodes and QPS.


> why people find them so compelling

My theory is that it's easy to add a field by adding logic into the app instead of munging tables relationships. Moves the logic to where developers are more comfortable. Scalability/etc is irrelevant for most use cases anyway.


> Scalability/etc is irrelevant for most use cases anyway.

I literally can't parse what you mean by this


Most apps don’t get a lot of traffic


I'd guess 95% of what we use databases for; actual performance of the database is irrelevant.


It's amazing for three things: search, logging and draft records.

Search, with mongodb can do $all query, which is hard to replicate at sql level without aggregation. However I'm still waiting for aggregate-level $elemAt.

Logging, you can attach anything to a property, then it'll be queryable.

Draft records, it's easy to just insert and insert the records because it's schema-less. Validate during creation and validate again during publishing or approval. It's queryable and you can use a generic collection for that.

For logging and draft records, sql JSON field may be able to handle them, though I don't know how good it is at querying.


4 things, you missed _crazy_ fast analytics.


I have found myself enjoying using a document database as the online store, and then using a 'big data solution' (we use Presto) for any analytics queries later.

Traditional migrations for relational databases are really painful. Document databases make this much easier, and if you've faced the operational pain of needing to migrate a large database (for example, it's so easy to accidentally lock an entire table in Postgres), you might be pretty compelled.

(That said, I think the pendulum is swinging back away from document databases. So you're in luck ;))


How do you normalize your Json structure so it can be queried? Do you enforce a schema on your JSON or do you morph it on export into a common structure.

If you enforce a schema on the Json structure how do you handle the changes on the live system?


I think the parent is arguing that it's hard to do useful big-data analysis on highly nested structures of data. When your database is storing some immense JSON blob, it's hard to write SQL against it.


I have found 2 use cases, one of which I've never actually seen in the wild.

The most common use case is, "I need to store data where the schema is unknown or can change without notice, and have my shit not break." This is what we used Mongo for.

The other use case I could see (and this is pretty much only with Dynamo) is, "I want to build an application that's cross-region native. Most of my data is relatively static, so I accept eventual consistency on changes. I will have a separate data store for transactional data and data that cannot be eventually consistent." I want to build this project, but it will never happen because it's too easy to RDBMS in a single region to start.


> Despite having used document oriented databases for many years(largely because they were shoved down my throat and I inherited someone else's architecture), I never really managed to figure out why people find them so compelling.

Well, filesystems are pretty good. It's the only document store I use (and mostly enjoy).

But then you look at the trade-off with some think like just Maildir, and you really start to wonder if this schemaless document store thing is so great?

I suppose the real shame is that proper object dbs like zodb or gemstone gets much less attention - they to have big trade-offs - but I feel they at least give back in terms of consistency and simplicity.


> I never really managed to figure out why people find them so compelling.

This might sound jaded but my feeling is that a lot of developers just looked at JSON objects that they were already working with and thought to themselves "actually, it would be cool to just store this directly".

Which, in itself, isn't a bad idea but writing a completely new solution from scratch to a problem that's been solved for decades seems a bit like hubris.

AFAIK many relational databases support JSON today, so I'm not sure what the argument would be to choose something like MongoDB today from scratch if you had the choice of anything.


A half table oriented half document oriented solution is just odd. I’ve worked with JSONB in Postgres. There are many reasons to dislike mongo but “your can do that with PostgreSQL “ isn’t one of them.

Querying nested values is is nothing like Rethink or Mongo.

Keys could be rows or keys.

You’re still having to make one to many relationships for something that should just be an array.


We are using mongo specifically because it makes it easy to do analytics on large datasets quickly.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: