On the 'eventually consistent' point, just wanted to point out that MarkLogic does NOT use the 'eventually consistent' model. Instead MarkLogic uses MVCC and is fully serializable when it comes to transactions. In addition, MarkLogic's indexes are updated within the transaction boundary, so as soon as the transaction commits, you know the journal has been updated and the indexes are updated too.
Combine this with MarkLogic's ability to save any arbitrarily complex search as an alert, and you can power many interesting mission critical information sharing applications.
I have a blog entry that describes this: http://adamfowlerml.wordpress.com/2013/11/25/marklogic-huh-what-is-it-good-for/
Also around Serializability: http://adamfowlerml.wordpress.com/2013/01/25/true-acid-compliance/
[Yes, I work for MarkLogic! I work here because it's an awesome piece of tech. ]
FoundationDB actually did a pretty good job of separating out the truth from the hype recently in their post about NoSQL and ACID Compliance: https://foundationdb.com/acid-claims#!
Doesn't mention neo4j specifically - don't think the article authors considered graph/triple stores in their article - but it's a good read nonetheless. The article mentions MarkLogic too. Added my comments on it here: http://adamfowlerml.wordpress.com/2013/11/17/foundationdb-joins-acid-transaction-crusade/
FWIW MarkLogic 7 also is a triple store (not graph store), including support for the W3C Graph Store protocol and SPARQL. http://www.marklogic.com/what-is-marklogic/marklogic-semantics/
[Yes, I do work for MarkLogic - and proudly so!]
As someone who has done databases for a long time, I have very little respect for NoSQL, but that is mostly because everyone keeps trumpeting it as a killer of traditional databases. There are scenarios where NoSQL systems are an ideal fit. However, NONE of those scenarios require data to be very reliably stored in a guaranteed and predictable way.
If you don't get your tweets or your friends facebook posts as soon as they are posted, no one will really care. But for something as truly important as health insurance coverage? Are you f__king kidding me? And that's just from a reliability standpoint. Nevermind the fact that NoSQL is currently at the wild west stage where nobody is compatible with anybody else, there is nothing resembling a standard set of APIs between products, making it very difficult to develop expertise.
I certainly get where you're coming from with this, but a couple of points to address your concerns:
MarkLogic has been around since 2001, with features needed for enterprise deployments (ACID, HA/DR, security) baked in from the start; it's not just the new hawtness.
NoSQL systems aren't just for lots of data, they are for data with lots of variety. In those cases, a schemaless approach allows for much more rapid application development. And yes, many of those scenarios do require reliable storage. MarkLogic does ACID transactions. The MarkLogic customer page shows some of the customers who have found this helpful.
Regarding interoperability, MarkLogic has a REST API and a Java API, in addition to the ability to work with XQuery. Custom HTTP service endpoints can be built to fit into an existing environment. It still requires some learning -- a document store is different from relational, so that's unavoidable -- but with a little training developers can use a lot of their current skills.
As many have said, XML is here there any everywhere and so we need strategies to manage it. Some people use it purely as a data interchange format, which is fine. Others want to manage documents using an XML representation, which is fine too.
MarkLogic takes an XML document and stores a highly compressed binary tree representation of it. It doesn't store 'raw text', or even strings. This makes it storage efficient. MarkLogic also indexes the entire document's structure, values, words and phrases (and stems thereof) for search. The database and the search engine use the same indexes, rather than a set of indexes each.
Special range indexes can be created too. E.g. a geospatial co-ordinate pair appearing anywhere in the document. As the point's XML element could conceivably be anywhere in the document, creating a traditional schema for this is nigh on impossible. Storing the document itself, with indexes along side it, is actually quite an elegant solution.
I would agree with your statement if it were 'XML soo sucks for traditional relational db applications', because those RDBMS systems require shredding/rebuilding of the documents, or store the XML as 'dumb' column types. Thus you lose the flexibility of the XML. MarkLogic takes advantage of the fact the data is XML anyway.
Why not use a document database with built in support for XML to store data that is XML? Seems pretty straight forward to me.
More details here: http://adamfowlerml.wordpress.com/2013/11/25/marklogic-huh-what-is-it-good-for/
MarkLogic has scheduled backups. You can also do a backup through the Admin UI, which is mentioned pretty early in the Administrator's Guide.
Why would anyone design such a product? Why would anyone buy such a product?
Because for many kinds of data, once I became familiar with MarkLogic I found I could build complex applications more rapidly than I could with relational technologies. It scales out, it's schema agnostic, and it's ACID compliant. It has HA/DR features and a robust security model. Customers buy it because it solves their problems. For anyone interested in learning a bit about MarkLogic, the Enterprise NoSQL page is a good place to start. Disclaimer: MarkLogic employee.
Term, holidays, term, holidays, till we leave school, and then work, work, work till we die. -- C.S. Lewis