dmcassel - Slashdot User

Comment Re:MarkLogic's Pitch (Score 1) 334

by matt turner on Monday November 25, 2013 @10:41AM (#45514287) Attached to: NYT: Healthcare.gov Project Chaos Due Partly To Unorthodox Database Choice

When faced with a very complicated software project, use what's been proved to work -> MarkLogic has been proven to work - its behind some of the largest and most complicated databases of information out there.

Comment Re:NIH syndrome (Score 1) 334

by matt turner on Monday November 25, 2013 @10:37AM (#45514247) Attached to: NYT: Healthcare.gov Project Chaos Due Partly To Unorthodox Database Choice

If you are equating NoSQL with unreliable, no transaction systems with no security then I can see your point. But as others have pointed out MarkLogic has ACID and scaling and flexibility as well as security and is pretty well proven out to be stable and scalable. So now it comes down to the data model -> complex ever changing data is a hard task for relational. You have to create the perfect model first and then transform all the data into that model. Several years later when that is done you can start again because the data has all changed. NoSQL brings schema flexibility and with that the whole process is changed -> you can load the data first, execute mappings and normalizations as needed and get up and running with reliable access much much faster. You still need a data model, you still need to understand the data, but the whole process takes much less time and, assuming you are using MarkLogic, is as reliable as any RDBMS system.

Comment Re:Blow to NoSQL movement (Score 1) 334

by Adam Fowler on Monday November 25, 2013 @10:31AM (#45514181) Attached to: NYT: Healthcare.gov Project Chaos Due Partly To Unorthodox Database Choice

On the 'eventually consistent' point, just wanted to point out that MarkLogic does NOT use the 'eventually consistent' model. Instead MarkLogic uses MVCC and is fully serializable when it comes to transactions. In addition, MarkLogic's indexes are updated within the transaction boundary, so as soon as the transaction commits, you know the journal has been updated and the indexes are updated too.

Combine this with MarkLogic's ability to save any arbitrarily complex search as an alert, and you can power many interesting mission critical information sharing applications.

I have a blog entry that describes this: http://adamfowlerml.wordpress.com/2013/11/25/marklogic-huh-what-is-it-good-for/

Also around Serializability: http://adamfowlerml.wordpress.com/2013/01/25/true-acid-compliance/

[Yes, I work for MarkLogic! I work here because it's an awesome piece of tech. ]

Comment Re: follow the money (Score 1) 334

by matt turner on Monday November 25, 2013 @10:29AM (#45514145) Attached to: NYT: Healthcare.gov Project Chaos Due Partly To Unorthodox Database Choice

Spike - its the data complexity that kills relational - you would have to design the perfect schema up front and then spend years transforming all the data feeds which by the time you are done would all have changed. Schema flexibility is a huge advantage for complex data of any kind. So the fact that MarkLogic works, scales, is ACID and has flexibility makes it a strong choice for complex data problems.

Comment Re: MarkLogic = NoSQL (Score 1) 334

by Adam Fowler on Monday November 25, 2013 @10:25AM (#45514117) Attached to: NYT: Healthcare.gov Project Chaos Due Partly To Unorthodox Database Choice

FoundationDB actually did a pretty good job of separating out the truth from the hype recently in their post about NoSQL and ACID Compliance: https://foundationdb.com/acid-claims#!

Doesn't mention neo4j specifically - don't think the article authors considered graph/triple stores in their article - but it's a good read nonetheless. The article mentions MarkLogic too. Added my comments on it here: http://adamfowlerml.wordpress.com/2013/11/17/foundationdb-joins-acid-transaction-crusade/

FWIW MarkLogic 7 also is a triple store (not graph store), including support for the W3C Graph Store protocol and SPARQL. http://www.marklogic.com/what-is-marklogic/marklogic-semantics/

[Yes, I do work for MarkLogic - and proudly so!]

Comment Re:Blow to NoSQL movement (Score 1) 334

by dmcassel on Monday November 25, 2013 @10:23AM (#45514097) Attached to: NYT: Healthcare.gov Project Chaos Due Partly To Unorthodox Database Choice

As someone who has done databases for a long time, I have very little respect for NoSQL, but that is mostly because everyone keeps trumpeting it as a killer of traditional databases. There are scenarios where NoSQL systems are an ideal fit. However, NONE of those scenarios require data to be very reliably stored in a guaranteed and predictable way.

If you don't get your tweets or your friends facebook posts as soon as they are posted, no one will really care. But for something as truly important as health insurance coverage? Are you f__king kidding me? And that's just from a reliability standpoint. Nevermind the fact that NoSQL is currently at the wild west stage where nobody is compatible with anybody else, there is nothing resembling a standard set of APIs between products, making it very difficult to develop expertise.

I certainly get where you're coming from with this, but a couple of points to address your concerns:

MarkLogic has been around since 2001, with features needed for enterprise deployments (ACID, HA/DR, security) baked in from the start; it's not just the new hawtness.

NoSQL systems aren't just for lots of data, they are for data with lots of variety. In those cases, a schemaless approach allows for much more rapid application development. And yes, many of those scenarios do require reliable storage. MarkLogic does ACID transactions. The MarkLogic customer page shows some of the customers who have found this helpful.

Regarding interoperability, MarkLogic has a REST API and a Java API, in addition to the ability to work with XQuery. Custom HTTP service endpoints can be built to fit into an existing environment. It still requires some learning -- a document store is different from relational, so that's unavoidable -- but with a little training developers can use a lot of their current skills.

Comment Re:MarkLogic = NoSQL (Score 1) 334

by matt turner on Monday November 25, 2013 @10:20AM (#45514065) Attached to: NYT: Healthcare.gov Project Chaos Due Partly To Unorthodox Database Choice

SQL doesn't work well for complex DATA from multiple ever changing sources. You have to create the entire perfect model before you can even load data and by then, with any complex problem, the data has changed and you are in a refactoring death spiral.

Comment Re:MarkLogic = NoSQL (Score 1) 334

by matt turner on Monday November 25, 2013 @10:18AM (#45514051) Attached to: NYT: Healthcare.gov Project Chaos Due Partly To Unorthodox Database Choice

there isn't any reason a NoSQL database can't be ACID and scale AND support a complex data problem. The problem is that most of the current systems were built for web scale (cough cough) and when you try to use them for a real project you have to make up all kinds of silly reasons to work around the lack of ACID.

Comment Re: follow the money (Score 1) 334

by Adam Fowler on Monday November 25, 2013 @10:12AM (#45514011) Attached to: NYT: Healthcare.gov Project Chaos Due Partly To Unorthodox Database Choice

As many have said, XML is here there any everywhere and so we need strategies to manage it. Some people use it purely as a data interchange format, which is fine. Others want to manage documents using an XML representation, which is fine too.

MarkLogic takes an XML document and stores a highly compressed binary tree representation of it. It doesn't store 'raw text', or even strings. This makes it storage efficient. MarkLogic also indexes the entire document's structure, values, words and phrases (and stems thereof) for search. The database and the search engine use the same indexes, rather than a set of indexes each.

Special range indexes can be created too. E.g. a geospatial co-ordinate pair appearing anywhere in the document. As the point's XML element could conceivably be anywhere in the document, creating a traditional schema for this is nigh on impossible. Storing the document itself, with indexes along side it, is actually quite an elegant solution.

I would agree with your statement if it were 'XML soo sucks for traditional relational db applications', because those RDBMS systems require shredding/rebuilding of the documents, or store the XML as 'dumb' column types. Thus you lose the flexibility of the XML. MarkLogic takes advantage of the fact the data is XML anyway.

Why not use a document database with built in support for XML to store data that is XML? Seems pretty straight forward to me.

More details here: http://adamfowlerml.wordpress.com/2013/11/25/marklogic-huh-what-is-it-good-for/

Comment Re:MarkLogic is an XML repository, not a RDBMS (Score 1) 334

by matt turner on Monday November 25, 2013 @10:11AM (#45513999) Attached to: NYT: Healthcare.gov Project Chaos Due Partly To Unorthodox Database Choice

Yup and ML indexes that tree and largely uses search engine like indexes to resolve queries. What this gives you is schema flexibility and that is key for bringing together many complicated data sources. With a relational approach you would have to define the entire master schema up front.

Comment Re:Blow to NoSQL movement (Score 1) 334

by matt turner on Monday November 25, 2013 @10:08AM (#45513975) Attached to: NYT: Healthcare.gov Project Chaos Due Partly To Unorthodox Database Choice

Except of course that you CAN have actual transactions and simple scale out in NoSQL - just most of the NoSQL databases out there weren't designed to manage that. The database in question here, MarkLogic, has fully ACID transactions and horizontal scale out without any penalty to ingestion speed or query. Its all about the core design being right from the start and MarkLogic's architecture isn't complicated its MVCC with write forward over a shared nothing cluster.

Comment Re:MarkLogic = NoSQL (Score 1) 334

by dmcassel on Monday November 25, 2013 @09:40AM (#45513749) Attached to: NYT: Healthcare.gov Project Chaos Due Partly To Unorthodox Database Choice

MarkLogic has scheduled backups. You can also do a backup through the Admin UI, which is mentioned pretty early in the Administrator's Guide.

Comment Re:Blow to NoSQL movement (Score 1) 334

by matt turner on Monday November 25, 2013 @08:56AM (#45513519) Attached to: NYT: Healthcare.gov Project Chaos Due Partly To Unorthodox Database Choice

That's only a valid argument if the choice was in fact one of the NoSQL databases that has reliability and security (you forgot about that - its equally appalling in most NoSQL). But there is such as thing as enterprise NoSQL with real transactions and security and MarkLogic has been replacing Oracle for mission critical systems for over 10 years. The new generation of database with schema flexibility along WITH reliability is a killer combination for hard data aggregation problems.

Comment Re:follow the money (Score 2) 334

by dmcassel on Sunday November 24, 2013 @11:26PM (#45511605) Attached to: NYT: Healthcare.gov Project Chaos Due Partly To Unorthodox Database Choice

It's actually really fast. The XML is indexed on ingest and many queries are answered just from the indexes.

Why would anyone design such a product? Why would anyone buy such a product?

Because for many kinds of data, once I became familiar with MarkLogic I found I could build complex applications more rapidly than I could with relational technologies. It scales out, it's schema agnostic, and it's ACID compliant. It has HA/DR features and a robust security model. Customers buy it because it solves their problems. For anyone interested in learning a bit about MarkLogic, the Enterprise NoSQL page is a good place to start. Disclaimer: MarkLogic employee.

Comment Re: follow the money (Score 1) 334

by dmcassel on Sunday November 24, 2013 @10:53PM (#45511439) Attached to: NYT: Healthcare.gov Project Chaos Due Partly To Unorthodox Database Choice

MarkLogic is ACID compliant. http://www.marklogic.com/blog/can-you-pass-the-acid-test/

Slashdot Top Deals