Comment Interesting but flawed paper (Score 1) 241
This is clearly the question that corporate co-authors Nvidia and Logicblox hoped you would ask.
The paper seems to represent more of an evolutionary rather than revolutionary approach, but suffers from some unfortunate hand-waving, particularly in their attempt to negate the real cost of memory->PCIe transfers (to their credit, at least they call out that latency), their unwillingness to perform comparisons on like-to-like base hardware, and their rather odd choice of front-end environment. Coupled with their odd price-performance metric, I suspect that Nvidia marketing got way up in Gatech's business on this. My suspicion is that there are real use cases where SIMD processing is going to substantially speedup relational database performance on easily partitioned datasets, but as more vectorization effort is placed on main CPU, the advantages of kicking off to coprocessor will eventually go the way of the 387.
The paper seems to represent more of an evolutionary rather than revolutionary approach, but suffers from some unfortunate hand-waving, particularly in their attempt to negate the real cost of memory->PCIe transfers (to their credit, at least they call out that latency), their unwillingness to perform comparisons on like-to-like base hardware, and their rather odd choice of front-end environment. Coupled with their odd price-performance metric, I suspect that Nvidia marketing got way up in Gatech's business on this. My suspicion is that there are real use cases where SIMD processing is going to substantially speedup relational database performance on easily partitioned datasets, but as more vectorization effort is placed on main CPU, the advantages of kicking off to coprocessor will eventually go the way of the 387.