I've only had a quick look at their press release, is there a pre-print of their paper anywhere?
This looks like a hardware implementation of something like "Grand Central Dispatch". Combined with transactional memory.
The basic idea seems to be that you can take a serial-ish process, break it up into tasks. Start running the first few tasks that should obviously run first. Then if you have spare CPU cores, you can also start speculatively executing later tasks. But if these speculative tasks hit a conflict in the transactional memory model, the results will be thrown away.
So you might see a massive win from running those tasks early. But at worst, you'll still run every task in order.
IMHO getting any kind of speed boost is going to depend on hardware support. But there might be a way to do something similar with OS kernel support.