How about graceful seg faults instead of program crashes? Obviously modern architectures don't really support such things but one can imagine a processor that detected bad pointers instead of causing the program to crash. In fact, each program could program or transaction even could program a pre-determined fault handler.
What'll happen is:
1. Thread A sets a "start of code snippet" and programs an address that has a fault handler.
2. Thread B starts its processing as well.
3. Thread A at some point tries to dereference a pointer at address X.
4. Thread B races ahead and deletes the pointer at address X.
5. Normally, in protected memory, the processor would throw a fit as thread A tries to access an illegal memory address.
6. Instead, the processor jumps to thread A's custom fault handler.
7. Thread A's fault handler sees "hey, my code snippet tried to access an illegal address and I, the thread, am not guaranteed to be thread safe". It then rolls back all of the work it's done up until the instruction that faulted.
8. Thread A tries again starting from 1. It could, at some point, decide to not try the thread unsafe method (if it faults too many times) and actually use the old mutex locking method.
The idea is that the majority of the time, thread A and thread B don't actually conflict. Or thread A wins the race. In those cases, you have a case of parallel computation speedup.
It's up to the programmer (or compiler, probably a JIT) to recognize when to exploit this by analyzing the algorithm and the likelihood of conflict. A JIT would probably use profiling information it gets in real time.
Nobody's saying this will replace 100% of all synchronization methods. But we don't need to. To get a speedup, you only need to technically replace 1 use case. But most likely, you can replace a lot (90%) of use cases.