ARMv8 is not eliminating them, it's reducing the number of instructions that have them. Conditional instructions are useful because you can eliminate branches and so keep the pipeline full. For example, consider this contrived example:
On ARMv7 and earlier, this would be a conditional add. The pipeline would always be full, the add would always be executed, but the result would only be retired if the condition is true. On MIPS, it would be a branch (complete with the insanity known as branch delay slots, which if you look at the diassembly of most MIPS code typically means with a nop, so you get to waste some i-cache as well) and if it's mispredicted then you get a pipeline stall.
On ARMv8, you don't have a conditional add, but you do have a conditional register-register move and you have twice as many registers. The compiler would still issue the add instruction and then would do a conditional move to put it in the result register. From the compiler perspective, this means that you can lower PHI nodes from your SSA representation directly to conditional moves in a lot of cases.
Basically, 32-bit ARM is designed for assembly writers, ARMv8 is designed for compilers. As a compiler writer, it's hands-down the best ISA I've worked with, although I would prefer to write assembly by hand for ARMv7. I wouldn't want to do either with MIPS, although I currently am working on MIPS-based CPU with some extra extensions.
Actually, ARM's reasoning is that modern branch predictors on high end AP's can do a good enough job of following a test and branch and keeping the pipeline(s) full that there is very little value in conditional instructions on future chips. It's hard to cause a pipeline stall or bubble by branching a few instructions forward or back on these CPUs since they are decoding well in advance of the execution pipelines. Added to that, there is an energy cost in executing an instruction and throwing away the result. Obviously, not all cases are wins. In the example you noted, a register to register mov on a register-renaming system is basically a 0-cycle operation (never makes it out of the instruction decoder), so it's hard to do better than that.