75% of the performance for 25% of the power compared to x86
The problem is they don't provide anything approaching that sort of efficiency.
I've had the, um, privilege of benchmarking a few of the new up-and-coming ARM server systems and chips. It's pretty neat to be able to have four quad-core servers, each with 4GB of memory, pulling a total of 40W or so. That's a great system for a web server farm.
The problem is when you compare the throughput vs performance for high performance computing. For a few workloads, the new ARM systems compare favorably - giving a small edge in work done per watt. The performance advantage per watt in these workloads is usually less than 5%.
The best-case for the ARM systems, under workloads that are most ideal for the ARM systems in question: 105% of the performance for the same power draw.
If your workload doesn't scale to a large number of cores easily, or has a large amount of inter-process communication, the current ARM systems are hopeless. Even with a supposedly high-performance backplane in a chassis hosting around 40 ARM nodes, it was soundly trounced by one x86_64 node.
Note this is for a cluster system, so many x86_64 and ARM nodes are being used for the benchmark.
One node X86-64 can handle the workload of 40+ ARM nodes. Granted, you can fit 40+ ARM nodes in a 3U chassis, but it's still a lower overall compute density than with 1U x86_64 nodes. Simply throwing more cores at the problem doesn't necessarily give you the gains you'd think.
While ARM is theoretically capable of better performance/watt, it's impossible to get anything resembling theoretical in a supercomputing application. Any advantage ARM has in performance/watt is eaten up by the overhead of having to use so many (slower) cores. Very few workloads scale linearly as you throw in more cores, as various overheads (MPI, network, etc.) decrease the overall efficiency dramatically.
Currently, you can't use ARM for memory-hungry applications, as you'll hit the 4GB limit. 64-bit ARM is promising, but it's also not for sale.
The best performance per watt for supercomputing workloads is still found in accelerators, such as GPU's or Intel's Xeon Phi.
ARM is very promising for many datacenter type workloads, where there are a large number of unrelated, independent processes, such as a farm of web servers. (Any database backend is, however, a different matter).
While slightly OT (as it's a non-supercomputing application): What if you want to use an application that uses Java server-side? Forget it. The current ARM JVM's (both openJDK as well as Oracle's) both appear to lack JIT; the only way to get Java to have a similar performance/watt between ARM and x86_64 is to disable JIT on x86_64. This is largely a software issue, but until it's fixed, forget about Java on an ARM server.