I use OSMAnd on my phone, but my girlfriend recently bought a Windows Phone and I've been very impressed with Nokia's mapping app (I actually like a lot of what Microsoft's done with Windows Phone 8, but it's a strange mix of very polished and well-designed UI parts and completely unfinished parts with missing features). It's good to see more competition with Google maps, which is becoming increasingly entrenched in spite of the fact that the UI is pretty poor in many regards and the mapping data is terrible. For example, here they're missing (or have in the wrong places) most of the cycle paths, which ends up with people regularly getting lost if they rely on Google, in spite of the fact that all of this data is in OpenStreetMap.
 For me, it's the killer app for Android. Offline maps, offline routing, and open source backed by high-quality mapping data from OpenStreetMap. I use the version from the F-Droid store, which doesn't have the limitations of the free version from Google Play and it's one of the few open source apps that I've donated money to.
Just curious, what's so wrong with branch with delay slot and isn't that more native way to look at branch ?
They're a pain for people on both sides of the ISA.
The compiler has to find an instruction that can run after the branch. This is normally trivial for calls, but for conditional branches within a function it's often difficult to find an instruction that you can put there. It has to be one that is either from before the jump (or in both basic blocks after the jump), but that the branch doesn't depend on (because it's executed after the branch instruction). This means that you quite often end up padding the delay slots with nops, which bloats your instruction cache usage. On a superscalar implementation this is the only cost, but on a simple in-order pipeline it's also a completely wasted cycle.
On the other side, it's a pain to implement. It made sense for a three-stage pipeline in the original MIPS, because you always knew the next instruction to fetch. A modern simple pipeline is 5-7 stages though, so your branch is still in register fetch (if there) by the time the delay slot is needed. It doesn't buy anything and it means that, if you're doing any kind of speculative execution (even simple branch prediction, which you really need to do to get moderately good performance) then you have an extra dependency to track - you can't just use the branch as the marker and flush everything after it, you need to do some reordering. In a superscalar implementation, you need to do even more complex things in register renaming to make it work.
I didn't look at the floating point stuff in much detail, so there may be something there, although the biggest changes in recent versions of the MIPS specs have been that they're more closely aligned with the IEEE floating point standards, so it's hard to imagine anything there.
The biggest difference between MIPS64r6 and ARMv8 is that the MIPS spec explicitly reserves some of the opcode space for vendor-specific extensions (we use this space, although our core predates the current spec - it's largely codifying existing opcode use). This allows, for example, Cavium to add custom instructions that are useful for network switches but not very useful for other things. ARMv8, in contrast, expects that any non-standard extensions are in the form of accelerator cores with a completely different ISA. This means that any code compiled for one ARMv8 core should run on any ARMv8 implementation, which is a big advantage. With MIPS, anything compiled for the core ISA should run everywhere, but people using custom variants (e.g. Cisco and Juniper, who use the Cavium parts in some of their products) will ship code that won't run on another vendors' chips.
Historically, this has been a problem for the MIPS ecosystem because each MIPS vendor has forked GCC and GNU binutils, hacked it up to support their extensions, but done so in a way that makes it impossible to merge the code upstream (because they've broken every other MIPS chip in the process) and left their customers with an ageing toolchain to deal with. I've been working with the Imagination guys to try to make sure that the code in LLVM is arranged in such a way that it's relatively easy to add vendor-specific extensions without breaking everything else.
Imagination doesn't currently have any 64-bit cores to license, but I expect that they will quite soon...
Wouldn't it be just a matter of re-compiling your code though?
Assuming that your code doesn't do anything that is vaguely MIPS specific. If it is, then there is little benefit in using MIPS32r2 now - ARMv7 is likely to be closer than MIPS32r2 to MIPS32r6 in terms of compatibility with C (or higher-level language) source code compatibility.
I love MIPS and, that is the case in large part, because of its current instruction set. It seems like a bad idea to mess with the current instruction set and break backward compatibility. Why did they decide to do that?
Basically, because the MIPS ISA sucks as a compiler target. Delay slots are annoying and provide little benefit with modern microarchitectures. The only way to do PC-relative addressing is an ugly hack in the ABI, requiring that every call uses jalr with $t9 in the call, which means that you can't use bal for short calls. The lwl / lwr instructions for unaligned loads are just horrible and introduce nasty pipeline dependencies. The branch likely instructions are almost always misused, but as they're the only way of doing a branch without a delay slot there's often no alternative.