Siri supports US English (speaks in the default female voice everyone as heard), UK English (low pitched male voice), AU English (different female voice, better than the US voice, in my opinion), French (effeminate-sounding male voice, as you would expect from any French guy), and German (the best sounding female voice in my opinion).
You are confusing synthesis with recognition.
Siri and Evi both use Nuance's automatic speech recognition (ASR) technology. This technology can support both US and UK English (among many others), depending on which models are used. Presumably, this can be configured by the application software, based on the location of the device and/or user setup (I do not have a smart phone, so I'm not sure if the latter is supported). The Nuance technology also adapts to the user's acoustics and word usage over time, so, in theory, a UK-accented speaker could start with the US English model and it (their speaker-dependent model) would eventually have "moved over" into their space. Not optimal, but it can certainly work, with patience on the user's part. With the correct model and/or adaptation, the vast majority of adult speakers will be able to get their "words recognized" by the ASR technology.
You are correct that the localization issues impact the ability to then respond intelligently to map queries, etc. But that's not all.
The "natural language understanding" (NLU) layer - which includes more general query processing - is also extremely location/domain dependent, and the adaptation technology here is much less advanced than with ASR. So, the main value-added by the Evi application (relative to Siri) is very likely to be an NLU framework that is regionalized. I'm sure Apple has the capability to make their NLU domain-specific; it's mainly a matter of data collection and training.