Exactly what would your non-internet-connected echo-like device that does all processing locally be capable of doing?
FWIW, I'd ideally prefer the arrangement to be a bit different, and I think my ideal setup would probably suit most of those complaining about some cloud connected always on mic. I'd prefer a local voice to text thing, a separate array mic piece, and a separate thing to process the commands and send results and keep state, and I'd want it all open source and no requirement for any external accounts. The end result would almost certainly have far more room for vulnerabilities and exploits. I'd still want it to have an internet connection so it would work with all the wonderful things on the internet. I'd still want it to be able to tie into IFTTT (if you don't know what that is, look it up - it's pretty awesome). I'd definitely want to be able to customize the action word (ie. use something other than "Alexa", "Amazon", "Echo", "Siri", "OK Google", "Cortana", etc)... which means that part couldn't use a low power hardcoded hardware bit to pick up the keyword, so it would have to listen and process all text instead. With all that stuff done locally, there's also a lot more that can break, and it'll need a lot more resources (CPU, memory, ram, and storage). I'd also want it to train based on what's been said before, which is also built into the Alexa service (you can review every clip it sent, and what it thought you said, and correct them as needed).
I don't think everyone in the world should have one, but if you're avoiding them because, "there's no way I'm putting an internet-connected microphone in my living room", that just seems silly. You'd be doing it anyway if you had anything with similar features, and your probably carrying around an internet connected mic right now, and your laptop, PC, xbox, ps4, and whatever kit you have connected to your TV probably all have one as well. If you have none of those and you're that paranoid (maybe even justifiably so), that's fine, but obviously this isn't for you then.
If you really would, "love to have something like this in my house", then using a low powered net connected device like the Echo is the smart way to go... there's no way I could justify a more-or-less fully fledged PC at every point I wanted to do that stuff. The mic alone pretty much justifies the cost, let alone the "wife factor", so to speak (and sorry to sound sexist). But hey.. if you're worried about recording devices, ignore this whole segment of tech forever. If it's voice activated, something has to be listening.