Sphinx by itself is a terrible answer to this problem, unfortunately. The code is free, but good luck finding an appropriate model. Worse, you'll need to train a speaker-dependent model to get any usable results, and this is a VERY non-trivial task with Sphinx tools in the state that they are. I spent several years getting paid to adapt Sphinx for commercial purposes and while it's great for some things, I can say with confidence that it is not the tool you're looking for.
You know what works? Dragon. Hate to say it, but the commercial products here have a gigantic edge on the competition.
That said, I'd love to see someone come up with an open source speaker-dependent model training system that's friendly enough for app developers (not speech researchers) to roll into projects. I think this is a big open door for contribution to the community. Sphinx isn't the best thing going, but it's certainly usable, and if a real product came into being I'm sure all the speech wonks would start coming out of the woodwork to improve the algorithms.