I agree. The voice-to-text is remarkably good: definitely at the point that it has become a tool and not just a toy. (I won't say that it never makes mistakes, but it's accurate enough that you can dictate a text message and only have to make a small number of fixes, making it overall faster in many cases.) The Google Now features also work well (asking relatively free-form questions).
However, the 'embarrasement factor' still looms large: I don't want to use the functionality where it might disturb other people (e.g. at work), and I'm even self-conscious using it when walking around in public. (Yes, it remains ironic that we feel weird talking into our phones.) I also avoid using when my wife is in a nearby room, because of the "What did you say? Are you talking to me?" factor. And of course, I usually don't want to broadcast my activities for all to hear. As a result, I'm not conditioned to use the feature, and I forget to use it even in cases where it would make sense (e.g. home alone).
I guess what I'm saying is that the adoption of these technologies might well be more limited by social convention, rather than limitations in the tech itself. I'm not sure if this is an intrinsic aspect of humanity (that on average people don't like talking to technology, despite what sci-fi has long predicted), or whether this is purely generational, and the next batch of users will be completely comfortable speaking commands to their computers/phones/etc. (in which case, the tech will no doubt have to improve; e.g. in order to only respond to the assigned user's voice).