It seems like there are two(for broad simplification purposes, there are definitely more or at least cases that mix elements of both) 'styles' of use; one of which is fairly hard to imagine replacing keyboards in; the other much more amenable(already partially done in some cases).
There are the tasks that involve relatively precise symbol manipulation. Programming is probably the most extreme case(human readers might be disgusted by your spelling, grammer, and atrocious taste in formatting; but they are likely to understand what you meant than the compiler or interpreter is); spreadsheet data munging, word processing, and the like are the other big ones. You can substitute something for a keyboard in these cases; but it is generally pretty clunky and you really need a reason to bother. Speech-to-text, say, works; and can be a valuable assistive technology for those who can't type for one reason or another; but it isn't actually all that impressive compared to typing if you have the option of either(both because it is somewhat error prone; because some operations have extremely terse expressions on the keyboard "move right one cell" is expressed with one touch of an arrow key, which is far faster than saying it, and certainly at least as fast as even a specially defined codeword of some sort; and because people, without substantial practice, aren't terribly good at speaking the way they want to write; pauses, 'umm', etc.)
Then there are tasks that can be done by manipulating symbols; but are really about snapping together some primitives the system is already familiar with in one of a reasonably limited number of ways according to what is basically a template provided by the system. Creating a calendar event or starting a phone call are probably reasonably good examples: For a calendar event; you are snapping together one or more items from your contacts(if it's a 'reminder', it just contains you; if it's a meeting or something, it will have additional participants), a date/time, and a location(sometimes just a human-readable description intended for the participants, in company settings often a conference room or the like that is also a specialized type of contact that is known to the system so that room availability tracking works). Placing a phone call is an even simpler case: you are specifying a contact and a known operation to perform against that contact(and possibly an additional detail if the contact has a work, home, and mobile number or the like, in which case the command has to be 'call X at work').
This set of tasks is inherently somewhat limited, because (barring markedly more expert expert systems than we yet enjoy) you can really only perform them if the system already has a template defined; but many of the common cases are really, really common; so it isn't prohibitive to enumerate and support those cases; which reduces the ambiguity involved and makes it easier for a relatively imperfect input mechanism to assemble the correct answer (or at least recognize that it needs to ask you to repeat yourself) because the context automatically excludes the vast majority of possible inputs.
If your plan involves a grim future where computers are basically just for scheduling meetings and asking Alexa to buy things; it becomes much easier to imagine replacing the keyboard; but that is much less about improvements in speech to text or other new input mechanisms than it is about defining down the list of possible activities until you no longer need precision, general purpose input, or other things your alternative input mechanism is bad at.