If all you were able to do is listen to the network traffic, then yes, you're right.
But we're talking about a special case here, where the online banking is being done from within a VM. In that special case, malware installed in the host OS can monitor both the keystrokes and mouse events that are going to the VM in addition to the network traffic.
If I were going to write malware to try to steal usernames and passwords for "interesting websites", I'd wait until I saw network traffic to one of those sites, and *then* start logging keystrokes and mouse events. The fact that the network traffic is HTTPS doesn't matter. All that matters is *where* it's going, and HTTPS doesn't hide that. I don't care about the payload of the packets or what pages you're requesting. All I care about is the DNS name of the computer you're sending data to.
When the malware is installed in the same machine (real or virtual) as the online banking, you can log only the keyboard and mouse events that are beingg sent to the web browser and ignore everything else. What I proposed above allows you to further limit the data you have to sort through by only logging the keystrokes that are likely to result in data being sent to the websites I care about.
If there's a VM between the malware and the browser, you can no longer monitor just the keystrokes going to the browser -- you have to sift through *everything* that's being sent to the VM. But you can still use the network traffic to provide you with some context of what is likely to be interesting and what isn't.