Even RSA admits no one should use a 4 digit PIN. The reason the PIN is acceptable in length is the only way to test a PIN is valid or not is to use it with the code to enter a passcode on an authentication site. If you are allowing over a thousand bad guesses, you're doing something else wrong. The PIN is used to modify the 8 digit token displayed on the screen and then that result is what is entered. Hardware tokens still have you enter PIN and token manually in some cases (not all hardware tokens work this way), but the packet is in theory encrypted. You do make them authenticate over an encrypted channel, right?
Yes, someone might compromise the device with the software token, but that in theory should be hard. That's why people tell you to keep that bit better protected than most. Is it perfect? Of course not. We're breaking all six (5+1) rules of computer security (first being, don't have a computer). The point of this stronger authentication is never perfect security. Of course, no matter what authentication you use, if you actively compromise their source device completely, you'll get through it. It is to complicate the attack significantly.
In my job, whenever people say security must be cumbersome, I'm asked to go in and teach them that for the level of security appropriate to where I work, we can almost always find a clean solution. Good security, properly done, is done by professionals in a manner to hide most of it from the user so the user thinks it invisible.
Always keep your threat model in mind. Are you trying to protect against selected 3-6 letter government agencies with datacenters full of true supercomputers? Or are you trying to protect against a lesser threat?