If you are designing an API, there are dozens of different error codes (invalid value, invalid enumator, run out of memory, invalid handle). Sometimes one bad input could generate three or more errors. But it is the specification which states which error takes priority.
To make sure everything works as expected, you do all sorts of things:
Positive testing - making sure things do what they are supposed to do. You look up all the input parameters, work out which combinations are critical and need to be tested together.
Negative testing - making sure things handle incorrect input. Test every single wrong input one at a time, then move onto pairs of wrong input, and finally all inputs bad.
Then there are random code and data tests, which just generate random streams of commands. These can pick out things. That method was used to test network device drivers. You just blasted the poor device with a random stream of packets of all values and sizes, and investigate whenever something goes wrong. Unfortunately, randomly generated code usually ends up more like the entry of an obfuscated coding competition.
Getting real-world data is the best test, especially when it is multi-threaded, then all sorts of weird stuff can appear.
The