If a toddler tries to eat a bee, you better believe it's never going to try that again, and won't ever try to eat anything else insect-like. And you don't even have to show it thousands of pictures of insects so it 'learns' what they are. It's inherent in intelligence, which this rover does not have.
Even a toddler has had years worth of dual high resolution video feeds, experiencing motion, ego-motion and interaction with objects in a 3D space. For an AI those pictures are literally all it knows about existence, the equivalent of a toddler that's been completely paralyzed since birth, blind on one eye and sedated while cared for so the only visual stimuli it's experienced ever is that slideshow. It's not an apples to apples comparison.
We've recently been making great strides in unsupervised pre-training. Basically we let the model study a ton of unlabeled data first, then it turns out we need only a few labeled examples/counterexamples to identify a class. The reason you need counters is that you don't know if the class you're trying to find is "insect" or "bee" so we also need some information on what's not bees.
Of course that's still only a fraction of the human mind, but it's hard sometimes to distinguish between reasoning ability and information compression. The more you're able to reason about the world, the more you can derive, the more compact your representation of the world is. If that was a scientist turning raw observations into a formula we'd call that intelligence, if it's a computer creating a compact representation through decomposing/deconstructing a complex result it's not.