AI cannot extrapolate, per construction. They can interpolate better than anyone, but as soon as you leave its dataset, it has no clue anymore.
You can feed the best AI a trillion photos of cats, if none of them included a black cat, it will be fundamentally unable to tell you that a picture of a black cat contains a cat.
The illusion that it can extrapolate comes from the fact that these models are fed with humongous amounts of data, so even just interpolating is still mostly good enough as you won't go near the edge of the dataset.
An artificial algorithmic "mind" knows what it knows.
A human mind can conceive that there are things it does not know.
So AI can sort through and retrieve an item and that item's entire ranked adjacent items, faster and more reliably than any human brain. But it cannot break its own rules. You can ask AI to generate an image of a cat with nine tails and zebra striped fur, and it can do that because you - the human mind - prompted it to invoke those specific rules (cat, nine, tails, zebra, fur) and combine them in specific ways. But if you - a human mind - paint a photorealistic image of a cat with nine tails and zebra striped fur wearing a ballet tutu, a top hat, and a monocle, swimming past a coral reef, then present that image to an AI program and request it identify the thing in the picture, it might return anything. And, whatever it outputs as its answer, if you follow up to input "Are you sure that's correct? It has a tutu like a ballerina, stripes like a zebra, top hat like a victorian gentleman, (etc.)" it is very likely to output a completely different guess, or (if programmed by human minds to do so) will output that it can't be sure.
Meanwhile, every (non-neurologically damaged) human being over the age of 5 will immediately say "a cat" or something like "a cat with zebra fur and a lot of tails and a tutu and...". If you follow up to ask, "Are you sure that's correct? Maybe it's something else" they will say "It's a weird cat, but yeah it's a cat" - even though there has never existed such a cat in the history of the world, and thus from a taxonomic, scientific, reality sense that image does not contain a cat.
The human mind has capacity for unknowns, so it can instantly expand to create space where that completely-not-a-real-thing still is a cat.
The intriguing "magic" is that you could train a current-gen AI to recognize that image as a cat. IF you first created tens of thousands of images of equally bizarre cats, mixed them with images of equally bizarre dinosaurs, giraffes, cactuses, etc., then had thousands of human minds click CAPTCHA tests on the images, then added that data to the AI. That is, the AI could eventually have a high accuracy rate on bizarrely-visualized cats if subjected to thousands/millions of instances of cats. Meanwhile, a 5 year old human who may have only seen a small handful of cats in their entire life, and never ever one wearing a tutu/hat/monocle/stripes while swimming in the ocean, will instantly and with total certainty know "that's a cat".
That's one reason I scoff at the claims we're right around the corner from (or already across the line to) human-level general intelligence if we just (n+1) faster. Every big LLM out there has "read" more text than a random sample of a billion human beings. Human understanding is NOT just about adding more instances of a thing. There's something else (and a lot of it) going on that makes human conscious cognition what it is.
What's even more provocative to me is watching what happens when your toddler learns what a thing is. They become obsessed with it. For the next several months, every time a cat (physical or an image) comes within view, they point and excitedly say "Cat!!!" "DaDa! Cat!!!" Same thing for every airplane in the sky, every bird at the window. They are hungry for ideas to fill their capacity. The human animal inherently LOVES acquiring concepts. We crave them. They pleasure us.
It's why elementary school kids love corny jokes. A joke takes a concept, then jumps from that to a different, unexpected concept. The brain gets the pleasure of knowing the first concept and is following along, then gets a hit of pleasure from suddenly having a new concept thrust into the concept space and recognizing both the nature of the connection and the jump. The more apparent distance between the two concepts, the more pleasure we derive from it. Almost all kid jokes (and a huge percentage of grownup humor) takes the form:
1) Banal context followed by weird incongruous situation. Knock knock. Who's there? Me. Me who. Meow!
OR
2) Weird context followed by banal situation. Why did the chicken cross the road? To get to the other side!
5 year olds absolutely love jokes like #1. The reason is because of that pleasure hit from making a new connection. Until they first hear the joke, the word "me" has only been used to refer to the self. The word "meow" has only been used to refer to the sound of a cat. The concepts of "self" and "sound of a cat" have zero conceptual connection. That is, they have an infinite distance between them, semantically. They only connect through an accident of the particular phonetics of English and a few other languages where both words start with the same syllable. And kids get that. They don't know the word onomatopoeia, but they already understand and know how to work with the concept of onomatopoeia.
You can melt every ounce of ore in the world and refine every drop of oil in order to produce 60 trillion GPUs and then use every centimeter of land to create one massive datacenter, and the absolute best of our current AI models will still not get the joke. They will have less understanding and less concept-manipulation capacity than a 5 year old. It's all still just programmatic I/O and logic gating at this point.