[ouch -- overly long comment ahead... I wonder if anyone will read it :( ]
I have also never seen a good popular explanation of entanglement: either they go full weirdo saying things like "instantaneous something" or don't really explain entanglement, like your example about numbers in a hat. The problem is, to understand exactly what physicists understand by "quantum entanglement", you first must understand measurement, and in particular, that you can choose many different ways to measure each particle from the entangled pair. If you're willing to spend some time, I recommend the excellent series of lectures starting in this video. The material is pretty self-contained, except some matrix algebra and complex numbers. The lectures are given by Leonard Susskind, a famous physicist.
Now, if you just want to understand why entanglement is different then your "numbers in a hat" example, I'll try to explain as simply as I can with a simple example (which has the basic idea of Bell's inequalities and their violation, you can search wikipedia if you like).
OK, consider a "game" with these elements:
- (1) two people, Alice and Bob, the players in the game
- (2) both Alice and Bob each have a coin. Their coins are fair (i.e., 50/50) and, when flipped, give the result "0" or "1". Call the results of Alice and Bob flipping their coins ca (for "Coin-Alice") and cb (for "Coin-Bob"). Clearly, ca and cb are completely unrelated, because the coins are fair. Alice can see the result of her coin, but NOT Bob's, and vice-versa.
- (3) the hat from your example, which generates either two "0"s or two "1"s. You'll always give one of the numbers to Alice and the other one to Bob, and each of them can NOT see what the other one got. Call the two generated numbers ha and hb (for Hat-Alice and Hat-Bob). So, you'll always have either ha=hb=0 or ha=hb=1.
- (4) both Alice and Bob each have a piece of paper where they have to write either "0" or "1". Call the two written numbers pa and pb (for Paper-Alice and Paper-Bob). Each of them can't see the other one's paper.
The game is played by Alice and Bob collaborating to win -- either they both win or they both lose. To play, they can discuss their strategy, after learning the rules which I'm about to explain. But once the game starts, they can't talk anymore (i.e., no information can pass between them).
The rules of the game are as follows: when the game starts, Alice and Bob are given the numbers from the hat (ha and hb), then flip their coins (ca and cb) and write in their papers (pa and pb). They win if the resulting numbers satisfy the following equation:
pa XOR pb = ca * cb
and lose otherwise. Here, XOR is the usual XOR operation on bits, and "*" is the usual multiplication. Remember that all numbers in the equation (pa, pb, ca, cb) are either "0" or "1"). Also note that the numbers from the hat are not used to determine if they win or lose, but they can use the hat numbers to write their answers, according to whatever strategy they decided.
Now, note that if they both always write "0" in their papers, ignoring the hat completely, they win 75% of the time, because 0 XOR 0 = 0, and ca*cb will only be different than 0 if both coins are 1, which happens only 25% of the time. It's not too hard to prove (mathematically) that if their coins are indeed completely fair, any strategy they use will give them at most 75% of chance to win, it's impossible to do better without cheating.
So far, nothing involves anything quantum, and there's no entanglement. What I described is called a "Bell inequality" (win probability <= 75%), you can look up Wikipedia for a much more complicated explanation of basically the same thing.
Now, the real reason for all this business is this: if you replace the hat with a source of entangled photons (or a pair of anything that's entangled), then it's possible to do better than 75%, if Alice and Bob have a strategy for deciding what measurements to make on the photons that came out of the hat based on the result of the coins. This experiment has been done in the laboratory by many different people, and it shows that nature does indeed behaves like this, i.e., it's not really behaving like "numbers from a hat" (the actual possible win rate is about 82% with the right "strategy", as predicted by the theory that explains entanglement, i.e., Quantum Mechanics).
Unfortunately, to understand that you must understand what exactly is meant by "measurement", how you define a measurement, etc. That's what's explained in the lectures starting in the video I linked at the start of this (now extremely long) comment.
I hope this helps to at least give a hint about how entanglement can be different than the hat in your example.