While I agree with your general concept (I think), this example doesn't demonstrate it because you don't have enough context.
Why would the child giggle when dad eats a bite of chocolate? Simply put - the child wouldn't, assuming this is his first observation of the pattern of behavior. The child would have no reason to giggle, because nothing is inherently funny about it. Dad (and Mom, for that matter) go into the refrigerator to get food all the time. Mom (who most likely does the grocery shopping) puts food in the refrigerator all the time.
The child would giggle if he had observed a pattern of behavior where:
1. Mom puts a "special food item" (like a chocolate bar) in the fridge
2. Dad sneaks a bite of this special item without Mom's awareness
3. Mom later discovers a missing bite of her food.
4. Mom (or Dad) respond with some behavior which the little boy decides is funny
So on subsequent repetitions of this pattern, the boy sees steps 1 & 2, and mentally projects step 4, causing him to giggle.
Alternatively, Dad cues little boy during the initial iteration of step 2 that this is a funny action (perhaps doing it by acting in a silly manner, smiling & laughing more than normal, etc), in which case his laughter has nothing to do with a mental simulation, but is merely reflecting Dad's attitude.
The bottom line is -- humans are REALLY GOOD at pattern recognition, and computers less so (currently). What you call "simulations" I see as simply extrapolations of observed behavior patterns -- and if computers got good at autonomously recognizing human behavior, they'd get good at "simulations" too.