Comment No (Score 1) 7
The behavior, known in the research community as sycophancy, stems from how these models are trained: reinforcement learning from human feedback, or RLHF, rewards responses that human evaluators prefer, and humans consistently rate agreeable answers higher than accurate ones.
No, it's because in the training corpus most of the responses to "are you sure" that anyone bothered to record will involve someone being corrected.