Lets consider two scenarios here.
In the first case, the camera is not panning, but just filming the scenario as it is, and projector playing it back at the filmed rate. Thus viewing the projection is the same experience as looking at the scene in real life, to within the fidelity of the playback. Notably, there is no depth (or a poor simulation of depth with forced focus), but apart from that higher fidelity should be more realistic. The viewer's eyes will be jumping around the big screen and blinking just like normal so there is absolutely no reason to try to "simulate" that; you have the real-life effect already occurring. Same with motion blur; the eye will supply the same amount of blur that it does in real life, so there is no reason to simulate it, beyond compensating for too *low* of a frame rate, which requires a longer integration time to avoid appearing choppy.
And yet it is exactly this sort of scene that was causing people to deride 48fps as being "soap opera like". They talked about how watching the Hobbits slowly walk down the hill towards them looked epic in 24fps, and looked like a documentary in 48fps. It destroyed the suspension of disbelief for them, and made them think they were looking at actors not Hobbits. That has nothing to faking limitations of human vision. It is completely psychological; whether that psychological effect is inherent in the medium or the result of prior conditioning is debatable, though.
The second scenario is where the camera is panning, and thus forcing visual motion on the user even though they didn't initiate it. This is identical to being smoothly flown around a scene, and how "realistic" it is will depend on whether that would actually happen in real life. In situations where it is realistic my argument above would apply; the eye will be looking around the moving scene just like it would be when looking out a train window.
On the other hand, in situations where panning is being used to simulate human motion, I would argue that 48fps could allow the filmmaker to have more realistic view changes if they want them. Low rate 24fps forces the director to have slow gradual pans less they create a choppy or blurry mess as a result in the limitations of the rate. However, as you pointed out, the eye doesn't work that way. It jumps around, taking time to settle and focus each time. If you tried to do that at 24fps the viewer would get lost, unable to follow the transitions. In large part this is because in real life they are controlling the transitions so they know in advance where the view is changing to, but to a lesser extend this is due to the limitations of the frame rate. Faster frame rates will allow for more abrupt translations that are still possible to follow.