The display buffers a frame ahead, and delays audio to match. So it has a start point and an end point
That's fully consistent with what OP said. The issue is the boundary condition where the pendulum/waving hand/etc. changes direction, where start and end points have little meaning in a vacuum.
For example, if frame 1 shows the pendulum at X:47, frame 2 shows it at x:50 and frame 3 shows it at x:48, you'd need a lot more information and horsepower than these motion smoothing algorithms currently have in order to make anything but a guess as to where the pendulum should appear in the interpolated frames (50, 51, 50, 49, 48? 50, 50, 49, 49, 48? 50, 50, 49, 49, 48? 49, 49, 49, 48, 48?) Wrong guesses register as unnatural in our brains, which are keeping track of the broader context of the motions based on expectations of the underlying mechanics/physics.