The problem can be simply stated as that the camera is 4D, because it is a 2D array of 2D cameras, while the image is 3D.
You cannot simply add the dimensions, it depends on how you integrate the image data together. Us people who don't know very much call this integration 3D reconstruction. "The image is 3D" - do you mean the real world is 3D ? The image, as you put it, is a projection from 3D onto a 2D plane and is most definitely 2D.
Humans possess a stereoscopic vision system, each eye is capturing a 2D image at any moment in time. I expect you would call that a 4D vision system ?
SOLUTIONS: All solutions rely on making the camera 3D instead of 4D.
1. Take a short movie while changing focus on a 2D camera.
So now you add a third dimension - time. And the concept of Depth from Focus is not your idea at all, it has been around for a very long time.
Your other idea involving "somehow" doing something cannot be considered prior art. In the grand scheme of things an invention has to be realisable.
I also consider it to be evidence that I am better at this stuff than almost all 3D camera engineers.
If I could see a smiley on the above line I wouldn't think you are an idiot.