What are you basing this assertion on? It sounds believable, but it also sounds like an unproven assumption.
I'm glad you asked.
1. I assert that it will always be more expensive to make an array of individual PV cells, with circuitry to route their output to readout logic and the global power pool, than to make a conventional sensor and a conventional PV cell with larger effective area. I can't "prove" this, but I can't imagine a realistic scenario where it wouldn't be true.
2. I observe that a sensor in an optical assembly, with light only entering through a lens, can only absorb light that falls on the lens. In fact, it can't absorb all of that; the lens will reflect or absorb some light and some won't land on the sensor, unless the sensor encompasses the entire FOV of the rear of the lens. So, if your lens has one square centimeter of aperture, and your light level is one milliwatt per square centimeter (pretty bright indoor lighting), you get less than a milliwatt onto the sensor.
3. I assert that any camera will have a housing with more surface area than its lens. That's where you put the (conventional) PV cell(s) to harvest energy.
Feel free to poke holes in these assertions and observations, or to point out things that they don't cover. I'll take a crack at it myself:
a. Maybe you've got a situation where most light is coming from a single direction, and your camera faces that direction. You build your camera as a cylinder, perhaps embedded into a wall, so that light falls on no part of it except the lens. In that situation, you don't have a place to put an exterior PV cell. I think this is an unrealistic scenario; if it's built into a wall, plug it in, or surround it with a PV bezel.
b. Maybe your manufacturing is so good and so mature that microelectronics cost a flat rate per square cm, whether you're making a simple PV cell, a simple CMOS image sensor, or an integrated light-harvesting imaging array. From what I know of semiconductor manufacturing, this seems unrealistic, too -- but even if we get there, you can still collect more light with an exterior PV cell (from points 2 and 3), as long as you aren't also in contrived scenario a.
c. You're in a mature IoT scenario, where you've got smart dust everywhere, and it all needs to be self-powered. Here again, though, those dust grains will have more surface area than the lenses integrated into them. In this scenario, I'd expect something more like a fly-eye arrangement anyhow.
Any other ideas?