Aye, it may be impossible the way Tesla is trying to do it. Their original plan was for a coast-to-coast demo in 2017, which obviously failed.
What "failed" is that they had to start over from scratch because MobileEye felt that it should own all of the self driving data, and Tesla disagreed. So it took a few years to get back to their 2016 status.
They actually could do a coast to coast demo now and have had that capability for about a year. Their current difficulties are the same that Waymo is having - you have to trust that other drivers will actually obey red lights and stop signs - thus ignoring that the other drivers current velocity will cause a crash if they don't slow down or stop when you make a left turn. Similarly aggressive behavior required for merging, etc. that will cause an accident if the other driver ignores you trying to merge, etc.
Their problem is twofold. First they underestimated the processing power needed to do handle images from the cameras. They use neural nets to process them and on the original hardware they shipped (known as AP2) it just wasn't powerful enough, they couldn't even get it to compare consecutive images (which helps when you don't have stereo vision). They went to AP2.5 and now AP3, but it's not clear if even that is fast enough for what they want to do.
You should watch youtube videos that show the shadowmode debugging output that is tracking people, cars, bikes, road markings (lane boundaries, stop at light boundaries) etc. in real time for all cameras. The hardware works fine for what it needs to do.
The second problem is that it's just really, really hard to use neural nets to do everything they need. Not just recognizing objects like cars, signs and traffic lights. It has to see road markings, it has to see traffic police and understand their gestures, it has to understand complex 3D spaces with no/poor road markings like car parks and private driveways. It has to be able to recognize small objects that the radar/ultrasonics close to the ground won't pick up, like toll barriers and the over-hanging rear ends of trucks.
It isn't as hard as you seem to think, also they aren't using NN's for everything. Also FSD doesn't have to handle every case - you can geofence it - so it never has to handle private driveways. Something that is level 5 for well defined common use cases, but doesn't do country rounds in the middle of nowhere is still a major game changer; or that announces that "in 10 minutes we will be approaching the boundary for FSD, please take over soon".
To give you some idea of how far away they are, even the current driver assist parking isn't good enough for full self driving.
The driver assist is an entirely different code base. It is using essentially none of the data that is being used for FSD development. They are parallel development tracks with almost no resources being devoted to the non FSD stuff.