The deep learning revolution did not solve the problem you claim. What deep learning does is allow more complex piecewise linear functions to be modelled efficiently (if you use relu that is, which is the most popular activation (*)). That's both a blessing and a curse.
What actually happened in the deep learning revolution is that humans solved the problem of designing basic features over many generations of papers and progressively simplified the solution, discovering what is important and what isn't. The algorithms were weeded out until the point we are now, which is that data input is matched to algorithm, in this case the algorithm of choice is of deep learning type. It only looks like deep learning is good for every dataset, but it's not true.
For example, in vision problems, try training a deep network on input that is not in the form of pixels and not in the form of multiple color planes. It will fail miserably, the quality of recognition will be abysmal. That's why data design is so important, you have to know what the strengths of the AI model actually are. In this case, the statistical regularities between neighbouring pixels are what is enabling the CNN layers to extract information. These regularities are an artefact of choosing to stack pixels and colour planes into a rectangular grid. That's solving most of the problem.
Now pixels didn't always exist, they were invented by people quite recently. Try looking up TV technologies of the 1930s and you'll find that it's all about deflecting electron beams. There's really nothing natural about pixels, it's just what our current technologies are based on. And so there's nothing natural about what a deep network does either, it's just a system that has been selected for fitness against our current tech stack, for a handful of high value problem domains. It doesn't imply anything about other problem domains that haven't been studied so intensively.
(*) if you don't use relu but some other smooth activation family for your deep network, then there will always be a close piecewise linear approximation, as these functions are dense. So it's not a big loss of generality to assume relu everywhere.