I do network engineering at an ISP. We are small, though I have discussed these things with my peers at larger networks.
Once you scale above a very small network (like your home connection), allowing congestion isn't really okay in practice, even with QoS. When I say it's not "okay" here, I'm speaking purely technically.
It might be possible to let networks congest somewhat if you had a large amount of elastic traffic that you could reliably identify. Netflix, for example, could meet these criteria. But that's not okay politically; that's an example of why net neutrality is good!
QoS in carrier networks is only useful for priority (de-)queuing of traffic to reduce latency and jitter. For example, real-time voice or video traffic could benefit. This is where it'd be nice to actually be able to honor user traffic markings.
It's not (currently at least) practical to make the decisions on a flow-by-flow basis in the core of the network (which is what your proposal would require). This is a hardware scaling issue. To be clear, tracking flows statistically is okay at scale. ISPs do plenty with NetFlow/sFlow. But taking an incoming packet, assigning it to a flow, and marking it appropriately, for every packet, in real time is the scaling challenge.
The following approach would scale perfectly in trusted CPE (ONT/cable modem) or reasonably well in a DSLAM (for DSL). Give each user (for example) two queues. Honor the incoming DSCP markings. Put a small, but reasonable, limit on the size of the priority queue; overflowing traffic gets remarked and placed into the non-priority queue. Then, honor markings through the rest of the network.
There are a few problems with even this approach. First off, there are going to be users who legitimately create more high priority traffic than any limit that's acceptable across the board. Is it okay to charge them for a higher limit? If not, how do you avoid gaming the system? If yes, won't that incentivize ISPs to set the limit to zero and charging for all priority? Is that okay? If so, what fraction of people will request and pay for priority in that world? Will that be enough to encourage application developers to mark traffic appropriately? Or does this just degrade into our current zero-priority Internet?
Second, this only gets you one direction (upload). To handle the download direction, you'd need to honor priority bits on your upstream and peering links. But there, you can't trust the markings (unless it's a 1:1 peering link and you are guaranteed your peer implements a compatible policy at their incoming edge), at least without policing. Policing the queues there is easy, but gives you terrible results in real life. If the limit is exceeded with traffic that "should not have been" marked priority, it will destroy the prioritization of "legitimate" priority flows by forcing some fraction of their packets into the non-priority queue. If you accept all (or just a high enough fraction of) incoming traffic as priority traffic, then you have destroyed the prioritization yourself. If you try to mark flows per IP/customer, we're back to that scaling problem.
It might be possible to do something that involves tracking flows at the customer edge and using the incoming markings for the downstream direction. But this is only prioritizing in the last mile. At best, this is a lot of work for very little benefit.