1) It will work on top of IP, but the way it is designed it can work without it as well.
2) Well, you miss the main reason behind this protocol. The reason for its existence is because of sites like youtube, which provide the same content for many users. Now the way it works, youtube sends the same content over and over for every single person. I could turn around your question and ask you who is paying for the bandwidth? I'm not just saying that this reduces network traffic allowing to handle higher demand, but any ISP which isn't peering (i.e. not Tier 1) is paying for access and in fact they are generally paying per amount of data received. If they have 100 people requesting the same data this reduces their cost to 100:1. So it's not like cost of youtube is brought on someone else's shoulder, but instead the cost of youtube will be simply reduced. Anyway, it is not like the ISPs wouldn't have control of what names are cached. The only one who wouldn't be happy are Tier-1 who get paid by other ISPs for their access.