I thought there was a Unicode code point shortage?
Nope. Originally, Unicode only had room for 65536 code points, but it was extended with Unicode 2.0 to 1,112,064 code points. At least if the Wikipedia page on it is to be believed, only 120,737 characters have been defined as of Unicode 8.0.
Maybe that's just because UTF-8 because has to maintain backward compatibility with ASCII.
Nope, UTF-8 can actually represent even more code points than that, but any encoding that results in a code point value past 0x10FFFF is invalid in UTF-8.
From what I understand, in doing so, it wastes a few hundred other code pages.
Nope. All that "maintaining backward compatibility with ASCII" involves is "encoding code points 0x000000 through 0x00007F as a single octet equal to the code point, and not using those octets in the encoding of any other code point"; the encoding scheme can represent everything up to 0x7FFFFFFF, i.e. it only loses the uppermost 2,147,483,648 code points, and can still handle the lower 2,147,483,648 code points. That's a lot more than the Unicode limitation of 1,112,064 code points.