Google does all the heavy lifting on servers, for Glass and pretty much everything else that they do. The Glass just collects data and displays the results. It has barely any compute or storage, just enough to be a client to the web services, really. That's how they got it small, light, and relatively cheap (compared to previous similar devices).
Retinal scanning is great, but the goal is to ID people from a distance, so that the observer is just wearing Glass and watching a stream of people. If they're doing a retinal scan, they can also stop people and check IDs. Besides which, of course, normal people don't have a reference retinal scan to check against.
I agree that trying to do general population facial recognition would generate too many false positives, wasting everyone's time.
But there's also some room for optimization to improve the odds, and to find a use case that doesn't require perfection. For example, the system can narrow matches down to people with tickets for flights in the next few hours departing from that airport, and flag anyone who doesn't match for an ID check. To be useful it wouldn't have have to be 100% accurate, or match everyone in the universe, so it's an easier computation - it's value would be in letting security filter out 80% of the people that are known OK travellers, and interview the people that are out of that profile.