Firstly, you apparently didn't read my comment that I wasn't discussing how apt works, only yum.
When Yum downloads something, it fetches a bunch of repo information (like apt-get update), then it downloads files (like apt-get install). To do this, it does... all the shit I described apt doing.
Secondly, the critical issue that you are missing is that if I install a package from an alternate repository (eg EPEL), my systems don't tell the main CENTOS mirrors about those EPEL packages.
No, of course not. You tell Georgia Tech, the NSA CentOS Mirror, or Microsoft's Redmond CentOS mirror, at random, who you are and what you're downloading.
Multiple distributions and mirror maintainers coordinate in secret to keep security exploit details quiet until a patch is ready from everyone. There's an entire network of quiet discussion that happens, intentionally hidden from everyone, to make sure everyone hits the ground running. If you report a remote exploit in Firefox directly to Mozilla, Debian, RedHat, Slackware, or Gentoo, marked as a security bug, they will keep the details private until everyone has patches ready; then they all release at once.
So you believe Microsoft is doing secret things dealing your data to secret partners in secret; but that Linux distributions might not be secretly collecting your data, or that various Linux mirrors who aren't controlled by those distributions aren't under the influence of others. That is: although AT&T was sucking up your phone data and piping it to the NSA, they apparently won't collect what scraps of OS update telemetry data hits their servers in the same way.
You're basically saying there's no network of bad actors out there, so instead of trusting "Debian", you trust everyone.
Finally, there is no fingerprinting involved in the yum transactions. If I have multiple machines behind a single IP address, the server doesn't have sufficient information to distinguish them. As well has having insufficient information to fingerprint individual systems, no user information is transmitted.
We've been able to identify individuals based on their Internet usage and TV usage, even from the same account, device, and browser. We can tell if your 16 year old daughter or her 17 year old sister is currently using the PC or watching TV.
I might have two x86-64 PCs running the same version of Ubuntu, and a Raspberry pi; you can fingerprint at least three systems out of my usage habits, and identify one distinctly at least.
Through all of that...
In summary, yes I am leaking some information, but it is benign.
The leaking of what Microsoft software you've installed to Microsoft's servers is benign as well. Who fucking cares that Microsoft knows you have Office 2013 installed?