All I know is that FAT32 is incredibly simple for writing to. It just sequentially writes to memory in every free spot available, adding the address of the next cluster of the data to a table at the beginning of the partition. NTFS must just be much more complex in handling all the writing. When you add several extra operations every time you write to a 4KB or larger cluster, the delay would seem to build up rather quickly.