No, most fastloaders actually used only two of the serial lines to transfer data. They worked by transferring byte data synchronously, by synchronizing the instruction stream at the start of each byte so that there was a sequence of 4 reads or writes to transfer each byte. This involved some other blackmagic trickery, since the 1541's 6502 processor was completely unfettered, while the C-64's 6510 processor was stalled for sprite DMA and display memory access (basically, 1 out of every 8 scanlines), meaning that you had to either turn off all of that DMA (turning off sprites and blanking the screen) or make sure you run your transfers when you knew that the DMA wasn't going to take place.
A couple of other schemes used asynchronous transfers which were STILL faster than the standard kernel code, then a few systems like Copylock and Vorpal used custom sector formats to minimize the amount of processing needed by the 1541 to decode the data. If I recall correctly, the drive I/O on the C-64 is a stripped down version of an older interface which was actually a parallel 8-bit interface, they simply removed 7 of the lines from the interface but still retained the same control scheme. This resulted in an I/O speed of around 300 bytes/second for a "stock" C64/1541. The 1541's OS itself was able to encode/decode and read/write sectors with an interleave of 3, which is about 7k/second, but the processor-driven serial I/O reduced that by quite a bit. Many fastloaders simplay used the existing read/write routines and accelerated the serial I/O, which easily put the drive at an interleave of 8. Some rewrote the GCR routines too, and I've seen those go down to an interleave of 4. Vorpal rewrote the serial I/O and the sector format, disabled all DMA, and synchronized both processors to +1/-0 cycles to run multiple byte transfers, making it the fastest software based accelerator, with an interleave of 2, reading and transferring an entire track in two revolutions of the disk, or about 12K/second.