I do also suspect that the CR/LF conversion might be the source of his troubles. Thus, it's not a bug, it's a feature of the ftp protocol. I guess you could simply use binary mode for the ftp transfer.
Now, let's assume that it's not an CR/LF problem but that instead for some unknown reason the ftp transfers get aborted and thus the file size mismatches.
Okay, first of all, if you want to guarantee that a file that departed from one system is the very same file after its arrival on another system it is not wise to use the file size for verification, as the two files could have the same length but different contents. Therefore typically md5sum is used. Or better yet, use both MD5 and SHA-1 hashes so nobody could probably ever produce meaningful collisions for both of them at the same time.
Now, what programs should be used for the transmission itself? Well, that depends on your requirements: Is confidentiality important or is it really just about integrity and availability? Is speed or link saturation a topic? Like, if your current pipe is like 80% full, you probably cannot afford to encrypt your data. Otherwise, of course you should except for like if an IPS/IDS maintainer wants to be able to scan the contents. Let's take a look at both possibilities:
- First: Confidentiality not an issue but bandwidth/speed/IDS is
Basically suitable is every tcp data transfer application that does by itself not meddle with the data itself. So this kind of excludes ftp as it can substitute CR/LF (Unix) line brakes with CR (Windows) ASCII text line brakes while transferring data from UNIX to windows and vice versa. But then again, you can use FTP just fine if used in binary mode. However, even the Swiss army knife of network transmissions can be easily used for the purpose of reliably transmitting files from A to B: netcat.
nc or nc.exe is available for both Windows and Unix and is often used in the forensics world in manual combination with md5 and/or sha-1 hashes to transmit forensic evidence from e.g. a suspect drive to the examiners workstation. Here the chain of evidence would be maintained by recording a hash of the data on the suspect drive, recording a hash of the data on the examiners workstation after arrival and recording the date, time and contents of the transmission. Note that it might be vital to have a log of what has been transferred when so that it can be proven that you sent some data the other party claims to never having received it.
So, recapping, e.g. netcat, ftp, SMB/CIFS shares, HTTP and any other TCP based file transfer utility could be used. HTTP and FTP could even be easily scanned for viruses/malware during transit. UDP based file transfer utilities could be used as well as long as the implementation does take care of the integrity. As most likely a short script would be used in order to generate logs containing MD5 and SHA-1 hashes on both sides, the time and date of the transfer and the filename, this script could as well easily handle data retransfers in the case of packet loss.
- Second an better: confidentiality with some bandwidth and CPU constraints
Sorry, this posting by now bores me. So, the recap:
Use SSH (SCP), cryptcat (used among others in forensics for the chain of evidence when confidentiality is an issue), HTTPS, SMIME or any other encrypted transfer tool, really. Hell, you could even generate an encrypted PGP file or whatever with a script and pipe it through whatever data transfer application you want. (Like ftp in binary mode ;) )
So, overall, what are needed here are two small scripts that do something like this:
On the sending side:
10 compute SHA-1 / MD5 hash of a file to be transferred (and optionally compress it)
20 send file
30 receive a SHA-1 / MD5 hash of the transferred file from the receiver
40 compare the hashed
50 complete transaction including logging the date, time, filename and hash, if hashed match
60 else goto 20
On the receiving side:
10 receive file
20 generate SHA-1 / MD5 hash of the received file
30 send SHA-1 / MD5 hash back to the sender of the file
40 goto 10 if same file is resent, otherwise complete transaction recording date, time, filename and hash.
Those are pretty simple scrips but they assume that you can run a script or some custom software on the receiving end. If you can't do so, all you could do is to sniff the transmission traffic, extract the file from that stream and compare the hashes of both. That should be enough to hold even in a court: Here we have the evidence that the file has been sent by the transmitting script/application (hash of application log) and there we have the evidence that it has been transmitted on the wire, including the acknowledgements of the receiver (hash of the sniffer extracted data).
Why again was that trivial question posted on /.?