(Q) Did they identify the code that was the cause of the problem?
(A) Yes concatString += addString
(Q) Did they identify WHY that code that was the cause of the problem?
(A) No, they hand waved about += having to do a few more operations than StringBuilder (vs. a metric-butt load that it's doing for a million character string)
(Q) WHY did that code that was the cause of the problem?
(A) In Memory was an O(n^2) algorithm, vs. a O(n) disk algorithm
And I don't believe they understood this, or they would have explicitly explained it in their paper and/or never bothered to publish it
(Q) Is this a problem for the paper?
(A) Yes. The paper title implies that the same algorithm taking place entirely in memory (and one single large disk write) could be slower than one with lots of disk writes. They are clearly not the same algorithm when you look under the hood of string concatenation and writing to a filebuffer.
It's kind of like saying, we had a tortoise and a hare race between point A and point B, pointing out that the tortoise won the race; but neglecting to mention the hare was facing the wrong way, and ran the long way around the world. Oh, but if we switch the hare for a rabbit (which happens to be facing the right way); then the rabbit beats the tortoise - clearly the hare takes longer per stride for some reason
(Q) Are there any other problems with the paper?
(A) Yes, their lots of disk write version of the algorithm is writing to a buggered stream, and while it flushing - there is no guarantee that data has been physically written to the disk before the next iteration started. And since the PC has way more than a million bytes free on it, there's a good chance that the OS didn't have to do a physical write until long after the program finished (the disk array controller may even have a backup battery, so it could be minutes before it actually gets written to disk).
Identifying problems like this with a paper is not belittling the authors. Mocking them for publishing out of their area of expertise may be (re. Biology) or for being a potential expert (PhD. in Electrical Engineering(*)) and making what is clearly a junior coding mistake. But I'm not mocking them. I'm identifying a fundamental problem with their paper. They're grownups in academia - the should expect challenges.
* - Lord knows that he may not have any actual training in software development beyond what he needed to get through school, and may have spent more time working on the hardware aspects of computing - that was the case will all my profs that came from an EE background. PhD. and P.Eng. does not mean infallible programming expert. It means highly specialized in one are of study.
This paper has done nothing to increase the knowledge of the world, and wasted lots of people's time. It's like they published paper saying the world is round when analyzed at a sufficiently large enough scale. Be careful that you don't fly in a straight line from Washington to Japan according to a Mercator projection map because it'll take longer and you'll burn a lot more fuel. Every good pilot in the world knows that and good developers should know the problems of string concatenation (especially in loops).