My experience with data flow is
LabVIEW. As a language designed to handle simultaneous slow hardware communication and fast dataset processing, it's a natural for multi-threading.
Parallelization is automated within the compiler based on program structure. The compiler's not all that great at it (limited to 5 explicit threads plus whatever internal tweaking is done), but... the actual writing of the code is just damn easy.
Not to excuse the LabVIEW compiler: closed architecture, tight binding to the IDE, strong typing that's really painful, memory copies everywhere. But the overall model for dataflow is just superior for parallel applications. It's unfortunate that there seems to be little alternative out there with similar support for data flow, but more overall utility.