Basically, the procurement process for supercomputers is like this: the buyer (e.g. a DOE lab) will ready a portfolio of apps (mostly simulation codes) with a specified target performance. Vendors then bid for how "little" money they'll be able to meet that target performance. And of course the vendors will use the most (cost/power) efficient hardware they can get.
The reason why we're no longer seeing custom built CPUs in the supercomputing arena, but rather COTS chips or just slightly modified versions, is that chip design has become so exceedingly expensive and that the supercomputer market is marginalized by today's mainstream market.
Also, the simulation codes running on these machines generally far outlive most supercomputers. The stereotypical supercomputer simulation code is a Fortran program written 20 years ago, which received constant maintenance in the past years, but no serious rewrite is viable (costs exceed price of hardware). So vendors will look for low-effort ways of tuning these codes for their proposed designs. Sticking with general purpose CPUs is in most cases the most cost efficient way.
From what I read the dongle is merely the interface from the camera (USB) to the smartphone (USB). That should be trivial. (For my setup a USB OTG cable + adapter to mini USB is sufficient, there are tons of apps to control cameras).
The article states that they had to use a beefier micro controller etc., but I wonder: why not do all the processing on the smart phone? These days our phones have so much processing power AND sensors, there should be no need to do any kind of non-trivial logic outside, especially when you're just trying to launch your first product.
Of course general purpose CPUs exist, simply because we call them that way. But it is also true that each design has it's own strengths, and "dark silicon" is another driver for special purpose hardware. Efficiency is another. Andrew Chien has published some interesting research on this subject. In his 10x10 approach he suggests to use 10 different types of domain-specific compute units (e.g. for n-body, graphics, tree-walking...), each of which is 10x more efficient than "general purpose CPUs" in its domain (YMMV). Those compute units bundled together, make up one core of the 10x10 design. Multiple cores can be connected via a NoC.
Let's see how software will cope with this development...
ps: can special purpose hardware exist if general purpose hardware doesn't?
One reason might be that railways are more efficient in densely populated areas. There express trains can even compete with airplanes. Yesterday we went from Tokyo to Osaka. Flight time would have been ~1h, plus 1h checkin and transfer to/from the airport (~45min. each). The Nozomi Shinkansen took us there in 2:30, and both stations were directly at the center of the cities.
Most of Japan's population is situated in coastal regions, so just a hand full of routes can service all major cities. Imagine how many connections you'd need in the US...
Do you suffer painful hallucination? -- Don Juan, cited by Carlos Casteneda