How much does computer time on these things cost? How is the cost calculated? Is time divided up something like how it's done on a large telescope, where the controlling organization get proposals from scientists, then divvies up the computer's available time according to what's been accepted?
On the supercomputer centers I'm familiar with, scientists write proposals which are evaluated by some kind of scientific steering committee which meets regularly (say, once per month), and gives out a certain amount of cpu-hours depending on the application.
Do they multi-task (run more than one scientists' program at one time)?
Yes. Typically the users write batch scripts requesting the amount of resources their job needs. E.g. "512 cores with at least 2 GB RAM/core, max runtime 3 days", and then they submit the batch job to a queue. At some point when there are enough free resources in the system, the batch scheduler launches the job. When the job finishes (or during its runtime) the usage is then subtracted from the quota they were awarded in the application process.
Does the computer run at top power (10pf) at all times, or does the resource usage go up and down?
Usually all functioning nodes are running and available for use, yes. Typically load is around 80-90% of maximum, due to scheduling inefficiencies etc. (e.g. a large parallel job needs to wait until there are enough idle cores before it can start, and so forth).
And lastly, how hard is it to write programs to run on these things? Do the scientists do it themselves, and if so, do the people who run the supercomputer audit the code before it runs?
Pretty tricky. Usually they use the MPI library. The programs are either written by the scientists themselves, or by other scientists working in the same field. The supercomputing center typically doesn't audit code, but may require the user to submit scalability benchmarks before allowing the user to submit large jobs. For some popular applications the supercomputing center may maintain a version themselves (so each user doesn't need to recompile it) and provide some more or less rudimentary support.