BadgerFish - Slashdot User

Comment Re:On Perl and command-line utilities (Score 1) 267

by BadgerFish on Tuesday July 22, 2003 @02:17PM (#6501844) Attached to: Getting Software Added to Unix Distributions?

To be useful, numutils should go beyond what is trivially available in perl and awk. That might involve using C, or just a sufficiently complex script. I don't think numutils is sufficiently complex enough as it is now to be distributed. The issue is whether it fills a particular void, and how well it meets that need. The random utility is useless as written. To generate a random number why not just read the perldoc on srand?

Suso Banderas should follow up on this goal to implement the ability to simultaneously operate on columns of data.

A month ago, I wrote an awk script which calculates mean, standard deviation, variance, min, max, sum, and count (see below) for a given stream of numbers.

#!/bin/nawk $1 ~ /[0-9]+/ { x = $1; N = N+1; if (N>1) { if (min>x) {min=x}; if (x>max) {max=x}; sumx = x + sumx; oldavgx = avgx; avgx = avgx + (x-avgx)/N; varx = (N-2)/(N-1)*varx + N(avgx - oldavgx)^2; } else { min = x; max = x; sumx = x; avgx = x; varx = 0; } } END { print avgx,sqrt(varx),varx,min,max,sumx,N }

This took me very little time to write, and it covers half of numutils scope of effort. The numutils package should shift focus away from calculating means and bounds.

This is my suggestion: For each utility, determine what numutils does which is a pain to accomplish in awk or perl. Focus on those areas.

Some of these scripts are excellent examples of what can be accomplished in Perl, though. And better commented than most.

Personally, i'm interested in finding something that would compute the median and percentiles for a given stream of input data. I was excited to see "numutils" but was dismayed as not finding the variance. I would like to see an open source version of something like the NAG utilities such as nag_summary_stats_1var or nag_5pt_summary_stats.

I guess I'm just waiting for the the Commons-Math Jakarta Mathematics Library project to get released.

Slashdot Top Deals