Forgot your password?
typodupeerror
User Journal

Journal: It's surprising to me how vicious atheists on /. are....

Journal by pcwhalen

I mentioned my faith as a thing that guided me and there was a maelstrom of haters. WTF? If you don't have a religion, you don't hear me trying to convert you. I don't give a Parson's fart if you do. And I don't really care what you think about me. It's just amazing that atheists want to spend so much energy to mock other people.

Kinda fucked up, really.

User Journal

Journal: Still can't believe the NSA is as deep as it is.

Journal by pcwhalen

The width and breadth of our govt's spying abroad was never in doubt: our abilities are impressive, but not shocking. the shock comes from deciding to turn the lens inward. Exec order 12333 says CIA and NSA look "out" and not domestically at US persons.

It's a slippery slope to start down. Without 4th Amendment protections, we could all be fucked.

User Journal

Journal: Continuation on education 13

Journal by jd

Ok, I need to expand a bit on my excessively long post on education some time back.

The first thing I am going to clarify is streaming. This is not merely distinction by speed, which is the normal (and therefore wrong) approach. You have to distinguish by the nature of the flows. In practice, this means distinguishing by creativity (since creative people learn differently than uncreative people).

It is also not sufficient to divide by fast/medium/slow. The idea is that differences in mind create turbulence (a very useful thing to have in contexts other than the classroom). For speed, this is easy - normal +/- 0.25 standard deviations for the central band (ie: everyone essentially average), plus two additional bands on either side, making five in total.

Classes should hold around 10 students, so you have lots of different classes for average, fewer for the band's either side, and perhaps only one for the outer bands. This solves a lot of timetabling issues, as classes in the same band are going to be interchangeable as far as subject matter is concerned. (This means you can weave in and out of the creative streams as needed.)

Creativity can be ranked, but not quantified. I'd simply create three pools of students, with the most creative in one pool and the least in a second. It's about the best you can do. The size of the pools? Well, you can't obtain zero gradient, and variations in thinking style can be very useful in the classroom. 50% in the middle group, 25% in each of the outliers.

So you've 15 different streams in total. Assume creativity and speed are normally distributed and that the outermost speed streams contain one class of 10 each. Start with speed for simplicity I'll forgo the calculations and guess that the upper/lower middle bands would then have nine classes of 10 each and that the central band will hold 180 classes of 10.

That means you've 2000 students, of whom the assumption is 1000 are averagely creative, 500 are exceptional and 500 are, well, not really. Ok, because creativity and speed are independent variables, we have to have more classes in the outermost band - in fact, we'd need four of them, which means we have to go to 8000 students.

These students get placed in one of 808 possible classes per subject per year. Yes, 808 distinct classes. Assuming 6 teaching hours per day x 5 days, making 30 available hours, which means you can have no fewer than 27 simultaneous classes per year. That's 513 classrooms in total, fully occupied in every timeslot, and we're looking at just one subject. Assuming 8 subjects per year on average, that goes up to 4104. Rooms need maintenance and you also need spares in case of problems. So, triple it, giving 12312 rooms required. We're now looking at serious real estate, but there are larger schools than that today. This isn't impossible.

The 8000 students is per year, as noted earlier. And since years won't align, you're going to need to go from first year of pre/playschool to final year of an undergraduate degree. That's a whole lotta years. 19 of them, including industrial placement. 152,000 students in total. About a quarter of the total student population in the Greater Manchester area.

The design would be a nightmare with a layout from hell to minimize conflict due to intellectual peers not always being age peers, and neither necessarily being perceptual peers, and yet the layout also has to minimize the distance walked. Due to the lack of wormholes and non-simply-connected topologies, this isn't trivial. A person at one extreme corner of the two dimensional spectrum in one subject might be at the other extreme corner in another. From each class, there will be 15 vectors to the next one.

But you can't minimize per journey. Because there will be multiple interchangeable classes, each of which will produce 15 further vectors, you have to minimize per day, per student. Certain changes impact other vectors, certain vector values will be impossible, and so on. Multivariable systems with permutation constraints. That is hellish optimization, but it is possible.

It might actually be necessary to make the university a full research/teaching university of the sort found a lot in England. There is no possible way such a school could finance itself off fees, but research/development, publishing and other long-term income might help. Ideally, the productivity would pay for the school. The bigger multinationals post profits in excess of 2 billion a year, which is how much this school would cost.

Pumping all the profits into a school in the hope that the 10 uber creative geniuses you produce each year, every year, can produce enough new products and enough new patents to guarantee the system can be sustained... It would be a huge gamble, it would probably fail, but what a wild ride it would be!

User Journal

Journal: Letter frequencies in URLs

Journal by arth1

Doing some maintenance on a few squid cache servers, I decided to look into the letter frequency distributions for URLs, and how it matches normal written text.
Four caches were scanned for the URLs of currently cached content only, constituting around 1.5 million URLs.

In short, the results have some of the same characteristics as normal text, but with notable exceptions. You don't get an etaoin shrdlu; there are a lot of h, t, p, colons and slashes in URLs which skew the results. I'm also surprised that w scored so low, given all the URLs that start with www.

If anyone else finds a use for this, here is the data. Each character in the URL is followed by the number of times it was used in each cache, plus the total for all four caches.

/: 83198 130244 3028097 2929538 6171077
t: 73026 99729 2727455 2641930 5542140
e: 52801 95537 1746624 1753865 3648827
.: 35317 60175 1478231 1467006 3040729
o: 40941 86873 1423124 1448453 2999391
a: 43075 72450 1408451 1384211 2908187
c: 36078 64921 1308435 1295986 2705420
s: 41946 76684 1251987 1278493 2649110
p: 28248 44907 1214805 1190698 2478658
m: 29609 45768 1168769 1195505 2439651
h: 22543 41992 1029463 1019494 2113492
i: 37846 58586 974977 994693 2066102
n: 30006 51596 815477 795344 1692423
r: 26958 53239 801514 774606 1656317
g: 23689 57734 666533 790131 1538087
d: 23304 36637 746244 697523 1503708
:: 15442 27059 639115 649013 1330629
w: 25563 41061 622672 629215 1318511
1: 9697 12580 577523 561429 1161229
l: 21855 32824 560110 542960 1157749
2: 9890 13516 492565 514385 1030356
u: 11878 15246 440808 431176 899108
0: 10333 13106 404229 445998 873666
v: 7450 8415 328991 292590 637446
b: 9980 26743 280533 285767 603023
3: 6296 6905 299391 272352 584944
f: 9866 25830 265685 266037 567418
4: 4738 5931 273161 244104 527934
k: 4202 5641 235501 230456 475800
5: 5957 6920 212941 235172 460990
7: 6497 7333 230677 200956 445463
9: 4327 5215 206613 195295 411450
8: 5363 6697 210689 178565 401314
6: 5761 6487 209092 175203 396543
x: 3853 5755 168401 144265 322274
-: 3516 11325 124398 133481 272720
y: 4348 5272 114803 96971 221394
_: 2301 2683 87749 80901 173634
j: 4436 5058 89043 72567 171104
=: 1555 1437 37342 35214 75548
q: 1494 1538 32910 37861 73803
z: 741 907 29563 30037 61248
,: 3282 2848 21099 14688 41917
&: 493 413 12558 9222 22686
%: 220 460 9640 11420 21740
;: 2878 2254 8281 8281 21694
?: 322 294 4796 9264 14676
+: 45 35 1333 1758 3171
~: 31 7 996 735 1769
$: 0 0 425 670 1095
^: 6 0 420 228 654
*: 27 10 187 188 412
!: 0 2 282 122 406
[: 0 0 292 23 315
]: 0 0 272 23 295
|: 8 8 77 167 260
@: 10 0 113 38 161
(: 0 0 75 55 130
): 0 0 69 55 124
{: 0 0 75 0 75
\: 0 0 6 4 10
': 0 0 1 1 2

Does it have any practical use?
Perhaps. In proxy.pac files, a common method of load balancing based on URLs, known as the Sharp Superproxy script, is to sum the ASCII values of the cache entries, and mod it by the number of servers, to pick a server to use. .pac files are javascript, and javascript does not have an easy method to return the ascii value for a character. So what's generally used is a function like:

function atoi(charstring) {
    if (charstring=="a") return 0x61; if (charstring=="b") return 0x62;
    if (charstring=="c") return 0x63; if (charstring=="d") return 0x64;
//.....
}

This can be speeded up by ordering the list in the order of frequency, starting with "/", "t", "e", ".", "o", "a" - just moving those few to the front, reduces the latency of the script significantly.

Also, hashing in URL history handling can be sped up if the most prevalent buckets are created. This could also be useful for other URL collections, like AV software URL matching. I am unaware of any that work directly with character based lookups, but it is certainly one way to do it.

Other uses?
In pen testing, having a frequency table like this can greatly aid in URL discovery speed.

But all in all, it was a fun exercise. Note that the variations may be great, especially for the bottom half of the list. Also note that the low count for the letter 'x' in the URLs might not match your users.

Books

Journal: History books can be fun (but usually aren't and this is a Bad Thing) 2

Journal by jd

Most people have read "1066 and all that: a memorable history of England, comprising all the parts you can remember, including 103 good things, 5 bad kings and 2 genuine dates" (one of the longest book titles I have ever encountered) and some may have encountered "The Decline and Fall of Practically Everybody", but these are the exceptions and not the rule. What interesting - but accurateish - takes on history have other Slashdotters encountered?

User Journal

Journal: Nothing changes: different decade, same consumer fraud.

Journal by pcwhalen

I just looked back at a journal post I wrote from almost a decade ago here on /. It mentions all the same issues of arbitration and attempts by Big Biz to screw the Common Man with contract clauses.

I am not surprised that with so much money at stake for corporate America to steal that they have kept such a tight leash on consumer contract arbitration provisions. These keep a consumer from suing in court, requiring instead a forum more favorable to the Corp. Almost all consumer contracts have been changed to preclude class actions.

In my lawsuit against Sony in August over the PSN data breach, I was immediately faced with Sony changing its contract to preclude class actions. The same for my lawsuit against Citibank in their data breach.

An Australian friend put it this way: "Different dog, same leg action."

Education

Journal: HOWTO: Run an educational system 1

Journal by jd

The topic on Woz inspired me to post something about the ideas I've been percolating for some time. These are based on personal teaching experience, teaching experience by siblings and father at University level and by my grandfather at secondary school, 6th form college and military acadamy. (There's been a lot of academics in the family.)

Anyways, I'll break this down into sections. Section 1 deals with the issues of class size and difference in ability. It is simply not possible to teach to any kind of meaningful standard a group of kids of wildly differing ability. Each subject should be streamed, such that people of similar ability are grouped together -- with one and only one exception: you cannot neglect the social aspect of education. Some people function well together, some people dysfunction well together. You really want to maintain the former of those two groups as much as possible, even if that means having a person moved up or down one stream.

Further, not everyone who learns at the same pace learns in the same way. Streams should be segmented according to student perspective, at least to some degree, to maximize the student's ability to fully process what they are learning. A different perspective will almost certainly result in a different stream. Obviously, you want students to be in the perspective that leads them to be in the fastest stream they can be in.

There should be sufficient divisions such that any given stream progresses with the least turbulence possible. Laminar flow is good. There should also be no fewer than one instructor per ten students at a secondary school level. You probably want more instructors in primary education, less at college/university, with 1:10 being the average across all three.

Section 2: What to teach. I argue that the absolute fundamental skills deal in how to learn, how to research, how to find data, how to question, how to evaluate, how to apply reasoning tools such as deduction, inference, lateral thinking, etc, in constructive and useful ways. Without these skills, education is just a bunch of disconnected facts and figures. These skills do not have to be taught directly from day 1, but they do have to be a part of how things are taught and must become second-nature before secondary education starts.

Since neurologists now believe that what is learned alters the wiring of the brain, the flexibility of the brain and the adult size of the brain, it makes sense that the material taught should seek to optimize things a bit. Languages seem to boost mental capacity and the brain's capacity to be fault-tolerant. It would seem to follow that teaching multiple languages of different language families would be a Good Thing in terms of architecturing a good brain. Memorization/rote-learning seems to boost other parts of the brain. It's not clear what balance should be struck, or what other brain-enhancing skills there might be, but some start is better than no start at all.

Section 3: How to test. If it's essential to have exams (which I doubt), the exam should be longer than could be completed by anyone - however good - within the allowed time, with a gradual increase in the difficulty of the questions. Multiple guess choice should be banned. The mean and median score should be 50% and follow a normal distribution. Giving the same test to an expert system given the same level of instruction as the students should result in a failing grade, which I'd put at anything under 20% on this scale. (You are not testing their ability to be a computer. Not in this system.)

Each test should produce two scores - the raw score (showing current ability) and the score after adjusting for the anticipated score based on previous test results (which show the ability to learn and therefore what should have been learned this time - you want the third-order differential and therefore the first three tests cannot be examined this way). The adjusted score should be on the range of -1 (learned nothing new, consider moving across to a different perspective in the same stream) to 0 (learned at expected rate) to +1 (learning too fast for the stream, consider moving up). Students should not be moved downstream on a test result, only ever on a neutral evaluation of some kind.

Section 4: Fundamentals within any given craft, study or profession should be taught as deeply and thoroughly as possible. Those change the least and will apply even as the details they are intertwined with move in and out of fashion. "Concrete" skills should be taught broadly enough that there is never a serious risk of unemployability, but also deeply enough that the skills have serious market value.

Section 5: Absolutely NO homework. It's either going to be rushed, plagarized or paid-for. It's never going to be done well and it serves no useful purpose. Year-long projects are far more sensible as they achieve the repetitious use of a skill that homework tries to do but in a way that is immediately practical and immediately necessary.

Lab work should likewise not demonstrate trivial stuff, but through repetition and variation lead to the memorization of the theory and its association with practical problems of the appropriate class.

Section 6: James Oliver's advice on diet should be followed within reason - and the "within reason" bit has more to do with what food scientists and cookery scientists discover than with any complaints.

Section 7: Go bankrupt. This is where this whole scheme falls over -- to do what I'm proposing seriously would require multiplying the costs of maintaining and running a school by 25-30 with no additional income. If it had a few billion in starting capital and bought stocks in businesses likely to be boosted by a high-intensity K-PhD educational program, it is just possible you could reduce the bleeding to manageable proportions. What you can never do in this system is turn a profit, although all who are taught will make very substantial profits from such a system.

User Journal

Journal: I don't know which is scarier

Journal by jd

That I am old enough to remember where my current .sig came from, or that nobody else is.....! For those who are suffering from a memory lapse, here is the sig: "The world is in darkness. To erase data is to suppress truth; to halt computing is to shackle the mind."

Ok, ok, you're too lazy to google it, so here's the link: Son of Hexadecimal Kid

The universe seems neither benign nor hostile, merely indifferent. -- Sagan

Working...