Follow Slashdot stories on Twitter

 



Forgot your password?
typodupeerror
×
User Journal

Journal Journal: Minor Jibber-Jabber

These Are Minor Journal Entries so don't expect too much from them, 'k?

2006-11-11: Yet Another Critter (Sorta)

My wife really loves me! I know this because she caught a big, nasty spider and gave it to me, even though she fears big, nasty spiders. (I mean, really and truly fears -- she isn't your stereotypical girlie female most ways, but when a spider emerges from the dark corners of her critter-room she screams and has me come in to dispose of it!)

I've named her Sil, and we set her up in a spare critter-keeper with some moss and a twig. She's happily hanging out in one corner of it in her web, occasionally feasting on the crickets and flies we put in there with her. I got into the practice of catching flies in the early morning stumbling-around still-waking-up hours when I had my first spider, and it's coming back pretty easily now. I'm using the Too Much Coffee Man method of waiting for a fly to land on a flat surface, then putting my hand edge-down over that surface, and sweeping it quickly through the air just an inch above the fly in a grasping motion. The first thing flies do when they sense danger is leap straight up and then get their wings going, so they fall into my hand quite nicely.

-- TTK

2006-11-05: Stray Political Thoughts

It seems to me that the basis of "progressive" ideology is the idea that human beings have not yet lived the best life they possibly can, and that we can progress into a better life, one that has never been lived before. Liberalism is a type of progressivism, but with a (IMO) fucked-up idea of what "better" means, and of what constitutes "progress".

Along these lines, the basis of "conservative" ideology is that the best life humanity can hope for is either already here, or was lived at some time in the past (or at least, that the life humans once lived was better than what we have today). It can express itself in the form of wanting to preserve what is good about the status quo, which is (IMO) understandable and laudable. Unfortunately there are also some really messed-up, evil elements in modern conservatism which I cannot overlook.

I think the essence of "neo-conservatism" is that the "past" which is held up as the ideal towards which humanity should strive is actually fictional, and never actually existed. So in a twisted way, neo-conservatives are actually progressives, but they appeal to conservative sensibilities to dishonestly further their agendas.

As a libertarian anarchist, I consider myself a progressive. I have some ideas of how mankind might live in a better world than any we have yet seen. Unfortunately in this hyper-polarized modern America, idealism is often overlooked. People are preoccupied instead with the struggle between liberals and conservatives to control the resources and people of the world. Both sides have taken a position of "you're either with us, or against us", and see me as just another wasted vote, helping the other side by not pitching in with their side.

That doesn't bother me, though. What bothers me is that so few people are keeping their eye on the prize. What good is it if "your guy" wins the election, if the ideals your guy upholds are antithetical to justice? I believe I have my priorities straight, by putting my ideals first.

-- TTK

2006-11-01: The Joys of Semi-Rural Living

I live in a semi-rural area, where neighbors are far away, the streets are unlit, and the wildlife runs pretty wild. This combination provides ample opportunity for hillarity.

One neighbor has a halogen light in their front yard, which is so bright it hurts my eyes from our own yard, a couple hundred meters away. Fortunately there are some trees and bushes between us, so the light only shines through in patches. I have never liked this light, cursed it constantly, and have considered asking them to change it somehow, or "change" it myself with a rifle. But all that changed last Monday.

I was taking out the trash Monday night, in a bit of a mood, when I saw a dark shape slinking across the unlit street. It was about the right shape and mass for Sam, our big doofus "gorilla-cat", and I thought maybe I'd get behind him and shout "boo!" or something, and watch him levitate up a tree for my amusement.

So I crept up on the shape, and had raised my arms and filled my lungs with air just as it was passing through one of those patches of light from the neighbor's halogen to reveal not Sam's grey hair, but black hair. Black hair marked by two white stripes. Stripes pointing directly at me. It looked something like this.

I froze in my tracks as surely as the breath froze in my lungs. I backed away sloooooowly, not daring to make a sound. Making good my escape, I reflected that should I ever meet the neighbor who installed that halogen, I just might kiss them.

-- TTK

2006-10-31: Never, Ever Do This

I like my soda cold and flat, so I opened a diet 7up and put it in the freezer. I forgot about it, and it froze solid. So I put it on the stove to thaw it back out. Soda was forgotten, and I heard it start to boil. Soda was pulled off the stove, and when it had cooled slightly I put it back in the freezer. When it was cold again, I chugged it.

I think the boiling did something to it.

Something horrible.

It tasted normal enough, a little stronger than usual, but soon afterwards I grew nauseous.

The nausea grew. Oh dear gr0d, it grew.

I'll spare you what came afterwards. Anyway, take it from me, don't ever, ever do that.

-- TTK

User Journal

Journal Journal: Trying Something New and Different 1

And now for something a little different..

Last saturday I had a chat with a long-distance friend who says I don't write enough in my journal. This wasn't exacly news to me, but he said something interesting. When I observed that most days there isn't much more to my life than "woke up, worked hard, came home late, went to bed way too late", he replied that that was okay, he'd like to read it.

So I mulled that over, worried a bit that the dozens of "crap" journal entries would drown out the occasional "quality" entry, and realized I could remedy that with a new format. I'd make an "Everyday Life" journal entry, and then reply to it on the days that followed with the latest "Everyday Life" report. Then when I got around to making a quality entry, it would be a new entry on its own, so it would stand out from the "Everyday Life" entries, and then I'd start a new "Everyday Life" journal entry after it, so that all entries would logically remain in chronological order. (If that didn't make much sense, that's okay, you'll see.)

So, it's two days later and still no journal entry. WTF? Well, there's another problem -- one of timing. When do I talk about my day? Not in the morning, because my day hasn't happened yet. Not during business hours, because I work my ass off trying to get stuff done. I'm buried in enough work to keep two engineers busy -- there is no chance of my getting ahead of the workload, all I can do is try to minimize the number of projects that get dropped entirely. Not after I get home, because I prefer spending time with my wife over the computer (this journal entry you're reading now being an exception; she is currently engrossed on something on her own computer). Also, that's really the most interesting part of my day, and little of it if any would get put in the day's entry. It lasts until we're both too outrageously tired to really enjoy being awake any longer and we drag ourselves to bed. Then it's too late, and memories of the more interesting bits of the previous day are swept aside by the hectic new day.

So, where does that leave me? It leaves me with the option of putting something else into my "low-calorie" journal entries, something other than a few short words about my day. And there is plenty of material to draw from -- I have a rich internal life, and my three hours a day of commute gives me much time to ponder. I just don't usually like to put it into written words unless I'm going to do a proper job of it -- and I seem to get around to that maybe once every few months. So I'll just have to practice doing an improper, half-assed job of it, so I can post entries more often.

And that's the plan. We'll see how it goes.

-- TTK

User Journal

Journal Journal: On Wikis and Computer Racks 3

Gaaah! Too long since last journal entry!

Six months, and no journal entry .. how utterly pathetic. In the meantime cobalt (aka "invisiblecrazy") has been writing up a storm. Well, here's a new one.

Chronicles of The Beast, Part One

A friend of mine was getting rid of a computer cabinet, and asked if I wanted it. Naturally, I leapt at the opportunity. I've been meaning to build a server rack from wood for a while now, and I had a place set aside in our shed outside for just that.

So I woke up at 6am, borrowed my wife's truck for the day, and picked the beast up. And it was quite the beast. I'd been expecting something like the cheesy minimalist computer racks we use at The Archive, but this was anything but. The steel load-bearing structure was completely encapsulated in aluminum walls, with a locking tinted polycarbonate door. I could barely fit it into the back of cobalt's little truck! But that was really the easy part.

After work, I took the beast home, where it stayed for another day before I found the time to get it into the shed. And this is where the story really begins.

Getting the beast into the truck was made relatively easy by the caster wheels on its bottomside, and the nice flat pavement from my friend's carport to his driveway. The grounds around my house are not paved, but rather dirt covered by a four inch or so layer of pebbles. Nor are the grounds flat, the soil is too marshy for that. It flows around and makes little slopes everywhere. Nonetheless, I figured I had a pretty clear shot from the driveway through the side yard and into the shed, and I figured our handcart's 12-inch wheels would be up to the task of transporting it across the rocks.

So I got the casters off, lowered the beast onto the push cart, strapped it to the cart for good measure, and started down the side yard. I didn't get it very far. The ground had a distinct slope to the left, away from the house and towards the fence that separated the side yard from the drainage ditch. The pebbles were mostly navigable, but when the cart got stuck anywhere I couldn't jar it or rock it out or the entire cabinet would try to topple downhill. I worked at it for about an hour before rethinking my approach. If I kept it up, the cabinet would end up crashing through the fence and getting stuck in the ditch, where I would surely never dislodge it. So I laid down some wood to give myself a flatter surface, and used different thicknesses so that the cabinet would be more level in its trek.

Did I mention it was heavy? At seven feet tall and two feet wide, this thing was really damn heavy. Keeping it from toppling over was like wrestling with a poltergeist-infested coke machine.

Anyway, I finally got it to the shed, which I had prepared carefully to make room for its passage, and only then realized that I'd committed a grave oversight: Though I had carefully measured the space in which it was going, I had not thought to measure the entrance to the shed itself, which was far, far too short to allow the cabinet's passage. In the end I had to tip it over so it was laying down flat, with my body underneath it, then lift it straight up and push it through the shed entrance. Damn it was heavy.

After that, it should have been smooth sailing, but once again I fell prey to the hazard of unlevel surfaces. The floor of our shed was made of shipping palettes, and they were not all the same height, so the cabinet listed somewhat to the left. I had carefully measured the width of the shed's central aisle, and even at its tightest points it was a few inches wider than the cabinet -- or at least, a few inches wider than a *perfectly upright* cabinet. The leftward list was sufficient to close this slim margin, and a few items got knocked over before I again used tactically-positioned pieces of wood to give the cabinet a level surface on which to ride.

After that, it was smooth sailing. Three hours of hot, back-breaking tedium was all it took to get the beast into place. And now it is mine, and I have a newfound appreciation for level paved surfaces. :-) Time to start rackin' up those computers!

Little Black Car, Revisited

Soon after my previous blog entry, I got my little honda's tires more fully inflated and its oil changed, and now it's regularly getting 35 miles to the gallon -- one mpg more if I drive it nicely, one mpg less if I abuse it mercilessly. I have to admit to be getting 34mpg more often than I get 36mpg, because half the joy of having a stick-shift transmission is abusing it mercilessly.

I also settled on a bumper sticker, but have yet to put it on my actual bumper. The sticker itself was a row of three anarchist circle-A symbols (in the old style, not the Anarcho-Punk style made popular in the 1970's), but I cut it into three pieces so now I have three square-shaped anarchist circle-A stickers. One I put on the back of my laptop. One I will put on my car, but haven't yet. I haven't decided what to do with the third.

That rear bumper is not completely unadorned, however; my beloved wife found a "duct tape is like the force" sticker and stuck it on my bumper while I wasn't looking. :-)

Falling Prey to the Wiki Meme

I have been writing a little software for importing masses of data into a MediaWiki, an online resource which generates and manages webpages which anyone (or a whitelisted userbase) can update or edit. This is in response to two problems which I think Wiki-hosting material might address.

First, the system I have been using to manage my MBT Resources pages is woefully inadequate. One of my passtimes is researching battletank technology, and resources which seem rare or of scientific or engineering value get saved to my home workstation's hard drive. I then have a tool which can be used to curate this content, and export it into a simple webpage format.

This system has problems. It is tedious and time-consuming to sift through all this content, the resulting web pages are rather sparse and featureless, and once the content is cast into this web format it is difficult to update. As a result I have updated my site little in the last couple of years, even though I have amassed enough new material to easily triple it in size.

Second, Tank Net, a vibrant and valuable online forum for tank enthusiasts, has problems of its own. People there like to talk about tanks, but they don't like to read old discussions. This results in a lot of newbie questions and myths being posed over and over again, and the old regulars are getting tired of repeating the same answers and rebuttals. There are members who are visibly at odds with the disorganized and transient nature of BBS content, and I get the impression these members would appreciate a more organized and longer-lasting medium to target with their online muse.

I think a Wiki would address both, my problems with my website, and the tanknetters' frustrations, with a little help from software I have been developing at The Archive to sort disorganized content. I can automatically organize my new content into something approximating the right layout (tanks here, other afv's there, armor on its own set of pages, and munitions in their own, etc), and then use MediaWiki's powerful content curation tools to reorganize misplaced content and add descriptions to them in an ad-hoc fashion. The resulting documents would be nicely cross-linked (one of the features of Wikis is that when a term shows up in a document, and that term has its own document associated with it, that term is made into a hyperlink for its document).

Furthermore, since MediaWiki allows users to discuss the content of Wiki documents (each page has a "discuss" link attached to it, which leads to a forum interface, sort of), users can pose their questions and observations about the material, and a FAQ in the content page from which the discussion is linked might cut down on the repeatedly posed newbie questions. I would whitelist a select body of users (mostly tanknetters) and allow them to contribute to the Wiki in general. Hopefully this will lead to a more generally useful and better put-together resource than my current MBT pages.

Anyway, that's The Plan and the theory behind it. Execution of that plan remains to be seen.

Postscript

Those of you who are familiar with what's going on with my life lately might be surprised that I failed to mention a few important developments. This is not because I do not consider these developments meaningful, but rather it is because I do not want to talk about them in my public journal.

-- TTK

User Journal

Journal Journal: General Rant, More to Come Soon (I hope) 3

General Rant

Lots has happened .. and I haven't had time to write. Where to begin?

Work! Work is familiar territory. I spend enough time there. I will start there.

My long, dark midnight in the Collections Department is breaking. We have hired a Manager of QA, who has kicked some major, major ass, and freed me from the constant firefighting which has consumed all of my time for the past few months. I'm spending about half my hours actually developing software (ooooh!). :-) It feels wonderful.

It's just in time, too -- John, the Director of Operations, came back to work after a short (but all too long!) leave of absence. He takes the problems facing the organization very seriously, and pulls resources from anywhere and everywhere to solve them. Since I'm one of the resources at his disposal, he has assigned me some solutions to work on. Wow, now that I've written that out, it doesn't seem like such a wonderful thing, but really, truly, I am elated -- he "owns" the problemset in which I am most interested, and for which I applied to work at The Archive in the first place.

For instance, I've been developing new functionality into the software I use to track the data "items" on the three data clusters, so that I can answer questions about what's where, and what isn't where it's supposed to be, how quickly we are filling up the servers with different kinds of items, and (perhaps most importantly) what is supposed to be the same between the clusters which isn't. I haven't been using my ItemTracker system to do this; that system is still not in production. I've been using an ad-hoc collections of perl scripts which operates on flat files of columnated data, which build up different perspectives of the data, starting from raw "manifests" -- simple lists of all of the items on each host in every cluster (as generated by the dy utility) which are automatically generated every night and uploaded to a central server.

Right now I'm building these perspectives and generating reports as needed, but at John's request I am automating it, so that these reports are generated every week without human attention, and made available via a web interface. Once that's working, I am hoping to spend some time working on ItemTracker, which will not only make new kinds of information available, but also allow users to describe the kinds of reports they want, and have them generated on demand without me.

Switching gears a bit, Brewster sat on a panel at last Friday's Commonwealth Club of California meeting, where he and other relevant individuals talked about the future and present of book digitization. I didn't deliberately time it this way, but I happened to get out of work and into my car (my car! more on that later) to go home just as the meeting began, and got to listen to it on the radio on the way home.

He was a little muted, which surprised me at first, but as the meeting progressed I started to see why it might be a good idea for him to withold some information and conceal some of his zeal. Some people are really upset about book digitization, and view it with suspicion if not outright hostility. Publishing industry professionals see it as a threat to their future profitability, authors are concerned about potential infringements on their copyrights, and the lawyers are circling like sharks smelling blood in the water. Right now those lawyers are making their passes at Google, and I don't blame Brewster a bit for not wanting to attract their attention. Furthermore, most of the books The Archive has available for download are from the Million Books Project, which was a huge learning experience in how not to scan books. The quality of most of these is horrible, and I can understand him not wanting to give people the wrong idea about the books we're scanning today. The quality we're getting out of the Scribe is nothing short of amazing.

Oh yeah, my car! I finally, finally, finally got a car, to replace the 1999 Toyota Corolla I totalled last November. The main reason it took so long is because it had to have a manual transmission. I'm addicted to the stick. But manual transmissions are getting really hard to find! Moreover, I wanted a car with a maintenance record that Consumer Reports liked, and that got good gas mileage for my commute. I finally found what I wanted in a Fremont used car lot -- a black 2001 Honda Civic.

I've been driving it for a couple of weeks now, and I'm really happy with it. It's a bit noisier than I would like (lots of engine noise, and no room between the engine and the firewall to add soundproofing -- still pondering this), but it gets 33 miles to the gallon (vs, the Corolla's 35), has more than enough zip, and has a toy I've never enjoyed in a car before -- a cd player.

So, I dug out the dusty binder that contains my entire cd collection and put it in the car .. there isn't much to it, since I've always had tape players in my cars, and most of the music I like is on tape, and tape players are less expensive than cd players anyway, so why not take advantage of my existing investment and buy a tape player to replace a broken tape player, rather than buying a cd player? So I haven't bought many cd's.

Here's what I do have: NIN: pretty hate machine, NIN: broken, NIN: downward spiral, Stabbing Westward, Gravity Kills, Eagles: Hotel California, Pink Floyd: The Wall, Sisters of Mercy: Greatest Hits, Ministry: Filth Pig, Aerosmith: Get a Grip, Faith No More: King for a Day, Ozzie Ozbourne: Ozzmosis, Depeche Mode: Violator, Psychedelic Furs: Midnight to Midnight, Duran Duran: Decade, Tears for Fears: Tears Roll Down, Led Zeppelin: II, Pet Shop Boys: Discography, Pet Shop Boys: Very, Garbage, Def Leppard: Vault, No Doubt: Tragic Kingdom, Alice Cooper: Hey Stoopid, Blondie: The Best of Blondie, Berlin: The Best of Berlin 1979-1988, Roxette: Look Sharp, White Zombie: Astrocreep2000, White Zombie: La Sexorcisto - Devil Music, Rage Against the Machine: Evil Empire, Marylin Manson: Coma, and Eurythmics: Greatest Hits.

I want: more Ministry, Ozzie, LedZep, Marylin Manson, .. and I want my Foetus on cd! I have been missing Foetus a lot during my commute. I've been listening to No Doubt, Garbage, and Ministry instead. I'll see what I can find on Amazon!

There's something else, too -- this is the first time in something like ten years I've actually owned my car. My last two cars, I got on finance, and totalled them just as they were getting paid off. But this one we simply bought outright. It's not the bank's car, it's my car. I own it, I can drill holes in it if I want to, and I really, really want to put a bumper sticker on it.

Preferably a bumper-sticker with an Anarcho-Capitalist message, which has proven remarkably difficult to come by. I've poured through Cafe Press and similar sites looking for one, but mostly I've turned up stickers being sold with Anarcho-Syndicalist or Anarcho-Communist messages. (Which is pretty entertaining, if you think about it!) I wouldn't mind a more generic anarchic message, and found a few which resonate with my personal convictions (like, "There's No Government Like No Government"), but none which would go over very well if seen in The Archive's parking lot -- we rely a lot on government contracts and grants, and certain important persons working there are of the conviction that our government is the only means by which certain charities may morally or pragmatically be provided to those who need them. This may sound out of character for a self-proclaimed anarchist, but I am loathe to ruffle any feathers.

That's all the time I have for tonight, but I hope to get back to this journal soonish. There's much more on my mind.

-- TTK

User Journal

Journal Journal: Minor Life Notes 1

Minor Life Notes, 2006-01-07

My mouth is finally starting to heal up, again, after this, the third surgery in two months .. my diet is still mostly soup and eggs, and I'm still popping ibuprofen every six hours, but at least I'm not waking up early in the morning in agony anymore.

As of Friday, I was still spending ten hours a day at the office trying to do three people's jobs in the Collections Department of The Archive, and doing it somewhat poorly as a result. A new hire starts this Monday, though, dedicated to QA and technical user support, which is terrific. I'm hoping that with a little training, she'll be able to take a large load off my shoulders. Also, John Berry is back as Director of Operations, which means I get to perform some tasks for the Data Repository group again (which is really the job I signed on to do, and what I want to spend more of my hours doing). I'm really glad he's back.

One of the consequences of the new hire coming in is that I had to pack up some of my personal machines which I had stashed in the Collections office and take them home. Fungus, Dusty, Rooikat, and Dragon all got taped up and tucked into the closet. I kept hoping to have some time to work with them at the office, but aside from running Fungus for a couple of months as a secondary archival storage box, and firing up Rooikat a couple of times, they remained neglected.

-- TTK

User Journal

Journal Journal: ARRRRRGH , Take Two 3

ARRRRRRRRGH, take two

I woke up this morning around 5am to absolutely excruciating mouth pain. I stumbled into the main room and thought I only took two fioracet and went back to bed, but when cobalt woke me up to take her to her morning MRI I was extremely stumbly. My balance was off, I was very clumsy, and my speech was slurred. It's a miracle I got us there without killing us.

Near as we can tell, I got up sometime before 5am and took two fioracet, and then at 5am took two more, because I showed all the signs of overdosing on the stuff. Cobalt ended up driving us from the MRI to my oral surgery, and from the surgery to home. While I was waiting for her MRI, I drank a cup of coffee, which turned out to be a mistake, because suddenly I was both fioracet-drunk and nauseaus. Very bad day!

At the oral surgeon's Dr. Louie explained to me that the tooth socket pain was something called "Referred Pain", whereby the socket adjacent to the empty socket was being served by the same nerve, so the damage from the emerging jawbone was creating the illusion of both regions being in pain. His assistant numbed up the region with lidocaine (which, they she me, was the active ingredient in Oragel -- good to know! Anbesol uses benzocaine, which is related but I think not as powerful), and Dr. Louie explained that they were just going to grind off the top milimeter or two of protruding bone, and then surture it closed. He gave my jaw two injections of topical anasthetic, and then reached into my mouth with some horrible metal thing and started yanking at the jawbone, asking "does that hurt?" At my frantic affirmative, he injected me again, and started the surgery. I was crying out in pain, so he gave me one more injection of topical, which finally did the trick.

The procedure didn't cost me a dime (which is good, because The Archive doesn't provide dental insurance), and I thanked Dr. Louie for doing this for me. He replied "Don't mention it, I enjoy the opportunity to inflict pain," which made me laugh. Love that Dr. Louie.

After the surgery, he explained that while my jawbone was much thicker than normal, my left jawbone was even thicker than my right jawbone. I was feeling neanderthal enough, but even he uttered the phrase "caveman jaw". He wrote me a prescription for antibiotics, and another for fioracet. Yay! I get to reimburse cobalt for the fioracet of hers I used the two days previous (she doesn't have much on-hand).

I'm going to be eating soup for the next few days. Dr. Louie admonished me to not eat "hot foods", but since I'm a nekojita any soup I eat will not be particularly hot. After getting home it was all I could do to drag myself to bed, and here I stay.

Okay, in much pain now, took two more fioracet, and going to try sleeping again. Unconsciousness is good for pain. Poor cobalt asked me to carry something heavy outside, just now, and I had to tell her "No". She's not used to that. Hopefully I'll get better soon, so I can lift heavy things again without feeling like my jaw is going to explode in a paroxysm of blood and core.

-- TTK

User Journal

Journal Journal: Happy Fucking HOLIDAY OF PAIN

ARRRRRRRRGH

About two months ago I had my wisdom teeth extracted. A month later, my right jawbone emerged from my gums. Apparently this happens sometimes, when the gums shrink faster than the healing jawbone. It was painful, but I got through it.

Three days ago, the same thing started happening on the left side. "Oh, I know what this is, I'll get through it" I said to myself, and took ibupofen to control the pain, like last time. Yesterday the jawbone emerged, and the pain became too intense for ibuprofen to suppress, and I moved up to ibuprofen and anbesol (a benzocaine product, targetted at OTC topical oral anasthetic application), which was sufficient (if barely). This morning, I woke up to excruciating, unending pain. About half an inch of jaw emerged, and the pain became so intense as to be unbearable, despite everything I tried -- salt/listerine washes, icing, anbesol, ibuprofen, whatever. It wasn't just the emerging jawbone, it was also the entire socket just forward of the socket where the wisdom tooth had been removed. On a scale of one to ten, with "ten" being the worst pain I've experienced, this was a "nine". Around 1997, I had a really bad ear/nose/throat bacterial infection, and one of the medicines they gave me was codeine. I had a bad reaction to the codeine, resulting in intense waves of pain wracking my entire body, leaving me screaming and writhing on the floor for about an hour. That was "ten". This morning wasn't quite that bad, but it left me casting about in a borderline panic, wondering what I could do to fix it. (cutting a slit over the bone? finding the nerve and severing it? searing the entire area with a soldering iron? yanking the tooth with pliers?)

Normally, one fioracet is enough to leave me with no pain, suspended in a dreamy haze for about an hour, before dropping me into blissful unconsciousness. They're what cobalt takes for her migraines. Well, four fioracet later the morning's pain was reduced to tolerable levels. Cobalt chased down the chain of dentists and oral surgeons, who kept referring to each other (of all the reasons to loathe the christmas season, all of the nation's professionals taking their vacations all at once ranks way up there) and finally got ahold of Dr. Louie, the guy who extracted my wisdom teeth in the first place. He could squeeze me in tomorrow, to remove the emerging bone (and maybe give me a topical, oh please oh please). In the meantime, he said, fioracet was a good idea (he called in a subscription for me, which is nice because cobalt doesn't have many). So I'm drugged to the gills on fioracet ("blue pills") until then.

Thanks for putting up with my bitching. I really needed a way to vent my frustration. I took another four fioracet a little while ago, and my eyes are having trouble focussing, so I'll stop here. I will say that cobalt has been remarkably understanding and caring. She's been really good to me, and I'm grateful.

Merry @#$%^* Christmas. This season sucked for many reasons, which I'll address in another journal entry, but this topped the cake. I hope the rest of y'all had a better one.

-- TTK

User Journal

Journal Journal: A New Position! 1

A New Position!

Cobalt and I discovered a wonderful new position today. It makes her sigh, and moan, and cry out, and compliment me on the strength of my hands. I'm writing it down so that we remember, and to share it with anyone looking for something new and wonderful.

To make the position, you start out kneeling on the ends of the bed, facing each other, about three feet apart. Then she bends forward, still kneeling, until the top of her head is on the bed, so that her forehead is almost touching her knees. Her posterior rises a little so that the back is curved from almost vertical at her neck, to about 30 degrees from horizontal at her rear.

Then he leans forward slightly, placing his hands spread-out across her shoulder blades, thumbs against either side of her spine, so that he can apply a little weight as needed.

From this position, he runs his thumbs over the knots in her back, applying steady and even pressure from the base of her neck to the small of her back. A little lotion can help keep the action smooth -- we used IcyHot (your basic menthol-and-camphor), also using the ball and heel of the hand to press down and in on the tight muscles on either side of her neck up into the back of her head. It should take about eight seconds to do a complete "cycle", starting at the base of the neck, massaging down the spine to the small of her back, and then the same speed and pressure back to the back of her head. I didn't know it was possible to coax such verbalizations from that woman. After ten or fifteen minutes her eyes were glazed over and she was positively purring. She asked if I'd been having an affair with a masseuse, and noted she wouldn't mind right now, as long as I "brought home the rewards". :-) It's nice to make her so happy.

Bigga Badda Boom

Oh yeah, by the way, about three weeks ago I was in a traffic accident on highway 101, coming home from work. Wham, bang, nobody was hurt, but my car was totalled. I am so incredibly bummed. I loved that car (a 1999 Toyota Corolla).

It's taking forever for the insurance company to get its ducks in a row, and in the meantime I've been working from home some days and taking cobalt's truck to Archive HQ other days. She's unhappy to be without her truck, I'm unhappy to be without a car, my coworkers are slightly miffed that I'm not in the office every day, and I am very much looking forward to getting the insurance money so we can buy me a new (used) car.

The Corolla was great -- plenty under the hood for my modest needs, an excellent 34 miles to the gallon, a five-speed manual transmission, and enough trunk space to carry a few computers or snake tanks or weedwhackers or whatever. It was a little noisy, which annoyed cobalt more than it annoyed me. Very reliable and low maintenance. It will be a hard act for the new car to follow.

So far I've identified a few cars which I think might make me happy -- hyundai accent, honda accord, mitsubishi lancer, mitsubishi mirage, suzuki aerio, suzuki esteem, volvo S40, and volvo V40. I also wouldn't kick a toyota mr2 out of bed if one falls in my lap (I've been lusting after one since highschool), but realistically it doesn't have the cargo room to satisfy all of my automotive needs. All of these would make good to fair commuter cars, plus the capability of lugging a lot of stuff. What remains to be done is check out what Consumer Reports says about each of them, and cross out any with significant reliability dings or high maintenance needs, and then go to our local dealerships to see what they have in the ways of slightly used models (I'm thinking between 2001 and 2003) with a manual transmission, and give them a test drive to see if any of them are grievously incompatible with me. My money will essentially go to the lowest bidder -- I don't have enough of a preference for any one of these cars to make me choose one over the other based on anything but financial cost.

Wish me luck.

-- TTK

User Journal

Journal Journal: Shook Hands with a Legend 3

I was sitting in my office just now at the Internet Archive monitoring some data transfers when Brewster walked in and announced some visitors (as he is apt to do at times -- all visitors get to poke their heads in every door so Brewster can describe the functions of the different departments). A family flowed through the doorway, and the gentleman in the lead was introduced to me as Freeman Dyson.

Freeman Dyson.

Personal childhood hero, Freeman Dyson.

"Dyson Sphere" Freeman Dyson.

The Orion Project's Freeman Dyson.

Unifier of the field of quantum electrodynamics, Freeman Dyson.

I shook his hand, and lost all of my cool immediately, blurting "THE Freeman Dyson?!?", at which the younger man still standing at the door (perhaps Dyson's son?) looked very cross. Oops.

Mr. Dyson answered "One of them anyway", and returned his attention to Brewster. I sat at my console and tried not to panic.

Brewster turned to me at one point and asked "How many non-web items do we have in the Archive?", to which I blanked and answered the first thing that came to mind, which was the number of data items I was trying to transfer, which of course was off by two orders of magnitude. Brewster looked a bit perplexed, but being himself he saved very well. How the hell am I supposed to perform when he springs a living legend on me like this? I just want to close and lock the door and dissolve.

I've met some great people at The Archive, but I never lost my composure like this. Ack. Still in shock, maybe I'd better take a walk.

Egads .. Freeman Dyson!

-- TTK

User Journal

Journal Journal: Appelbaum, Books, and Code

A Night With Jacob Appelbaum

Last friday night at the Archive, co-worker Gordon Mohr brought an interesting individual up to my workspace -- Jacob Appelbaum, a blogger who had just gotten back to the United States after spending several months in Iraq and then New Orleans, shooting pictures and video in the wake of Operation Iraqi Freedom and of Hurricane Katrina.

He was in a bit of a bind -- he needed to upload dozens of gigabytes of footage to a hosting service in time to give a presentation on the material the following morning. So we used the high-bandwidth connection here at Archive HQ to pull his highly unique footage off his travel hard drive and fast-track it into the .us data cluster. It has been organized into five data "items" (basic chunks of data handled by the Archive's software):

jacob_appelbaum_Iraq_Video
jacob_appelbaum_iraq
jacob_appelbaum_turkey
jacob_appelbaum_New_Orleans
jacob_appelbaum_Houston

We stayed at the office late into the night, talking about everything from his experiences at Iraq to SSH ciphers to the impact of HR Giger on the fields of art, science fiction, and transhumanism. He is quite the renaissance geek, and speaks very eloquently on a wide range of topics. It was midnight by the time I got home, but we got all his footage hosted. It was a very satisfying experience -- this is exactly the sort of content that The Archive exists to archive, and why I work here.

Books Books Books!

One of the more interesting projects going on here at The Archive is the Scribe book-scanning robot, currently in Beta state. It is a device which allows for the rapid and efficient scanning of books into high-quality images. It associates metadata with each page it scans (describing the page number, page type (Contents, Title page, Text, Illustration, etc)) and processes the images into a number of formats, including jpeg, djvu (which is like pdf, but better in every way), text, and xml. The plan is to deploy Scribes in libraries across the world so that rare and/or interesting books can be archived into perpetuity with a minimum of human effort.

I have used the Scribe prototype to scan a few books, which I keep on my laptop so I can read or reference them abroad. It has been extremely enjoyable to be able to read them when out and about, and handy to be able to search them for rapid reference (one of the books is a text on material engineering, another on automotive technology). I've been giving the Scribe developers feedback on what works well, what could work better, and what features would be handy to have. They seem to appreciate it.

Today before coming to work I wandered my bookshelf, pulling about 5000 pages' worth of books I would like to also have on my laptop for quick reference and/or pleasure reading. My wife thinks this is a great application, and gave me one of her more prized books to be scanned too.

In my spare time (what of it there is), I've been working on software which ingests the various data and metadata the Scribe generates, and generates from it an HTML document for the book. This would be more useful to me than the djvu or text versions, but there are a number of technical problems to overcome before it works adequately; the Scribe's OCR is error-prone, and I am trying to come up with heuristics for correcting mistranslations. Also, it cannot yet reliably pick out just the illustration from a page of mixed illustration and text for inclusion in the document.

The fonts and layouts will be different than what's actually in the book, which is not an issue to me. I'm pretty happy with plain ASCII text. But the audience to which the Scribe is being pitched (librarians, historians, and archivists) cares a lot about preserving not only the fonts and layout, but also the coloration of the pages, including blank paper, inked letters, and illustrations. So this software is just for me; I doubt anyone else involved with the Scribe project will be interested.

(ObCopyrightDisclaimer: The Archive only stores into its data clusters books which are either out-of-copyright, or books whose intellectual property owners have given permission for their archival. The books I have scanned for my laptop are for personal use only, and this is covered under the Fair Use clause of American copyright law.)

ItemTracker and UniversalDB

My ItemTracker project continues to evolve. The idserver became more of a liability than it was worth, since as the database continued to grow, the process of serving unique id's ceased to be the performance bottleneck. I chopped out the idserver and replaced it with equivalent SQL transactions, reducing the complexity of the project considerably.

As the range of data items being monitored by ItemTracker expanded to cover the entire .sf and .us clusters, the performance of the database diminished. I tweaked the table indexes and database parameters as well as I could, but some of my SQL queries have run into several minutes of processing time. It was clear that the system would not scale to cover all three of our existing clusters (.sf, .us, and .eu), much less when these clusters continue to expand. So I am taking the step of distributing the database.

All of the ItemTracker code is currently accessing the database through my UniversalDB module, which makes it the obvious place to implement the distribution code as an abstraction layer. My goals are modest to begin with -- each table's columns will be duplicated across all nodes in the "virtual" database, with different nodes storing different rows. Since everything in the database ultimately relates to some item, the item rows are trivially segmented by item id. UniversalDB parses each INSERT statement for its item id and sends it only to the appropriate node for storage (and INSERTs to tables without an item id column are duplicated onto all nodes). All other statements (UPDATE, DELETE, SELECT, etc) are being sent to each node in parallel, and the returned data is concatenated together on the client's side. This will not work for some joins, but works perfectly for all of the SQL statements currently used by the ItemTracker system. In the future, if/when I have time (or perhaps if someone else takes it upon themselves to do the work), more work can be done to see if non-INSERT statements can be sent to only some subset of all data nodes, but for now this should be plenty to get ItemTracker back on track (and boy am I getting impatient!).

UniversalDB's distribution function is not a one-trick pony; I'm sure I'll have uses for it in other applications in the future, and someone else might find it handy too once it's published. So I'm trying to do a good job of making its configuration useful in the general case -- at least as much as I reasonably can. Its simple segmentation scheme is most useful for distributing databases whose tables all have some common key, and much less useful outside such cases.

The abstraction is important -- I don't want to touch any of the existing ItemTracker code, now that it's stable. Right now ItemTracker just opens a database by host and database name, and spits SQL statements at it. To make this "just work" with the distributed database, UniversalDB needs to know that a database name corresponds to a "virtual" database, and not a physical database. It reads its configuration file to know what nodes (servers) make up the database pool, and to know how to divide the tables between them (eg: "split fester items on id by mod 16" tells it how the table is segmented, and "chunk fester items 12-15 ia401293.archive.org rfester" tells it that when fester.items.id modulo 16 == in the range 12 through 15, it should send the INSERT to node ia401293.archive.org, database "rfester" for storage). The object passed back by UniversalDB.pm's "new" method has a flag set which indicates that it represents an interface to a virtual database, and also keeps a list of other UniversalDB ojects which each represent an interface to a real database, one for each node in the database pool. When the user passes an INSERT to the virtual database UniversalDB object, it passes the statement to the appropriate real database object. Other statements are passed to each real database object in asynchronous mode, and then the virtual database object polls each of them cyclically for data.

Thus the details of the distribution are hidden from the "user", as long as there is an "administrator" writing the configuration files. From the user's perspective, it's just an SQL database. But in reality data is being stored and processed on a bunch of different machines. It should work well -- the distributed database system I worked with at Flying Crocodile in the late 1990's was similar, albeit much more sophisticated. Their system scaled to over 400 nodes, with an emphasis on absolute performance. I'll be happy if mine scales to eight or sixteen, with an emphasis on absolute stability and fault-tolerance. Several seconds (up to a minute or so) of latency per high-level transaction is quite acceptable for my application, but it must be robust or I will not deploy it. Silent data loss is intolerable.

UnixAdminBot, WAAG, and QQ

My unixadminbot project is looking good. Necessity is the best impetus for development, and I have been needing unixadminbot to run on some of my systems to handle basic things like reporting the local system's configuration back to a central location, and watching various daemons to make sure they stay up (rsync, ftp, mysql, and http). I have ambitious plans for expanding unixadminbot's functionality, but with great force of self-control I am limiting myself to just its minimum level of functionality until it is stable and complete up to that level. I will release that as its first beta, fork the codebase into stable and development versions, and check bugfixes into both while expanding the functionality of the development version.

WAAG (WAN-At-A-Glance, the project formerly known as Glance) is one of the projects which is getting folded into unixadminbot. I've rewritten much of the old codebase and written some new code as perl modules, and they will eventually be used by unixadminbot to provide the WAAG system with its periodic reports (quick rehash: the object is to monitor a nontrivial cluster in a manner similar to Nagios, but to make the user interface and underlying architecture scale better than Nagios -- a sysadmin should be able to deploy it across thousands of nodes without taxing the system, and know how healthy his cluster is by simply glancing at a webpage, without having to touch the scrollbar. Right now Nagios is straining to monitor The Archive's data cluster, and the "summarized" table of pending problems reaches several hundred rows in length.)

My "first stab" at writing WAAG has been running on a subset of our data cluster for a few months now, and even though it only has a fraction of the functionality I want it to have, it proves useful every day at exposing problems in the cluster. When a server is having issues, I know it at a glance because everything (not just the summary of problems, but all statuses) fits on the screen, even using a nice fat easy-to-read font. It's not that Nagios is incapable of detecting these same problems, but sometimes it takes a while because it falls behind in scheduling checks, and then the new problems are lost amidst the other problems in its display. With WAAG, new problems jump right out and are noticeable. This means if a server is heading for swap-death, I can catch it in time to ssh in and kill apache, or add another few gigabytes of swapfile, or whatever. This crude implementation has been useful in teaching me how the "real" implementation should be done. Since the configuration and scheduling functions of the WAAG daemon and unixadminbot are so similar, and it is in my interest to have unixadminbot running on The Archive's cluster, I am folding the "real" implementation of WAAG into unixadminbot.

The "qq" project started life as a quick-and-dirty parallel remote execution tool, and various people have come to rely on it. It really needs to be better, though. I've been writing code to make it better, but have run into some of the limitations of perl (especially the "early" versions, if you can call 5.6.1 early), which handles signals and threads ineptly and does not give me all the control I need over memory management. As a result, the "qq2" I've been working on is getting close to having the functionality I need, but is horribly unstable and apt to hog a lot of memory depending on the circumstances of its use. As much as I hate to do this to some of the people suffering under the original qq's limitations, I'm going to have to rewrite the client half of the system in C, as "rr" (Remote Run). I think the server half can remain in perl, as it does not rely on signals nor suffers from memory management issues, so I am rolling that aspect of the project as well into unixadminbot (which is perl). But writing the client part in C will allow me to manage memory directly, use signals reliably, and use standard pthreads with some expectation of good behavior.

Normally I hate gargantuan monolithic do-everything projects, preferring lots of little simple specialized tools, but rolling these projects into unixadminbot just seems like the right thing to do, from both a technical and political perspective. With appropriate and disciplined use of modules, I should be able to keep its complexity managable and avoid the instability which complexity brings. Also, nobody who runs unixadminbot will be stuck with running all of the services unixadminbot can provide; by default unixadminbot will do nothing but sleep until it sees a configuration file which tells it what services to provide. If people just want it to act as an rr daemon, it will do that. If people just want it to act as a WAAG daemon, it will do that too. If it is only told to report the system's configuration to the mothership on system boot, it will limit itself to only that. But personally and professionally, I intend to use unixadminbot to the limits of its capability to manage, configure, and run hassle-free, high-availability computer clusters. When a cluster "just works", and adding new servers is as easy as plugging them in and turning them on, then the real work can begin. I intend to make unixadminbot handle all that crap so I won't have to. Now if I only had a mechanical robot that could swap out faulty hard drives .. :-)

The DR Codebase Documentation Project

In theory, everything developed here at The Archive is open-source, but in practice the responsibility for packaging and publishing code devolves to individual engineers who have to do it on their own time. A few of the tools I've developed have made their way out to third party users (like qq, dy, sizerate, and doublecheck), and I would like to push out more. There are about a dozen tools in my bin directory which I developed, some of them quite small, which might engender wider interest. Most of them rely on a subset of the perl modules developed by The Archive's Data Repository department (mostly by me and Brad), specifically: DR.pm, DR::IDClient.pm, DR::Poster.pm, and DR::UniversalDB.pm (I might be adding ItemTracker.pm at some point in the future, but not for now). There are several other modules in DR, but I generally do not use them, and I am leaving those to Brad to package and push out if he deems it worthy of his time.

Aside from the issue of documentation (some of these tools are already documented, but others are not at all), there is also the issue of where these perl modules should live. Right now, The Archive's legacy cluster locates them in /ia, the Petabox locates them in /petabox/sw/modules, the systems used by Hardpoint Intelligence locate them in /usr/cluster/modules, and some oddball systems at The Archive which aren't really integrated into any cluster have them stashed away in /root/bin/modules, /home/search/bin/modules, /home/bill/bin/modules, or /home/ttk/bin/modules (depending on which accounts I can access on those systems). Thus, if I want a tool or daemon to "just work" on any of these systems without tweaking the code, I put something like this at the top of the script:

#!/usr/bin/perl
use lib '/home/ttk/bin/modules';
use lib '/home/search/bin/modules';
use lib '/root/bin/modules';
use lib '/home/bill/bin/modules';
use lib '/usr/cluster/modules';
use lib '/petabox/sw/modules';
use lib '/ia';
use DR;
use DR::Poster;

Yuck! Obviously it would be very nice if these could go in some standard place.

After poking around some on CPAN, it looks like the right thing to do is to rename the packages to go into Cluster::InternetArchive::*, and then use the CPAN interactive install tool and/or Makefile.PL to install these modules into their expected place. I do not have administrative control over the Petabox, so that might be an ongoing political issue, but it shouldn't be too hard to get this done on all of the other systems. Then writing a script to work everywhere would be much cleaner:

#!/usr/bin/perl
use Cluster::InternetArchive::DR;
use Cluster::InternetArchive::DR::Poster;

Looking at the perlnewmod man page, turning a perl module into an installable package doesn't look hard at all. I just need to type it up and push it onto SourceForge. Finding the time will be the hardest part (then again, I find the time to write journal entries once a month or so, and this shouldn't be much harder).

-- TTK

User Journal

Journal Journal: Books, Books, Books, and the Law

Books, Books, Books, and the Law

Disclaimers up front: I am not a lawyer, and nothing said here should be construed as any sort of official position of The Internet Archive (of which I am an employee, and which is the Appellant in the case discussed below). These are purely my own personal, nonprofessional, unofficial opinions.

I've been staying up late tonight reading the Government Opposition's Brief and Appellant's Reply regarding the latest appeal of the federal case of Kahle vs Gonzales (was Kahle vs Ashcroft), and I have to admit that what I'm reading doesn't look very good.

If I understand the legalese correctly, at issue is whether Congress' automatic extension of the time period that authors' works are protected under copyright law are subject to "First Amendment review". If it is, and if a review takes place, and if as a result of that review Congress' extensions are found unconstitutional, the effect would be a great many (though not all!) books and other copyrighted works more than 28 years old reverting to the public domain, specifically those works whose authors have not performed (and continue to not perform) the formal copyright renewal process described by the unrevised copyright law. These public domain works could then be made globally available at virtually no cost.

Both sides make some good arguments (mixed with some rather disingenuous arguments) on a variety of ancillary issues, but it seems to me that the Appellant's position suffers from a critical flaw. The Opposition makes a persuasive argument that authors are entitled to have their works protected under copyright for both, the initial 28-year time period, and the post-renewal time period (which Congress has upped to 47 years), and that the renewal process is merely a formality which makes life easier for the government, and needlessly harder for the author. If this is accepted, then it is reasonable to conclude that the elimination of the requirement to perform the renewal does not constitute a change in the "traditional contours of copyright protection", and therefore does not warrant a First Amendment review. The Appellant's attempted rebuttal of the Opposition's argument is spirited, but in my opinion does not demonstrate any fatal flaw in the Opposition's reasoning. Let's hope that the district court disagrees with me :-) and finds the Appellant's arguments more persuasive. Any step towards repealing Congress' extravagant inflations of copyright protection periods (no doubt bought by "donations" from the RIAA and MPAA, who stand to gain millions of dollars each year from continued copyright extensions) would be a good thing.

You may wonder why I care so much that it's keeping me awake on a Friday night, and that is actually the subject I wanted to write about here. All of the above was just laying out the background.

I hold books in very high regard. Books are the classic means of describing knowledge in a systematic way, in depth, and organized in such a way to facilitate learning. Brad, one of my co-workers (and the original and most senior "Data Archivist" at The Internet Archive), claims that the web is more useful than books, on the assumption that (if I may paraphrase) any worthwhile information which exists in any book is also out there on the web, somewhere, and in a searchable and hyperlinked format. But until someone invents a wonderful, magical search engine which finds all of these bits of information and organizes them in a way which facilitates learning in depth and without significant gaps of knowledge (ie, like a textbook), I cannot agree with him on the initial claim, nor do I agree with his base assumption. Someone who seeks to teach themselves a new field of engineering or science will gain a higher quality of education, and faster, by reading books on the subject, than they could by googling the web (with the possible exceptions of computer science and computer engineering, which enjoy an exalted status on the world wide web for obvious reasons). I can say this with some confidence because I took it upon myself several years ago to learn material engineering. Being a cheapskate homebody, I used online sources of information as much as I could to further my knowledge and understanding of the field, and I did find some very useful material properties references online, but I only experienced high quality learning when I sat down with a book and read it cover to cover. Every such book revealed facts to me which I do not believe were on the world wide web (or, if they were, I was never going to find them).

Insomuch that medicine, science, engineering, and agriculture are the key fields which preserve humankind and promote its welfare, books which teach medicine, science, engineering, and agriculture are the keys to a better future for the entire world. If more books become freely available (and accessible -- which is a central tenet of The Archive), more minds can be educated by their contents, and as a consequence more epidemics may be prevented, more bridges and dams built, more renewable crops planted and harvested, and so on.

That having been said, as an anarcho-capitalist consequentialist libertarian I am not unsympathetic to the rights of authors to protect their intellectual property. Realistically, though, 28 years is a long time for an author to gather the proceeds from the sales of their efforts. In the modern market, profits from the sale of intellectual goods tend to be at their maximum soon after the release of those goods into the market, and taper off rapidly after a few years. Any artist, author, or researcher worth their salt had better have come out with a new product within 28 years of releasing their last product, or they deserve to starve. And if it really is very important to them to hold onto that copyright protection, they can file a request with the copyright office and have it extended for another 47 years. Kahle vs Gonzales does not seek to change that.

To me, Kahle vs Gonzales is the most interesting effort, in a sense, in which The Archive is involved. This is because it is an attempt to change the way our government operates -- something that I cannot do myself. Everything else The Archive does (collect, digitize, and archive information, and make it available), I could do myself in some scaled-down sense. In theory, whenever The Archive spends $100,000 to do good things, I could personally spend $1000 and do 1% as much good thing (not exactly, but you get my drift). But that is not true of suing the federal government to change its practices. I cannot do that. I cannot even do 1%, or 0.1%, or 0.001% of that on my own. It takes The Archive (and its esteemed partners, including Prelinger and the EFF) to accomplish such a magnificent feat. If they succeed in making their case, then in my eyes The Archive will have justified its existence. Until then, it is merely doing as a single entity of a few dozen people, what a few dozen people could do just as well working as independent individuals. It is Kahle vs Gonzales, and the lawsuits yet to come, which distinguish the capabilities of the greater Archive community from the capabilities of its constituents. I feel awed and privileged to be near (and, in a very indirect way, an active part of) the source of that effort.

-- TTK

User Journal

Journal Journal: Some Dental Work Today .. Ow 4

Teeth Out!

I just got back from having my two back lower molars pulled out. The empty sockets are starting to get sore. Ow.

I refrained from general anasthetic so that I would be aware through the process. I was curious about the methods and procedure the doctor would use.

It started with a swab of local anasthetic around the backs of both sides of my lower jaw, which numbed the area slightly. They followed up with a huge, long needle of lidocaine deep into my jaw. He stuck me with it twice on each side of the jaw, then we waited for it to take effect. The numbness spread across my jaw and face, but was noticeably less intense on my right side, so when the left side got "fat" and the right side didn't, he stuck me on that side again with another dose of lidocaine. When the numbness spread a bit more, he went in.

The doctor gave me a rubber thing to bite down on with the left side of my jaw, and he went in to take out the right tooth. I thought it was odd that he started there; why not start with the more-numbed side left and let the numbness spread on the right side in the meantime? But nevermatter. It worked out fairly well.

He scraped and pulled at the outside of the first tooth, trying to see if it would come apart. Then the reached in with a grabby-thing and pulled a little, but released it almost immediately and pulled out a router-like thing instead (like a drill, but for cutting rather than drilling) and cut into it, but not all the way through it. Then he reached in with the grabby-thing again and pulled it around some more. This time I experienced an odd, familiar kind of pain, down in my jawbone. I recognized it from the times I broke a bone (in my face, my arm, and my feet, so far), that deep-down sick kind of pain. I made a noise because it was unexpected, but when he asked me if I was okay, I said yes (or "uh-hu" as the case may be) because despite being unexpected, it was not that much pain. I get worse hitting my thumb with a hammer. So he continued to pull at it, and then got his other hand in with a pen-like instrument to get under the tooth. Gradually he worked it out, quite pleased to have gotten it all out in one piece.

As he removed the rubber thing and replaced it on my jaw's right side, I reviewed what had just happened. It was unpleasant, but not all that painful. I psyched myself up for the other extraction, anticipating more of the same deep-down-jawbone pain. Often, if I anticipate a pain being worse than it's going to be, the subjective experience of it is much less intense than had I not. So again he cleaned it and pulled at it, then reached in with the grabby-thing, and stopped and went in with the router, but when he started cutting, I felt something very unexpected -- a sharp, intense shooting pain, down near the tooth, moving in a line perpendicular to my jaw. He saw my expression, pulled out, and asked me what was up. I did my best to mime what was going on, but the closest they could guess was that I was experiencing pain when he touched me with the router, which I confirmed. So he took a different kind of needle (short and relatively obtuse) and did something to that side of my mouth. He explained what he was doing to his assistant, but I didn't quite catch it. Some other kind of local anasthetic, I suppose. Whatever it was, it did the trick, and did it quickly. He extracted the tooth fairly quickly, without further pain of any kind, not even a hint of the jawbone pain.

I'd requested to keep the teeth, and he acquiesced, much to my surprise. They don't generally do that anymore, but I figured it wouldn't hurt to ask. He had to sterilize them first, he explained, because it wasn't permitted to carry potentially infectious material out of the facility, but after a short bleach bath I got them in a little bag. Woot!

I'm icing my jaw now, and the painkiller is wearing off. I have 600mg ibuprofin tablets, but I don't anticipate they will do much. I'm going to be in considerable pain for a while -- several days at least, if not a couple of weeks. But as long as I don't get a "dry socket" (lose the bloodclot in the empty socket) it should heal up just fine in the next two weeks.

At least I got a lot of hard stuff done at work before this point .. I'll probably be short of temper and somewhat distracted for a while. Well, I'll see what I can do despite it.

-- TTK

User Journal

Journal Journal: Grim Grumbles on Goods, Glance, Goons, and Goals 2

Whew, it's been a while since I wrote in here, eh? Well, life's been pretty full.

At The Internet Archive, I've been trying to get my latest grand project, an "Item Tracking System", off the ground and debugged for real-life deployment. It's been a difficult one. The basic idea itself is simple and fairly straightforward, but as with any project at The Archive, the quality of sheer quantity casts its own peculiar shadow over matters, demanding peculiar approaches to the design of the system.

At The Archive, we have two kinds of archives: Web, and Collections. The Web archive consists of thousands of .ARC, .DAT, and .CDX formatted files spread across several hundreds of nodes in the data clusters. Every two months or so, 50 TB's or so of additional (compressed) files are added to the collection, but otherwise they remain fairly static. The Web archives are Brad Tofel's specialty, and he is the de facto top authority on the technology used at The Archive to collect, analyze, and manipulate them.

The Collections archives are a bit less uniform. In fact, they're downright chaotic. They consist of about 120,000 "items" of various types, where an "item" is defined as a globally unique name, a directory referenced by that name, and a bunch of data files in that directory. The items are contributed to The Archive by our partner institutions and by the artists who created the content (and therefore have the right to make that content available via The Archive). There's been some attempts, lately, to define an "item" more rigorously than that, adding well-formed metadata to each item in the form of .xml files in the item directory, but all in all the item format has been highly fluid and changes rapidly. The data content also changes, since the owners of these items sometimes update them with more information, or with newly reformatted versions. Some items are music, others are "texts" (scanned or transcribed books), while others are movies, software, radio broadcasts, etc. They are spread across three data clusters, which keep each other in sync via an OAI feed which seldom works.

As the Data Repository department's designated programmer, and the technical lead of our QA efforts, I've had to deal a lot with the Collections archive, and am close to the problems we run into trying to keep the various instances of items in sync, moving them between servers (which we should never do, but do all the time anyway), handling item failure modes, etc. I became increasingly frustrated by the lack of a central database which enumerates all items, and got tired of compiling cluster-wide manifests by hand when cluster contents needed to be compared, so I made up a list of all the problems we deal with every day which such a database would solve and used it as my justification for building an Item Tracking System.

Tracey is officially the only programmer in charge of developing "infrastructure" for the Collections archive (Brewster believes in keeping a logically unified system by keeping it all in one person's head, which is "different", but totally his call to make), so this Item Tracking System is necessarily something of a skunkworks project. It will never be considered an authorative source of information, but it should be useful to many people within the organization anyway, and can also be used to sanity-check some of the other (very buggy) systems we use to track and manipulate Collection items.

Anyway, the Item-Tracking System is very close to being finished. I am mostly squishing bugs at this point, and finding ways to make it hammer the database less severely. It uses a schema which is exquisitely suited to a distributed design, with data columns duplicated across many servers and data rows distributed between them, and I've designed such a system before, but for now I'm keeping it very simple in the interests of expediting development and depolyment, and using just one database on a central server. Implementation improvements can come later. Projects at The Archive either get deployed very quickly, or they die, and I do not want this project to die. It promises to be too useful for too many people (including myself). I have deployed it across parts of our infrastructure a few times now, leaving it on for a while to observe bugs and then shutting it down to rewrite things, but I hope to turn it on and leave it on sometime next week. The longer it runs, the more changes it can catch, which will hopefully give us more insight into some of the unsolved bugs in our data cluster infrastructure. Also, there are people in the organization who want to use it for their own purposes, but are holding off until it is fully deployed and stable.

A few interesting bits of code have come out of this project which have wider application. One is an "idserver", which is a better-performing tool for acquiring unique identifiers for globally-scoped labels in a distributed environment. One conventional way to "atomically" acquire a globally unique identifier is to perform an SQL "insert" to a table of labels, let the database system allocate an identifier to it, and then "select" the inserted row back to the remote system so it can find out what the identifier is. It is important that assigning the identifier be atomic, else race conditions could result in two "versions" of the unique label having two different identifiers, and the SQL insert/select method achieves this, but the overhead cost of this is fairly high -- two database transactions per label. The Item Tracking System needs to manage millions of unique labels (one for each item, and one for each file in each item), and when I started it up for the first time, the database was completely deluged with unique-id transactions. The "idserver" is a more optimized form of performing the same role. It achieves atomicity of operation by being single-threaded, running accept() on one network connection at a time, assigning a new id if none exists, and responding with a tuple of the form [isnew, id, label], where "isnew" is 0 if the label already had a unique id or 1 if the idserver assigned an id during this transaction, "id" is a numeric id, and "label" is the label identified thereby. By keeping it brutally single-threaded, race conditions are avoided and it was fairly easy to write and debug (thanks, btw, to Brad Tofel for the idea, and to Odo for optimization tips). The idserver also achieves better performance by accepting a list of labels per transaction, rather than being limited to one label-id-assignment per transaction. This allowed me to batch up my labels (perhaps 200 filepaths for some item), establish the TCP connection, send them over, and get back a list showing all of their id's and which ones were new (so that it might choose, for instance, between using an "update" or an "insert" to store newly discovered information about an item or a file). With the SQL based solution, this would have required 400 separate transactions (though only one TCP connection). If necessary, I can also get an N-order reduction in idserver load by running N idservers on N different machines, with each idserver being responsible for a different subset of all possible labels (perhaps taking the md5 checksum of a label, modulo N, and the remainder being the index into the authorative idserver). Each idserver could assign identifiers from different ranges of numeric values (2**64)/N apart, and thus maintain global uniqueness of id's without need for communication between idservers.

I have some other projects which could take advantage of idserver. It will be a pleasure to drop it in place and see improvements in performance. I'll be open-sourcing idserver soon, I've been organizing a bunch of my tools into a cohesive collection, categorizing them, and writing documentation for them. They'll be appearing on Sourceforge when I'm ready, and idserver will be among them.

Another interesting technology is a "universal database" interface, which I developed in reaction to the annoying way perl's DBI module for interfacing with different database systems manages its system-specific components. The idea behind DBI is beautiful -- it provides a simple API for interfacing with SQL-based databases in general. If you write your code using DBI (and if you only use generic SQL), you don't need to care which database system is being used: MySQL, mSQL, PostgreSQL, Oracle, whatever. You can switch the database your servers run from MySQL to Oracle, and your perl will still "just work" by changing only the "mode" parameter of DBI's connect() function (which can trivially be made config-file-driven). The way DBI handles the different implementations these databases use to communicate by allowing for many DBD modules, where each DBD is compiled with the client-side libraries of the target database system, and then provides DBI with a uniform interface. There is a DBD::mysql, a DBD::postgres, a DBD::oracle, and so on. It's lovely.

Unfortunately, it also falls apart when an organization has several different (and incompatible) versions of the same database systems running on different servers. The Archive currently uses at least three different versions of MySQL and two different versions of PostgreSQL for various things, so different versions of client-side libraries are installed on different nodes, and my Item Tracking System needs to be able to run everywhere. These different databases are "owned" and maintained by different people, and it would be something of a major political effort to get everyone to change to using a single version of MySQL and a single version of PostgreSQL (even if that were advisable; I'm personally against forcing people into homogeneity, preferring to see diverse solutions to similar problems, and it would be a waste of everyone's time and effort, and would possibly disrupt existing services). So suddenly using DBI to interface with these different systems gets a bit more complex. If it was just a matter of making all of the Item Tracker's client-side daemons able to communicate with the one central database I'm currently using, it wouldn't be that bad. I'd "just" have to compile my chosen version of MySQL on all the different platforms in use, compile a special version of the DBD::mysql module, put it in its own special place, and use perl's "use lib" directive to make DBI ignore the system's usual DBD::mysql and use the special one. But it's more complex than that, because I designed the system with an eye towards distributing it into a heirarchy of databases (so that the European data cluster could write to a local database, which gets synchronized with the one in America periodically, rather than having to conduct every little transaction across the ocean), and because I want to be able to switch databases as needed without having to rebuild all of the client-side software everywhere. My initial deployment of the System ran on MySQL 4.0.13, which proved unstable. I'm currently giving MySQL 4.1.12 a whirl, and it seems stable so far, but I don't fully trust it yet. I anticipate trying MySQL 3.23.41 (which I *know* is stable, but doesn't support the full SQL feature set) or PostgreSQL 7.4.7, perhaps not in that order. I also anticipate running into this problem again.

Rather than trying to figure out a way to contort DBI to use the right DBD's depending on the database version, I wrote my own UniversalDB module which obviates the problem by using its own simple TCP-based protocol to communicate with a "universaldb" process which runs on the same server as the database system, and uses DBI with the local version of the DBD's to communicate with that system. Right now UniversalDB exports its own DBI-like API, but when I have some breathing space I want to rewrite it as a DBD so it can plug into DBI itself rather than working around it. Fortunately, I wrote the Item Tracking System from the beginning using my own do-everything-right function, dbdo(), which wrapped the necessary DBI calls and added some other features, like debug logging, and reconnecting when the TCP connection unexpectedly closes, etc. All I had to do was insert a check at the top of dbdo(), and if the $UDB flag (essentially a global variable) is set, it passes the sql query to UniversalDB's do() function. If the $UDB flag is not set, it runs the pre-existing DBI-using code.

I have great plans for expanding UniversalDB's functionality eventually, but right now it is enough that I can simply export UniversalDB.pm to the data clusters, run "universaldb" on the database servers, and my code will "just work" everywhere. One of the things I want to do is give UniversalDB some config-file-driven logic for recognizing some database names as distributed databases, which uses simple config-file-driven rules for factoring data rows, and uses multicast TCP connections to submit SQL queries. Then the client-side code need not even know whether the database it is using is a conventional single-system database, or a distributed system. It will all "just work". I love systems that "just work".

I'm eager to get the Item Tracker behind me, though, not only because I want to use it to squish some persistently annoying problems, but because I want to finish off a couple other projects before they get stale. One such project is Glance, which I've talked about before. It's about 80% ready for the big time, but for the last few months I've just been running an older version of it on a fraction of The Archive's data cluster. Still, even that older version of the code (my new code is unstable) has been useful for detecting and diagnosing a variety of problems. Another project is qq, my remote-parallel-command execution tool, which has been increasingly used and relied on for petabox-related tasks, not only at The Archive but also at Capricorn Technologies, which sells the hardware that the Petabox runs on. But qq suffers from serious shortcomings, so I've been recoding it to use a UDP-based protocol and borrow a few tricks from gexec to make it better. Also, I promised the VP of Operations at The Archive to write a "datacenter control panel" a long time ago. I've been keeping notes about it, and it's always been in my head, so I'm figuring out its organization, underlying data structures, and algorithms, but I've written relatively little code so far. I intend to make good on my promise, it will just take a little time.

Hrm .. there were lots of other topics I wanted to touch on (I've been neglecting this journal considerably) but that's enough for tonight. More on "Goods, Glance, Goons, and Goals" later.

BTW, cobalt's doing much better.

-- TTK

User Journal

Journal Journal: Bad Things Happening to a Good Person

Bad Things

Bad things came down on the Ciar household since my last journal entry. My wife's actually kept a pretty good accounting of it on her Xanga blog, starting here. The short of it is, she changed medications for treating her bipolar disorder, and the new meds had opposite of the desired effects. Things were very hard on both of us for a few months, but she's doing a lot better now. In fact, this past weekend we built an awning together for the back porch, to keep the sun and rain off. Yay industriousness!

Sharp Plastic!

I was cutting some polycarbonate into strips with a hacksaw, to glue to the runners in my dresser drawers. I had formerly used acrylic for this sort of thing, because the strength and resilience of polycarb wasn't needed, but cutting the acrylic is a pain (tends to shatter) and I have a pile of polycarb scrap I picked up some months ago for really cheap. Polycarb cuts much more nicely than acrylic.

Well, after I cut off a strip, I ran my thumb over its edge to see if I needed to deburr it, and SCHNICK the edge cut right through the skin of my thumb! I just stood there watching it dribble blood for a while, I was so shocked. Polycarbonate holds a mean edge!

Improved Aluminum

Some bright boys working for the Navy have come up with a way to drastically improve the strength of aluminum alloy 5083. This is really cool. They start by cryogenically grinding some aluminum to a desirable granularity, and then use it to "seed" a larger volume of aluminum before it is cooled, and the presence of these grains encourages the formation of similar-sized grains throughout the metal volume. The article says that the end product is 150% as strong as unmodified aluminum 5083 (though, their wording is vague, and they might mean strength is increased by 150%, but I doubt that). Assuming they mean something close to aluminum 5083-H112, one of the less strong forms of AL 5083, this would mean their aluminum had an ultimate tensile strength of about 65,250 psi -- only somewhat lower than mild steel (at 85,000 psi) and less than a third the density as mild steel. Not bad at all!

Depending on how inexpensive this process can be made, the potential for this material is boundless. Particularly, it might be used as a drop-in replacement for parts made of AL7000 series aluminum alloys. The article mentions using it to replace some titanium parts on the space shuttle, which is great because titanium is *very* expensive (and apparently becomes "brittle after repeated exposures to liquid hydrogen fuel burns", according to the article).

This is exciting stuff! It will result in some expensive things coming down in price, and some inexpensive things getting stronger and/or lighter. We truly live in the age of nifty toys.

I Wrote Something Nifty

I wrote a research tool which analyzes documents returned by Google and tries to pick out simple statements of fact. It is similar to Googlism, but IMO returns more interesting results. It is written in perl, and the code can be found in my codecloset, as can a variety of examples of returned results.

The parsing is extremely simple, to the point of being simplistic, but I am surprised and delighted with the quality of data it produces. I'll work on it to recognize a wider variety of statements of fact.

Glance: Developing a Better Nagios

At The Internet Archive, we use Nagios to monitor our data clusters, and detect problems as they arise. Nagios is a superb and versatile monitoring tool, and we couldn't manage our clusters without it, but I've long gotten the distinct impression it wasn't intended for use with clusters as large as the ones we have here (1000+ hosts per cluster), especially with as many checks per host as we are running (about 8 checks per host, on average).

Right now, for instance, there are 114 "warning" and 253 "critical" trouble lights in our sf cluster, which is pretty typical. Viewing all of those trouble lights involves scanning a 367-row table, which makes it pretty hard to tell when some new trouble has come up, or if there is a pattern to a recent spate of problems. There are ways of dividing the hostlist according to category, but grouping them in ways which is useful to the different departments at The Archive (ie, so that the Crawler team can bring up a page of just the hosts they care about) is awkward.

On top of that, Nagios is a resource hog, even when deployed in a distributed configuration. It typically eats all available CPU on the Nagios-dedicated servers, and sometimes lags hours behind its schedule of remote checks to perform. Bringing up a report page can take minutes. It makes for very painful operation, sometimes, especially when the operations team is trying to work on an urgent problem.

When the director of operations asked me to develop a simple dashboard-type monitoring tool which would allow the different departments (especially operations) to tell at a glance what was wrong and where, I leapt at the chance. Its role at The Archive is to be complementary to Nagios, not replacing it, but it could be used to replace Nagios altogether. I called the project "Glance", so that I could keep the project's central theme in mind.

Not too surprisingly, it looks a lot like Nagios. I've compressed the table format a lot, so that check results appear in columns next to the hostname, instead of appearing in rows below the hostname, and I've kept the Nagios colors and semantics of those colors. This should make life easier on our sysadmins, who are already accustomed to looking at Nagios and knowing that yellow == bad, red == very bad, green == ok. There are still checks which get run periodically for each remote host, the results from which are used to determine the health status of the host, and I even made Glance compatible with Nagios plug-ins, so that existing Nagios checks could be used under Glance.

The main difference is in the way the checks are scheduled and issued. Nagios schedules things on the central Nagios server(s), and runs remote checks by making NRPE or SSH connections to the remote host, running the check there, and reading the results back into the Nagios server's database. Glance runs a "glanced" daemon on each monitored host, and it is this daemon's responsibility to schedule and run checks on the local host. The results are periodically injected to one or more central database servers (it actually currently uses Databroker, which I've talked about before in this journal). If an update fails because of a temporary network outage or because a database server is overloaded, it's no big deal because the data still exists in the "glanced"'s internal data structures, and glanced can try again with a different database server or just try again later. This daemon can read its configuration from either local files, or from the database server (for easy centralized administration), or both. The configuration format looks a lot more like cfengine's than Nagios'. The user interface (glance.cgi) just looks up database entries and makes up the html tables from them, and need not be running on the same server as the database(s).

An additional function of "glanced" is to run a couple of subprocesses ("top" and "vmstat" at the moment) from which facts about the system's state can be derived, which cannot be as easily derived by periodic spot-checks. "Glanced" periodically summarizes its analysis of these subprocess' outputs, and stores the data as if they were from ordinary checks, and injects them to the database along with the other check data. Also, "glanced" occasionally injects the dmesg log to the database, so that a host's last known state can be viewed via glance.cgi.

This all makes Glance more robust, better-performing, and less resource-hungry than Nagios, but there are some things it cannot do as well as Nagios. It lacks any sort of notification functionality, for one, so it cannot email a sysadmin's beeper when the mail server goes down (or similar). It relies entirely on a human occasionally eyeballing the web page to get its data out to the users. It also has to rely on Nagios-style centralized scheduling + checking of remote resources which are not capable of running "glanced", and this centralized logic is not as sophisticated as Nagios' (I've been putting my energies into making "glanced" work as well as possible). So if your primary concern is monitoring a thousand PBX modules, or ATM switches, Nagios is still the better choice. Glance is primarily for monitoring full-fledged computers (servers or workstations). Also, initial Glance development is being performed on Linux hosts. Even though I habitually write pretty portable code (and perl makes that easy), I am only aiming for cross-platform compatability for Linux, FreeBSD, Solaris, and MacOSX. So if I make assumptions which break Glance for, say, OpenServer, then again you're better off using Nagios. Linux distributions it will be tested on the most are RedHat (legacy sf Archive cluster), Debian (new Petabox Archive clusters), and Slackware (my preferred platform).

I am in the middle of rewriting large sections of Glance right now, so even though it is currently in limited deployment at The Archive, I'm not going to post its source publicly just yet. The currently deployed version has major issues because I tried to make the scheduling too simple. Also, I hard-coded too many things which need to be configurable, and the plugin handling needs serious work before it's really useful. When I deploy the next iteration of Glance at The Archive, I will also make the code available under the GNU Public License. (Ditto for Databroker, which also needs a little work -- it will be made publicly available under the GPL too.)

User Journal

Journal Journal: Insomniac Wife Falls Down

Cobalt Fall Down, Go Boom (but not crunch!)

So, I woke up around 6:45am today to the sound of my wife yelling for help in the back yard. She couldn't sleep last night, and spent the night and early morning working around the house and with her critters. She fell from our elevated back porch, twisted her left ankle very badly, and hit her head on the (thankfully soft) ground. We took her to the hospital, where she was (eventually) x-rayed, and it was determined that her ankle was not broken, merely badly sprained. After that we saw her doctor for further examination, diagnosis, and instruction.

She has a splint on her ankle now (looks like a combo of polypropylene for stiffness, and nylon to hold it all together), but she still really can't put any weight on the foot. They also gave her a pair of crutches, but our house is so cramped that she prefers to use her old wooden cane which she used right after her brain surgery. She's supposed to keep that foot elevated and iced, and of course most of our slushy-icepacks are no longer at home. One is at our friends' house in Milpitas, and another is at my work's office, and the one we have here is kind of lame. I made up another one (one part rubbing alcohol to four parts water in a ziploc freezer bag), and I'll make some more later. Two will do okay for now. Knowing her, she'll give up trying to ice it pretty soon anyway -- her entire body gets chilled from them.

She still wants to be up and do things, of course. She's quite stubborn. I love her, though, and I should be thankful that she's so stubborn -- anyone who wasn't, wouldn't have put up with me for eight years. :-)

Flip-Flopping on Electric Drive

The three or four people who read this journal ;-) may remember a previous entry where I threw in the towel and decided to go with a hybrid electric drive for my model vehicle, because I could not build a functional mechanical transmission. Well, I went through HSC's supply of electric motors, and I am thinking again.

I wrote down the specs and dimensions of the three most powerful motor models they had in stock, and they are all weak, bulky, and heavy. When I normalize their price by power, they come to $300, $300, and $570 per horsepower. It would take six, ten, and twelve of them respectively to aggregate a mere 1/4, 1/10, and 1/6 horsepower.

This is totally unacceptable.

I also ran across something else which I hadn't seen until this trip: a box of small electric clutches. They aren't much, no gears or anything, just simple solenoid-driven contact clutches, but they do open up several possibilities for me to build a simple mechanical transmission (one which might actually work, for a change).

I'm thinking right now of simply using one clutch to engage / disengage the combustion engine from the drive shaft, making it a "one gear" mechanical transmission (I need to find out how much mechanical power these clutches can conduct), and using another electric clutch to engage / disengage a single electric motor from the drive shaft too. This would enable me to "fake" having a (very slow) reverse gear: the engine's clutch would disengage, the electric motor's clutch would engage, and then the motor would drive the drive shaft backwards with 1/72 horsepower, geared way way back. With this arrangement I'll have to go with swing steering instead of skid steering (car-like, not tank-like), but that's not a really big deal. So it'll be a half-track.

All I need to do now is find the time to sketch it out and try to build it.

-- TTK

Slashdot Top Deals

There are two ways to write error-free programs; only the third one works.

Working...