Forgot your password?
typodupeerror
User Journal

codegen's Journal: The long saga of using Linux to teach an OS Course Part II

Journal by codegen

Fast forward ahead to the next summer. The rest of the class has gone reasonably well, with only small problems having to do with the students learning C during the Class, and the fact that one TA and a Prof is not enough support for 80 students working on a lab.

Now it is time to try and track down the network gremlin that is trashing the Virtual Disk files. To recall the setup, we have 20 PCs running Windows 2000 Professional connected by windows file sharing to a server running Windows 2000 Server. Each account on the server has a copy of the virtual PC hard disk with Red Hat 6.2 installed on it. Each client machine is running virtual PC and using the file as a virtual disk. When 20 virtual machines are started off a single server at once, some of the virtual disk files (avg 2) become badly corrupted. For example, some of the the contents of a configuation file from /etc are found in the middle of the /bin/sh binary file.

I write a small C program that runs under windows that just opens a 500 MB file and does random seeks followed by reads in the file. This is to be a rough simulation the behaviour of the Virtual Machine as it accesses various files on the virtual disk. The file consists of consecutive binary integers (4 bytes), so we can tell if the seek and read is successful. No writes are done so the file will not become corrupted. It also prints out status continously as it operates.

Weird. I was expecting that when we start the C test program on 20 machines accessing 1 server, that we would get an occasional error from several individual machines. We start up the 20 machines and they are running rather slowly (20 machines doing random seeks and reads means lots of head motion on the disks). No errors so far. Then suddenly, one machine starts running full speed printing status lines, all of which are errors. It is as if it has lost the connection to the server! The master image for the lab workstations is updated with the latest Windows 2000 Pro patches and the lab is reimaged. Running the test produces the same results. Several attempts are made to change settings on the server and client, but nothing works. We are running out of time before the fall term starts, so we decide to manage the problem. Start only 10 machines at a time (TA's control who starts up the VMs and when during the lab). This works with only a few cases of corrupted drives the next term.

to be continued

This discussion has been archived. No new comments can be posted.

The long saga of using Linux to teach an OS Course Part II

Comments Filter:

Disks travel in packs.

Working...