Please create an account to participate in the Slashdot moderation system

 



Forgot your password?
typodupeerror
×
User Journal

Journal karniv0re's Journal: Operation Obsolescence

Some people are a waste of space. And we know that the only thing keeping them there, always in our way, is the fact that there is a need for them. We love to hate them, but goddamnit, there is a need for them.

Enter Operation Obsolescence. If we can eliminate the need, in theory, that person should simply go away.

We begin by describing the mark's responsibilities:

  1. Monitoring servers
           
  2. Grepping through logs to find "bad actors"
           
  3. Creating data sources
           
  4. Responding to support cases
           
  5. Server planning and architecture
           
  6. Applying patches and hot fixes
           
  7. Monitoring disk space and usage
           
  8. Recycling servers
           
  9. Responding to monitoring alerts

Then, we carefully pick through them and figure out a way to eliminate these responsibilities, or at the very least, make them so trivial that there
is really no longer a need to have more than one person assigned to these tasks.

Of course, this is not a trivial issue in and of itself. If it were, it would have been done by now. That is not to say that it can't be done.

Let's talk briefly to the issues:

(1). We have alerts set up, and this should all be self-monitoring. If necessary, we can set up additional alerts to let me know of impending problems.
But on a deeper level, there should be no alerts. Yes, that is a fantasy, but I think I can come close to achieving this by attacking the issue at
the core: bad code. Having less of a tolerance for bad code could make the number of alerts drop significantly. How do we do this in our current
state of "Wild West" programming? By taking alerts more seriously. Instead of just recycling the cluster right away, find out what code is
responsible for it, then find out who owns it. If necessary, move it to the penalty box and isolate it. Help the developers refine their code - of
course, there should be tools for this. A code sweeper perhaps, but more likely, a way of tracking the performance and finding bottle necks (usually,
this is at the database level).

(2). I am already hot and heavy on developing a solution for this. It's a log grepper that spans logs and servers, and pulls out only what it needs
within a given time frame. This should help with (1).

(3). This process is somewhat automated, but I think it could be improved and made smarter to reduce time spent by at least half.

(4). This will always be an issue, but the smarter I get at this game, the faster I will get. Also, I need to be put on the main notification list so
I can attack them before the mark does.

(5). As I get smarter at the environment, I should be taking over this field entirely.

(6). Staying up to date with the bugs out there and knowing my environment will make me a master at this. Setting up a solid testing environment will
also aid with finding and fixing bugs, both inherent to the product, and bugs in my code.

(7). Again, we can set up more monitors, but those will only tell me when it hits a certain critical point. We need trends. We need to be able to
extrapolate data. Where are we headed next week? Next month? Next year?

(8). This is in the works. I have a temporary solution in place for development and testing environments. I can now recycle clusters from my
phone, which is nice, but I can make this process even slicker. I can tie this into the alerts so when I receive one, I click a link in the alert, it
not only recycles the cluster, but also populates a database with all that information. The more data we have on these things, the better we can
track trends and prevent it from happening again, or at the very least, anticipate it.

(9). This is covered in (8).

With all of these tools and procedures in place, most of the need for the second person should be eliminated and I can reign peacefully over my new
product while spending the bulk of my time coding.

Bliss.

This discussion has been archived. No new comments can be posted.

Operation Obsolescence

Comments Filter:

It is easier to write an incorrect program than understand a correct one.

Working...