Squid, FreeBSD Rock the House at Caching Bake-Off 159

Posted by timothy on Tuesday February 29, 2000 @11:34PM from the baked-with-pride-seafood dept.

Blue Lang writes: "Saw on the squid mailing list today that the results of the second polygraph Web-cache benchmarks are in, and squid on FreeBSD captured a few top marks, as well as performing exceptionally well overall. Interesting reading, especially as a comparison of free and open systems versus some very well-architected proprietary solutions."

This discussion has been archived. No new comments can be posted.

Squid, FreeBSD Rock the House at Caching Bake-Off

Load All Comments

Search 159 Comments Log In/Create an Account

Comments Filter:

Architecture (Score:1)

by Anonymous Coward writes:

What is the architecture of squid? I read that it was "I/O driven", does this mean it uses asynchronous network access where if data comes in, a thread is chosen to service it, then when it becomes I/O bound it goes back into the pool?
Re:One Word for You (Score:1)

by Anonymous Coward writes:
Uhh, do your background research.
- reverse cache
- - Sends static content to the client using the fastest means possible to access it, while referring dynamic content to scripts/CGI.
- khttpd
- - Sends static content to the client using the fastest means possible -
- the fact that it is within the kernel - to access it, while referring dynamic content to scripts/CGI.
Thank you.
Re:Performance is king. (Score:1)

by Anonymous Coward writes:

The only highlight of the Squid results was the cache hit rate. Otherwise, it was pretty unremarkable. Open source didn't save any money or any hardware - the Squid guys didn't win on price, and didn't win on price/performance. What good is getting the software for free if you need so much more hardware to run it?
The open source operating systems, on the other hand, really did shine, especially FreeBSD. Then again, this sort of network-heavy workload is a perfect fit for *BSD, so that aspect isn't surprising.
They should have tested conformance (Score:1)

by Anonymous Coward writes:

It sucks that they focused only on the performance aspect of the devices. My experiance has been that nearly every single cache out there has HORRIBLE RFC conformance, to the point where they blatently violate the HTTP RFC and don't support critical features from HTTP/1.1.

Sure, netscape and most HTTP clients can connect to them, but try it with something that really works HTTP/1.1 features and it just sucks. So far, I have yet to see a single 'transparent' proxy that even implements HTTP pipelining, let alone the more advanced stuff from the spec :

IMHO, this is the most critical point when choosing a vendor - how well they implement HTTP/1.1 - very little else matters!
Re:Accelerate your website -- it's awesome! (Score:1)

by Anonymous Coward writes:

(Oh My Gosh do you really want to spend $150,000 on a CACHE???)
Sure. If bandwidth costs more, and in some (non-north american) countries it's staggering.
Maybe Squid COULD use a little teeking (Score:1)

by Anonymous Coward writes:

Rather than add to the xBSD vs Linux wars, why don't we look at it this way:
Where are squid's weaknesses, and what can be done to improve on them?
Squid uses the native filesystem of whatever OS it runs on. Could a better solution be that it doesn't use one at all. Let it write "raw" to partitions you assign it. We don't need permissions checks, don't give a damn about concurrency (one i/o thread per partition), don't care what happens when we crash (other than to know if the file is bad or good). How much could be stripped out and even optimized for this use.
There also seem to be some IP stack issues too. I know you don't want to be doing some of this stuff in user mode, but could squid benefit from jumping in lower down at the IP level and handling http 1.1 and relevant TCP bits it self?
I'm not a kernel hacker (obviously), and I don't even know if these are squid's weak points, but it seems a good place to start teaking.
Re:One Word for You (Score:2)

by The Man ( 684 ) writes:

The kernel http daemon in Linux 2.4 will make reverse http caching obsolete and outmoded.
I hope this is a troll. Just in case it isn't, let's think good and hard about what khttpd does: it's an ultra-fast web server for static content served from the filesystem. It's not a caching product, and never will/can be. It's also marked EXPERIMENTAL and almost certainly will be for 2.4 as well. khttpd is a marginally cool idea (hmmm..whoa, look what I can do!!!) that very few people will be using any time soon. And it certainly won't be for caching.
FreeBSD & load (Score:2)

by hawk ( 1151 ) writes:

The response under load is the single biggest difference I've noticed under FreeBSD & debian. At even a load of 2 on this box (P120/24), I notice the lack of response under X (as in: wait several seconds for the cursor to move). Under FreeBSD on a K6-200/64, there is nearly no loss at a load of 10 (from several parallell makes). Yes, that's close to apples/oranges, but the freebsd box used to run debian, and I noticed a similar phenomenon there.

Maybe it's just configuration somewhere; both boxes are pretty much stock. However, it seems to me that I noticed something similar with macbsd and linux
a couple of years ago, when the macbsd box had a slight memory/cpu disadvantage. (However, if you tried to run lyx with the default postscript fonts and not using xfs, both came to a screeching halt :)

hawk
Re:FreeBSD & load (Score:2)

by hawk ( 1151 ) writes:

I'm finding it rather hard to believe myself, but it's happening.

I suspect that at least a large part is the switch by debian from slink to potato; there have been various changes in default behavior, and the whole thing seems to have gotten *much* slower since the "upgrade". If I can get three hours free at a time that students don't need the server running, I'm ripping out debian for freebsd. I've reported it a couple of times in a couple of places, but have gotten no acknowledgement that anyone else has seen potato become a pig. THe problem is that I can't report anythign objectvively as a bug without spending a couple of days on an instal/reinstall/test cycle on a couple of different versions . . .

However, the speed difference was there before with older versions of debian and freebsd. The offensive box only has 24mb . . .
Re:Rob - you learn anything from this article? (Score:4)

by Jon Peterson ( 1443 ) writes: <jon@snowdr[ ].org ['ift' in gap]> on Tuesday February 29, 2000 @11:30PM (#1236612) Homepage

Err let's see:

1. /. does not use CGI - it uses a preprocessor (mod_perl) which shares memory and caches compiled Perl bytecode.

2. Last I checked /. uses a seperate server for static content (images)

3. Given (2) above and the highly dynamic nature of /. content, I'm not sure that a accelerator proxy would be such a big help.

4. As I said /. uses no CGI in time critical areas, and mod_perl is in many ways superior to fast-CGI.

As for /. code being scary I'll take your word for it. I can't for the life of me see what's wrong with using DBI though.

Share
twitter facebook
Re:Caching vs. Akamai-type services (Score:1)

by Karl Anderson ( 1467 ) writes:

Akamai doesn't seem to be anything more than a network of caches, some load-based routing, and a private network connecting them, plus a few geegaws like special transport of certain stream formats. Not to malign what they're doing, but I wouldn't say caching *vs* Akamai.
Re:Rob - you learn anything from this article? (Score:2)

by Thomas Charron ( 1485 ) writes:

While you did address the points of the original article, I still feel the need to say the code is most *certainly* far from being optimized for performance..
Re:My boss will love this article. (Score:1)

by tzanger ( 1575 ) writes:

Just a quick question regarding the machines and the setting they're in:
A pair of identical (load balancing and transparent failover via BigIP) rackmount servers, each with a PIII 600 CPU, 256MB, 2940UW and 20Gb of disk. And let's not forget the triply-redundant T3's to threee distinct Tier-1 internet providers.
Nice setup. What do P3's get you that P2's or even celerons don't? The extra cache won't help a whole hell of a lot and the SIMD or KNI instructions don't do anything for you, either...

...Now I heard that somewhere there is a patch to the Linux kernel that uses MMX to help calculate the packet checksums faster but you said you don't use Linux.
Oops (Score:2)

by Matts ( 1628 ) writes:

I'm surprised I've not heard much about Oops. I'm trying to get it working here, unfortunately the documentation isn't great. However it does seem like the architecture would make it an extremely fast little proxy, and it seems to have most of the features of Squid.

Anyone got any good stories about using Oops?
Re:It didn't win. (not flamebait!) (Score:3)

by Rendus ( 2430 ) writes: <<rendus> <at> <gmail.com>> on Tuesday February 29, 2000 @07:20PM (#1236617)

Heh.. I work for Dell, and if the stability of Dell's servers are any indication of the stability of Windows NT, you sure as hell don't want to be using them as an example.

Share
twitter facebook
I don't think Squid won (Score:2)

by raph ( 3148 ) writes:

I read the page too, and while it definitely shows that Squid is a viable option, I didn't see it blowing the doors off the competition either. The Microbits pizza box delivered nearly the same performance at significantly less cost, and some of the higher end offerings were able to deliver considerably greater price/performance. The IBM 3500M10, for example, was able to deliver almost double the peak throughput at only a slightly greater cost.

These results are surprising to me - I would have thought that the use of commodity hardware and no-cost software would have created a compelling price advantage. What happened?

If there's something I'm missing, could someone please spell it out for me?
DOS port (Score:2)

by VAXGeek ( 3443 ) writes:

Nothing beats my DOS based web cache for pure cacheability. I run Caldera OpenDOS {used to be available from www.calderathin.com, but I think now it's lineo.com) with packet drivers for an NE2000 and I run the squid webcache [www.squid.org, it's really cool, a webcache that SCALES] ported to a DOS TCP-IP stack. For reliability, it's number 1. I pity the fool that tries to DOS my DOS box.
nWo for life!
------------
a funny comment: 1 karma
an insightful comment: 1 karma
a good old-fashioned flame: priceless
Re:What in the fuck ... (Score:2)

by Guy Harris ( 3803 ) writes:

Where the hell is Novell's BorderManger in this little test?

Novell's BorderManager? Dunno.
Novell's Internet Caching System? See the Vendor Comments page [ircache.net] on the bakeoff site, where somebody from Dell says:
The Novell Internet Caching System - Powered By Dell (Dell ICS 130B) used in the Second IRCache Web Cache Bake-off tests, is currently available from Dell Computer.

There were other boxes running it as well, e.g. at least some, perhaps all, of the IBM boxes.
(The "Vendor Comments" section seems to be filled primarily with "Vendor Advertisements"; yes, my employer, NetApp, proudly participated in the marketoonery in question.)
Re:What in the fuck ... (Score:2)

by Guy Harris ( 3803 ) writes:

yes, my employer, NetApp, proudly participated in the marketoonery in question.

"The marketoonery in question" being the dumping of advertisements into the "comments" section, not Polygraph itself. Marketoons - can't live with them, can't send them out the airlock without a suit....
Re:Misleading title for article (Score:2)

by Guy Harris ( 3803 ) writes:

Seems like a good showing for FreeBSD.

Or, rather, for FreeBSD plus whatever caching code iMimic runs atop it.
Re:Squid and Akamai (Score:3)

by Guy Harris ( 3803 ) writes: <guy@alum.mit.edu> on Tuesday February 29, 2000 @09:03PM (#1236623)

I think one of the first developers of squid is the CTO of Akamai.

The CTO of Akamai is Daniel Lewin; his bio page at Akamai [akamai.com] says nothing about Squid.
You may, perhaps, be thinking of Peter Danzig, who is the VP of Technology at Akamai; his bio page at Akamai [akamai.com] says:
His background in Internet information systems also includes work on the federally-funded Harvest Information Discovery System, or 'Harvest Project.' His collaboration on this project at the University of Southern California resulted in one of the earliest designs for caching Internet backbone traffic. Danzig led the Harvest Web cache and helped design the Harvest indexer projects from 1992-1995.

I think the Squid project was originally derived from the Harvest cache; the NetApp NetCache software was also originally Harvest-derived, although much, perhaps most, of it was done at Internet Middleware (a company founded by Peter and bought by NetApp) and NetApp. (I suspect much of Squid might also be non-Harvest code.)

Share
twitter facebook
Re:Performance is king. (Score:1)

by Bwah ( 3970 ) writes:

Huh? squid was pretty darn "evolved" last I checked. It's been around a long time.

I agree comparing to two is a bad idea ... but I don't think the comparison was intended the way you took it.

performance is king ... fact is as a caching proxy squid hauls ass. period. If you take advantage of some of the redirect features you can do some pretty amazing things with it.
Remeber NT? (Score:1)

by ink ( 4325 ) writes:

Hmm, an http server in the kernel.
Now, why oh why would that be a bad idea?
The folks at Microsoft thought it would be cool to move the GDI into the kernel for faster graphics under NT. Now that NT crashes, the blame falls on "unstable video drivers" instead of the system architecture.
No thanks. Our Pentium 166 can saturate the T1s already.

The wheel is turning but the hamster is dead.
Re:Accelerate your website -- it's awesome! (Score:1)

by law ( 5166 ) writes:

Um did you look at the hardware and cost of that product? silly troll or just moronic? you decide.
Re:suprised by this ... (Score:1)

by cthonious ( 5222 ) writes:

weird ... haven't experienced any memory leaks myself, been running squid for a long time on linux, uptime is 95 days and squid has been using 132MB since I can remember.
Other software based on Squid (Score:2)

by Ed Avis ( 5917 ) writes:

I'm sure I heard that Microsoft Proxy Server, along with other proprietary web-cache software, is based on Squid. Is this true?
Re:My boss will love this article. (Score:1)

by pqbon ( 7033 ) writes:

Read the PCI spec. 64bit pci cards are suppossed to work at 64/66 64/33, 32/33... yes 32bit by 33mhz... Most 64bit pci cards will fit in any PCI 2.2 slot.

"... That probably would have sounded more commanding if I wasn't wearing my yummy sushi pajamas..."
-Buffy Summers
Goodbye Iowa
Performance is king. (Score:1)

by Signal 11 ( 7608 ) writes:

Apache is a robust and reliable server with support for all kinds of server-side stuff that can cost you tens of thousands of dollars to find in a commercial implimentation, offers damn good performance (see the apache tuning hints on their page), and even a crummy P60 running apache can saturate a T1. Also, I don't think it's fair to compare Apache to Squid as they are run by two completely different development groups - and one is ALOT more evolved than the other (not the groups - the product *g*).
Squid, on the other hand, is a good compliment if your yearly IT budget is smaller than marketing's christmas party funding... but for serious stuff?
Sorry, but Squid didn't cut it here - I know, we all want the open source crew to win, but hey.. it just didn't happen here.
Re:It didn't win. (not flamebait!) (Score:1)

by Signal 11 ( 7608 ) writes:

For latency, yeah.. but I don't think hits-per-second wants to be as low as possible....
Re:It didn't win. (not flamebait!) (Score:1)

by Signal 11 ( 7608 ) writes:

He wasn't bored. He was bearded. There's a difference. =)
Re:Performance is king. (Score:1)

by Signal 11 ( 7608 ) writes:

performance is king ... fact is as a caching proxy squid hauls ass. period. If you take advantage of some of the redirect features you can do some pretty amazing things with it.
I'll agree Squid can do some serious work, but it didn't fare well against the other solutions presented in this benchmark. :(
Re:DOS port (Score:1)

by Signal 11 ( 7608 ) writes:

I pity the poor bastard that has to admin that box...
Re:It didn't win. (not flamebait!) (Score:1)

by Signal 11 ( 7608 ) writes:

Dude, I didn't read /a/ table, I read the footnotes from each manufacturer (squid didn't seem to have anything to add about their configuration), I read the conclusion, executive summary, and performance tips.. including the bit about the TCP_WAIT2 problem which nailed a few contestants to the wall.
Now, I'll say it again alittle more succinctly: Squid got squished.
Re:It didn't win. (not flamebait!) (Score:2)

by Signal 11 ( 7608 ) writes:

Windows delivers the scalability and reliability to run real businesses-now.
Opinion.
Feature for feature, Windows 2000 is the most cost-effective business platform.
Opinion.
Microsoft wants to work with you to make your business successful on the Internet.
Fact.
Some of the biggest e-businesses and dot coms run on Windows.
Fact.
Dell, the largest e-business on the Internet, runs on Windows.
Fact.
Sun claims to be a leader in system reliability and more reliable than Windows.
Fair enough, they do claim to...
Electrolux Group, Accounting.com, Pro2Net and thousands of other companies have switched their Web sites from Sun platforms to Windows. (Source: Netcraft)
Fact.
The vast majority of Sun?s Solaris shipments are on Sun?s own expensive, proprietary hardware and Sun has always buried the cost of Solaris in their hardware pricing.
Opinion.
Conclusion: Windows is useful in some environments. So is everything else. I care about numbers, data, real, tangible, and reproducable things. If an NT server in X configuration crashes 35 times in and has an average downtime of 5%, while a linux box in X configuration with similar performance has a downtime of 1%.. linux wins. Conversely, if the NT box can pump out 8000 hits/s, while the linux box can manage 2100 hits/s and I need raw performance, NT wins. Stop reading the marketing hype and start reading the technical specifications.
Re:My boss will love this article. (Score:2)

by Signal 11 ( 7608 ) writes:

1. Because I know I can rely on the technology
I can't tell whether you meant this as alittle FUD thrown over linux, or because you believed all the other vendors there were inferior to FreeBSD. On one count you'd be wrong, unfortunately.
Yes, you can rely on FreeBSD. You can rely on NT too for certain things. That doesn't say much. I'd also like to point out that there are very serious holy wars out there over whether linux is superior to FreeBSD along with the general consensus in the linux camp that they will catch up (if they haven't already) with the BSDs in short order. The evidence is inconclusive..
Lastly.. about that "killer caching proxy"... umm, with all that bandwidth, why would you need proxying anyway? by that time you're probably a backbone provider and don't need to worry about stuff like that. Caches are used by ISPs with a T1 or two or corporations to limit bandwidth.. not by super-sized ISPs (not generally - AOL comes to mind as an exception). And why the 2940UW (I'm assuming you're thinking adaptec)? They have Ultra160 fibre now in the AIC-78xx chipsets which is register-compatible with the aic78xx module for linux... or for the *BSDs.
Re:My boss will love this article. (Score:2)

by Signal 11 ( 7608 ) writes:

I have a 32 bit version sitting in my system.. the very same one I'm typing this on. Hit up adaptec's site and search for the 29160N Ultra 160 SCSI adapter.
It didn't win. (not flamebait!) (Score:4)

by Signal 11 ( 7608 ) writes: on Tuesday February 29, 2000 @06:48PM (#1236640)

Microbits had a higher price/performance, about 25% less top-speed, but at half the price of the squid solution.
No offense, but you call that winning? It lost to it's competitors categorically and across the board - hits, latency, cost/performance.. what's the good news? Anyone?

Share
twitter facebook
Re:Remeber NT? (Score:2)

by szo ( 7842 ) writes:

You don't _have to_ use it. You can compile your kernel w/o it, and it won't change. With NT you don't have the choice...

Szo
Re:Architecture of Caching to large-scale sites (Score:1)

by Lazy Jones ( 8403 ) writes:

Unfortunately, several web servers still don't support Expires: headers (e.g. thttpd, which works very well for static data). Then again, browsers think they're smarter than the servers and proxies, since in the absence of Expires: headers they cache stuff depending on the type of data and the URL, AFAIK, so it's not absolutely necessary to set these headers.
Re:It didn't win. (not flamebait!) (Score:1)

by LWolenczak ( 10527 ) writes:

I just want to point out, that the lower it is on thowse graphs, the better it is....
Re:It didn't win. (not flamebait!) (Score:1)

by LWolenczak ( 10527 ) writes:

yeah, but thats probaly because it was bsd, or it was just checking to see if it had a copy of the page on its disk. One has to notice alot of those solutions are turn key solutions. This box was probaly just put together by some bored guy one afternoon.
It's important to dig into the numbers.. (Score:1)

by Blue Lang ( 13117 ) writes:

Given.. there is a ton of really good info in there, especially the network configurations (each company brought their own network) and disk configurations, etc.

Cache the world!

--
blue
Re:It didn't win. (not flamebait!) (Score:2)

by Blue Lang ( 13117 ) writes:

No offense, but you call that winning? It lost to it's competitors categorically
and across the board - hits, latency, cost/performance.. what's the good
news? Anyone?

Hi.

If you'd kindly point your browser back to the top of the screen, you might take a moment to re-read the post. Squid+FBSD did well. The ICS-based solutions cost bazillions of bux0rs and brought along 100+GB disk array, and, pound for pound, were not that much better. The microbits entry did rock, and it's about the size of a personal pan pizza.

there's a reason i posted that it's important to READ the ARTICLE, not just grab the first table you see and start wallowing about.

:)

--
blue
Re:TTchorus: site gets slashdotted, why not cache (Score:1)

by maphew ( 14702 ) writes:

> Tangenital
ick
{chuckle}

To answer your question, there are two main reasons why this shouldn't be done.
1: Copyright could be infringed on the pages being cached

Okay, that makes a certain amount of sense and I can understand being cautious, but caching makes the web go around. It's already pervasive, or so I was given to understand. As another poster mentioned, what about Google?

I guess the size of the financial club is often more relevant than the technical legality of a something. {sigh}

2: Many sites get their revenue from click throughs and banner ads. If /. mirrors the info, are they going to mirror the banner ads as well?

Wouldn't bother me personally, I never seem 'em anyway, I use Junkbuster or turn autoloading images off. It's amazing how much more responsive surfing is then. :)

-matt
Re:TTchorus: site gets slashdotted, why not cache (Score:1)

by maphew ( 14702 ) writes:

thanks!

For onlookers: The link is here [slashdot.org].
And it doesn't say anything about copyright, just time and money (isn't it always?).

-matt
Re:Accelerate your website -- it's awesome! (Score:1)

by maphew ( 14702 ) writes:

Will somebody please define what "reverse caching" is?

thanks

-matt
thanks (Score:1)

by maphew ( 14702 ) writes:

thanks
TTchorus: site gets slashdotted, why not cache it? (Score:2)

by maphew ( 14702 ) writes:

Tangenital topic:
At least once per story, somebody suggests that slashdot cache or mirror sites they link to in order to avoid the dreaded /. effect.

I have yet to hear an explanation of why this might not be a good idea. Anybody out there have one?

(honest question)
Re:/.tted--copyright not issue (Score:2)

by maphew ( 14702 ) writes:

I can't see how copyright would not be the issue, it's pretty a pretty blatant violation when you start copying other websites content and stick it on yours.

I can see that argument if you surrounded their page in a frame, or replaced their banners with yours, or something which somehow makes it unclear who the real owner/producer of the page is.

How about a script or program which:

-caches the linked page when the story is first posted
-periodically checks the page for response time
-if $lag > $unbearable then serve cached page with an inserted headline which says "the host server http://blahblah appears to be /.ttd. You are veiwing a cached page, it may not be up to date. Click here to try and fetch the original."

This way the big companies would host their own material (the assumption being they have enough money to have bigass servers and don't need to be cached) and only the little guy with the cool make-your-own-transmeta-chip page who actually _needs_ to get cached, will get cached.

Is there some reason this wouldn't work?

-matt
Why the BSD vs Linux flames? (Score:5)

by johnnnyboy ( 15145 ) writes: on Tuesday February 29, 2000 @07:58PM (#1236653) Homepage

I've been introduced to Unix through Linux. I must say that the Unix environment just simply kicks ass!

Out of sheer curiousity I tried out freeBSD. Their kernel is incredible. I know that the bench marks aren't there to show it but their "claims" are true.

Their TCP/IP stack is better, loads can be handled with ease even on a extremely low-end systems and their memory management is out of this world. I was impressed at how fast my shitty unix boxes went.

Now I know that linux heads like myself would become defensive but linux has made big improvements and a lot of issues are being addressed with the next 2.4 kernel. Their "claims" will be seriously tested soon.

I have decided to go back to linux because I prefer it. There's more software and it makes a better desktop for me. Plus it is stable enough, user friendly enough, fast enough and damn good!

However, freeBSD is a great unix OS and the only way to find out is to try any BSD yourself. Even a linux head like me can defend freeBSD.

Keep up the good work to all BSD contributers :-)

Share
twitter facebook
Re:Accelerate your website -- it's awesome! (Score:2)

by Anonymous Psychopath ( 18031 ) writes:

Reverse caching is used to offload your web servers. What it does is cache the static components (read: graphical images) of your web site and the caching engine then serves them up to people hitting your site, rather than the web servers having to serve them up.
It's useful in scenarios where you have a large web server farm. By implementing reverse caching and lightening the load on your web server farm, you don't have to have quite so many web servers. It also has the net effect of making your web site appear to be "faster" since users will see the images more quickly from the cache than the web server.
Re:DOS port (Score:1)

by mplex ( 19482 ) writes:

Do people just love to mark down sig_11 posts? Was this pointless moderation or what. I wonder if any points will get wasted on my post.
My experiences with Squid (Score:5)

by SONET ( 20808 ) writes: on Tuesday February 29, 2000 @08:12PM (#1236656) Homepage

I just want to take this oppertunity to say Squid totally rocks. I put a squid server on a rescued 486/66 with 24MB of RAM. By rescued I mean that when the processor was removed from an old donated Compaq Prolinea server, it flew out of my hand and landed on concrete - then got stepped on while I was trying to find it and every pin got flattened (oops! found it!), and I had to straighten each pin with a butterknife to shove it in the Squid box! Honest! And that's only the processor story! Anyhow, you get the point - we're talking about really crappy and abused hardware I'm working with here.

We have roughly 100 machines on our network, and Internet access was coming to a standstill - especially when everyone in the computer lab was on the Internet. Imagine a 128Kb/s fractional T1 with 25 *active* users all trying to look at mega-image-rich content, plus some other users on campus accessing the Internet at the same time (can you say sub-300 baud and ping times measured in whole-second increments?). I was having to pre-load web sites before a class came into the computer lab because just loading the first page could take roughly five minutes on a good day.

Then I configured and installed a Squid server on a rejuvinated Compaq Deskpro running Linux 2.2 that was donated with the above said specs. I was a little sketchy to implement it across the entire campus at first because I had always heard that proxy servers were a Bad Thing. So I silently pointed browsers to the Squid machine in a few classrooms to see if I would hear anything from anyone. I got calls from people that very day. They were asking me how I had finally coaxed our school district into buying us such a fast connection!

As it goes, the more classrooms I pointed to the proxy server, the faster things got (as the cache was growing and the hit rate was increasing), and the more happy teachers I had. In a school situation, many sites are visited multiple times by different students and classrooms. In the computer lab, every computer often visits the same site as a class. So having a caching-proxy server helps a great deal! I really believe that every school with less than a T1 should have one.

As for statistics, I have an average 'hit' rate of well over 80% because of the multiple viewings of sites. Initially I had 2GB set aside for caching purposes (on an IDE Samsung 2.1GB drive), and I found that as it reached its capacity the server just got way too slow. So first I brought it down to 1.5GB, and now I have it at 1GB (I may even take it to 750MB). It has been running pretty fast at 1GB - by far compared to not having a caching-proxy server at all, but I do see the performance start to degrade at about 750MB with my particular hardware.

Sure, faster server hardware would be *great* and is probably necessary to handle our unusually heavy load due to all of the graphics content on the visited sites, but right now that just isn't an option because we live on donations. My point is that even though we are running Squid on such a crappy box, it has worked wonders on our network. Internet access seems very fast now, whereas before it was almost unbareable. And most importantly people are happy and making use of the technology we have to its fullest extent, where as before they may not have been able to do this. I must admit though that I am writing grants in hopes of getting a faster/newer box because ours is getting tired and I worry about what will happen when the hardware finally kicks the bucket. :)

For a school in our situation, Squid is great because it even helps when you're using it on otherwise possibly worthless hardware, and the price is just right.

Anyways, I'd like to thank all who have donated their time on the Squid project, you've done great work and you're helping people more than you realize!

--SONET
http://www.hbcsd.k12.ca.us/peterson/technology

Share
twitter facebook
Re:Why the BSD vs Linux flames? (Score:1)

by jstepka ( 20825 ) writes:

FreeBSD offers a Linux binary emulation which runs binaries just as fast as Linux. You can install the port from /usr/ports/emulators/linux_base/.
Re:Architecture of Caching to large-scale sites (Score:1)

by stab ( 26928 ) writes:

No, it's not absolutely neccessary, since they send IF_MOD_SINCE requests, and don't always retrieve the data, etc.

But still, it's a good matter of principle to do so, to guarantee behaviour of clients.

Also, this only extends to private caches, and public caches really like to see http/1.1 headers before holding on to them (again, depends on the exact cache obviously)
Architecture of Caching to large-scale sites (Score:3)

by stab ( 26928 ) writes: on Wednesday March 01, 2000 @01:28AM (#1236659) Homepage

For those of you interested in caching and how it can help large scale sites, I've helped co-author a technical report [netapp.com] with Network Appliance, which was our experiences at accelerating the Mars Polar Lander [marspolarlander.com] website. That site used NetCache boxes, simple HTTP/1.1 cache-control headers, and a bit of cunningness to allow user-level tracking without letting the track requests filter through. As traditional, the site had a couple of problems which we've also included in the appendix after we fixed them, to hopefully save other people the same hassles in the future.

The technical report can be found at http://www.netapp.com/tech_library/307 1.html [netapp.com]

We would all save a scary amount of bandwidth if more sites were designed with public caches such as (the awesome) squid in mind, and it's a really simple use of headers that make it possible.

For those who use Apache and are interested in making your own sites more cache-friendly, I recommend you look at mod_expires [apache.org], which is part of the default distribution of Apache, although not compiled in by default. If you have large, static images that rarely change, then go ahead and put week-month-year long expiry headers on them, and watch the hits for those redundant images drop right down on your web server. And if you suddenly need to change them, then it's no real problem, as all you have to do is change the images URL and it will become a "new" entity for purposes of caching.

Yeah, granted, bandwidth is getting cheaper now, but for us poor Europeans, it's still a scarce commodity and we need to worry about these things :-)

-anil-

Share
twitter facebook
Re:Accelerate your website -- it's awesome! (Score:1)

by horace ( 29145 ) writes:

What difference will khttpd make to caching software and the need to use caches?
Re:TTchorus: site gets slashdotted, why not cache (Score:2)

by odaiwai ( 31983 ) writes:

> Tangenital
ick

To answer your question, there are two main reasons why this shouldn't be done.
1: Copyright could be infringed on the pages being cached
2: Many sites get their revenue from click throughs and banner ads. If /. mirrors the info, are they going to mirror the banner ads as well?

dave
Re:My boss will love this article. (Score:2)

by Garpenlov ( 34711 ) writes:

bout that "killer caching proxy"... umm, with all that bandwidth, why would you need proxying anyway?
by that time you're probably a backbone provider and don't need to worry about stuff like that. Caches are used by
ISPs with a T1 or two or corporations to limit bandwidth.. not by super-sized ISPs

Uhm... you're kidding, right?

Did you ever think about how much of that bandwith your high speed clients (DSL, cable modem) can eat up? And how much of it is redundant? (i.e. cacheable)
Re:Accelerate your website -- it's awesome! (Score:3)

by Garpenlov ( 34711 ) writes: on Tuesday February 29, 2000 @07:56PM (#1236663) Homepage

Squid captured top honors in cache hit ratios, but nothing else (AFAICT), showing that those "expensive, proprietary systems" also can be very
well-tuned operating systems that eliminate traditional OS overhead for these numbers.

True, but the operating system that Squid was running on (and that's what you were talking about, the operating systems) was FreeBSD, which also runs the iMimic, which captured the highest hits/sec and reqs/sec per $1000. By a large margin. Interestingly enough, the only linux-based entry, the Swell-1000, didn't do very well. Which goes to show you that just because you have a good starting point, doesn't guarantee success.

And, of course, the amazingly expensive Cisco products probably (I don't know, just assuming) do a lot more than just cache -- and are probably a lot more reliable (MTBF) and redundant, which is important if your cache is a vital business component. (And if cache == internet access, then, well, it probably is).

Share
twitter facebook
Re:Cached idea, stale opinions. (Score:1)

by MadAhab ( 40080 ) writes:

Yup, and if you can't provide better arguments, and back them up with reason, then stay anonymous, Coward. Or better yet stay quiet.
Re:TTchorus: site gets slashdotted, why not cache (Score:1)

by spencerogden ( 49254 ) writes:

Rob has a response to this in the FAQ.
Re:Accelerate your website -- it's awesome! (Score:3)

by Doc Hopper ( 59070 ) writes: on Tuesday February 29, 2000 @09:12PM (#1236667) Homepage Journal

Basically, reverse proxy caching works by you hijacking connections to your webservers from the outside world. IP Policy-based routing is the easiest example to understand, and is the method we use at Excite@Home E-Business, so I will detail it.
A connection is destined for "www.excitestores.com", and ends up at the external DS/3 (T3, T1, insert your fast link here) port on our router. The router runs a rule against the packet and says "Hey, this is www traffic bound for the servers that are to be accelerated. Therefore my next hop is (insert IP address of cache here)!". It route-maps it to the cache server as it's next hop. The caching server is set up to "hijack" any incoming connections as if they are destined for itself, and makes the request to the origin web server on behalf of the requesting client. At this point, this does not differ too much from standard forward transparent proxying, except that you normally have an access control list that only permits transparent proxying of a limited set of URL's or IP addresses. You don't want to run an "open proxy" for the world to use to cache whatever they want.
Of course, note here that there are alternate methods of accelerating sites depending on the cache you choose and your infrastructure. The basic idea is to get the packets to your cache instead of the web server, however you choose to do it. Common methods include placing the cache in the natural route of the packets, making the webserver address point to the cache and have a non-public DNS that the cache looks to to resolve a web site on a non-routeable private network, or specifying on the cache that incoming connections on a certain IP are to accelerate a particular origin server.
Anyway, the benefits of this are enormous in our case. We have a (*&$load of modules compiled into our Apache server, tons of virtual hosts and modules to handle them all, and each daemon runs about 12 MB. Each web server has a gigabyte of RAM, therefore you do the math:
1024/12=85 and 1/3 connections run us out of physical RAM on each web server. Realize this is a rough estimate; our web servers can handle much more, but performance degrades quickly with more connections being served from virtual memory. I've also not taken into account OS overhead, other services running on the servers, and any other thing you may think of. However, modem users, particularly, saturate web server connections because it is so slow to deliver objects to them.
CNN.com, for instance, uses ICS caching boxes purely for connection management to handle these slower connections that could bog their servers down. Novell's ICS is rated at over 100,000 simultaneous connections on each box in reverse proxy mode. A big difference from 85 connections for one machine, no?
I'd love to discuss this in more depth, if you require a better answer. Better yet, check the FAQ at Squid's site [squid-cache.org] regarding transparent reverse proxying.
Seriously, this is what takes web sites to the next level, regardless of whether you use Squid, ICS, NetCache, or another type of reverse cache. Keep smiling!

Share
twitter facebook
Accelerate your website -- it's awesome! (Score:5)

by Doc Hopper ( 59070 ) writes: on Tuesday February 29, 2000 @06:55PM (#1236668) Homepage Journal

I've been overseeing a caching (really, website acceleration project) for my company, Excite@Home E-Business Services, over the last three months now. I can personally say that the three I've had experience with, Novell's ICS caches (which comprised ten of the twenty entrants), Network Appliance's NetCache, and Squid (on Solaris, in our case) all rock. Squid 2.3-stable1 was a dream to compile, install, and configure. ICS has a few user interface quirks with their Java administration tool that I don't like, but except for Cisco's cache (Oh My Gosh do you really want to spend $150,000 on a CACHE???) ICS-based systems captured the many top honors in this roundup. Network Appliance's NetCache is also a nice choice, and as the only vendor with streaming media caching/splitting support, they are receiving a lot of attention recently.
It's really important to note that IRCache has no desire to point to any "winner" in this bakeoff, but instead to have real non-partisan numbers to point to when evaluating cache performance. Squid captured top honors in cache hit ratios, but nothing else (AFAICT), showing that those "expensive, proprietary systems" also can be very well-tuned operating systems that eliminate traditional OS overhead for these numbers.
One of the frequently overlooked uses of cache is as a web site accelerator, instead of the standard forward proxy. Using a few simple access control lists and a policy on a router, reverse-proxy caches managed to reduce the instantaneous load on our web servers by up to 94%. We serve about 3.5 million hits a day. A "reverse proxy" is an EXCELLENT use of a proxy cache, and after these technology evaluations I've been involved with in past weeks I'd recommend it to anybody considering running a high-traffic website. This allows your Apache servers to function more as the "cgi engine" of your site, and lets the static images, text, banners, etc. be delivered from a box that can handle 100,000 simultaneous connections. Very cool.
While I'm not allowed to post a "review" of any one of these units, because of various agreements for the evaluation boxes we tested, I can clearly state that Squid, NetCache, and ICS-based systems can and will vastly reduce infrastructure scalability costs for businesses when deployed in a reverse-proxy configuration. Our earlier estimates guessed we'd need to expand our web farm three times to handle our estimated load by the end of the year. Now we can reliably predict that our farm can serve 10 times the amount of hits we're running now by using a cache as an accelerator. VERY cool stuff.
Be sure and check out the system configurations in the bakeoff review. It's very illustrative that the boxes tested have VERY specific audiences. Don't be fooled by the "fastest hit response time" or "most throughput" -- you can spend $6,000 or $150,000 for any setup, depending on your needs.
Noticeably absent from the review was Inktomi, for the second year in a row. I'm hearing FUD from vendors that their performance isn't up to snuff-- any truth to these rumors?

Share
twitter facebook
Re:Wow! Look at that ICS stuff! (Score:1)

by mrfantasy ( 63690 ) writes:

Actually, ICS's cache engine has little to do with BorderManager's, and BM's is still available as part of BorderManager.

ICS is designed for people who don't know NetWare--it's a NetWare kernel with the ICS stuff on top of it. Say what you want about the NetWare file system, it's pretty fast when tuned for stuff like this.
Re:Can anyone explain that to me? (Score:1)

by bugg ( 65930 ) writes:

Actually, google has been switching over many of its boxen from Linux to FreeBSD.
On a side note, Google is going the whole nine yards embracing BSD- they are considering setting up a BSD-specific search engine, not unlike their current linux engine. (I've talked to a guy from google about this, more at http://daily.daemonnews.org/view_story.php3?story_ id=562)

So, they aren't really a linux or a FreeBSD camp. Currently, they are both.
Re:FreeBSD & load (Score:2)

by bugg ( 65930 ) writes:

As a desktop machine, the differences in speed shouldn't be too noticable for either side- when the VM system doesn't have to swap out pages (read: most desktop work on today's computer) the kernel isn't too stressed for I/O or CPU time, hence the system load isn't that high.
FreeBSD's VM system has been tweaked, fiddled with, and rewritten for 4.0 (by Matt Dillon) for efficent swapping. It swaps out idle pages when there is free I/O even if there is more physical ram available- so if a sudden demand for pages to be ran came up, it could easily kill one of the pages in ram as a copy exists on the swap, and then create another- so you aren't too stressed and swap out like mad when you need it.
I don't think Linux has preemptive swapping, and if it does it is new, and I'm doubtful that it is as mature.
Try putting the boxes under serious load and try again.
Very true (Score:1)

by Peter Eckersley ( 66542 ) writes:

Yes, this is true.
Hence, making khttpd a non static web server would be a very foolish thing to do, and would mangle system stability.
Static web serving is not problem (once you debug the code).
Re:Very true (Score:1)

by Peter Eckersley ( 66542 ) writes:

>> Static web serving is not [a] problem (once you debug the code).

>Nothing is a problem once you debug the code.

Oh, right. So all programming tasks are of equal difficulty then? :)

Perhaps we *should* be using a microkernel, but until we are, anything that goes into the kernel should be very robust. There's no reason why a static http server can't be robust, and there's every reason why a dynamic one (or an entire graphics subsystem, as per NT) is going to cause huge headaches.
Re:I don't think Squid won (Score:1)

by Twid ( 67847 ) writes:

[disclaimer: Novell employee]

Most of the top winners used Novell's Internet Caching System, a customized version of Netware specifically designed for caching applicances. What really gives ICS a lot of speed is COS, the Cache Object Store, which is a specialized file system designed for caching. Much of the overhead of a traditional file system (file integrity checks, etc...) aren't required for caching.

In addition, Netware is an awesome network traffic processor. We don't use the same threading model as *nixes. So, a fast file system and a fast network response = an awesome caching appliance.

No slam on BSD, they are doing great with generic software, but because of the way Netware is architected, we're just faster.

-Todd
Re:My boss will love this article. (Score:1)

by mahone ( 69636 ) writes:

BSD is great for Squid because of the excellent stack and reliability (and it's the platform of choice for Duane and most other leads developers), but Linux is better if you want performance. Async IO is only available under Linux and Solaris
Have a look at the man page for the FreeBSD mount command [freebsd.org] and say that (search for async while there). For those occasions when you don't mind flying by the seat of your pants, you can of course have async writes. It isn't a new option either. And there's always softupdates too.
I suppose it wouldn't be so easy to win arguments if people actually checked their assumptions...
Re:Oh wow, John Carmack talked, everyone be quiet. (Score:2)

by Inoshiro ( 71693 ) writes:

Let me clarify.

Somehow, a moderator decided that:
"Nothing is a problem once you debug the code."

A true, if somewhat tongue in cheek comment, is being a troll, but:

"Go work on your games or something."

ISN'T!

This reminds me of a chemistry class I once took.
"Nothing is a problem once you've got the balanced reaction equation."
"Go read a book or something."

Why do the moderators (or perhaps this one moderator) sanction trolling behaviour, and dump on a genuine statement?

See you all metamoderating...
---
Wow! Look at that ICS stuff! (Score:1)

by haggar ( 72771 ) writes:

Novell's ICS did pretty damn well! I still remember when ICS was part of BorderManager. It showed very good potential, and incredibly flexible. You can't find such a configurable caching solution anywhere!
Now, I believe ICS has stripped down conffigurability, but upped the performane.

Good job, Novell!
Re:What in the fuck ... (Score:2)

by haggar ( 72771 ) writes:

ICS (as I said in another post) used to be integral part of BorderManager. Novell decided to strip off the admirable manageability of BorderManager, but improved the performance, and that's ICS. Of course, BorderManager had much more features, and it's still sold separately, but ICS is sold to VARs like IBM and Compaq.

Look at the performance the Comaq box is sporting! It's purely amazing!! Obviously, Compaq's many years of cooperation with Novell, and the many NetWare drivers they have developed, helped them for the ICS appliance, too. Let me remind you that ICS actually runs on NetWare (but without NDS).
Copyright problems was one of them (Score:2)

by Duxup ( 72775 ) writes:

If I remember correctly, shortly after Andover's IPO Malda noted a few reasons why they don't do it. Granted I don't remember all of them but one was over copyright concerns. Copying content off of other sites can get ugly I figure, especially when they find out you have a big company with come cash behind you like VA I would think.
Re:/.tted--copyright not issue (Score:2)

by Duxup ( 72775 ) writes:

Malda used it as one of his reasons for not doing it back when he was answering such questions. I can't see how copyright would not be the issue, it's pretty a pretty blatant violation when you start copying other websites content and stick it on yours.

Just because Google hasn't been sued doesn't mean copyright is not the issue, although it would be interesting to see someone try that one in court :-)
"Well how can I be at fault? Nobody has sued Google yet!?"
Re:FreeBSD & load (Score:1)

by gharikumar ( 87910 ) writes:

I find it hard to believe that linux can be so slow. I have run linux on even slower boxes, for e.g., a 486DX100 with 48MB RAM. Even running X and KDE, the response was nowhere near as slow as that you claim to have experienced. Not that I am doubting your words, but I would seriously look to see if there is anything wrong with your hardware or configuration.

I personally have not found any difference between Linux and FreeBSD, at least as a desktop OS. They both respond with indistinguishable speeds. They are both very stable. I have not used either as a server, but have many friends who have, though not under extreme conditions.

That being the case, I think that FreeBSD's perceived superiority is a myth. I feel that, for all practical purposes, you can use whichever you prefer with no performance penalty whatsoever.

If anyone has any pointers to studies that stack up a recent linux kernel against a recent FreeBSD kernel and prove or disprove my belief, I would love to see them. Thanks very much.

Hari.
Re:PFFT, so What (Score:1)

by mr ( 88570 ) writes:

Why is it moderated Downward?

If someone posted:

Microsoft rules over Linux.
or perhaps
FreeBSD rules over Linux.

Would you consider that a 'dissenting opinion and the truth'.

If you care about shipping and buyable voice to text systems, then yes M$ rules over Linux.

If you think the GPL licence is a bad licence, and the BSD licence is better, then yes, FreeBSD is the OpenSource ruler over Linux.

In the case of these 2 examples, Linux is the loser.

Rather than spending your time whining about moderation, why don't you spend your time writing some code, or at least work on extracting you head from your ass.
[Net]BSD+LFS (Score:1)

by T-Punkt ( 90023 ) writes:

I wonder if NetBSD with its LFS instead of FFS for cache directories can boost up the results (LFS does faster writes).

I always wanted to give LFS a try on our production webcache (squid as well) since I've read some documents about it---too bad that LFS isn't matured enough yet in -current and hence probably won't be in 1.5 either:-(
Re:One Word for You (Score:1)

by T-Punkt ( 90023 ) writes:

It still doesn't make reverse cache obsolete since a reverse cache can accelerate a website by cacheing some stuff wich is generated dynamically.
Oh, you do! (Score:1)

by T-Punkt ( 90023 ) writes:

> I do not use BSD. Ever.

Too bad that you are already using BSD without
knowing it:

1. Parts of BSD are built into nearly every other OS witch supports the internet protocols: Windows, Linux, Solaris, BeOS...
The "Sockets" interface to network protocols that all those OSes offer is a BSD-developement

2. Many, many Routers run on BSD derived systems

3. Many Nameservers run on BSD systems, the Berkeley Internet Name Daemon aka BIND has spun of BSD.

4. Some of the pr0n-server you've visited yesterday run on BSD

5. ...

It's absolutely impossible to use the internet without using BSD.

It's absolutely no problem to use the internet without touching Microsoft or Linux.
Incredible Funny! (Score:1)

by T-Punkt ( 90023 ) writes:

The same comment posted twice gets
a) +1 Insightfull
b) -1 Troll

ROFLMAOPIMPTIME

Hey, trollking, a little tip for you how to
get a "+5, Informative:"

Next time try:
"I would just like to voice my support for Linux. It is the best OS ever, in my humle opinion."
Can anyone explain that to me? (Score:1)

by T-Punkt ( 90023 ) writes:

I've downloaded the page (www.freebsd.org) wich google gave me as 4th and:

% grep -i "who\|rules\|world" index.html|wc -l
0

Then I've downloaded the cached page from google as well:

% grep -i "who\|rules\|world" cache.html |wc
0

OK, "who" is ignored, says google. But that page doesn't contain *any* of the search keys except "the".

I don't understand it, why does google give such bogus results?

(AFAIK Google is linux powerd so this can't be a FreeBSD conspiracy :-)
Why? (Score:1)

by Nonesuch ( 90847 ) writes:

I need proxying to protect a large internal network from 'exposure' to the outside world, and to make efficient use of all that bandwidth.
Actually, I chose OpenBSD over FreeBSD (or any other OS), for the same reason I chose the 2940UW over another SCSI chipset-
It may not be the latest and greatest cool technology, it may not be the fastest, but I know, from personal experience, that I can rely on it.
The drives and controllers are relatively inexpensive, so I can afford to keep spares on hand, and when the current solution becomes overloaded I can easily scale it up.
More detail on this in the message 'Distributed proxies' elsewhere in the thread.
My boss will love this article. (Score:3)

by Nonesuch ( 90847 ) writes: on Tuesday February 29, 2000 @06:57PM (#1236690) Homepage Journal
I just spent a hefty chunk of company money building a pair of killer OpenBSD+Squid boxes as a load-balanced caching proxy system.
When I spec'd it out, all the techies I talked to asked me three questions, this article validates my answers to all three-
- Why BSD instead of Linux?
- Why SCSI instead of IDE?
- Why RAIDframe instead of one huge disk?
My answer to each was two parts:
1. Because I know I can rely on the technology
2. It scales well.
Semi-Off-Topic
What do I mean by a 'Killer caching proxy'?
A pair of identical (load balancing and transparent failover via BigIP) rackmount servers, each with a PIII 600 CPU, 256MB, 2940UW and 20Gb of disk. And let's not forget the triply-redundant T3's to threee distinct Tier-1 internet providers.
All this just so I can read slashdot.
Share
twitter facebook
Distributed caches with 'proxy.pac' (Score:3)

by Nonesuch ( 90847 ) writes: on Tuesday February 29, 2000 @09:01PM (#1236691) Homepage Journal

One reason I like squid is that it makes it easy and inexpensive to build a hierarchy of distributed caches. Just take any ancient PC, load a free OS, and put it where it can help alleviate congestion.
I've done a lot of work with 'proxy.pac [squid-cache.org]' files in the last year- it's amazing how much decision-making power you can put into the autoproxy script, letting the client machine take on some of the responsibilities of smart proxying.
For example, right now I have two distinct sites with their own Squid proxies, users at both sites use identical 'proxy.pac' files. The browser decides whether to go direct or via a proxy based on the host/domain of the destination, then chooses a proxy based on it's own source IP address.
This means that every Netscape and IE browser in the enterpise has the same configuration, and even roaming users will always get their closest proxy server each time they connect.
If a business unit later gets their own internet firewall and proxy, it takes a line or two in the global script, and clients automagically use the new proxy.
You can also specify multiple proxies in the file- if the first one times out, all future requests (until the browser is restarted) will go to the next server in the list.
Now if only Lynx would parse the (javascript) proxy.pac file...

Share
twitter facebook
Where exactly does Netcraft say that? (Score:2)

by Walles ( 99143 ) writes:

AFAIK the only thing Netcraft says is that Apache rules the web with 60% market share. Could you please provide a link to where they say that "thousands of ... companies have switched their Web sites from Sun platforms to Windows"?

Thank you //Johan
Re:Architecture (Score:1)

by SwellJoe ( 100612 ) writes:

I/O driven means that is relies on a select or poll loop to wait for file descriptors to be ready for processing.
The async io compile option allows what you've described. And it does wonders for performance. The Swell results from the bakeoff compared to our current results published on our page are a good example of this (we rated 77 req/sec at the bakeoff...we're testing 110 req/sec on less hardware in the lab right now). The difference was that we were experiencing problems (system lockups) with async io compiles at the bakeoff, so to simply get a solid run in we ran without async...performance was less than ideal because of it.
Those issues have been resolved, and the async io builds are significantly faster.
Re:My boss will love this article. (Score:1)

by SwellJoe ( 100612 ) writes:

Not if performance counts.
BSD is great for Squid because of the excellent stack and reliability (and it's the platform of choice for Duane and most other leads developers), but Linux is better if you want performance. Async IO is only available under Linux and Solaris and it makes a HUGE difference (look at the Swell results at the bakeoff without threads and our results more recently with threads--77 reqs/sec at the bakeoff, 110 running in our labs now--from a $2139 dual IDE drive box! Performance is on the Linux side).
Recent benchmarks of a very tweaked linux/squid box are posted at the swell technology website doing 100 req/sec.
Most WERE above average... (Score:1)

by SwellJoe ( 100612 ) writes:

I was the tech at the bakeoff for Swell's entry, and after having met everyone during the first week and taking a look at the entries, I have to say the vendors at this bake-off were pretty much all above average.
To find the below average vendors you'll need to look to the folks who didn't show up this time.
While I have an obvious interest in promoting the Swell entry (which did quite well...but not as well as we expected due to some unresolved bugs), I spent a lot of time talking to the other vendors techs. There were some very smart people pushing very good products at this bake-off. I wouldn't hesitate to recommend (for customers that need more than our boxes provide and can spend the extra dough, of course) many of the products tested.
I know it sounds rather flowery to say that "All the girls in the pageant were very pretty", but in this case, I think the bake-offs are really separating the chafe from the wheat. Look at previous bakeoff numbers and prices and compare to this time around. Keeping in mind that this bake-offs workload was MUCH harder than previous workloads (the Polymix-1 or Datacomm-1 workloads found in previous comparisons), even so the price performance has improved markedly from all vendors. And the price/performance also-rans from previous events just didn't show up this time.
That's why the polygraph guys deserve such praise. They allow cache users to really know what they are buying. And the companies that don't show up or don't provide a good value just won't sell as many boxes. (And I strongly recommend against buying an untested cache product...there are some real stinkers out there and you don't always get what you pay for.)
Re:My boss will love this article. (Score:1)

by SwellJoe ( 100612 ) writes:

Nice try. Look into the squid source and FAQ. Kernel threads are required for an async io compile of squid. I'm not knocking FreeBSD by any means. I've had to learn to love it while doing these benchmarks (the testing environment only runs properly under BSD). It's a great OS.
But squid currently performs better under Linux, when proper tweaks are made. If you can make a BSD box do 110 reqs/sec from a K6 and 2 IDE hard disks with a stable version of squid, then I'd love to hear about it. It simply isn't going to happen currently. It's not even easy to do with Linux.
When the new squid filesystems come online, all the rules may change. But currently linux is the speed king for Squid.
Look at the Swell entry, instead... (Score:2)

by SwellJoe ( 100612 ) writes:

I think it's more fair to look at the Swell entry (also using Squid, except on Linux). It's price is very similar, though it has much beefier hardware.
You should further look a little deeper into the results. Microbits box was only caching about 44% of web traffic and getting rather slow response times. So while they got 120 reqs/sec, no sysadmin in their right mind would push that box that hard. To compare apples to apples with the Squid entry or the Swell entry (both had nearly ideal cacheability and excellent response times) you should think of the Microbits box as being more along the lines of 95 or 100 reqs/sec.
To see Squid results in more favorable light, check out the more recent results on the Swell web page:
http://www.swelltech.com [swelltech.com]
Our test box at the bake off was having fits using async io...so we disabled it in order to get a clean run. However, performance suffers markedly without it. Those async issues have been resolved...Our boxes are running in our labs at 110 reqs/sec right now (we have a 100 reqs/sec run benchmark online...you can note that response of squid is still excellent at that load).
Anyway, given the proper tweaks, Squid can really scream on a low priced box. (Our $2139 unit is the one included in the bakeoff and our more recent benchmarks.)
Look a little closer at the numbers people! (Score:4)

by SwellJoe ( 100612 ) writes: on Wednesday March 01, 2000 @02:13AM (#1236698) Homepage

I've seen several comments about Squid not being the best, or what have you. Squid made an admirable showing at the bakeoff. If you only look at price/performance you are not seeing the whole picture, or even most of it.
Squid showed perfect cacheability (why buy a cache except to cache?), whereas some others in it's price range (except the Swell box also running squid) displayed much lower cacheability. Response times from a lot of boxes were not so good either, while squid's was excellent (the other reason to cache...browsing speed). When you see a box with long response times and low cache hit rate, you are looking at a box that was being pushed WAY too hard. You would not run a cache with 30 or 40% DHR and mean response times of 2 seconds...ideally, you run it such that cacheability is near perfect and response times are very very fast. Squid did that. Microbits didn't.
The Squid team have done a great job with Squid, and it gets better every time around. Even compared to the ICS products (many of which are very very fast these days...but you pay the price for them...ICS on low end boxes suffers a bit), Squid didn't do so bad at all.
Anyway, if you'd like to see some more Squid numbers, we've got a $2139 squid box in the lab doing 110 reqs/sec from dual IDE drives, whereas the Squid team got 160 from a $4k box with 6 SCSI 10k drives. We will be posting pretty specific specs for it sometime in the future so that others who want to roll their own can do so (it takes a lot of work). Some of our recent benchmarks (using Bake-off rules and benches) are posted on the Swell Technology web page. Currently, the posted benches are for a run at 100 reqs/sec. The 110 run will be posted sometime soon.
Those interested in caching should check out the squid devel list lately. Discussion has centered on a couple of new filesystem ideas that should improve squid performance markedly. Fascinating stuff. I suspect the ICS guys will be a little more worried come next bake-off.

Share
twitter facebook
Re:Very true (Score:4)

by John Carmack ( 101025 ) writes: on Tuesday February 29, 2000 @08:03PM (#1236699)

> Static web serving is not problem (once you debug the code).

Nothing is a problem once you debug the code.

John Carmack

Share
twitter facebook
Re:Architecture (Score:1)

by Serveert ( 102805 ) writes:

Yes it will indeed do wonders - optimally you should have N+1 threads where N = # of processors.

BTW - thanks to the Linux Scalability Project [umich.edu], the 2.3 Linux kernel will perform asynch I/O very efficiently. Netscape is 100% responsible for this - their imap server uses the same asynch. architecture so they patched the kernel for their imap server under the guise of this project.
Rob - you learn anything from this article? (Score:2)

by Chagrin ( 128939 ) writes:

The thing that gets me is when someone builds a web server that runs on steroidal (expensive) hardware when just a little bit of intelligence put into the system will do the trick. So many times people miss the obvious elements:
* Use a preprocessor like PHP instead of basing everything on CGI
* Don't use Apache unless you really need to. Smaller servers like thttpd or BOA will often supply everything you need, are much more lightweight and much faster
* Use a web accelerator like Squid
* If you *must* use CGI, see if you can't implement it with something like fast-cgi. Especially with Perl!
And of course, I'm sitting here posting on a web site that hasn't implemented any of the four. Slash's code is absolutely frightening -- all the scripts use the same humongous module (Slash.pm) which use DBI and *gulp* Date::Manip. And you wonder why the site gets slow!
squid bake off? Yum! (Score:2)

by anotherone ( 132088 ) writes:

yummy... squid is really good fried, as long as you don't dry it out. Marinating them works OK sometimes, but here's a tip- don't use soy sauce. It's mostly salt, and remember what happened when you salted slugs as a child? As for deamons, I think they would be too stringy to really serve as a good entre, but they might make a good shishkabob, you could use it's little pitchfork thingy.

Make Seven
Caching vs. Akamai-type services (Score:2)

by rambone ( 135825 ) writes:

What is your opinion on caching vs. Akamai? It would appear that it properly implemented, Akamai-style services can make all of this caching infrastructure obsolete.
Almighty squid (Score:4)

by wholesomegrits ( 155981 ) writes: <wholesomegrits@mchs[ ]om ['i.c' in gap]> on Tuesday February 29, 2000 @06:47PM (#1236719)

While the report is concerned with performance, the greatest aspect about squid is it's ability to transform even a crappy computer into on hell of a proxy.
Way too many times the open source software is dismissed as sort of a dull knife -- it gets the job done, but doesn't do it in an elegant or efficient way. Take apache for example, how many people rag on apache because of it's focus on compatibility vs its speed?
For Squid, I can't honestly think of a better overall proxy software. If www.proxymate.com can handle the massive amount of traffic it does running Squid on Linux, all but the most stump headed ignoramuses would realize that business needn't drop a couple thousand $$ on a specialized platform.

Share
twitter facebook
Re:PFFT, so What (Score:2)

by z___987 ( 158296 ) writes:

The WebStone benchmark tests were originally developed by Silicon Graphics to measure the performance of Web server software and hardware products. WebStone 2.0.1 is a more portable version of the original WebStone benchmark which added support to use Windows NT systems as client test systems.

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Architecture (Score:1)

Re:One Word for You (Score:1)

Re:Performance is king. (Score:1)

They should have tested conformance (Score:1)

Re:Accelerate your website -- it's awesome! (Score:1)

Maybe Squid COULD use a little teeking (Score:1)

Re:One Word for You (Score:2)

FreeBSD & load (Score:2)

Re:FreeBSD & load (Score:2)

Re:Rob - you learn anything from this article? (Score:4)

Re:Caching vs. Akamai-type services (Score:1)

Re:Rob - you learn anything from this article? (Score:2)

Re:My boss will love this article. (Score:1)

Oops (Score:2)

Re:It didn't win. (not flamebait!) (Score:3)

I don't think Squid won (Score:2)

DOS port (Score:2)

Re:What in the fuck ... (Score:2)

Re:What in the fuck ... (Score:2)

Re:Misleading title for article (Score:2)

Re:Squid and Akamai (Score:3)

Re:Performance is king. (Score:1)

Remeber NT? (Score:1)

Re:Accelerate your website -- it's awesome! (Score:1)

Re:suprised by this ... (Score:1)

Other software based on Squid (Score:2)

Re:My boss will love this article. (Score:1)

Performance is king. (Score:1)

Re:It didn't win. (not flamebait!) (Score:1)

Re:It didn't win. (not flamebait!) (Score:1)

Re:Performance is king. (Score:1)

Re:DOS port (Score:1)

Re:It didn't win. (not flamebait!) (Score:1)

Re:It didn't win. (not flamebait!) (Score:2)

Re:My boss will love this article. (Score:2)

Re:My boss will love this article. (Score:2)

It didn't win. (not flamebait!) (Score:4)

Re:Remeber NT? (Score:2)

Re:Architecture of Caching to large-scale sites (Score:1)

Re:It didn't win. (not flamebait!) (Score:1)

Re:It didn't win. (not flamebait!) (Score:1)

It's important to dig into the numbers.. (Score:1)

Re:It didn't win. (not flamebait!) (Score:2)

Re:TTchorus: site gets slashdotted, why not cache (Score:1)

Re:TTchorus: site gets slashdotted, why not cache (Score:1)

Re:Accelerate your website -- it's awesome! (Score:1)

thanks (Score:1)

TTchorus: site gets slashdotted, why not cache it? (Score:2)

Re:/.tted--copyright not issue (Score:2)

Why the BSD vs Linux flames? (Score:5)

Re:Accelerate your website -- it's awesome! (Score:2)

Re:DOS port (Score:1)

My experiences with Squid (Score:5)

Re:Why the BSD vs Linux flames? (Score:1)

Re:Architecture of Caching to large-scale sites (Score:1)

Architecture of Caching to large-scale sites (Score:3)

Re:Accelerate your website -- it's awesome! (Score:1)

Re:TTchorus: site gets slashdotted, why not cache (Score:2)

Re:My boss will love this article. (Score:2)

Re:Accelerate your website -- it's awesome! (Score:3)

Re:Cached idea, stale opinions. (Score:1)

Re:TTchorus: site gets slashdotted, why not cache (Score:1)

Re:Accelerate your website -- it's awesome! (Score:3)

Accelerate your website -- it's awesome! (Score:5)

Re:Wow! Look at that ICS stuff! (Score:1)

Re:Can anyone explain that to me? (Score:1)

Re:FreeBSD & load (Score:2)

Very true (Score:1)

Re:Very true (Score:1)

Re:I don't think Squid won (Score:1)

Re:My boss will love this article. (Score:1)

Re:Oh wow, John Carmack talked, everyone be quiet. (Score:2)

Wow! Look at that ICS stuff! (Score:1)

Re:What in the fuck ... (Score:2)

Copyright problems was one of them (Score:2)

Re:/.tted--copyright not issue (Score:2)

Re:FreeBSD & load (Score:1)

Re:PFFT, so What (Score:1)

[Net]BSD+LFS (Score:1)

Re:One Word for You (Score:1)