No OS is immune to fragmentation. On a data store disk with ext3 and tons of files in the 5M range, this is what happened (sudo filefrag *):
rt-01n8vmuqn8xtls6d.w4c: 141 extents found, perfection would be 1 extent
rt-01n9q0j59s1sovam.w4c: 23 extents found, perfection would be 1 extent
rt-01nk9zgmitrsow7g.w4c: 8 extents found, perfection would be 1 extent
rt-01nlrr9aaasuk0yb.w4c: 20 extents found, perfection would be 1 extent
rt-01o3kwc33nhpgqg4.w4c: 41 extents found, perfection would be 1 extent
rt-01o3p9b4x2mfbwem.w4c: 16 extents found, perfection would be 1 extent
rt-01ohtzjkl2z2y3wl.w4c: 17 extents found, perfection would be 1 extent
rt-01orb2yYTsp1vALN.w4c: 1 extent found
rt-01orz1hkb5jzbepv.w4c: 29 extents found, perfection would be 1 extent
rt-01q9x02lltcvogr1.w4c: 62 extents found, perfection would be 1 extent
rt-01qq34rl6exztyx3.w4c: 17 extents found, perfection would be 1 extent
rt-01qrz236bvnim44i.w4c: 14 extents found, perfection would be 1 extent
Solution? None. Just add more drives. "Sequential" reads are now at 15M/sec if you balance the load over the raid1 array, it isn't too bad, but if it was an issue I'd take NTFS with its safe and secure online defragmentation API over Linux anytime.
Too bad the price difference between them and others is much too large. I'm not talking about tens of percents, but 2 cent per 1000 views vs the 50ct we currently receive for US traffic. For international traffic, divide by 5 to 10. Advertising revenue is bad enough already, unless you serve millions of pages a month, you're not going to break even. Reducing that by another 90% is plain suicide - it's probably more effective to remove them all and add a donation button if you can take a 90% pay cut.
I really would love to support them, but advertisers just do not want to advertise internationally with the same ad. Even brands like Dell have 30 different versions of their ads, one for each country, and depending on where the visitors are, they get their local version with prices in their own currency and the text in their own language - it simply works better that way. If you serve me US prices for Dell, I still have no idea what the final price is in euros after the import duties, VAT and the price difference of Dell NL vs Dell US, so that makes the ad useless for 70% of the visitors. Project Wonderful can never achieve this with their model. This internationalization is one of the main reasons editors on sites have lost control of the advertisements - there's just no way you (or anyone) can review thousands of ads each day...
I hate how the ads market works and I'd love to see a fix for it, but Project Wonderful isn't it. The market is completely in control of the advertising networks, it's hell for us independent publishers; we just get a check every month, and there's nothing we can do to influence it.
I would love to see such a feature, it would make the life of everyone hosting a advertising revenue dependent site a lot easier. The Slashdot standard answer of Noscript/Adblock/Hostfile doesn't solve anything - there will always be users who don't mind advertisement as much as you do, and it's our job to protect them from harm.
Yes, I run a site that have ad revenue. No, I don't deal directly with the scareware crowd, I sell my space to Google, Right Media, AOL etc. But if they make a mistake and a user gets served a bad ad, I'd love to know from which network it came, so I can demand they take down that ad ASAP and if this is repeated, I will take my business somewhere else.
But browsers just lack that information at the moment, so to report an ad, we ask our users to follow the procedure below:
It will contain a lot of useless info, but somewhere in between, there's the magical <script> tag plus the generated (=bad) content for verification.
On any recent linux system, free reports only really free memory, not page/disk cache. That one is reported in the cached column. (sorry, no pre html tag @
free -m
total used free shared buffers cached
Mem: 48310 48090 220 0 94 15120
-/+ buffers/cache: 32875 15435
Swap: 16383 25 16358
PostgreSQL's performance depends on the page cache, so you can't see all of cached as free - if you let the cached number drop too much, your disks die.
I apologize for this automatic reply to your email.
To control spam, I now allow incoming messages only from senders I have approved beforehand.
If you would like to be added to my list of approved senders, please fill out the short request form (see link below). Once I approve you, I will receive your original message in my inbox. You do not need to resend your message. I apologize for this one-time inconvenience.
Click the link below to fill out the request:
There's no way I'll waste my time filling in that form, so I've added big warning on the registration page now - sorry users of a overzealous ISP, please disable your spam filter if you can or just use another email address to register from.
The same goes for ActiveRecord. It's great in simple cases, but falls apart rapidly when you're developing larger web apps, especially when you're performing complex data retrieval. It gets even worse if you need to optimize that data retrieval. At this point, ActiveRecord becomes a huge pain in the ass, rather than a useful tool.
Hmm, can you explain why? I haven't worked with Rails much, but in Django, if things are too slow like in a really complex query spanning so many tables that the PostgreSQL optimizer chokes, you can hand optimize it easily by either rewriting how you specify it in python, breaking it up into multiple statements. You can also choose to retrieve the data as plain tuples or map/dicts if you need to fetch thousands of rows (100k+, I've no problem with 20k/query the normal way at all). If all fails, plain sql is just 2 statements away, with an easy way to turn the results back into objects.
A good ORM recognizes that there are situations that falls outside of the common/simple use cases and should assist you in the harder things, not work against you.
"If I do not want others to quote me, I do not speak." -- Phil Wayne