Follow Slashdot blog updates by subscribing to our blog RSS feed


Forgot your password?

Comment Re:Give a raise to overworked programmers (Score 1) 241

Only if you specify the domain as humans. There are far too many insects to make that true if your domain is multi-cellular animal life forms. The problem here is that people seem to be forgetting about having different averages across different domains. To clarify, I think the intended statement was, "many employed *people* make more than your average employed *programmer*." I don't know if that's true. I'm a programmer, and I certainly think I make more than an average American employee. But I'm working for an international company and I'm reasonably certain that some of our programmers in other geographies make much less. Actually technically I may not be considered a programmer any more seeing as how I'm writing designs *for* the programmers. So maybe the statement is accurate.

Comment Wrong questions. More details needed. (Score 5, Informative) 219

You're not asking the right questions:

The first correct question is why on earth would someone need to access half a petabyte? In most cases the commonly accessed data is less than 1%. That's the amount of data that realistically needs to reside on disk. It never is more than 10% on such a large dataset. Everything else would be better placed on tape. Tiered storage is the answer to the first question. You have RAM, solid/flash storage (PCI based), fast disks, slow high capacity disks and tape. Choose your tiering wisely.

The second question you need to ask is how the customer needs to access that large datastore. In most cases you need serious metadata in parallel with that data. For Petabytes of data you cannot in most cases just use an intelligent tree structure. You need a web-site or an app to search that data and get the required "blob". For such an app you need a large database since you have 5M objects with searchable metadata (at 200MB/blob).

The third question is why do you have SAN as a premise? Do you want to put a clustered filesystem with 5-10 nodes? Probably Isilon or Oracle ZS3-2/ZS4-4 are your answer.

Fourth question: what are the requirements? (How many simultaneous clients? IOPS? Bandwidth? ACL support? Auditing? AD integration? Performance tuning?)

Fifth question: There is no such thing as 100% availability. The term disaster in Disaster Recovery is correctly placed. Set reasonable SLA expectations. If you go for five-nine availability it will triple the cost of the project. Keep in mind that synchronous replication is distance limited. Typically, for a small performance cost, the radius is 150 miles and everything above impacts a lot.

Even if you solve the problems above, if you want to share it via NFS/CIFS or something else you're going to run into troubles. Since CIFS was not realistically designed for clustered operation regardless of the distributed FS underneath the CIFS server, you get locking issues. Windows Explorer is a good example since it creates thumbs.db files, leaves them open and when you want to delete the folder you cannot unless you magically ask the same node that was serving you when it created the Thumbs.DB file. Apparently, the POSIX lock is transferred to the other server and stops you from deleting, but when Windows Explorer asks the other node who has the lock on the file you get screwed since the other server doesn't know. Posix locks are different from Windows locks. It affects all Likewise based products from EMC (VNX filler, Isilon, etc.) and it also affects the CIFS product from NetApp. I'm not sure about Samba CTDB though.
I would design a storage based on ZFS for the main tiers, exported via NFSv4 to the front-end nodes and have QFS on top of the whole thing in order to push rarely accessed data to Tape. The fronted nodes would be accessed via WebDAV by a portal in which you can also query the metadata with a serious DB behind it.

I've installed Isilon storage for 6000 xendesktop clients that all log-on at 9AM, i've worked on an SL8500, Exadata, various NetApp and Sun storages and I can tell you that you need to do a study. Have simulations with commodity hardware on smaller datasets to figure out the performance requirements and optimal access method (NAS, Web, etc.). Extrapolate the numbers, double them and ask for POC and demos from vendors, be it IBM, EMC, Oracle, NetApp or HP. Make sure that in the future, when you'll need 2PB you can expand in an affordable manner. Take care since vendors like IBM tend to use the least upgradable solution. They will do a demo with something that can hold 0,6PB in their max configuration and if you'll need to go larger you'll need a brand new solution from another vendor.

It's not worth doing it yourself since it will be time-consuming (at least 500 man-hours until production) and with at least 1 full-time employees for the storage. But if you must, look at Nexenta and the hardware that they recommend.

And remember to test DR failover scenarios.

Good luck!

Data Storage

Ask Slashdot: How Do You Store a Half-Petabyte of Data? (And Back It Up?) 219

An anonymous reader writes: My workplace has recently had two internal groups step forward with a request for almost a half-petabyte of disk to store data. The first is a research project that will computationally analyze a quarter petabyte of data in 100-200MB blobs. The second is looking to archive an ever increasing amount of mixed media. Buying a SAN large enough for these tasks is easy, but how do you present it back to the clients? And how do you back it up? Both projects have expressed a preference for a single human-navigable directory tree. The solution should involve clustered servers providing the connectivity between storage and client so that there is no system downtime. Many SAN solutions have a maximum volume limit of only 16TB, which means some sort of volume concatenation or spanning would be required, but is that recommended? Is anyone out there managing gigantic storage needs like this? How did you do it? What worked, what failed, and what would you do differently?

Comment Re: Can anyone illustrate? (Score 1) 196

It's still not clear how an application rendering Japanese text could end up making the bad assumption. If it's using a Japanese font, why would it bother to switch to another font when the character to be rendered exists in the current font? Does the problem only occur when the current font *doesn't* contain the character, and then the application goes hunting for it and ends up picking up characters from potentially multiple inconsistent fonts? That seems like an application issue, failing to try to retain a consistent font in this defaulting process. It points again to the notion that we should not even be doing that, but rather force applications to use "Unicode fonts" if they want to support Unicode text properly. This seems like a font issue more than a Unicode issue. Does Unicode have separate code points for italic and bold characters in other languages? Why should that information be part of the character instead of the font?

Comment Re: Can anyone illustrate? (Score 1) 196

What I still don't understand is, if there's only one code point for this character, where are the multiple renderings coming from? Multiple fonts? Is the source of the problem that Japanese fonts are providing a bad glyph/rendering for this character that doesn't match the style of the rest of the font, or is it that they are unable to provide both glyphs because there's only one code point? Would there still be a problem if they just changed their glyph to the other style; could this just be considered a bug in Japanese fonts?

Comment Re:Can anyone illustrate? (Score 1) 196

So, pardon my apparent inexperience with Unicode, fonts and glyphs, but this looks like an application or framework issue wherein someone decided that we should switch fonts in the middle of a string if there's another font that contains a glyph for the character we're after in some circumstances. Is that what's happening? Why shouldn't all text drawing operations be restricted to the currently active font, and make it the responsibility of the application developer and user to pick a font that contains all the glyphs required by their application. This doesn't really seem like a fault in Unicode, but in how the application or framework outsmarted itself in trying to switch fonts. Following the K.I.S.S. principle, this never would have happened, right? The application should simply stick to a single font. Also, under what circumstances (if any) would that "wrong" character ever be desired? Is it ever correct? Does it have a similar meaning in these other circumstances?

Comment Can anyone illustrate? (Score 3, Insightful) 196

I have been reading the comments for 20 minutes because I don't understand Japanese, but I still don't understand the problem. There's a Japanese character called no, it looks very much like a lowercase English/Latin "e" rotated clockwise about 80 degrees and then flipped over the vertical axis. Is this being mixed up with something else or rendered wrongly? Can anybody provide examples of what it's getting mixed up with or how or where it's being rendered improperly?

Comment Re:Um.. we don't see it as advancing our career (Score 1) 125

Why then, at 40, do I still get weekly contacts from recruiters looking to fill local development positions? Is it possibly your comment applies to a local market, possible in Silicon Valley, but not to the Midwest? Or is it possible that every one of these recruiters is just trying to fill a quota of prospects despite the fact that the employer they're hunting for couldn't afford me?

Comment Re:Jeez, sparse arrays, really? (Score 1) 128

From the user perspective, I think Wikipedia is correct. To any coder using a sparse array, it just looks and acts like an array where most of the elements are 0 or null. From the implementation perspective, when you know this is the case, there are some optimizations you can make to significantly reduce the memory usage of such a structure, which is why the term "implementation" was used to describe sparse arrays' relation to maps. Internally sparse arrays are implemented as maps so that space doesn't need to be allocated for all those zeros. Although a sparse array's implementation doesn't define it, it is a notable detail about how they are generally implemented. So if you want to split hairs on the definition of "is", Wikipedia probably has a better definition, but it's also not incorrect to say that they are implemented as maps.

Ask Slashdot: User-Friendly, Version-Preserving File Sharing For Linux? 212

petherfile writes: I've been a professional with Microsoft stuff for more than 10 years and I'm a bit sick of it to be honest. The one that's got me stuck is really not where I expected it to be. You can use a combination of DFS and VSS to create a file share where users can put whatever files they are working on that is both redundant and has "previous versions" of files they can recover. That is, users have a highly available network location where they can "go back" to how their file was an hour ago. How do you do that with Linux?

This is a highly desirable situation for users. I know there are nice document management things out there that make sharepoint look silly, but I just want a simple file share, not a document management utility. I've found versioning file systems for Linux that do what Microsoft does with VSS so much better (for having previous version of files available.) I've found distributed file systems for Linux that make DFS look like a bad joke. Unfortunately, they seem to be mutually exclusive. Is there something simple I have missed?

Comment Re:A dupe but can't be said enough (Score 1) 614

I thought the LCA involved in an H-1B visa is supposed to prevent paying the visa worker a wage lower than what would be paid to a native worker doing the same job. I can't find any reputable source for this, but Wikipedia states, "The LCA also contains an attestation section designed to prevent the program from being used to import foreign workers to break a strike or replace U.S. citizen workers." Is this a misconception that is not in fact backed up by any real requirement?

Comment Re:Germany should pay war reparations for WWII (Score 4, Insightful) 743

This kind of ridiculous stunt is why the Germans are sick and tired of giving Greece money. They've been model world citizens and have been subsidizing Greece for decades, and trying to use this now is the ultimate in spoiled screaming teenager tactics. Nobody bankrupted Greece except Greece - as the Nordics, who actually got their shit together, very painfully, like to point out.

If I remember correctly, it was the 3rd party auditors that made the economical recommendations that led Greece to bankruptcy. In a perfect world, the financial institutions and auditors that pushed Greece onto such a road would pay for the economical disaster that they directly contributed to. But I guess that they're busy giving bonuses to C*Os. If your financial consultant (or tax consultant) makes wrong calculations/projections/recommendations for you and puts you into default, wouldn't you seek compensation from him? You did pay him to give you realistic results. How can one country's rating go down from AAA to Junk in one day?

Germany are somewhat dour and grumpy parents, and a Grexit now is much less harmful to Eurozone than it would have been two years ago, so being kicked out of the house isn't out of the question at all. I wouldn't push it too hard.

You're claiming that it's not fair, but the IMF and ECB gave Greece loans at rates that are not sustainable. I can get an EURO credit at a lower rate than Greece has. Furthermore, for Germany it's win/win. They bought out a lot of Greek companies for pennies. Think of OTE that was bought by Deutsche Telekom. I personally feel like this is looting and not helping out. Private corporations from the US, UK and Germany (financial and audit) bankrupted Greece with bad advice, while earning serious money for it (think Deloitte, S&P, etc.). When the bubble burst, the Greek government received help at ridiculously high rates from a few countries and multi-national institutions. Then came the major companies from those countries and bought everything for pennies. Afterwards, they are still complaining that the Greek can't make the payments.

I'm not German or Greek, but have been following this for years in the Economist and Bloomberg, and I know lazy scammers trying to wheedle more money rather than earn it.

I see your problem right there: you're reading it from Economist or Bloomberg. How about checking out the bare survival conditions of a lot of Greek citizens? Should Greece abandon them because Germany said austerity is the way? The Greek government's responsibility is to it's citizens. P.S.: I'm not Greek or German either. I don't live in Greece or Germany, but I try to get my news from newspapers that aren't necessarily in New York, London, Frankfurt, Tokyo or Hong Kong.

Comment I Created an Easter Egg (Score 1) 290

I created an easter egg in a product called Fourth Shift Edition for SAP Business One ( maybe 5 years ago that rendered an interesting sequence of John Conway's game of Life (starting from the acorn state) while displaying names of developers in a marquee. Trying to remember how to access it... I think it was just typing "LIFE!" while looking at the about dialog. I work pretty efficiently so it was hard to keep me busy at times. The easter egg was a (self-inspired) way to do something interesting related to the software I was working on for a couple hours while waiting to see what came next... and I thought it might someday briefly amuse someone too accustomed to nothing but business all day long. (The software is for ERP.) I showed it to my boss and a few coworkers who, if I recall, all had positive reactions... or at least no negative reactions I'm aware of. I'm not sure if anyone would have expressed a negative reaction to me if they had one because I feel pretty well respected there. I'm not sure anyone who knew about it is still with the company. Maybe I should tell a couple support people about it in case they feel like using it as a diversion while researching a solution to someone's inquiry, especially since it's Easter time. :)

Be careful when a loop exits to the same place from side and bottom.