Forgot your password?

typodupeerror

Comment: Re:Interesting (Score 1) 459

by atamido (#38745542) Attached to: Microsoft Announces ReFS, a New Filesystem For Windows 8

Let's see: 32K file name and path limits (instad of 255)

NTFS doesn't have a 255 character limit on paths. The default file handler API in Windows (even 7/2008) is limited to ~254 characters. You can force force the use of the newer API (with 32K path lengths) by using \\?\, but I've personally found application support to be spotty, even under MS applications.

Specifics available here:
http://msdn.microsoft.com/en-us/library/windows/desktop/aa365247(v=vs.85).aspx

Comment: Re:Low quality plot too (Score 1) 505

by atamido (#38647060) Attached to: JRR Tolkien Denied Nobel Due To Low Quality Prose

Hobbits were resistant to the allure of power because of their live-and-let live ethics, even Gollum showing strength against the ring. The redolent Bombadil episode probably was left in solely to make the point that the ring had no use for him, nor he for it.

Wow, I sorta feel like an idiot for missing that. I hadn't given it a lot of thought, but I'd just sort of assumed that the hobbits possessed some natural magical resistance to the ring. Now that I see you point it out, it's so obvious that it hurts. It also clears up why Bombadil is in there, which is otherwise a very odd and short subplot.

Comment: Re:Tolkien's prose (Score 1) 505

by atamido (#38646854) Attached to: JRR Tolkien Denied Nobel Due To Low Quality Prose

he didn't really change the literary or social landscape of his day (which is what the Nobel committee usually looks for), although he somewhat "crystallised" fantasy writing

Seriously? What is with the lack of recognition? Say what you will about his writing style (I never could make it through The Silmarillion), but the man basically defined characteristics and styles of fantasy species for every book afterwards. He based his work on preexisting mythos, but he consolidated a lot of opposing mythos and people have been using what he defined ever since.

Tolkien probably had a bigger impact on fantasy writing than any else, before or after.

Comment: Re:Well, they're a good indicator of intelligence (Score 1) 672

by atamido (#38617284) Attached to: Are Brain Teasers Good Hiring Criteria?

Deciphering decades if not hundreds or years of legal precedent and laws and then coming up with a line that will persuade the jury or judge isn't a puzzle.

That sounds like mass memorization combined with social engineering. A difficult feat, but something very different from logic puzzles.

In my extremely limited experience talking to lawyers, it seems like law firms typically use a team of people with different skill sets. People to examine past case law for relevant information, people to compile relevant information in a useful way, and people to present the information in a convincing manner.

Comment: Re:Dedup is just a marketing word.... (Score 1) 306

by atamido (#38607388) Attached to: Ask Slashdot: Free/Open Deduplication Software?

it needs incredible amount of memory to operate effectively.
from my university notes:
5TB data, average blocksize 64K = 78125000 blocks
for each block the dedup needs 320 bytes so
78125000 x 320 byte = 25 GB dedup table

use compression instead. (eg zfs compression)

I'm having a hard time imagining why you're using 320 bytes for each block. If you had a 100TB of data in 4KB blocks, that's 25 billion blocks. You could use an on-disk reference table for pointing out duplicate blocks and reference it with 36 bits. If you used an excessive 256 bit hashing algorithm for each block, a b-tree with two 64-bit spaces for references, and a 4 control bits, that still only gets you to 420 bits, which is 53 bytes. You only need to store location information for blocks that are actually deduped in RAM (the rest is only referenced when a dedupe happens), so two 36 bit locations on a duplicate block would bring the total for a duplicate block to 62 bytes, with an additional ~5 bytes for each additional block that is found to be a duplicate.

Let's say it's 50TB duplicated exactly, for a total of 100TB. 5*10^13/4KB*62B = 775GB. That's a lot of RAM, but it's still only 0.7% of the total. And you could get far more efficient by using a smaller hash with a reference to a larger on-disk hash table. A 32 bit hash with a 35 bit on-disk reference (containing disk block locations and longer hashes), a b-tree with two 35 bit memory offsets, plus 20 bits for whatever, and you'd be at 24 bytes, which would drop your total memory usage to 300GB.

Of course, 32 bytes would align much better, and you could decrease space to 1/16th by using 64KB block sizes, but the point is that 320 bytes is clearly absurd. All I can think of is that you should be typing bits instead of bytes.

Comment: Re:Roll your own? (Score 1) 306

by atamido (#38607218) Attached to: Ask Slashdot: Free/Open Deduplication Software?

Certainly, there are many scenarios where block-level deduplication does wonders. It just doesn't work well in all cases. For example, trying to use deduplication to store incremental database backups. Record sizes are not fixed in a sql dump, so you're likely to waste large amounts of space for small differences. I do use ZFS deduplication to store incremental backups from a variety of machines, and generally it works well, but the SQL dumps were one of the annoying bits.

Our online backups for databases are primarily snapshots of the volumes that host the databases. Because of dedupe + snapshots, the space and time for the online backups are negligible. If I have to restore the database to yesterday, we stop the service, restore the volume to a snapshot (a few seconds), and start the service. The space used for offline backups certainly isn't efficient, but that's much cheaper storage so it doesn't matter.

Comment: Re:deduplication is just compression (Score 1) 306

by atamido (#38607204) Attached to: Ask Slashdot: Free/Open Deduplication Software?

Since most file servers have about 95% unused processor cycles and a limited amount of disk I/O both compression and dedupe can be significant wins provided they don't create an I/O profile that is a smaller percentage more random than their effective compression (ie if they add 10% randomness to the I/O profile but provide 30% compression then it's probably a net win). The fact that they potentially increase cache effectiveness is just gravy since cache is a few orders of magnitude faster than spinning disk and at least an order of magnitude faster than even SSD's.

It's probably heavily dependent on the content, types of disks, and number of disks. With a few spinning disks where multiple large files are typically streamed sequentially, adding 10% randomness would be a pretty severe penalty. An SSD has essentially zero negative effect from randomness though, so they would probably benefit greatly.

It's worth pointing out that many modern SSDs actually perform compression on their storage internally to increase performance.

Comment: Re:I've wanted deduplication for a long time! (Score 1) 306

by atamido (#38607162) Attached to: Ask Slashdot: Free/Open Deduplication Software?

NTFSs file compression actually rather sucks.

It's fine as long as you use it properly. I use it for IIS logfiles. I want to keep the logfiles but rarely actually access them, and they are append only, and they are plain text. Very high compression at a very small loss of performance.

Compressing binary data in your working set is, as you point out, probably a bad idea, but as long as you don't do anything stupid you shouldn't have any problems.

Indeed. We use Netapps for our VMs, which have built in dedupe. But log files won't dedupe, so using compression on directories that store logs is an easy way to save space. It's also an easy way to keep from having to expand the disk size, and to keep log directories under control that can grow rapidly.

Comment: Re:Not just the pay...it's the location. (Score 1) 235

by atamido (#38484852) Attached to: East Coast vs. West Coast In the Quest For Young Programming Talent

I went to a week of training recently that was hosted in my city, so I just drove instead of staying in hotels like everyone else. The office the training was located in was downtown near the top of a sky scraper, with a great view. It was a serious pain to deal with traffic and parking every day, not to mention the wireless was over loaded and the facilities were small. The companies actual main offices were located in a really nice green belt area right off a freeway, and would have been a cinch to get to. When asked, the only reason for using the skyscraper was prestige to impress the people taking the class.

I wasn't impressed.

Comment: Re:Firefox - Too little, too late (Score 1) 330

by atamido (#38444924) Attached to: Firefox 9 Released, JavaScript Performance Greatly Improved

This is why I love Tree Style Tabs. You get the tab bar on the left (or wherever else you like it), tabs structured hierarchically, collapsible trees and all that fancy stuff, including vertical screen estate.

I'm also quite pleased with Tree Style Tabs. When I'm researching something, I can have dozens of tabs open and organized by how I got to each one. No other browser can have as many tabs and be even remotely useful for finding them.

The magic of our first love is our ignorance that it can ever end. -- Benjamin Disraeli

Working...