Comment Missing a link (Score 1) 1
i go clicky clicky clicky on all the things but it don't show me no airstream.
i go clicky clicky clicky on all the things but it don't show me no airstream.
No mention of improvements to the SATA/SAS attachment or protocols to address how you're going to read or write 100TB in any reasonable amount of time. We'll be lucky to see throughput double, and IOP/s are already pretty much plateaued for spinning disks. All these are gonna be good for is data lakes and backup-to-disk.
Computers, LLMs, Neural Networks, and anything that runs on silicon that acts or speaks on behalf of a dataset curated by web scrapers and data provided by corporations and people with inherent biases will _never_ have human intuition, nor be cognizant of the damage it can do to someone's livelihood, health, or safety if it were to provide bad info. And they provide bad info ALL. THE. TIME. There's no way around this, either. The output of any AI process needs to be vetted by a human if it's to ever be used in applications with existential impact.
He coulda spent $50K to get him 95% of the quality he was looking for.
It's that diminishing return at the end and not knowing when to stop that doomed him.
Imagine how much better their lives could've been if they'd spent that $1M on experiences and making memories.
(eccentric dads, listen up)
I used to be a senior engineer at an MLS data processing clearinghouse, handling data from over 200 MLSs. It always boggled me how difficult it was to get various MLS orgs to agree on schemas and metadata for their database entries, while at the same time having it _all_ be managed and distributed by one centralized org. I knew this would happen, and I'm kind of amazed it didn't happen earlier. You'd think that with the amount of money exchanging hands for licensing and access to these massive datasets, they'd figure out how to standardize on a schema and access methods that enables (encourages) decentralized and duplicated storage and transactions. Too much is riding this to not have some sort of redundancy in the system (for transactions and edits, not just replicating the data everywhere, like on ReMax, Redfin, Lyon, Zillow, and other real estate services vendors' networks... that's the easy part)
MLS has been used for YEARS and nobody thought to decentralize its transactional features, making a virtual monopoly on the most important functions, and that irked the hell out of me.
If you have a small budget and moderate reliability requirements, I'd suggest looking into building a couple Backblaze-style storage pods for block store (5x 180TB storage systems, apx $9000 each), each exporting 145TB RAID5 volumes via iSCSI to a pair of front-end NAS boxes. NAS boxes could be FreeBSD or Solaris systems offering ZFS filestores (putting multiples of 5 volumes, one from each blockstore, together in RAIDZ sets), which then export these volumes via CIFS or NFS to the clients. Total cost for storage, front-ends, 10GbE NICs and a pair of 10GbE switches: $60K, plus a few weeks to build, provision, and test.
If you have a bigger budget, switch to FibreChannel SANs. I'd suggest a couple HP StorServ 7450s, connected via 8 or 16Gb FC across two fabrics, to your front ends, which aggregate the block storage into ZFS-based NAS systems as above, implementing raidz for redundancy. This would limit storage volumes to 16TB each, but if they're all exposed to the front ends as a giant pool of volumes, then ZFS can centrally manage how they're used. A 7450 filled with 96 4TB drives will provide 260TB of usable volume space (thin or thick provisioned), and cost around $200K-$250K each. Going this route would cost $500-$550K (SANs, plus 8 or 16Gb FC switches, plus fibre interconnects, plus HBAs) but give you extremely reliable and fast block storage.
A couple advantages of using ZFS for the file storage is its ability to migrate data between backing stores when maintenance on underlying storage is required, and its ability to compress its data. For mostly-textual datasets, you can see a 2x to 3x space reduction, with slight cost in speed, depending on your front-ends' CPUs and memory speed. ZFS is also relatively easy to manage on the commandline by someone with intermediate knowledge of SAN/NAS storage management.
Whatever you decide to use for block storage, you're going to want to ensure the front-end filers (managing filestores and exporting as network shares) are set up in an identical active/standby pair. There's lots of free software on linux and freebsd that accomplish this. These front-ends would otherwise be your single-point-of-failure, and can render your data completely unusable and possibly permanently lost if you don't have redundancy in this department.
If I were involved with space exploration, I'd say the Voyager 1 space probe.
Launched 1977, still receiving commands and sending back data from interstellar space, 0.002 light-years away, and expected to run until 2025 with no hope of getting any upgrades or even a recharge.
It costs money to upgrade and stabilize the power grid. It costs money to stay ahead of the failure curve.
The current infrastructure sucks mainly because it's unpredictable and takes too much effort to synchronize disconnected sections of the grid before connecting them. You can't just "route around" a dead transmission line if there are generator stations active on both sides of the break. You must wait for the two sides to synchronize in phase before connecting them, which can take several seconds to a minute. If you don't, you'll cause even more breakers to trip.
None of this would matter if we switched distribution to HVDC. We have the technology, but again, the cost to convert everything to employ DC-DC switching converters is prohibitive. The biggest upside to switching everything to DC (all the way to the end-user) is that you could add standby capacity by simply connecting batteries to your mains circuit between the main breaker and load panel. The more people in a neighborhood using batteries to buffer their power source, more aggregate protection the neighborhood has against blackouts.
What's the difference between a computer salesman and a used car salesman? A used car salesman knows when he's lying.