Become a fan of Slashdot on Facebook

 



Forgot your password?
typodupeerror
×

Comment Re:Optane? (Score 1) 99

Which begs the question...just how much better does some technology (hardware or software) have to be over the current offering before people will actually put effort and money into adopting the new technology? No one is going to rewrite their software to take advantage of something like Optane for a simple 10%-15% improvement! Even if you double the speed, it can be a real challenge to get mass adoption if your solution is not a 'drop-in replacement'.

I have experienced this with a database engine I have developed. It is built on a completely new architecture I invented that has proven to be very fast and efficient. I can load in large relational tables and do queries about 10x faster on average than conventional databases (without needing separate indexes), but because it doesn't have all the bells and whistles yet that Postgres, MySQL, or SQL Server have; it is like pulling teeth to get people to even try it out let alone use it for real work.

Comment Re:incompetence (Score 1) 77

Are they sure the ransomware came from an OUTSIDE source? Let's see. The tech guys had already tested out a new solution and were ready to deploy. They were waiting for a possibly reluctant management team to give their permission to go ahead. Along comes this hack to push management over the edge...hmmm.

Comment Re:what a C**T (Score 1) 419

I think this is one of the biggest problems with open source. Once you pick a license, there is no 'going back'. Many projects have a ton of time and effort devoted towards them before the developer decides to open source it. If you choose the wrong license, you might have given away all rights to control what you have already done. Anyone (including big corporations) can fork your project and use it and/or subsume it at that point.

I have a hobby project that I have put thousands of hours on development time into. Lots of people tell me I should just open source it so that it gets a lot more traction, but I am hesitant. I know I could lose control of it if I do.

Comment Re:Symptomatic vs asymptomatic (Score 1) 193

There are lots of people who claim to be 'following the science' but want to completely ignore evidence that does not fit their personal narrative. We are two years into this pandemic, yet it seems to be incredibly hard to get good data that might help an individual assess their own personal risk. 'Health experts' will quote spikes in case rates, hospitalizations, or deaths without giving you any details about which demographics are really the highest risks. Just like insurance companies who have detailed charts showing the risks for hundreds of factors, we should know how much higher the risk is to a 70 year old compared to a 50 year old compared to a 20 year old. We should know how obesity, diabetes, or other heath factors contribute to risk of hospitalization and/or death. The symptomatic vs asymptomatic breakdown seems almost trivial to differentiate, but we can't even get that!

Comment Re:What's more important? CO2 targets or modernity (Score 1) 66

So if Ireland turns down the opportunity to build a data center in order to meet its 'CO2 targets' and the data center gets built in France instead then the climate activists in Ireland can pat themselves on the back all day but just how does that help the 'global environment' exactly???

Comment How long does it take to find something? (Score 2) 164

Given the cheap nature of mass storage (about $20 per TB for typical HDD), packrats like me store everything. The more relevant question is 'How long does it take to find that file you misplaced?' The more files you have, the longer it takes. Say you looking for a photo you know you took about a year ago and transferred it from your phone to your laptop, but you forgot where you put it. Finding it on a really big drive that is full of other stuff can take a long time. Especially if you can't remember its exact name or what format (JPEG, PNG, GIF, etc.) it used.

This is one of the biggest problems with antiquated file systems (most of them were first released when a 1GB drive was huge!) They don't help you find anything. Attaching meta-data tags to files is also a pain and searching based on them is slow. I am building a new object store that I hope will replace file systems. I have a container where I created 50M files and I can find all the photos (over a million of them) in under a second.

Comment File Systems are another example (Score 1) 143

Take NTFS (please!). It was first released in 1993 when the biggest hard drive you could buy was about 1GB. It was designed to scale up so they made sure lots of numbers that kept track of things like file size, blocks in use, and files per folder would accommodate large values. The problem was (and is) that it has a lot of meta data per file in the MFT (4096 bytes) and searching for files is very hard and time consuming.

Today you can buy a 20TB hard drive and if you put a bunch of them in a RAID you can get massive storage pools. It is very possible to get a few hundred million files in a single volume. It won't be long (if it hasn't happened already) before there are over a billion files in a single volume. Just 100M files takes up over 400GB in the MFT. To find all your photos, for example, you have to read in that whole table (and store it in memory if you want to do lots of searches fast). Things like extended attributes let you attach even more metadata, but searching on them takes forever when you have millions of files.

We need a replacement for old, antiquated file systems that let you store millions of files and find anything, or everything in just a few seconds without needed a separate indexing system (with its own database that can get out of sync).

That is what I have been working on. An object store that can store hundreds of millions of files in a single container and load the whole file table into memory in 1/64 the memory space that NTFS needs. Attach lots of meta-data tags to any file and then find anything with a tag in just a second or two. Software does not just need to scale, it needs to be super efficient at every level.

Comment Re:Personal Dilema - Open source my project (or NO (Score 1) 142

Of course, every software design has certain trade-offs. This includes RDBMS implementations (DB-ENGINES.COM lists over 300 of them). Some engines have been fine tuned for very specific data sets, but many of them try to deal with general data that can be just about anything. Every engine is built upon a set of algorithms and data structures that the designers felt was optimal. They continue to refine them over time as well.

What I am saying is that I have developed an engine that seems to be a great leap forward in the general case. My algorithms and data structures are very unique and very performant. I have loaded in dozens of various data sets and it consistently beats conventional DBs in a wide variety of queries. That has been my experience to date. I am encouraging any skeptics to download and try the system with their own data set and see for themselves. Please don't just dismiss it as 'unlikely' or 'impossible' before you actual try it out.

Comment Re:Personal Dilema - Open source my project (or NO (Score 1) 142

Most row-oriented databases store each record (i.e. row) in one place and must create a separate index to speed up searches (to avoid the dreaded table scan). This requires two copies of the data and forces the database administrator to guess which columns are most likely to be searched against in order to create indexes on the right columns. Updates and deletes require both the table and its indexes to be changed which costs time as well. You are correct that my system does some pre-processing on the data as it is inserted. All the data is stored in a way that makes searches incredibly fast. You are incorrect in assuming that this method is geared toward special cases instead of the general case. If you think you are right, please try and prove me wrong by downloading it and using whatever data set you like with it.

Comment Personal Dilema - Open source my project (or NOT) (Score 2) 142

I have a hobby project that I have been working on for years. It is an entirely new kind of data management system (www.Didgets.com) that shows incredible promise so far. It can do things thousands of times faster than traditional file systems. It can query large DB tables an order of magnitude faster than conventional databases (without needing separate indexes). It can do a bunch of things no other system can do.

I have been trying to get more people to try it out (for free) by downloading it from the website. I have lots of people tell me that I should just open source it. On the one hand, it will probably get a lot more people interested in it if I do. On the other hand, it could turn into a nightmare where I have thousands of people demanding all kinds of changes, fixes, documentation, etc. without any willingness to pay for any of it. Will it still be a 'fun project' in that case? I love to program. Actively maintaining an open source project might not be something I enjoy.

Comment Re:What the hardware giveth ... (Score 1) 97

While software has improved in many respects, the fast hardware can make programmers a little lazy. "Why spend a few hours making sure this function runs as fast as possible? Just tell the customer to get a faster computer!" seems to happen a lot. I have been developing a whole new kind of data engine that I often compare against other databases like Postgres. On my new Ryzen 5950x it can execute queries against a 7 million row table about 10x faster than PostgreSQL v.13! All because it was designed from the ground up to process everything it can in parallel on multi-core CPUs. I tried to test it on old hardware whenever I could so that I was constantly worried about performance. Here is the video (5 minutes). https://www.youtube.com/watch?...

Comment Re:Search Speeds (Score 1) 62

There are a couple major problems that go along with ever increasing HDD (and SSD) capacities.

1) The speed increases are much smaller relative to capacity increases. Every time the capacity of a device doubles, the read/write speed only seems to increase something like 1.2x. This means it takes longer and longer to fill up the drive or read all the data. If a drive in a RAID configuration fails, it can take days to rebuild. If a virus or ransomware attacks all your data, it can likewise take days to restore it from backup (assuming you had it all backed up).

2) The file system metadata is too big. Most file systems take up at least 256 bytes per file for each entry in the file table. NTFS is a hog and takes up 4K per file. This means that if you have a 50 TB system with 1 billion files on it, you will need to read 4 TB of data from the drive (and store it in RAM) just to scan the file table.

We need to completely rethink how we store and manage data on our persistent storage devices. I have been building a whole new kind of data management system. It can load the metadata for 200 million files in just 12 GB of RAM. It can scan through and find all your photos (or docs, or videos, or ...) in just a couple of seconds. The system is called Didgets and it or something like it is needed in the age of PB data storage.

Slashdot Top Deals

Were there fewer fools, knaves would starve. - Anonymous

Working...