Become a fan of Slashdot on Facebook

 



Forgot your password?
typodupeerror
×

Comment Re:"Hate Speech" you say. (Score 1) 105

Well, there is that I suppose, but to me it just seems like it is making it harder for people who are ethnically/culturally Jewish to separate that identity from the Jewish religion.

The other issue is that we end up protecting religious beliefs that should not be protected, sometimes at the expense of other people's rights.

Comment Re:Flash is costly? (Score 5, Informative) 37

Creating the training dataset is the *last* step. I have dozens of TB of raw data which I use to create training datasets that are only a few GB in size. Of which I'll have a large number sitting around at any point in time.

Take a translation task. I start with several hundred gigs of raw data. This inflates to a couple terabytes after I preprocess it into indexed matching pair datasets (for example, if you have an article that's published in N different languages, it becomes (N * N-1) language pairs - so, say, UN, World Bank, EU, etc multilingual document sets greatly inflate). I may have a couple different versions of this preprocessed data sitting around at any point in time. But once I have my indexed matching pair datasets, I'll weighted-sample only a relatively small subset of it - stressing higher-quality data over lower quality and trying to ensure a desired mix of languages.

But what I do is nothing compared to what these companies do. They're working with common crawl. It grows at a rate of 200-300 TB per month. But the vast majority of that isn't going to go into their dataset. It's going to be markup. Inapplicable file types. Duplicates. Junk. On and on. You have to whittle it down to the things that are actually relevant. And in your various processing stages you'll have significant duplication. Indeed, even the raw training files... I don't know about them, but I'm used to working with jsons, and that adds overhead on its own. Then during training there's various duplications created for the various processing stages - tokenization, patching with flash attention, and whatnot.

You also use a lot of disk space for your models. It's not just every version of the foundation you train (and your backups thereof) - and remember that enterprise models are hundreds of billions to trillions of FP16 parameters in their raw states - but especially the finetune. You can make a finetune in like a day or so; these can really add up.

Certainly disk space isn't as big of a cost as your GPUs and power. But it is a meaningful cost. As a hobbyist I use a RAID of 6 20TB drives and one of 2 4TB SSDs. But that's peanuts compared to what people working with common crawl and having hundreds of employees each working on their own training projects will be eating up in an enterprise environment.

Comment Putting numbers into perspective (Score 3, Interesting) 125

This is all to produce a peak of 240k EVs per year. Production "starts" in 2028. It takes years for a factory to hit full production. Let's be generous and say 2030.

Honda sold 1,3 million vehicles in the US alone last year - let alone all of North America, including both Canada and Mexico. If all those EVs were just for the US it'd be 18% of their sales, but for all of North America, significantly less.

In short, Honda thinks that in 2030 only maybe 1/7th to 1/8th of its North American sales will be EVs. This is a very pessimistic game plan.

Comment Re:They have no choice (Score 3, Informative) 125

Most Japanese brands were late to the EV game, with the exception of Nissan. I think some of them are still hoping that hybrids remain available for decades to come.

Honda's first EV, the Honda e, was really good. Okay, small battery, but everything else was great. Top notch tech, the best HMI of any car on the market, and the vehicle itself really took advantage of the EV drivetrain with a tiny turning circle and well tuned suspension.

Their second one, the confusingly named e:Ny1, is pretty pedestrian, if you will excuse the pun. It has barely any EV features. Bizarrely the regen is both weak and resets to off after a few minutes of driving. It's a nice enough car in other ways, but priced ridiculously high and already massively discounted. Why they ditched all the good work they did with the Honda e remains a mystery.

There is the up-coming Honda and Sony collaboration, but I expect it will be over-priced and not particularly great.

Toyota's BZ4X or whatever it's called is apparently decent. Some initial software issues that limited charging speed were quickly fixed. Mazda has one EV but it's not very good. Mitsubishi had one but never developed it, and now has none. Suzuki, Daihatsu, and several others don't seem to have any EVs at all. Apparently a lot of the issue is down to their suppliers in Japan not developing suitable EV drivetrain components and not wanting to rely on China like the rest of the world does. Hard times for Japan's auto industry.

Comment Re:Time to get off the pot? (Score 1) 89

You need offshore wind. Capacity factor in Europe is already over 50%, and increasing. Prototypes of very large deep sea windmills are up in the 70% range.

The US has massive amounts of offshore wind just waiting to be tapped. That can replace coal because it is consistent - the output varies within a range, but never stops. Combine with long distance transmission lines to areas where those coal plants are.

It's purely a political issue that it doesn't get done. Europe isn't immune to that either, we could do far more. Even in China, where they have more wind power than the rest of the world combined and are installing it at a fantastic rate, there is push-back from the coal industry and corrupt politicians.

Comment Re:Cool. Next step... (Score 1) 89

We already have a lot of that stuff in Europe, but need more. Some of the things you list emit soot and other non-greenhouse but still damaging pollution. Wood burning is a good example, it degrades air quality in an entire village or neighbourhood.

We do regulate emissions from home appliances, like we regulate them from cars.

Comment Re:Losing money anyway (Score 1) 209

Twitter has been losing money for years... Did they ever turn a profit? Certainly not under Musk.

Facebook lost money for many years too. As does TRUTH Social, although that might actually fit the description of propaganda.

That's just how tech start-ups work. Lose money but gain users, and eventually enshittify.

Comment Re: Obligatory... (Score 1) 209

More than that. TikTok is where a lot of younger people share political philosophy. It's one of the few mainstream places where socialism is the dominant movement, which is why they want to destroy it.

Without TikTok fewer young people would be members of unions, fewer would be taking climate change so seriously, and more would be vulnerable to bad landlords who rely on ignorance of legal rights. While there is of course a lot of crap on there, it's not true to say that there is nothing of value.

Comment Re: Wonder if he can make it funny again. (Score 2) 29

Lately The Onion has called a lot of the reporting around the situation in Gaza days or weeks before it happened.

It's funny, but it's also really biting satire that we need to help us keep perspective here. Their stuff about all the ways the media will find to avoid saying Israel killed anyone is a good example. Some of the headlines, about bullets "finding" their way into children's heads, are truly beyond parody, but we will need satire to remind us just how insane they actually are.

Comment Re:What? (Score 2) 79

I had an Amstrad PC1512 that came with DOS 3.3, but also with DOSPLUS that offered CP/M compatibility. And the Gem windowing system version 2, which was the one that was hobbled by a patent dispute with Apple, which resulted in the desktop being only able to show two windows side-by-side (apps could do what they liked).

I think I spent 90% of my time in DOS, although Locomotive BASIC II in Gem was interesting.

Slashdot Top Deals

Old programmers never die, they just hit account block limit.

Working...