Slashdot is powered by your submissions, so send in your scoop

 



Forgot your password?
typodupeerror
×

Comment Re:Flash is costly? (Score 5, Informative) 36

Creating the training dataset is the *last* step. I have dozens of TB of raw data which I use to create training datasets that are only a few GB in size. Of which I'll have a large number sitting around at any point in time.

Take a translation task. I start with several hundred gigs of raw data. This inflates to a couple terabytes after I preprocess it into indexed matching pair datasets (for example, if you have an article that's published in N different languages, it becomes (N * N-1) language pairs - so, say, UN, World Bank, EU, etc multilingual document sets greatly inflate). I may have a couple different versions of this preprocessed data sitting around at any point in time. But once I have my indexed matching pair datasets, I'll weighted-sample only a relatively small subset of it - stressing higher-quality data over lower quality and trying to ensure a desired mix of languages.

But what I do is nothing compared to what these companies do. They're working with common crawl. It grows at a rate of 200-300 TB per month. But the vast majority of that isn't going to go into their dataset. It's going to be markup. Inapplicable file types. Duplicates. Junk. On and on. You have to whittle it down to the things that are actually relevant. And in your various processing stages you'll have significant duplication. Indeed, even the raw training files... I don't know about them, but I'm used to working with jsons, and that adds overhead on its own. Then during training there's various duplications created for the various processing stages - tokenization, patching with flash attention, and whatnot.

You also use a lot of disk space for your models. It's not just every version of the foundation you train (and your backups thereof) - and remember that enterprise models are hundreds of billions to trillions of FP16 parameters in their raw states - but especially the finetune. You can make a finetune in like a day or so; these can really add up.

Certainly disk space isn't as big of a cost as your GPUs and power. But it is a meaningful cost. As a hobbyist I use a RAID of 6 20TB drives and one of 2 4TB SSDs. But that's peanuts compared to what people working with common crawl and having hundreds of employees each working on their own training projects will be eating up in an enterprise environment.

Comment Putting numbers into perspective (Score 3, Interesting) 82

This is all to produce a peak of 240k EVs per year. Production "starts" in 2028. It takes years for a factory to hit full production. Let's be generous and say 2030.

Honda sold 1,3 million vehicles in the US alone last year - let alone all of North America, including both Canada and Mexico. If all those EVs were just for the US it'd be 18% of their sales, but for all of North America, significantly less.

In short, Honda thinks that in 2030 only maybe 1/7th to 1/8th of its North American sales will be EVs. This is a very pessimistic game plan.

Comment Re: such yield, very profit (Score 1) 74

Succinctly put, we live in a time where there are $3T dollar corporations with $0.021T net out of $0.06T gross.
You cannot make your revenue/profit->stock price association math work with that, period. That's because that stock value became divorced from any connection to that corporation's profit a long time ago.

Comment Re: such yield, very profit (Score 0) 74

A company is valued by looking at its revenue, its costs (profit is revenue minus costs) and it's growth potential.

Nope.

If someone invests money in a stock they expect either the value of the company to go up due to growth or a dividend from the profit or hopefully both.

Yes.

It's not rocket science.

It very much is, and that's why you're not rich.

If a company is growing because they're plowing all their profit into expansion to fuel greater future revenue then their value and price will go up even though they're not paying a dividend.

No. That's demonstrably not how it works.

Obviously a company with a high stock price that stops growing and doesn't make any profit so doesn't pay a dividend won't have a high stock price for very long so your correlation is mostly correct.

Obviously.

Saying that dividends and profit have nothing to do with share market value is completely insane though.

They don't, because you've stared contrary evidence directly in your face and ignored it.
Stock value is largely psychologically set, as someone elsewhere in this thread lamented.
We live in a world where $44 billion dollar companies are unprofitable.

And while that's the more extreme demonstration, and we can just write it off as Musk being either a superhero, or a fucking moron, you can also just point to stocks almost anywhere in the tech sector, where even a divident payment ratio of 100% wouldn't result in a yield ratio of 3%.

Comment Re:Nano-dividend (Score 1) 74

XOM- and all fossil energy in general- usually has decent yields, and high stock value.
The dividends stay decent there, because the growth potential of the stock isn't great against competing stocks.
The reason they can afford high dividend yields, is because their profit margin is fucking obscene- as high as 50-60%.
The reason that's the case, is because they're in the business of getting society to subsidize as much of their extraction of a resource as possible, and then selling us the fruit of our already paid-for labors.

I wouldn't trust any Realty Trust. There's a reason they have high dividends- because their stock is ultra-high risk.

Comment Re:Honorable of them to take a Stand (Score 1) 194

ByteDance is majority owned by individuals not subject to Chinese jurisdiction.
I agree that the Chinese Government has great power to compel the company to do things based on the jurisdiction that it exists in, however, the idea that what it does isn't visible to its very international operation, majority owned by international entities... is frankly absurd.

We wouldn't dare apply this logic and paranoia to anything that really mattered, like the fucking computing devices that we're having this discussion via.

Comment Re:Nothing to see here (Score 1) 194

The US is a minority of the total TikTok userbase.

What makes more sense- sell an entire app, because a governing authority of a minority of your users demands you sell, or to say "fuck you guys."

But no, for sure. ByteDance is definitely a CCP PsyOp- just don't sweat the megabytes of firmware running on whatever you types your comment on that was flashed to hardware in China, while being made in China, by a computer manufacturer that is (more than likely) based in China.

Comment Re:Plainly Unconstitutional (Score 2) 194

You're about 86 billion neurons short from what you need to interpret the Constitution, period, if that's the way you think critically.

1) China is not hostile to the US. They are a geopolitical competitor, that's for sure.
2) ByteDance is 40% owned by the Chinese. 60% owned by you, me, and everyone else with a 401k. This makes them about the same as any company that has a Chinese presence. We should ban your PC laptop next. Who knows what kind of data exfiltration algorithms the CCP has squeezed into the firmware.
3) You cannot prove harm, because nobody has. Any harm here is hypothetical, and frankly, more than a little bit tinfoily.

Comment Re:Losing money anyway (Score 1) 194

The Chinese component of ByteDance's ownership is a minority of its total ownership.
I don't find your paranoia compelling.

Sure, the fact that it has a Chinese presence, it has at least some level of connection and cooperation/collusion with the PRC politburo, but it's not like that somehow runs in this black box hidden from the 60% ownership that exists in other countries.

Comment Re:Losing money anyway (Score 1) 194

Doesn't matter if funding capital keeps coming in.
How long did Amazon operate at a loss?
How long did Facebook operate at a loss?

Further, you don't really know if it's operating at a loss or not.
The actual quote is this:

TikTok accounts for a small share of ByteDance's total revenues and daily active users

And further, since it makes no sense for the context to be anything other than "the US", that quote makes a lot of sense, since though the US is the single largest TikTok consuming country, we're a rather small portion of the whole.

Slashdot Top Deals

"A car is just a big purse on wheels." -- Johanna Reynolds

Working...