Follow Slashdot stories on Twitter

 



Forgot your password?
typodupeerror

Comment Re:I am such a dinosaur (Score 1) 83

Citation: https://www.cbre.com/insights/...

"Construction completion timelines have been extended by 24 to 72 months due to power supply delays."

If you want to measure the speed at which CPUs can multiply numbers stored in on-chip cache, then flops are the right unit of measure, because that's constraint is very closely aligned with the thing that you care about.

If you want to measure the scale of a datacenter buildout today, gigawatts of utility power is a reasonable metric, because right now, that's a very important constraint, because it is very closely aligned with the thing that you care about.

To be clear, nobody is saying that it's the only interesting metric. The others still matter. But if the datacenter that you want to build requires more gigawatts than the utility in that location can provide, then it doesn't matter how many flops you'd be able to achieve with the GPUs in that planned datacenter. Even if you build the datacenter and fill the racks with GPUs tomorrow, you cannot turn them on, so they're going to be doing zero flops until you solve your power problem.

Comment Re: Umm, what about theft? (Score 1) 18

Yes, and since you can't defend a patent when prior art is out there, this scenario is already a non-starter.

You can file for that patent, and you might even get it, but your first infringement lawsuit is going to fail due to the prior art, making the patent a waste of money.

This scenario seems quite well addressed by existing patent rules.

Comment Re: Where do you come up with this? (Score 1) 289

I got these ideas by writing neural networks in c++ in the 1990s before libraries for that sort of thing were readily available. And then writing UI to visualize what was going on inside the network while it processed information.

Perhaps an example will clarify what I'm saying here...

Suppose you're using a neural net similar to the one in ChatGPT, with for example 300 billion weights connecting 100 layers.

You ask it a question. It gives you a wrong answer, with a citation to a non-existent document.

How do you go about finding the specific weights that need to be adjusted to make it produce the correct answer for that question next time?

How would you go about finding a layer within the network that contains the "mental state" of your network that corresponds to the upcoming wrong answer?

Spoiler: an army of PhD students have been working on these problem for years, because there is no clear mapping between "language" and anything deep inside the network

We're got hints, and mechanistic interpretability researchers are making slow but steady progress, but we are nowhere near being able to update the weights of a neural network to fix that wrong answer without risking significant degradation in overall performance.

What's happening inside a network isn't happening in "language" in the sense that humans use language. It's a stack of vectors in high-dimensional spaces that are almost entirely opaque to us.

Comment Re:I am such a dinosaur (Score 1) 83

1) While the FLOPs per megawatt ratio is going to change over time, the limiting factor right now isn't compute per se, it's the ability to get power the computers. Azure recently told my employer not to expect any additional capacity in the Virginia datacenter that our primary servers are hosted in, because they can't get any additional power.

2) It isn't. But the fact that memory bandwidth is a problem doesn't make FLOPS any less obsolete. Being able to do 50 TFLOPS with an on-chip cache just doesn't matter anymore. What matters for AI inference is being able to do 50 TFLOPS across tens of gigs of RAM (or hundreds?).

But we don't have a good unit-of-measure for that. But we do have a perfectly good unit-of-measure of today's limiting factor: gigawatts.

Comment Yes, really. (Score 1) 289

It's an interesting insight. While the input and output layers of LLMs are obviously designed to accept and generate language, there are dozens of layers in between that are not.

This is why interpretability research is keeping PhD students busy. This is why after years of research we're still unable to prevent LLMs from generating output that is completely unaligned with reality - we just have almost no idea what's going on inside. We train them on trillions of tokens and turn that into billions of numbers that we can't really make sense of.

The math that's happening inside is surprisingly simple. But in the middle of the LLM, your prompt and it's knowledge and it's response are all combined into vectors in spaces with tens of thousands of dimensions.

As someone else said, language is just the interface.

What's inside is an opaque soup of inscrutable numbers. It's borderline miraculous that those numbers produce coherent sentences as often as they do.

Comment Re: Blind idiocy. (Score 4, Informative) 72

Just a small nit to pick: it's not just about the behavior of galaxies.

The anisotropy that we observe in the cosmic microwave background could also be explained by the existence of matter that is not directly visible, aka dark matter.

One of the issues with the "modified Newtonian dynamics" approaches is while it explains the galaxy observations, it doesn't explain the CMB observations. That doesn't necessarily mean that it's wrong, of course. But dark matter fits the observations of multiple discrepancies (galaxy behavior and CMB anisotropy and more), so most cosmologists think that dark matter is the more likely explanation.

Comment Re: CO2 is a virus? (Score 3, Interesting) 49

It's a logical question, I was wondering exactly the same thing.

From TFA:

"Elevated levels of CO2 lead to reduced cognitive ability and facilitate transmission of airborne viruses, which can linger in poorly ventilated spaces for hours. The more CO2 in the air, the more virus-friendly the air becomes, making CO2 data a handy proxy for tracing pathogens."

Comment Re: That's not how computers work (Score 2) 23

Meanwhile, other companies are getting really good at stem extraction and pitch detection, so reverse engineering midi files from wave files is probably going to be pretty easy pretty soon.

Or maybe it is already. I'll admit I haven't been following this stuff closely, but I've seen pieces of it here and there.

Comment Mass Premium (Score 1) 35

It's premium, but it's for the mass market?

Are end users going to be asked to decide whether they are Mass or Mass Premium buyers?

Did Google hire someone from Redmond?

"Mass Premium" is the sort of branding brilliance that I've come to expect from Microsoft. I wonder how long they argued over calling the top version "Premium Pro."

Comment Re: With AI (Score 1) 259

There's something to this.

A couple years ago, a friend said something about China teaching calculus in middle school and I thought it was a joke, but it's true. The key thing is that they teach the concepts - what integration means, what a derivative is, etc. It's not about memorizing the rules for taking the derivative of every possible equation, it's about being able to recognize when a derivative would be a useful way to look at a problem. It's about the difference between a budget deficit and accumulated debt.

Turns out, you can teach that stuff to kids who aren't ready to calculate arbitrary integrals.

And people are going to need to grasp those concepts much more often than they're going to need to calculate integrals or derivatives.

It's an idea we should explore. Teach the concepts and applications, not the calculations.

Slashdot Top Deals

If you push the "extra ice" button on the soft drink vending machine, you won't get any ice. If you push the "no ice" button, you'll get ice, but no cup.

Working...