It's also *smart*, which acting classy often turns out to be. What people want from the leader of a company in an industry that is having these kinds of problems is maturity, perspective, and thoughtfulness. Naked opportunism and unbridled competitiveness at any cost isn't a good look when people need reassurance.
For that reason, not twisting the knife is the most effective way to twist the knife, especially when you can pretty much count on your competitor to do the twisting for you. Also, if a quality error happened to be discovered in an Airbus product shortly after the CEO was gloating about Boeing, that would be catastrophically bad.
This *is* real science. It's just not by itself a sufficient basis for making any kind of evidence-based decision. Nor *could it possibly be*.
I had a friend in collge who participated in a nutrition randomized control trial . For months he had to carry around a gym bag; not only did everything he eat and drink come out of that bag, all his urine and feces went into containers in that bag so they could be weighed and analyzed to ensure he was complying with the research protocol. If he snuck a candy bar or a soda the researchers would know, and he'd lose his "job" plus the bonus for completing the study. While I'm sure that study got high quality data given the immense care it took, it surely tracked only *markers* (like blood lipids) rather than *outcomes* (like heart disease). That's because the outcomes we're interested in usually take decades to develop. It's hard enough finding people to live out of and poop into a bag they carry everywhere for *six months*. You'll never find anyone to do that for *ten years*.
So in nutrition, even an RCT can't be treated as some kind of gold standard for evidence-based decision making. If an RCT proves A causes B, B will never be C, the thing we're actually interested in. B will at best be *correlated* with C. So whether we're talking RCT or cross sectional studies, we are just making a case for some kind of correlation. You need *multiple kinds of evidence*, repeated by multiple researchers multiple times. With that volume and variety of evidence, you eventually develop a picture which connects the dots between A and C in a way that is unlikely to be false in any of its particulars. Useful results are *always* big picture results.
So what should be the gold standard for evidence-based decisions is a systematic review paper published by a scientist current working in the field, and in a well-known journal. This is the *minimal* level of evidence that people outside a field should pay any attention to, at least for the purposes of guiding decision making.
You concede that the model is not the complete picture while also trying to mock me for saying it's not the complete picture. Poor show.
Um, no. I'll repeat: of the changes, none of which have a material impact on its general functioning. Indeed, some variants are even simpler than the original Hodgkin-Huxley model. But they all yield basically equivalent functional behavior.
For example, you can model the specific chemical behavior or thermodynamics behind neuronal behavior. Instead of assuming or measuring given synaptic strengths, you can model axons and dendrites to determine them. You can create models of multiple neurons at once. But none of these things change the overall behavior. The behavior modeled in the Hodgkin-Huxley model has held up for 2/3rds of a century. You are NOT going to debunk it in the comments section of a Slashdot article. Period.
Even with this when you're looking at a very narrow part of it, you're oversimplifying. Yeasts produce mostly ethanol but actually produce a range of different alcohols
It doesn't matter if none of that is what you care about. If you're a brewer, and you provide yeast a given feedstock and conditions, you get a given output, within given plus or minus margins. It's irrelevant how complex the yeast is. It doesn't matter how complex the gene transcription mechanism is, how complex the ribosomes are to assemble cyclin-dependent kinases to control the transition between the G1, S, G2, and M phases. What you get is, more or less, ethanol in water. The production mechanism is insanely complex. The output is trivial. We care about outputs, not production mechanisms.
And yes, we absolutely DO have wiggle room for a "more or less", "plus or minus" some percent, when we're talking about neuronal interactions. They're damped systems. Minor excursions get cancelled out.
Trying to produce an accurate model, of precisely which alcohols are produced and when based on what inputs is much, much more complex
Industrial brewers can very readily predict their yields for given yeast strains given given feedstocks at given temperatures, etc. Their models might be considered "complex" to, like, a child looking at them, but the level of complexity in modeling yeast alcohol production behavior is many orders of magnitude less than the model of complexity of the yeast themselves.
You're simply pushing a fallacy, that if a production mechanism is complex, the output must be equally or more complex. It's simply false.
HH model is just about conduction along the main axon
Loop back to my original post. And you don't have to model synaptic strengths - you can just measure them, or for simulation, simply fix them. That said, they are able to be modeled. But you don't need to.
Do you need to, if the parameter that you care about is solar output for your solar panel, plus or minus a couple percent?
It's simply a logical fallacy to assume that if the cause has complex underpinnings, that the output must be a well. It doesn't work that way. It doesn't matter how complex the mechanisms are that neurons use to reproduce, or to resist viruses, or metabolize sugars. What matters is the output behavior, and that has nothing to do with the overwhelming majority of the internals of the cell. We only care about how to model very specific output parameters. And it turns out, at least with output, this mechanism is very trivially modellable.
I'll repeat that randos on a Slashdot comments section aren't going to debunk the Hodgkin-Huxley model.
What's the difference between this and a gaming GPU?
NVidia's profit
Seriously, though, from an inference perspective (up to midsized quantized models), or for training or finetuning small models (or LoRAs on small-to-midsized ones), gaming GPUs are your best bet - far more compute for your dollar. The problem is when your model size gets any larger than you can handle in VRAM (esp. challenging with training, since it uses a lot more VRAM). Neural network applications are bandwidth-limited. So these "professional" AI cards first off have a lot more VRAM onboard. They're not necessarily that much faster, compute-wise, but their internal memory and memory bandwidth is much higher. Secondly, they're designed for also very high bandwidth to other such GPUs. You just can't do this with consumer-grade gaming GPUs; the bandwidth over PCIe x16 is just way, way too low, esp. given that you've only got a max 24GB per card.
The hope of course is that model design (potentially via more advanced MoE designs) progresses to the point where training can be well localized to individual consumer-grade GPUs. But that's not the state of the art today.
And that despite all of its horrible inefficiencies.
But letting physics do your calculations in analog rather than relying on matrix math is a BIG advantage.
It's this reason that I don't buy into Altman's obsession with massive power generation in the future to power AI. At some point, we're going to switch off global gradients for backpropagation, which prevent us from doing neuromorphic hardware, and switch to localized models that can be represented in analog. Then you're no longer doing e.g. multiplications and sums of weighted activations, but rather just measuring currents.
Sorry, but in over seventy years, the (Nobel Prize-winning) Hodgkin-Huxley model has NOT been debunked, and certainly isn't about to be debunked today by "Serviscope Mirror at Slashdot"). There have been various enhancements made to it over the years, but none of which have a material impact on its general functioning.
Yeast are incredibly complex beings, at a sub-cellular level. Yet the alcohol they produce is incredibly simple. Having complex subcellular machinery (which all cells have - again, primarily for structure, metabolism, reproduction, defense, etc) does not mean that the net consequences of the existence of said cell are themselves incredibly complex.
You don't have to model the fusion of every ion in the sun to be able to predict that 1,4kW/m of light will shine on the Earth from space, day after day for millions of years. Net results do not inherently map to the complexity of the thing that generates them. And the net result of the machinery of neurons, from an inference perspective, is readily modeled as a trivial circuit.
(And for the record, I also "have my name on biology papers" - or to be more specific, bioinformatics)
First off, re: your irrelevant talk about proteins:
Proteins are linear chains, created from (linear) mRNA templates; that they fold into complex 3d shapes doesn't change the fact that the information that encodes them is linear. The average protein is about 300 amino acids in length. With 26 possible amino acids, you would need a max of (2^5 = 32) transistors to represent the path diversity per amino acid, or 1500 per protein for equivalent complexity. In practice less, because many amino acids can substitute for others with minimal impact, and in some places you get very long repeats.
Why irrelevant, though? because proteins aren't the basic unit of thought. That's neurons. Proteins don't "think". Prions don't have ideas. Collective groupings of neurons do. Proteins just lay out the structural foundation for the neuron, virtually all of which is like other cells or simply laying out the structural machinery, not encoding the "logic" for training or inference. Or if you prefer analogies: the overwhelming majority of the proteins are just the silicon wafer, its pins, its casing, its cooling, its power supply, etc.
"Inference" in neurons has been well understood for over 70 years, and is highly predictable. It was first studied in the squid giant axon, since it's macroscopic (up to 1,5mm diameter), so easy to work with. It's quite straightforward, to the point that it can be represented by a simple circuit diagram. Inputs raise action potentials, action potentials decay, but if they exceed a threshold, a synapse is triggered. A VAST amount of different, complex proteins all work together, to build something that a child could assemble with a hobby electronics kit. Because the vast, overwhelming majority of that structure is dedicated to things like structure, metabolism, reproduction, repair, etc - things of utter irrelevance to digital circuits.
"Training" in neurons has been understood in generalities for a long time, but the specifics have been more difficult, because while on "inference" you're dealing with easily measurable spikes, learning in neurons involves difficult to measure internal chemical changes and triggers. However, increasingly, neurons appear to be a sort of non-Gaussian equivalent of a PCN. That is to say, unlike in backpropagation algorithms used in conventional ANNs, which involve global gradients, PCNs at a local level simply try to adjust their connection strengths and activation potential to make their firings match a weighted-mean firing of the neurons they connect to. Vs. traditional backpropagating ANNs, PCNs are slower and more memory-intensive to implement on traditional hardware, but they offer a wide range of advantages, including the potential for realtime learning during inference, no need for "layered" structures, the ability to have loops, the ability to change paths in realtime, the ability to have any arbitrary neurons be fixed outputs from which knowledge can propagate back through the system, and to enable / disable / swap those at any time, etc. They perhaps most importantly also appear readily possible to implement in analog neuromorphic hardware, as all required functionality is localized (traditional backpropagation algorithms are not).
Despite the gross inefficiencies of metabolism, of working via neurotransmitters, and the vast amounts of overhead our brain dedicates to supporting cells (even white matter "data buses" in the brain need to be living, let alone everything that functions for scaffolding, repair / cleanup, immune roles, etc etc), the brain remains much more efficient than GPUs. Why? For the simple reason that it's analog. Neurons don't need to compute matrix math to determine weighted averages of activations - the laws of physics do it for them. It's like the difference between trying to determine how full will a bucket of water be when it's being filled by a bunch of pipes of varying diameters by simulating every water molecule flowing through the pipes, rather than just, you know, measuring the water that ends up in the bucket. If ANNs ever switch to analog, they'll gain this advantage in spades.
Man, you seem stressed. Why not vent to ChatGPT about your problems and ask it for some advice on how to deal with the situation?
Elon himself has made the same mistake
I dunno, I'm going to wait for some actual benchmarks.
It's a gigantic model, by far the largest true-open-source (aka Apache, MIT, or similar licensed) model out there. But I strongly suspect it's severely undertrained relative to its size.
Would probably be a great base for any research on increasing information density in models via downscaling, though.
Hackers of the world, unite!