Comment In another dimension... (Score 2) 116

by thragnet on Friday July 10, 2026 @12:31PM (#66231678) Attached to: Nobel-Winning US Chemist Will Move to China to Lead AI Institute

Carroll O'Connor applauds you.

Comment Re:Mathematician commentary included (Score 1) 83

by thragnet on Monday June 01, 2026 @12:33PM (#66169552) Attached to: OpenAI Claims It Solved an 80-Year-Old Math Problem

My opinion is that we don't
know what these models are doing

Agreed. So Gemini can answer my original question "do llms perform logical inference" in either the negative or affirmative, and I should assign equal validity to either (or any) response. Which is to say: none.

For proofs, it's fine. A proof is always independently verified before being published.

Agreed. But let's also agree that the verifier should be a human or a symbolic system -- better both --, at least for now.

Comment Re:Mathematician commentary included (Score 1) 83

by thragnet on Monday May 25, 2026 @01:46PM (#66159890) Attached to: OpenAI Claims It Solved an 80-Year-Old Math Problem

Okay, same prompt after Wipe Chat History & Learned Preferences (Hard Reset):

do llms perform logical inference

Large Language Models (LLMs) simulate logical inference rather than performing it using formal, rule-based reasoning. They operate via statistical pattern recognition, predicting answers based on token correlations learned from training data, which allows them to mimic deductive and inductive steps but leaves them prone to logical fallacies and hallucinations. [1, 2, 3, 4, 5]

How LLMs Perform Inference
LLMs produce outputs through the following mechanisms:
Probabilistic guessing: Rather than evaluating mathematical proofs, they determine the most mathematically probable next word based on the text that came before.
Contextual mimicry: They excel at logic puzzles because they have seen similar structures during training, not because they possess an internal "understanding" of the rules. [1, 2, 3, 4, 5]

Techniques to Improve Reasoning
To overcome their lack of inherent logical architecture, advanced LLMs use specific prompting and architectural techniques to generate logical inferences: [1]
Chain-of-Thought (CoT): Asking a model to "think step-by-step". This breaks complex problems into smaller, sequential parts, allowing the model to predict logical premises progressively.
Inference-Time Scaling: Newer reasoning-focused models spend additional computational power before answering by running hidden "thought" steps or evaluating multiple possibilities internally.
Tool Integration: Advanced systems rely on external programming or logic solvers, essentially passing the heavy-lifting of formal logic (like math and coding) to deterministic calculators. [1, 2, 3, 4, 5]

The Limits of Model Logic
When relying purely on LLMs for logical deduction, you will often encounter distinct failure modes: [1]
Lack of Soundness: They can generate valid-sounding explanations that are logically invalid (hallucinations).
Fragility: If you change the names of variables or the context of a word problem, their predictive abilities often fail.
Sensitivity to Prompts: Their reasoning accuracy fluctuates depending on how a question is framed. [1, 2, 3, 4]
For critical tasks that require absolute certainty and absolute mathematical or logical rigor, relying solely on an LLM is risky. It is recommended to use formal, deterministic systems (such as Python compilers, symbolic logic solvers, or mathematical theorem provers) backed by authoritative evaluation methods. [1, 2, 3, 4, 5]

So we have three distinct responses ! Seriously, no coaching or cheating here ! This is exactly the model we need for weapons release, as it satisfies the Madman Theory of Negotiation (https://en.wikipedia.org/wiki/Madman_theory). Yeah, I'll totally rely on this for future decisions and current education.

Kidding aside, doesn't this exchange terrify you just the least little bit ? We're seriously using this technology out-of-the-lab ? Did it ever pass its regression test suite (did it ever have one?) ? Is all the universe a beta for these tools ?

Comment Re:Mathematician commentary included (Score 1) 83

by thragnet on Sunday May 24, 2026 @12:33PM (#66158486) Attached to: OpenAI Claims It Solved an 80-Year-Old Math Problem

I took your advice and queried a modern LLM, i.e., Gemini, as follows:

do llms perform logical inference

Its response follows, and seems a bit less than a full-throated affirmation of your position (perhaps it was hallucinating):

No, standard Large Language Models (LLMs) do not perform true logical inference. Instead, they generate answers through sophisticated, probabilistic pattern matching. They excel at "simulating" reasoning by predicting what words usually follow a given premise, but they do not process, understand, or apply formal logical rules. [1, 2, 3]
The scientific consensus surrounding LLM logical capabilities highlights several key limitations:

Pattern Mimicry vs. Understanding: Models often get logical puzzles correct because they have seen the problem structures (and their common solutions) frequently in their training data. When the wording of a puzzle is slightly changed, their performance degrades significantly.

The Chain-of-Thought Illusion: Techniques like step-by-step or chain-of-thought prompting make the model generate plausible reasoning narratives. However, these are fluent simulations of logic rather than the actual internal calculation of truth.

Susceptibility to "Fluent Nonsense": Because they rely on probability rather than validity, LLMs are known to generate logically flawed conclusions that sound highly convincing, posing risks in domains like legal, medical, or financial analysis.

Emergence of Reasoning Models: Some specialized LLMs (frequently categorized as Reasoning Models or LRMs) are trained to use additional computational resources to verify and revise their own logic before finalizing a response. Even then, they simulate reasoning through iterative generation rather than engaging in absolute formal deduction. [1, 2, 3, 4]

For guaranteed logical inference, developers still often rely on conventional, rule-based programs rather than unassisted LLMs. [1]

Comment Re:Mathematician commentary included (Score 1) 83

by thragnet on Saturday May 23, 2026 @01:18PM (#66157232) Attached to: OpenAI Claims It Solved an 80-Year-Old Math Problem

You have found my pain point-- I'm "hung up" on the fact that they don't verify the proof with 100% certainty. Humans have agreed on the notion of implication (in the logical, 100%/Boolean inference sense) for thousands of years, and have been able to check each other's proofs for the same length of time. I'm sure (with logical, 100%/Boolean accuracy) that the Pythagorean Theorem holds in a Euclidean space, because I can review the steps taken to verify it. And understand how each step logically (not just syntactically) follows from the prior step.

Can the same be said of NNs ? Are they trained upon and just provide fuzzy (not in the fuzzy logic sense, fuzzy in the model inference sense) syntactical theories of axioms and theorems that were originally derived with logical inference, and actual semantics ? How do they verify their claims ? Yeah, with human-built proof verifiers. Which are (generally) built with human-built symbolic systems. I'm guessing there's something qualitatively different between present-day neural networks and the human brain, some kind of ability to generalize (implication as a meta-pattern (what the hell is a meta-pattern) ? cause-and-effect? some other emergent behavior ?) but damned if I know. Which is not to say that NNs may not exhibit such behavior, I just haven't seen it yet. But have been recently diagnosed with cataracts, so...

Comment Re:Mathematician commentary included (Score 1) 83

by thragnet on Friday May 22, 2026 @01:03PM (#66156002) Attached to: OpenAI Claims It Solved an 80-Year-Old Math Problem

Both you and Chris above are correct. I checked the definition of "statistical inference", and my usage was not-even-wrong. My apologies.

What I was trying to say (poorly) is that NN inferencing is based on a statistical model, and cannot, by itself, provide logical inferencing proofs, which I consider the sine qua non of reasoning. Agree or disagree ?

Comment Re:Mathematician commentary included (Score 1) 83

by thragnet on Friday May 22, 2026 @01:02PM (#66156000) Attached to: OpenAI Claims It Solved an 80-Year-Old Math Problem

Both you and Rei below are correct. I checked the definition of "statistical inference", and my usage was not-even-wrong. My apologies.

What I was trying to say (poorly) is that NN inferencing is based on a statistical model, and cannot, by itself, provide logical inferencing proofs, which I consider the sine qua non of reasoning. Agree or disagree ?

Comment Re:Mathematician commentary included (Score 2) 83

by thragnet on Thursday May 21, 2026 @08:05PM (#66155000) Attached to: OpenAI Claims It Solved an 80-Year-Old Math Problem

The words to describe the process that you have identified are statistical inference, not logical inference. I don't believe that you can square that circle; it's why NNs are said to interpolate, but not extrapolate. But my beliefs are, shall we say, flexible -- I'm open to a counter-argument.

Comment Re:Mathematician commentary included (Score 1) 83

by thragnet on Thursday May 21, 2026 @02:03PM (#66154384) Attached to: OpenAI Claims It Solved an 80-Year-Old Math Problem

My understanding is that LLMs are built on a foundation of ANNs, and that indeed the backpropagation used to train ANNs is a statistical process; the cost function that must be minimized (via vector calculus) is a least-squared-error variant, a decidedly statistical calculation. How does this not make the model statistical ?

Comment Re:Did it use Lean ? (Score 2) 83

by thragnet on Thursday May 21, 2026 @01:15PM (#66154314) Attached to: OpenAI Claims It Solved an 80-Year-Old Math Problem

I am in no way dismissing hybrid systems that incorporate Lean, or similar symbolic apparatus. To the contrary, I find that there must be some "symbolic assist" (i.e., some predicate calculus engine) to pure NN systems to verify that their pattern match is a valid proof. But I'm more than willing to be proven wrong. I just don't see how to get past the fact that statistical inference is not the same as logical inference, at least in the province of proofs. But WTH, feel free to educate me.

Comment Did it use Lean ? (Score 1) 83

by thragnet on Thursday May 21, 2026 @12:32PM (#66154246) Attached to: OpenAI Claims It Solved an 80-Year-Old Math Problem

Or any other proof assistant / verifier ? Is this true NN reasoning, or just more LLM/NN spaghetti thrown up against a symbolic verifier ?

Comment Re:Reason (Score 1) 91

by thragnet on Thursday April 16, 2026 @05:44PM (#66097614) Attached to: Boston Dynamics' Robot Dog Can Now Read Gauges, Spot Spills, and Reason

| If every token in LLM produces is statistically inferred (a fair claim) does that mean the entire output is a statistical inference?

Yes. Is the Pythagorean theorem an approximation ? Within the limits of measurement, yes. In geometric logic, no.

| Now what if I told you that a neuron in your brain can only be accurately modeled statistically? Is it statistical inference all the way up?
| Or do we accept the systems that are inherently based on statistical inference (broad definition, not narrow) can, to a high degree of accuracy, approximate
| logical inference?

Can only be accurately modeled statistically ? How long will you argue that this will be true ? Obviously, many disciplines (e.g., thermodynamics) make your argument, but that is only a feature of the model - it works better than anything we have found to date. Guess I'm a hard reductionist, but you may be right.

Comment Re:Reason (Score 1) 91

by thragnet on Thursday April 16, 2026 @02:52PM (#66097246) Attached to: Boston Dynamics' Robot Dog Can Now Read Gauges, Spot Spills, and Reason

| Have you ever seen the results of a logic test described in a non-statistical way?

Yes! Multiple times in my high school geometry class, and thereafter -- many times to my sorrow. Also in sentential logic tests. The proofs/answers that I submitted were either correct or incorrect, and could be verified by deduction or induction.

| I fail to see how it isn't obvious that any kind of logical inference is merely an approximation.

You might take this up with George Boole, he is far more an authority than I. Snark not meant maliciously, I always learn from your commentary.

Comment Re:Reason (Score 1) 91

by thragnet on Thursday April 16, 2026 @01:03PM (#66096960) Attached to: Boston Dynamics' Robot Dog Can Now Read Gauges, Spot Spills, and Reason

Do you have a counter-argument to the statement that statistical inference is not the same as logical inference ? I fail to see that an NN/LLM can perform logical inference. My understanding of NN/LLM proof systems is that they detect proof patterns in their corpus, but, having identified candidate proofs, submit them to Lean (or some other GOFAI, e.g., Prolog) for verification. Which suggests to me that NN/LLM cannot reason, which I define as predicate calculus. But I'd like to hear your view.

Comment Re: Elon Musk Announces $20 BS 'Terafab' Chip Plan (Score 2) 126

by thragnet on Sunday March 22, 2026 @06:22PM (#66054892) Attached to: Elon Musk Announces $20B 'Terafab' Chip Plant in Texas To Supply His Companies

Should read 'Terafib'.

Comment In another dimension... (Score 2) 116

Comment Re:Mathematician commentary included (Score 1) 83

Comment Re:Mathematician commentary included (Score 1) 83

Comment Re:Mathematician commentary included (Score 1) 83

Comment Re:Mathematician commentary included (Score 1) 83

Comment Re:Mathematician commentary included (Score 1) 83

Comment Re:Mathematician commentary included (Score 1) 83

Comment Re:Mathematician commentary included (Score 2) 83

Comment Re:Mathematician commentary included (Score 1) 83

Comment Re:Did it use Lean ? (Score 2) 83

Comment Did it use Lean ? (Score 1) 83

Comment Re:Reason (Score 1) 91

Comment Re:Reason (Score 1) 91

Comment Re:Reason (Score 1) 91

Comment Re: Elon Musk Announces $20 BS 'Terafab' Chip Plan (Score 2) 126

Slashdot Top Deals

Slashdot