Become a fan of Slashdot on Facebook

 



Forgot your password?
typodupeerror

Comment Re:They are going from 4.5 to 4.1? (Score 1) 13

Since R1 has good reasoning, but no real breadth, and is open source, the logical thing would be to modify R1 to pre-digest inputs and create an optimised input to 4.1. The logic there would be that people generally won't provide prompts ideally suited to how LLMs work, so LLM processing will always be worse than it could be.

R1 should, however, be ample for preprocessing inputs to make them more LLM-friendly.

Comment Re:A tedious film (Score 1) 76

The removal of the scene with Jaba at the entrance to the Millennium Falcon (as a human, forget the CG version that makes Jaba a slug) makes the story that much more compelling -- we have reference to an unknown, ongoing potential threat, and the audience's imagination back-fills a story about Han in a way that including the scene with Jaba does not. Thankfully, the original editors understood that, even if George Lucas still does not.

Think of the scene from Breaking Bad where Saul Goodman claims he's a friend of the cartel and mentions Lalo by name. Instantly, Saul's backstory fills with rich complications that define the character, without introducing Lalo (until the *fourth* season of Better Call Saul). It's exactly the same storytelling technique.

Comment Re:Hmmm. (Score 1) 37

Gemini struggles. I've repeated the runs on Gemini and can now report it says files were truncated two times in three. After reading in the files, it struggles with prompts - sometimes erroring, sometimes repeating a previous response, sometimes giving part of an answer.

When it gives some sort of answer, further probing shows it failed to consider most of the information available. In fairness, it says it is under development. Handling complex data sources clearly needs work.

But precisely because that is its answer, I must conclude it is not ready yet.

I would be absolutely happy if any of these AI companies were to use the files as a way to stress-test, but I'm sure they've better stress tests already.

Still, it would be very unfair of me to argue a need for improvement if I were to insist on not providing either evidence or a means of testing.

Comment Re:Hmmm. (Score 1) 37

I'm using the web interface, using the file attachment button then selecting documents. The prompt used explicitly instructs using the largest input window available and to report any file that failed to be handled correctly. It gives the OK for most of the files, then reports several got truncated.

Yeah, if it was something simple like input exhaustion, they'd not be making the claim. That would be the first thing anyone tested. So I'm reasoning that the problem has to be deeper, that the overflow in Gemini's case is in the complexity and trying to handle the relationships.

Testing it further, after just a short list of questions, the answers cease to have anything to do with the prompt. This lends credence to the notion that it's the relating of ideas that kills Gemini, not the file size.

Comment Re:Another excuse for corruption (Score 1) 224

You're correct. It also violates WTO rules but the Republicans are actively blocking the WTO from operating.

Sovereign immunity is a dangerous game. Although the Magna Carta has long since ceased to be a factor in law, it did actually address the specific problem of sovereign immunity, proposing a special court for the sole purpose of trying those with such immunity, preventing bogus lawsuits but also providing a way to hold such people to account.

The US has attempted to use Congress for this, but we've now seen that "lawful" bribery makes it a useless mechanism for that purpose.

Comment Re:Hmmm. (Score 1) 37

So, to answer your question, none of the AIs I've tried can cope. There may be AIs I've not tried, ideas welcome, but the AIs just don't do what they claim in terms of data digesting, which leads me to conclude that the hidden overheads underpinning their methods are too large.

Comment Re:Hmmm. (Score 1) 37

Clause and ChatGPT manage to read the files but the context window is too small to process them.

Grok, DeepSeek, and amazingly even Gemini choke on the files. Gemini, multi-million-token windows notwithstanding, reported that the specification files (of which I've 13, plus another 9 containing contextual information) are too big to process, even though they're only about 20k of tokens each.

Slashdot Top Deals

Science may someday discover what faith has always known.

Working...