Become a fan of Slashdot on Facebook

 



Forgot your password?
typodupeerror

Comment Re: That's still bad. (Score 1) 33

The thing to remember is that it's a LANGUAGE model. It knows how to translate concepts between languages, which is really useful. But it is not an expert system.

This isn't correct. LLMs are language models, yes, but that description understates them, primarily, I think, by underestimating how much of the world is encoded in language. They could not generate reasonable-seeming output without also containing sophisticated models of the world, models that are almost certainly far broader and deeper than any expert system we've ever developed. And the newer LLMs aren't just LLMs, either. They have a reasoning overlay that enables them to reason about what they "know". This is actually extremely similar to how our brains work. The similarity is not accidental.

The proper way to use LLMs is as interfaces to other systems, rather than as standalone things.

Maybe, but I think the your description misstates the LLM's role in such a hybrid system. Rather than the LLM being "just" an interface, I think you would ask the LLM to apply its own knowledge and reasoning to use the expert system in order to answer your question. That is, I think the LLM ends up being more like a research assistant operating the expert system than a mere interface.

However, if you're going to do that, do you really even need the expert system? Its role is to provide authoritative information, but curating and compiling that sort of authoritative data is hard, and error-prone. You probably don't want the LLM to trust absolutely in the expert system, but to weigh what it says against other information. And if you're going to do that, why bother building the expert system? Just allow the LLM to search whatever information you'd have used to curate the data for constructing the expert system, and to compare that to knowledge implicit in its model and perhaps elsewhere.

Many of our current-generation systems have access to the web. I've been using Claude and found it extremely good at formulating search queries, analyzing the content of large numbers of relevant pages and synthesizing a response from what it found. It annotates its output with links to the sources it used, too, enabling me to check up on its conclusions (I've yet to find a significant mistake, though it does miss important bits on occasion). It could be better, could analyze a little more, but it's already shockingly good and I'm sure it will get rapidly better.

This seems like a much more sensible way to make LLMs better than by backing them with exhaustively-curated expert systems. Yes, they will make mistakes, similar to how a human research assistant would. But this approach will ultimately be easier to build, and more flexible.

As an aside, I had a very interesting experience with Claude the other day. I needed to analyze a bit of code to see whether or not it made a security vulnerability that I had already identified unreachable, or whether there was some input that could be crafted to provide the output the attacker needs in order to exploit the vuln. Claude did not immediately give me the right answer, but it pointed out important and relevant characteristics of part of the code and analyzed the result almost correctly. I pointed out some errors in its conclusions and it corrected its mistakes while pointing out an oversight I made. I pointed out some oversights and mistakes it made, and so on. Over the course of about 10 minutes, we jointly arrived at a full and completely correct answer (bugs in the bit of code did indeed block exploitation of the vuln) and a concise but rigorous proof of why our answer was correct.

In sum: The interaction was pretty much exactly like working with a smart human colleague on a hard problem. Neither of us were right in the beginning, we both drew (different) partially incorrect conclusions, and were fuzzy about different parts of the problem. In discussion we each pointed out flaws in the other's reasoning, and in the process both came to understand the problem better until we finally arrived at the correct conclusion. I contributed more than Claude did. I think I can safely say that at least in the area of code analysis, I'm smarter than Claude (though Claude is faster). But this really wasn't a case of "rubber ducking". Claude also contributed significant insights.

LLMs today are not just schochastic word generators (assuming that phrase ever had any real meaning). If you think they are, you haven't used them much.

Comment Re:That's still bad. (Score 2) 33

Hallucination is not some rare-and-random bug though. It is intrinsic to the nature of large language models (based on what I have read, anyway).

I think this particular form of hallucination -- inventing an explanation for an event -- goes even deeper and may be intrinsic to intelligence, period, or at minimum it's a characteristic that humans share.

There are a number of fascinating and clever psychological experiments that demonstrate this. Just one example: Researchers have asked people to answer a set of questions, and then a few weeks later asked the people to provide explanations of why they answered the way they did. But, in random cases, the researchers changed the answers and asked the people to explain why they gave those answers, even though they didn't. It was done subtly enough that few people realized the answers had been changed, even though they were often inverted. The really interesting part is that the people were just as good at explaining the answers they didn't give as explaining the answers they did.

Those and many other experiments seem to indicate that the primary job and ability of the reasoning layer of our mind is not to figure out what's right or wrong, or even what we do or don't want, but instead to invent explanations justifying whatever it is that we already think, from some deeper, non-verbal layer. There doesn't seem to be any evidence that the reasoning layer gets any hints from the deeper layer, either, it just finds something that makes sense. And we're extremely good at this.

The explanations we invent often don't hold up to scrutiny, but it's clear that our reasoning layer doesn't apply much scrutiny, at least not by default. We can vastly improve the accuracy of our reasoning just by making ourselves think through the process of explaining and defending our reasoning to another person, even without involving another person. This appears to work because while our reasoning ability is very good at inventing explanations, it's perhaps even better at identifying logical deficiencies in other peoples' explanations. So just going through the mental exercise of pretending to explain to someone else engages our reasoning to poke holes. And of course it's even better to actually engage with another person. The evolutionary advantages for a species that lives cooperatively but with internal competition are obvious. The person who is better at generating good explanations and poking holes in others' explanations will get their way more often, enabling them to reproduce and ensure the survival of their offspring. But because the rules of logic we apply work, as in help us to come to objectively correct decisions about the world, tribes arguing with each other will make better decisions, improving the odds of their offspring's survival.

Anyway, back to AI, it appears that's exactly what SAM did in this case. The backend system made a decision (to reject access) and while Sam didn't know the actual reason for the rejection, it invented a plausible one.

As a security engineer, it really makes me laugh that the AI chose an explanation that attributed it to security. It's so common for humans to do this. A huge percentage of the time when I see such security-related policy statements about systems whose security I think I understand, there is no actual security justification for the policy. Sometimes the speakers think there is, but they don't really understand security, and they're wrong. Probably sometimes the actual policy really is based on misunderstood security concerns, but I suspect that a lot of the time it's based on completely different concerns, and security gets invoked either as an intentional deception or, like Sam, because the entity generating the explanation doesn't know and invents something reasonable.

If anyone is interested in what I think is our best understanding of how human reasoning appears to work, I highly recommend "The Enigma of Reason", by Sperber and Mercier.

Comment Re:The trump path dont work so well anymore (Score 1) 66

"We're a country that buys more of americas stuff than they buy of ours. We've fought almost every war they've fought since WW2 , kitted our army in stupidly expensive stuff we've brought from them, in winter we send our firefighters over to help out and train american firefighters (Australian, Canadian and Californian firefighters are easily the best in the world) and followed their voting in *almost* every election. And we get treated like this."

This is an argument that la Presidenta does not understand because he has no memory of it happening and is uneducatable. And even if you did get to explain it to him, he wouldn't understand what you are saying, or care. It doesn't increase his wealth or feed his ego, he has no use for Australia.

Comment Re:Had to throw in drug & human traffickers (Score 1) 57

Or ignoring those nice Russkies stealing Ukrainian children and refusing to give them back. la Presidenta is above moral degradation like a brick is above the Sargasso Sea (to steal a phrase from Douglas Adams). His potted plant at the alleged State Dept. is now saying they won't entertain negotiations to end the Ukraine invasion for very long.

So the Great Putini now has his marching orders: wait until some other bright shiny object distracts la Presidenta and then go for the whole of Ukraine, depopulated it of Ukrainians, repopulate it with Russians, and return it to it former self of poverty and state repression.

Comment A Titanic Success (Score 4, Funny) 66

the idea investors backing *her* slapping her branding on something to sell it is something of a head scratcher.

Well people pay insane sums of money for items recovered from the wreck of the Titanic and that was something that was launched with great fanfare only to taken out by an iceberg a few days later - just like Liz Truss.

Comment Re: Something fundamentally wrong (Score 1) 252

It would mean that every company that produces W-2s and 1099-* forms could add support for their versions of the forms

What? You think corporations/individuals have their own 'versions' of the form?

I don't think you understand how federal tax forms work, everyone uses the same forms, the same 'version' if you will...

I'm not talking about the forms that companies file with the IRS. I'm talking about the forms that the companies send you. And no, those aren't standardized. Not by a long shot. They theoretically have the same information, give or take. But that's not the same thing as actually being standardized in terms of the format, the layout, how the data is represented, etc.

My W-2s have all been pretty much standardized except for perhaps what the letter codes mean (not sure about that). But those are also a dozen lines in total, so even if they weren't, it wouldn't be a big deal.

However, every single company I get a 1099-DIV or 1099-INT from, or a 1099-B from has different formatting. E*Trade and Morgan Stanley look pretty similar, but not identical. Edward Jones is radically different. My hometown bank from West Tennessee just looks like somebody typed it on a typewriter, with a description of each line followed by a dollar amount in two columns. They couldn't look more different if you tried. Heck, they aren't even consistent about where they put dollar signs from one company to the next.

1099-Bs are a particular nightmare. The 1099-B for one of my managed fund accounts had close to a thousand stock transactions on it, every one of which had a sale price and a basis price, and TurboTax imported maybe two-thirds of them correctly, leaving me to manually fix several hundred entries.

And TurboTax failed to import the basis from the 1099-B on my company stock account, which was set up for autosale, which meant I had to hand-edit dozens of lines there, too.

If you somehow miraculously haven't encountered absolute chaos when dealing with 1099-INT and 1099-DIV and 1099-B forms, count your blessings, because it is miserable.

Comment Re:Lottery Is Theft (Score 1) 66

In other words you believe Usury doesn't exist if the slave was forced to enter into the contract in order to eat.

I don't follow; the discussion was about lottery tickets, not usurious loans. Are you suggesting that people are being forced to buy lottery tickets in order to keep themselves fed? Or are you having a separate imaginary discussion in your head that is unrelated to anything I posted?

Comment Re:Starting a social media platform (Score 4, Informative) 66

Yeah, it's not technically difficulty; you just do what Trump did and slap your branding on Mastodon or something like that.

I think the reason it's news is that Liz Truss has a 14% positive and 64% negative rating, so the idea investors backing *her* slapping her branding on something to sell it is something of a head scratcher.

Comment Careful what you wish for (Score 2) 24

This will allow those people to stay in the program, and graduate with a CS degree.

Is that what you really want though? Lots of people with CS degrees that they only managed to get because Google's AI provided the answers they needed to pass? To me it sounds more like a quick way to make a CS degree close to worthless.

Comment Re:The People Voted For This (Score 2) 153

It's not surprising that Republican support hasn't changed much, because of poll methodology. If they want to sample so many "Republicans", they don't check voter registrations, they just call people up and ask their party affiliation. But feelings of party affiliation change depending on your feeling about the party, or in the case of the Republicans, the party leader. So many "never Trumpers" who were previously lifelong self-identified "Republicans" are identifying as independent now.

Slashdot Top Deals

Gosh that takes me back... or is it forward? That's the trouble with time travel, you never can tell." -- Doctor Who, "Androids of Tara"

Working...