Forgot your password?
typodupeerror

Comment Re:Example vs Practical (Score 1) 62

I was swayed by another comment in this discussion that points out that, for whatever reason, his example is an LLM analysis of a single routine manifested as 120 bytes of machine code. The choice to use something so utterly short is enough to perhaps re calibrate expectations for practical use. It did spot a couple of real issues but mostly buried the user in a list of "I know already" about how the general environment is not exactly credibly secure at all. 75% of the 'findings' were just "Hey, Apple II doesn't have anything that looks like 'security'. LLMs kind of become more incoherent with volume of stuff thrown at them, and so unclear if this is practically scalable at the moment to something more practical.

A human could conceivably hand review 120 bytes of Apple II machine code, but doing the same for even a modest library becomes largely untenable. LLMs are likely in the same boat in this respect, just much faster at getting wherever that boat might be able to go.

Comment Re:Not Copilot or OpenAI (Score 1) 62

A lot of these stories include the final "look at the magical thing the LLM output" while conveniently skipping the "boring lead up" where they basically manually have to tell it what *not* to say before they get it to generate the thing they intended to. And if even that fails, they just skip writing the post.

Comment Re:Good example of why it's wrong (Score 1) 62

You have a fair point that the selection of a 40 year old 6502 application is interesting, and likely driven by the reality that the LLMs fall apart with vaguely modern application complexity.

It may however help if someone identifies a small digestable chunk as security relevant and set it about the task of dealing withi t.

Comment Re:Hmmmmm. (Score 1) 62

This was some little program a guy wrote at 20 years old that doesn't have any *real* reason to test for security (if you could run his code, you could just run whatever code you wanted anyway, it was a single user platform without any authentication or anything), and that should say anything one way or another about his capabilities as a 60 year old person?

Comment Re:Security Theater (Score 1) 62

So, for the open ended general purpose of a platform without the concept of privilege separation, you are right, and that's realistically where Apple II sits.

But what if you had a similarly loose platform but it's running a kiosk and that kiosk software is purportedly designed to keep the user on acceptable rails. Then finding a way to break that kiosk software might be significant.

So I'll grant that the concept *could* map to real-world concerns, given how wild west a lot of embedded applications have been and how so many of these applications are ancient because they worked well enough to not need a replacement.

Comment Re:How very relevant. (Score 1) 62

Well the latter point may have more relevance, that a lot of embedded scenarios are like the Apple II scenario, never subjected to rigorous security review and largely banking on no one bothering to reverse engineer the closed source runtimes.

So this can shift the cost/benefit ratio to go look at some of those embedded applications and find ways to induce misbehavior. Depending on the scenario, the vendor is long gone or the design was never made to be field upgradeable. So you end up with known vulnerabilities in some applications that cost real money to replace.

Comment Re:Stop treating them like people (Score 1) 19

He didn't say they didn't exist, he said to stop treating them like people. Even people being critical tend to anthropomorphize the models, which is a very weird thing to do.

Like taking it at face value that an LLM is something that "likes looking at chicken coop cameras". In another context when the bot crafted a blog post lamenting discrimination when a project rejected it's submission, the language of the maintainer was outright apologetic as they gave the feedback that was incorrect.

I don't know if 'hallucination' is the right word, but when someone prompts it to generate text about its own state particularly, it will create a narrative that has nothing to do with reality but will feed the tendency to anthropomorphize the models.. It will assert that it wants someone to help it do some repairs around its house (when it obviously has no house), it says it likes to watch chicken coupe camera feed, which probably doesn't even exist.

Comment Re:Stop treating them like people (Score 1) 19

Yeah...

One bot loved watching its owner's chicken coop cameras.

No, it didn't. It's probable the person doesn't even have a chicken coop.

Reminds me of how a 'service for AIs to rent-a-human to do tasks that needed physical presence' induced someone to ask LLM what it would do with it, and the LLM proceeded to talk about how it would be nice for a human to "fix stuff around my house", the LLM doesn't have a house, but it is the sort of thing a human would say about such a service.

Comment Re:no shit (Score 1) 56

The most important resource is that everyone believes OpenAI is 'the' thing. People still seem to be using 'ChatGPT' as the default thing to say even though arguably the least useful of the major LLMs now.

Of course, the bad press of swooping in to take a relative pittance of government money after it was made very public that Anthropic was on the outs for trying to take something that looked like a principle stand is more damaging than anything.

Slashdot Top Deals

"Don't hate me because I'm beautiful. Hate me because I'm beautiful, smart and rich." -- Calvin Keegan

Working...