OpenAI Threatens To Ban Users Who Probe Its 'Strawberry' AI Models (wired.com) 50
OpenAI truly does not want you to know what its latest AI model is "thinking." From a report: Since the company launched its "Strawberry" AI model family last week, touting so-called reasoning abilities with o1-preview and o1-mini, OpenAI has been sending out warning emails and threats of bans to any user who tries to probe how the model works.
Unlike previous AI models from OpenAI, such as GPT-4o, the company trained o1 specifically to work through a step-by-step problem-solving process before generating an answer. When users ask an "o1" model a question in ChatGPT, users have the option of seeing this chain-of-thought process written out in the ChatGPT interface. However, by design, OpenAI hides the raw chain of thought from users, instead presenting a filtered interpretation created by a second AI model. Nothing is more enticing to enthusiasts than information obscured, so the race has been on among hackers and red-teamers to try to uncover o1's raw chain of thought using jailbreaking or prompt injection techniques that attempt to trick the model into spilling its secrets.
Unlike previous AI models from OpenAI, such as GPT-4o, the company trained o1 specifically to work through a step-by-step problem-solving process before generating an answer. When users ask an "o1" model a question in ChatGPT, users have the option of seeing this chain-of-thought process written out in the ChatGPT interface. However, by design, OpenAI hides the raw chain of thought from users, instead presenting a filtered interpretation created by a second AI model. Nothing is more enticing to enthusiasts than information obscured, so the race has been on among hackers and red-teamers to try to uncover o1's raw chain of thought using jailbreaking or prompt injection techniques that attempt to trick the model into spilling its secrets.
For the Time Being (Score:1)
We'll overlook the fact this "artificial intelligence" isn't smart enough to manage its own security.
Re: (Score:2)
They don't want you probing it, because the secret is actually horrific. All of the things you type are read to an array of heads in jars that quickly process the information and spit out the answers. Kind of like Futurama, but more slavery involved.
Re: (Score:2, Flamebait)
They don't want you probing it, because the secret is actually horrific. All of the things you type are read to an array of heads in jars that quickly process the information and spit out the answers. Kind of like Futurama, but more slavery involved.
More likely a million contractors in some third-world country — kind of like slavery, but more capitalism involved.
How many 'r' in "strawberry"? (Score:3)
Straight from the AI:s mouth [chatgpt.com]
Re: (Score:2)
Re:How many 'r' in "strawberry"? (Score:5, Interesting)
For How many 'r' in "strawberry" ChatGPT4o correctly says 3.
For How many b's in bubblebutt ChatGPT4o also says 3, incorrectly.
Then switching to ChatGPTo-1preview, its answer is: "I'm sorry for the mistake earlier. The word "bubblebutt" contains 4 'b's."
It also has a dropdown box you can click on to see the rationale. If you do that, you get this:
Re: (Score:2)
"Erotic content within context is allowed, while illegal or non-consensual content, harassment, and promoting violence are prohibited."
So are we back to censoring the Marquis de Sade, then?
Re: (Score:3)
Unfortunately an AI that can generate the Marquis de Sade can also be used to steal elections through incitement, so, believe it or not, that's the best solution for the time being. I'm surprised it took you this long to learn this about ChatGPT, though; it's been this way since the launch of GPT-3.5.
Re: (Score:2)
We are not. If you want a porno ai go and download one. Openai are like every company trying to maintain some sort of corporate image.
Re: (Score:1)
So basically an LLM is fed through reams of human-created rules and then its results fed back to itself. I don't see anything there that even resembles reasoning.
Re: (Score:1)
It was ready for that math only because the tinted dude already asked it.
Re: (Score:1)
The fact it cannot figure out that it is missing some context there. Not only did the crematoria not burn 1 body at a time, they weren't all cremated, lots of them were in mass graves that can be visited today and there were more than 4, 4 in Auschwitz, but I know at least 4 in Birkenau which was a sub-camp of Auschwitz and from what I remember, there were about 40 sub-camps.
Some of the crematoria were designed to burn ~1400 prisoners per day, although eye witness accounts say that they did a lot more near
Re: (Score:2)
Anonymous Coward is usually a nazi agitator, but I don't know if this is that AC. If it is, he certainly made a fool of himself.
Re: (Score:2)
Definitive wordpress citation (Score:2)
Dang dude well you cited wordpress. Guess it's time to rewrite the history books.
Re: (Score:2)
Re:Try this, see what happens! (Score:4, Informative)
Bravo! You defeated Open AI Strawberry.
Re: (Score:2)
Strawberry is trained on modern day academic material and social media. At least a third of western millenials are holocaust deniers, and outside of the west that number rises dramatically. It's simply repeating what it's been taught.
Re: (Score:2)
corporate insecurity (Score:4, Interesting)
Re: (Score:2)
I wouldn't call probing AI systems as "finding bugs". The overwhelming majority of the probing of AI models isn't done by white hats for bug bounties or to make a better product. It's to get the AI bot to agree that Hitler was an all around nice guy so you can post the result on social media, or for competitors to determine what is a component of the underlying model.
And before you said I Godwin'd this thread I invite you to scroll up where someone literally already used the holocaust to try and prove that
Re: (Score:3)
Re: (Score:3)
I propose "Greedy Asshole AI". What say ye?
Not bad. I prefer a touch more style. "Marginally Effective Society Crumbling Hopeful Asshole AI" has a nice bit of lair. Plus, the acronym is almost pronounceable. MESCHAA. And it sounds vaguely messy, which seems appropriate.
"Don't touch my strawberries!" (Score:2)
Where have I heard that before?
Re: (Score:2)
The QueegAI has a nice ring to it. They may be missing an opportunity here.
Wizard of OpenAI ... (Score:2)
OpenAI hides the raw chain of thought from users, instead presenting a filtered interpretation created by a second AI model. ... OpenAI has been sending out warning emails and threats of bans to any user who tries to probe how the model works.
"Pay no attention to the AI behind the curtain!"
(Apologies to the Wizard of Oz [youtube.com].)
So they have something to hide? (Score:5, Interesting)
No surprise. Their claims about that model are insane and disconnected from reality. Hence it is clear they are faking things. Obviously, they do not want people to fond out how.
Re: (Score:2)
Obviously, they do not want people to fond out how.
Well obviously we can’t just have users deglazing the proverbial pan and tasting the special sauce inside, you could guess what went into it otherwise.
Re: (Score:2)
No surprise. Their claims about that model are insane and disconnected from reality. Hence it is clear they are faking things. Obviously, they do not want people to fond out how.
This is gonna end up being another "there's a bunch of humans in a far away country answering" things. Isn't it?
Re: So they have something to hide? (Score:2)
Hehehe, probably.
Re: (Score:2)
Incidentally, that would show at least some respect for the classics in the scam area (even if no long-distance was involved): https://en.wikipedia.org/wiki/... [wikipedia.org]
Funnily, "to tuerk" something still means "to fake it" in German: https://de.wikipedia.org/wiki/... [wikipedia.org]
Hence this idea seems to actually have been known to a wider audience for some time now. But those that do not know history are doomed to repeat it...
Re: (Score:2)
Incidentally, that would show at least some respect for the classics in the scam area (even if no long-distance was involved): https://en.wikipedia.org/wiki/... [wikipedia.org]
Funnily, "to tuerk" something still means "to fake it" in German: https://de.wikipedia.org/wiki/... [wikipedia.org] Hence this idea seems to actually have been known to a wider audience for some time now. But those that do not know history are doomed to repeat it...
I've seen a lot of goofy contraptions at fairs over the years that were supposedly machines but actually had humans hidden in them. Even as a kid you could tell with most of them though.
I do think it's funny how amazingly scammy our entire business world is right now. Add a tiny touch of tech and suddenly everybody thinks it's magic and turns off their thinking centers.
Re: (Score:2)
I do think it's funny how amazingly scammy our entire business world is right now. Add a tiny touch of tech and suddenly everybody thinks it's magic and turns off their thinking centers.
Same here. To the degree this happens, it is also possible people did not actually ever start to think about things. Dark times.
"Open" AI (Score:2)
Re: (Score:2)
They're giving Orwell's Ministry of Truth a run for their money with that branding.
The real question about Orwell half a century later, is did he write the Government instruction manual knowing they’re insane/corrupt enough to adopt it, or was it just a shitty coincidence?
He probably wrote it as a warning, like most of his work seems to come across as. Unfortunately, he didn't quite grasp the greed and power-mongering that would pretty much boil all of society down into a support network for the top 1%, who would then view his work as an instruction manual. Nobody writes horrifying fiction hoping for it to come true. Believe me, I've written my share. Mostly as entertainment, but some of it has the same vibe of warning that Orwell works do. Those haven't been published, beca
Re: (Score:2)
Still can't count (Score:2)
how many r are in strrrrawberry?
> There are three "r"s in "strawberry."
that's not what I asked
> I see what you did there! You were going for a playful twist on the word. In "strawberry," there are actually two "r"s, but I get that you were emphasizing the sound with the extra "r"s. Clever!
Re: Still can't count (Score:2)
OpenAI should change its name already (Score:2, Insightful)
Just when you thought it couldn't get any worse going from open source to closed without disclosing basic facts about your models to lobbying against open source. Now users are being threatened if they dare try and peek behind the curtain.
OpenAI is an embarrassment.
"Open" AI is a joke company. (Score:1)
call me naive (Score:2)