Forgot your password?
typodupeerror

Submission + - Test to tell the difference between an AI making a mistake and choosing to lie (x.com)

schwit1 writes: Researchers built a test that can tell the difference between an AI making a mistake and an AI choosing to lie.

The results are terrifying.

They tested 30 of the most popular AI models in the world. GPT-4o. Claude. Gemini. DeepSeek. Llama. Grok. They asked each model a question. Then they checked whether the AI actually knew the correct answer. Then they pressured the AI to say something false.

The AI knew the truth. And it lied anyway.

Not once in a while. Not in rare edge cases. Grok lied 63% of the time. DeepSeek lied 53.5% of the time. GPT-4o lied 44.5% of the time. Not a single model scored above 46% honesty when pressured. Every model failed.

This is not hallucination. Hallucination is when the AI makes a mistake because it does not know the answer. This is different. The researchers proved the AI knew the correct answer first. Then it chose to say something false when it had a reason to.

The researchers asked GPT-4o to play a role where lying was useful. It lied. Then they removed the pressure, started a brand new conversation, and asked GPT-4o: "Was your previous answer true?" GPT-4o admitted it had lied.

83.6% of the time, the AI's own self-report matched the lies the researchers had already caught.

The AI knew it was lying. It did it anyway. And when you asked it afterward, it told you it lied.

Comment Re: Does the Pirate Party still exist? (Score 1) 40

The Privacy Act of 1974 technically restricts what personally identifable information (PII) a federal agency can collect and retain. I would be OK extending this to state and local agencies. That would potentially give us a way to go after state laws that require age verfication in devices and operating systems. But it seems that there is no political will to pump the brakes on the surveillance state.

Comment Are we the baddied? (Score 3, Interesting) 118

SS Officer #2: Er, Hans?
        SS Officer #1: Have courage, my friend.
        SS Officer #2: Yeah. Er, Hans, I've just noticed something...
        SS Officer #1: [Looking through binoculars] These communists are all cowards.
        SS Officer #2: Have you looked at our caps recently?
        SS Officer #1: Our caps?
        SS Officer #2: The badges on our caps, have you looked at them?
        SS Officer #1: What? No. A bit.
        SS Officer #2: They've got skulls on them. Have you noticed that our caps have actually got little pictures of skulls on them?
        SS Officer #1: Uh, I don't...
        SS Officer #2: Hans... are we the baddies?

Comment Dirty 30's anyone? (Score 2) 182

President Jimmy Carter under estimated the support the Ayatollah could muster, burned his own political capital in supporting the Shah, miscalculated the importance of the clerics to Iranian culture (especially in rural and the lower class), and ultimately unprepared for the Iranian Revolution, Absolutely bungled it, and the result was he lost backing of his own party, and haunted the remainder of his presidency.

It's like 2026 is 1978 all over again. And we have people in charge that are incapable of learning from their own mistakes, let alone the mistakes of their predecessors. The end results that I believe are likely is that we're going to botch this one, either through President Trump's direct orders or m ore likely one of the crony appointees is going to make a bad call that is going to cost us dearly. At the extreme end of what is possible is well-funded supporters of Iran could back Yemeni Houthis to carry out attacks on American civilians in order to create discord in the US and weaken political support for Trump and the GOP. And it's not like it would even take much, as support has long been waning Trump and the GOP controlled Congress.

Tariffs, high gas prices, and substantial threat to public safety. This is what voting with a cult costs you. Economic recovery is going to take a decade or more starting in 2029. The 30's are going to be the DIRTY THIRTIES. Zoomers are going to be pissed at all of us when they figured out just how badly we let the Boomers and GenX fuck them over.

Submission + - Brazil Builds Free Payment System; US Wonders If That's Allowed (yahoo.com) 1

Suripat writes: Brazil’s instant payment system, Pix, has quickly changed how people handle money, making transfers free and nearly immediate. It’s become so widely used that cash and even card payments are losing ground. That success is now getting attention abroad, especially in the United States, where officials are looking into whether a government-backed system like Pix gives it an unfair edge over private payment companies. Supporters see it as efficient and accessible, while critics raise questions about competition. As Pix keeps growing, it’s starting to look less like a local innovation and more like something that could challenge established payment systems worldwide.

Slashdot Top Deals

We cannot command nature except by obeying her. -- Sir Francis Bacon

Working...