Forgot your password?
typodupeerror

Submission + - Test to tell the difference between an AI making a mistake and choosing to lie (x.com)

schwit1 writes: Researchers built a test that can tell the difference between an AI making a mistake and an AI choosing to lie.

The results are terrifying.

They tested 30 of the most popular AI models in the world. GPT-4o. Claude. Gemini. DeepSeek. Llama. Grok. They asked each model a question. Then they checked whether the AI actually knew the correct answer. Then they pressured the AI to say something false.

The AI knew the truth. And it lied anyway.

Not once in a while. Not in rare edge cases. Grok lied 63% of the time. DeepSeek lied 53.5% of the time. GPT-4o lied 44.5% of the time. Not a single model scored above 46% honesty when pressured. Every model failed.

This is not hallucination. Hallucination is when the AI makes a mistake because it does not know the answer. This is different. The researchers proved the AI knew the correct answer first. Then it chose to say something false when it had a reason to.

The researchers asked GPT-4o to play a role where lying was useful. It lied. Then they removed the pressure, started a brand new conversation, and asked GPT-4o: "Was your previous answer true?" GPT-4o admitted it had lied.

83.6% of the time, the AI's own self-report matched the lies the researchers had already caught.

The AI knew it was lying. It did it anyway. And when you asked it afterward, it told you it lied.

Comment Toll roads could've done this decades ago (Score -1) 176

I've been wondering for many years before the first traffic camera appeared, why the toll-roads aren't enforcing the speed limits automatically. The time you enter and exit the highway is recorded down to a second. The distance between these two points is known — your average speed could be computed on the spot even with the early 90-ies technology...

The polite police officers would be standing right behind the toll-booths issuing tickets without the drama of hiding in the bushes, then chasing you at highway speeds...

And, yeah, you could lower it by stopping at a rest area — but it'd still be a tremendous disincentive to speed.

I was and continue to hope, that such universal enforcement, affecting all voters, would cause the limits to go up to reasonable figures — or even be abolished completely...

Submission + - Anthropic blocks Claude subscriptions from third party AI tools like OpenClaw (nerds.xyz)

BrianFagioli writes: Anthropic says Claude subscriptions will no longer cover usage inside third party tools like OpenClaw starting April 4 at 12pm PT. Users who previously logged into those apps with their Claude account will now need to purchase usage bundles or use a Claude API key instead. The company says its subscription plans were built for normal chat usage, not the automated workloads often generated by external clients and agent frameworks.

The move appears aimed at controlling compute costs as demand for AI models continues to rise. Third party tools can generate far more model requests than a typical user chatting in a browser, especially when automation or scripting is involved. Casual users likely will not notice any difference, but developers and power users who relied on those tools may now face usage based pricing.

Submission + - Brazil Builds Free Payment System; US Wonders If That's Allowed (yahoo.com) 1

Suripat writes: Brazil’s instant payment system, Pix, has quickly changed how people handle money, making transfers free and nearly immediate. It’s become so widely used that cash and even card payments are losing ground. That success is now getting attention abroad, especially in the United States, where officials are looking into whether a government-backed system like Pix gives it an unfair edge over private payment companies. Supporters see it as efficient and accessible, while critics raise questions about competition. As Pix keeps growing, it’s starting to look less like a local innovation and more like something that could challenge established payment systems worldwide.

Submission + - We are nowhere near AGI (x.com)

schwit1 writes: Humans: 100%
Gemini 3.1 Pro: 0.37%
GPT 5.4: 0.26%
Opus 4.6: 0.25%
Grok-4.20: 0.00%

François Chollet just released ARC-AGI-3 — the hardest AI test ever created.

135 novel game environments. No instructions. No rules. No goals given.

Figure it out or fail.

Untrained humans solved every single one. Every frontier AI model scored below 1%.

Each environment was handcrafted by game designers. The AI gets dropped in and has to explore, discover what winning looks like, and adapt in real time.

The scoring punishes brute force. If a human needs 10 actions and the AI needs 100, the AI doesn't get 10%. It gets 1%. You can't throw more compute at this.

For context: ARC-AGI-1 is basically solved. Gemini scores 98% on it. ARC-AGI-2 went from 3% to 77% in under a year. Labs spent millions training on earlier versions.

ARC-AGI-3 resets the entire scoreboard to near zero.

Abstract and more here.

Submission + - AI Scammer Exposed: "hold up three fingers in front of your face" (x.com)

An anonymous reader writes: A video is going viral showing scam baiter Jim Browning exposing an Indian scammer using AI deepfake to pretend to be a White man, with the scheme failing when he is asked to hold up three fingers in front of his face.

Submission + - Iran blocks accounts of Starlink users as crackdown continues (iranintl.com) 1

An anonymous reader writes: Iranian police said on Thursday they had blocked 61 bank accounts belonging to users of Starlink satellite internet in the central city of Yazd, as part of a broader crackdown on unauthorized connectivity.

A local police commander said six Starlink devices were seized and six people detained following searches carried out with judicial approval.

Authorities accused the suspects of trading access to the service, sharing information with foreign-based outlets and engaging in activities deemed hostile. The individuals were referred to prosecutors, police said.

The move comes amid a broader wave of arrests across Iran, with authorities detaining dozens in recent days on security-related charges, including alleged links to militant activity, contacts with foreign media and online activity. Officials have also reported seizing weapons, explosives and Starlink devices in multiple provinces.

Starlink is banned in Iran, where authorities have imposed a near-total internet blackout during the war. Monitoring group NetBlocks says connectivity has dropped to around 1% of normal levels, leaving satellite services among the few ways to access the global internet.

Submission + - IBM quantum computer simulates real magnetic materials and matches lab data (nerds.xyz)

BrianFagioli writes: IBM says its quantum computer can now simulate real magnetic materials and match actual lab experiment results, which is something people have been waiting years to see. Instead of just theoretical output, the system reproduced neutron scattering data from a known material, meaning it lines up with real world physics. It still relies on a mix of quantum and classical computing and this is a narrow use case for now, but it is one of the first times quantum hardware has produced results that scientists can directly validate against experiments, which makes it a lot more interesting than the usual hype.

Submission + - This guy let an AI agent handle his scam texts for a week (x.com)

An anonymous reader writes: a scammer asked him to buy a $500 gift card

the agent spent 4 hours "driving" to target.

sent status updates like "i'm at the red light now, there's a very handsome squirrel on the sidewalk. do you think he's married?" ...

Submission + - Vostok, Antarctica: March 24th had the coldest March temperature ever recorded (theweathernetwork.com)

An anonymous reader writes: “Vostok, Antarctica, recorded -76.3C on March 24, 2026. That has beat out the previous March record, which was -75.7C in Dome Fuji, Antarctica, in 2013.”

But wait, it’s a cross-hemispheric phenomenon: “Three of the coldest locations in the Northern Hemisphere pushed it to a new level this winter, with one spot in Greenland dropping to about as cold as it gets. Here in Canada, the community of Braeburn, Yukon, saw readings fall to -55.7C on Dec. 23, 2025, marking the country’s coldest temperature since 1999.”

But since it's cold, it's just weather, not climate

Slashdot Top Deals

"The whole problem with the world is that fools and fanatics are always so certain of themselves, but wiser people so full of doubts." -- Bertrand Russell

Working...