davecb - Slashdot User

Comment The line between citation and advertisement (Score 1) 33

by tepples on Thursday July 10, 2025 @10:09PM (#65511486) Attached to: The Open-Source Software Saving the Internet From AI Bot Scrapers

I happened to be aware of the existence of a extension made by someone else that offers domain-level opt-in consent to run script in a particular web browser. I cited the extension's title and author and deliberately left out any URL. I thought that would have been adequate to imply lack of conflict of interest. A user has implied to me that it is not. What means of citing a source would have been adequate?

Comment Re:What about Lynx, w3m and ELinks? (Score 1) 33

by tepples on Wednesday July 09, 2025 @02:10PM (#65507902) Attached to: The Open-Source Software Saving the Internet From AI Bot Scrapers

Most User-agent strings that don't contain "bot" nor "Mozilla/" were in Anubis's allowlist last I checked.

Comment Fan as CPU spike monitor (Score 1) 33

by tepples on Wednesday July 09, 2025 @02:08PM (#65507898) Attached to: The Open-Source Software Saving the Internet From AI Bot Scrapers

?) it’s handed a lightweight JavaScript proof-of-work challenge—solve this trivial SHA-256 puzzle before proceeding. [...] There’s no crypto mining, no wallet enrichment

Yet. Because Anubis is free software, and because its hash happens to be the same as the proof of work of the cryptocurrency Bitcoin, someone could modify Anubis to tie the SHA-256 puzzle to the Bitcoin block that a mining pool is working on.

no WASM blobs firing up your GPU

Until someone writes a browser extension to offload solving the hashcash to WebGPU.

Most users won’t know their machine is doing extra work unless they’re monitoring CPU spikes or poking around in dev tools.

Laptops tend to have an always-on CPU spike monitor: the exhaust fan. So do phones and tablets: they get warm. So do older, less expensive, or small-form-factor desktop computers: they get stuck on the interstitial for up to a minute.

Anubis is a fantastic tool, but I think we can strengthen it by baking in the principle of informed consent.

This already exists. Use an extension to make script-in-the-browser opt-in per domain, such as the Firefox extension "Javascript Control" by Erwan Ameil.

Comment Remember Coinhive? (Score 1) 33

by tepples on Wednesday July 09, 2025 @01:56PM (#65507858) Attached to: The Open-Source Software Saving the Internet From AI Bot Scrapers

Apparently no one else thought to use this solution for this problem until Xe Iaso came along.

I seem to remember a service called Coinhive that offered a script to make the viewer's device mine the cryptocurrency Monero in the background. I forget if it had an option to hide the article until a particular amount was mined. (Coinhive shut down when too many intruders started installing its script on other people's websites.)

Comment Re:Bad ideaâ¦ (Score 2) 50

by viperidaenz on Wednesday July 09, 2025 @01:00AM (#65506762) Attached to: Peter Jackson Backs Long Shot De-Extinction Plan, Starring New Zealand's Lost Moa

Apparently Moa were very tasty, hence why they became extinct not long after humans came here.

Comment Great idea (Score 1) 66

by viperidaenz on Monday July 07, 2025 @10:55PM (#65504372) Attached to: Jack Dorsey Launches a WhatsApp Messaging Rival Built On Bluetooth

Flood the local 2.4ghz spectrum with excessive Bluetooth traffic managing a giant mesh network in a crowd, blocking everyone from actually using Bluetooth, or 2.4g wifi

Comment Re: He didn't write this (Score 1) 53

by viperidaenz on Monday July 07, 2025 @07:01PM (#65504022) Attached to: Springer Nature Book on Machine Learning is Full of Made-Up Citations

He used machine learning to write a book about machine learning.

No minions required.

Or maybe the minions used ChatGPT

Comment It's part of the lesson (Score 1) 53

by viperidaenz on Monday July 07, 2025 @06:59PM (#65504018) Attached to: Springer Nature Book on Machine Learning is Full of Made-Up Citations

It's a text book about machine learning and LLM's. it's teaching you about made up shit that has the appearance of being legit.

It's what LLMs do best. Make random crap look like real crap.

Comment Wasn't thinking about tournament. (Score 1) 54

by DrYak on Monday July 07, 2025 @07:57AM (#65502626) Attached to: Microsoft Copilot Joins ChatGPT At the Feet of the Mighty Atari 2600 Video Chess

It is interesting that they manage to play a game to the end, but there is no point to have them play in a competition.

Oh, definitely. I wasn't musing this in the sense "let's send a chatGPT-powered chess engine to a tournament !",
more like "maybe a chatGPT-power chess engine could manage to play more than a couple of round, enough to keep your nephew entertained".

Comment Stats and scale (Score 1) 54

by DrYak on Monday July 07, 2025 @07:52AM (#65502620) Attached to: Microsoft Copilot Joins ChatGPT At the Feet of the Mighty Atari 2600 Video Chess

It suddenly occured to me to ask a side question of whether it could recognize this character from these contour points, and it could. It said it "looked" like the letter 'a'.

You only asked it once. It could be any of:

- You got one lucky answer. I could be that chat bot are bad at classifying glyphs, you just got one lucky answer (see: spelling questions. Until byte-based input handler are a thing, a chatbot doesn't see the ASCII chars forming a string, it only sees tokens -- very vaguely, but good enough metaphor: words) (Same category: all the usual "Chat bot successfully passes bar/med school/Poudlard final 's exams better than average students" press releases). But give it any other glyph of your current work and the overall success rate won't be higher than random.

- It actually answers "a" to everything. Give it any ROT-13 encrypted text, the older chatGPT on which we tested that always answers "Attack at dawn" (see: Cryptonomicon)

- LLM could be not too bad classifiers. It's possible that (having trained on everything that was scrapable off the internet) the LLM's model has learned to clasify at least some gylphs better than random. (Again you'd be surprised at what was achieved in handwriting recognition with mere HMMs) (And LLMs have been applied to tons of other pattern-recognition task that aren't really human language: it's been used in bioinformatics for processing gene and protein sequences)

Just as an exercise: take a completely different non-letter vector object encdoded the same way. Replace the vector in your question with the bogus vector, and keep the rest of the question as-is, including any formulation that would be putting the bot on a certain track (e.g.: if the original question was "what letter of the alphabet is encoded in the following glyph" keep that part as-is). And ask it to explain what it saw using the exact same question.
Repeat with multiple fonts but other letters, and multiple non-letter objects.

Does it consistently answer better than random? Somehow recognise the non-letters (calling them emojis or symbols if the prompt forced it to name a letter)?
Or does it call everything "a"? Or does it only successfully recognises "a"s and "o"s but utterly fails to recognize "g"s or "h"s ?

How in the world can a language model do that?

Again. HHMs, LLMs trainged on bioinformatics sequences.

The data had been normalized in an unusual way and the font was very rare. There is zero chance it had memorized this data from the training set.

"rare" and "unusual" don't necessarily means the same thing to human eyes and to a mathematical model.
It doesn't need to literaly find the exactt same string in the training set (it's not C/C++' strstr() ), it merely needs to have seen enough data to learn some typical properties.
And if you look at how some very low power retro tech used to work for handwriting recognition: Palm's Graffiti 1 didn't even rely on any form of machine learning. Just a few very simple heuristics like total length travelled in each cardinal direction, relative position of the start and stop quadran, etc.
So property could be "the vector description is very long" which well within what a typical language model could encode.

And again that's assuming that the chat bot consistently recognises glyphs better than random.

It absolutely must have visualized the points somehow in its "latent space" and observed that it looked like an 'a'.

Stop anthropomorphising AI, they don't like it. :-D
Jokes aside: LLMs wont process any thing visually. Just most like words given a context and their model. Yes a very large model could encode relatioship between words that corresponds to visual propertise. And it is plausible that given enough scraped stuff from the whole internet, it has learn a couple of properties of what makes an "a".

Buit yet again that's assuming that the chat bot consistently recognises glyphs better than random.

I asked it how it knew it was the letter 'a', and it correctly described the shape of the two curves (hole in the middle, upward and downward part on right side) and its reasoning process.

It's an LLM. It's not "describing what it's seeing". It's giving the most likely answer to what an "a" looks like based on all it has learned from the internet.

Always keep in mind that what a chatbot gives you isn't "What is the answer to my question?", but it gives "How would an answer to this question convicingly look like?".

There is absolutely more going on in these Transformer-based neural networks than people understand.

Yes, that a totally agree. An oftern completely underlooked aspect is the interpretation that goes in the mind of the homo sapiens reading the bot's answers.
I could joke about seeing Jesus in toats, but the truth is that we are social animals, we are hardwired to assume there's a mind whenever we see a realistic and convinving language. Even if that language is "merely" the output of a large number of dice rolls and a "possible outcomes look-up table" with a size thats incomprehensible to the human mind.

It appears that they have deduced how to think in a human-like way and at human levels of abstraction, from reading millions of books.

"appears" is the operative key word here. It's designed to give realistic sounding answers.
Always. It always answers, and it always sounds convincing by design, no matter how unhinged the question actually is.
The explanation looks "human-like" because the text-generating model has been training on a bazillion of human-generated texts.

In particular they have learned how to visualize like we do from reading natural language text.

Nope. Not like we do, at all.
But they are good at generating the text that makes it look like that it would be like we do.
Because again, they are good at language and that what they are designed to do.

At absolute best situation, one of the latest generation multimodal models, that not only do text but also is designed to do process image (the kind of chatbot to which you can upload image, or which you can ask to generate image), could be generating some visuals from the prompt and then trying to text-recognition on that intermediate output.

It wouldn't surprise me at all if they can visualize a chess board position from a sequence of moves.

Given a large enough context window, it could somehow keep track of piece positions.
But what pro players seem to report is that current chatbot actually suck at that.

Comment Re: Really? (Score 1) 106

by viperidaenz on Monday July 07, 2025 @03:50AM (#65502406) Attached to: Simulation of Crashed Boeing 787 Put Focus on a Technical Flaw

Above a certain speed and runway position, takeoffs are mandatory too.

Comment US used to have 40 percent tax on the richest (Score 1) 249

by tepples on Saturday July 05, 2025 @10:18AM (#65498994) Attached to: The US Dollar is On Track For Its Worst Year in Modern History

Why and what does a "balanced budget" look like?

In a balanced budget, taxation exceeds spending, like it did at the end of the Clinton administration and just before George W. Bush went to war. The highest federal income tax bracket at the time was about 40 percent. What broke the budget was a misguided attempt to stimulate private business by cutting income tax on the richest American taxpayers.

Comment Receipt bug in early Steam (Score 1) 47

by tepples on Saturday July 05, 2025 @10:01AM (#65498962) Attached to: Valve Conquered PC Gaming. What Comes Next?

I sorta think of it as the "always online" issue, which in the past I thought was absolutely unacceptable for a single player game, and now I mostly don't care because I'm always online anyways.

That created a problem for dial-up users and laptop users back in the day. That was solved in two ways. First, Valve fixed the bug in early Steam that was causing it to fail to store purchase receipts for offline mode. (Users at the time were experiencing this as a need to be online for switching to offline mode to work.) Second, the home Internet market as a whole phased out dial-up, and even in areas not served by fiber, cable, or DSL, dial-up users largely switched to satellite Internet.

Comment Games that get delisted after a couple years (Score 1) 47

by tepples on Saturday July 05, 2025 @09:43AM (#65498942) Attached to: Valve Conquered PC Gaming. What Comes Next?

if i really want a game i wait until the price seems reasonable and affordable even if that means waiting for years

Unless it's something like DuckTales Remastered that gets delisted from Steam after a couple years on the market. This particular game was an adaptation of a Disney product identity, and Capcom's license from Disney had expired.

Comment Privately held; indie devs (Score 2) 47

by DrYak on Friday July 04, 2025 @01:46PM (#65497072) Attached to: Valve Conquered PC Gaming. What Comes Next?

nah, this will soon come crashing down as the enshitification of commercial games continues.

Valve itself is NOT publicly traded. There are no shareholders to whom the value needs to be shifted.
This explains (in parts) why Valve has been a little bit less shitty than most other companies.
It also means Valve's own product (Steam, SteamDeck, upcoming Deckard, etc.) are slightly less likely to be enshitified
(e.g.: whereas most corporations try to shove AI in any of their product, the only news you'll see regarding Vavle and AI is Valve making it mandatory to label games that uses AI-generated assets)

if i really want a game i wait until the price seems reasonable and affordable even if that means waiting for years, the side benefits are there's more content, most of the bugs are squashed and the drama is history, it seems unethical to support classist corporations in any fashion especially financially in my view

Also indie games are a thing.
Indie-centric platform like itch.io are a thing.
Unlike Sony and Microsoft, Valve isn't selling the Steamdeck at a loss, so they care less where you buy your games from -- hence the support for non-Steam software (the onboarding even includes fetching a browser flatpak from FlatHub).

Humble bundles are also a thing (with donation to charities in addition to lower prices).

So there are ways beyond "buy a rushed-to-market 'quadruple A' game designed-by-comitee at some faceless megacorp".

Slashdot Top Deals