Follow Slashdot stories on Twitter

 



Forgot your password?
typodupeerror

Comment Lack of use... (Score 4, Insightful) 251

Skills atrophy with lack of use. They were taught how to read an analog clock when they were ~7 years old, and then never used the skill again.

It's hard to expect a child to retain skills they haven't used. I learned how to use a sewing machine in Home Economics back in 1994 or so. Damned if I could get one started today.

Comment Re:Decent look (Score 1) 23

That's why it will fail. It's a Linux guy's side project. It'll never be complete, and it will only ever solve the needs of the guy building it. Just like so many other "You know what the world REALLY needs? Yet another desktop environment!"-inspired projects.

Coulda just contributed to the wheels that already exist, instead of making another one.

Comment I'm fine with this. (Score 2) 48

Honestly, more distributions that have no reason to exist and no distinguishing traits should shut down. The people that worked on them should move to projects that aren't just rehashes of existing projects.

When your eulogy can't come up with a way to describe you in any meaningful way, your life nor death were worthy of celebration or remembering.

Comment Re:context (Score 1) 61

Re-upload it to where, exactly? What other service is providing ChatGPT 4o, when "Open"AI obsoletes it again?

You can take the context and transition it (or just select a different model for the current context) as you wish, but what these people want is the model to remain available, including its behavior as well as tone and tenor.

They didn't lose the past context or the ability to continue the "conversation" entirely, they lost the ability to continue the "conversation" with 4o. These LLMs DO have "personality" or at least a certain color to their output.

Comment Re:Sounds like the accusations are true. (Score 1) 96

And I'll add one more point, which I meant to make in my previous comment:

Agents, AI or otherwise, aren't just pulling down a single page to present to the user. They're performing logic on that page that was accessed, and probably accessing additional pages. AI agents can be given instructions to "collect all the content on slashdot.org". I did that, and here's what ChatGPT did: https://chatgpt.com/share/6894...

That's behavior that should respect robots.txt, but it apparently uses Chrome, so... It clearly doesn't.

Comment Re:Sounds like the accusations are true. (Score 1) 96

All scrapers, crawlers, and other 'bots' SHOULD respect robots.txt. The original intent was to block what was termed at the time (1994) as "crawlers", but that has evolved as the Internet has evolved.

Justifying crawling behavior by saying it's "just scraping, and then loading additional pages..." is... Well. Fucky logic to say the least. Following your logic here: If I access a single page, then extract all the links in it and add it to an RSS feed, I'm free to then access all those subsequent pages because now they're in an RSS feed, and I get to scrape them. I just run this in a loop, iterating over all of them because hey, I just want the contents of all those pages, and...

Where do you draw the line? Your "amending the fulltext to your RSS feed" example is crawling. You iterated over a series of links, accessing all the linked pages to get their full text with an automated process. It's just a nonsense argument to try to say you weren't "crawling". Just because you added --max-depth=1 doesn't mean it wasn't an iterative, automated process retrieving the contents of a page.

An AI Agent, acting on behalf of a set of logic instructions given by a user, accessing multiple pages and traversing them based on findings in the preceeding pages (such as executing a search, then following links to scrape for values) isn't crawling then?

Automated processes aren't just dumb "crawling and scraping" any more.

I don't believe any automated process should allow itself to access content denied by robots.txt, no matter the logic leaps made to justify it. If I wanted automated processes to access those URLs, they wouldn't be covered by a Disallow rule in robots.txt. A robots.txt file is a statement saying "I specifically do or do not allow access based on these criteria". The criteria isn't dependent on your use case. It's dependent on mine.

Comment Re:Sounds like the accusations are true. (Score 1) 96

Depending on the agent, the output may never be seen by a human. "Constantly monitor eBay for a good deal on a waffle iron, and trigger a notification when found", for example. No eyes will ever see the pages the agent loads. It's just consuming eBay's compute resources. A much better written prompt will also ignore promoted eBay listings, inline advertisements, and so on.

Even if it's an action taken on behalf of a person, it's very unlikely the ads on the page will be delivered to the user. Keep in mind, the entire point of agentic actions is to get an end result, not to use it as a web browser proxy where you see the site's fully rendered content.

The point is something like starting an agentic task of "Access all available pages on the site slashdot.org from the main page and all comments, ignoring advertisements and clearly sponsored content, and build a sqlite database containing the site's posts and all crawled content"

Or

"Crawl everything2 looking for any references to petrified Natalie Portman and hot grits. Copy the text of any such pages, ignoring and omitting any off-topic content such as advertisements and navigation elements. Translate these pages to Pig Latin and repost them anonymously to Slashdot under random comments."

Slashdot Top Deals

"Little prigs and three-quarter madmen may have the conceit that the laws of nature are constantly broken for their sakes." -- Friedrich Nietzsche

Working...