Become a fan of Slashdot on Facebook

 



Forgot your password?
typodupeerror

Comment Re:AI detectors remain garbage. (Score 1) 18

They clearly didn't even use a proper image generator - that's clearly the old crappy ChatGPT-builtin image generator. It's not like it's a useful figure with a few errors - the entire thing is sheer nonsense - the more you look at it, the worse it gets. And this is Figure 1 in a *paper in Nature*. Just insane.

This problem will decrease with time (here are two infographics from Gemini 3 I made just by pasting in an entire very long thread on Bluesky and asking for infographics, with only a few minor bits of touchup). Gemini successfully condensed a really huge amount of information into infographics, and the only sorts of "errors" were things like, I didn't like the title, a character or two was slightly misshapen, etc. It's to the point that you could paste in entire papers and datasets and get actually useful graphics out, in a nearly-finished or even completely-finished state. But no matter how good the models get, you'll always *have* to look at what you generate to see if it's (A) right, and (B) actually what you wanted.

Comment Re:This is why we use "agents" instead of "LLM's" (Score 1) 68

Yeah I have used Semantic Kernel to code AI in .NET and I did not give it the capability to tell the current date and time but it would be a 5 minute fix to do so since getting the current time is trivial. The bigger problem would be ensuring the offline server the AI runs on has its clock set correctly.

Comment Nope (Score 1) 68

Any modern AI model can be provided "tools" that it can use to perform various tasks or retrieve various information. The current date and time is easy to do. I can't say why the author and/or ChatGPT seems to have trouble but you can easily set up a tool to return the current date and time, instruct the AI "this will return the current date and time" and then if the user asks for it the AI will automatically leverage the tool. It's possible ChatGPT just has a lot of tools at its disposal and is getting confused about which one it should use (for example, searching online for the current date time) or perhaps OpenAI wants ChatGPT to use a less specific online search tool which can also return the current date/time when asked but sometimes ChatGPT doesn't quite search for the right thing. As someone who has leveraged AI you can provide specific tools but I expect OpenAI wants to provide far more scope to tool functionality in ChatGPT, so may provide more general tools like web search, which may cause problems.

Comment Re:AI is just limited. (Score 1) 68

I find the various LLMs are helpful as a form of search engine, enabling me to drill down to potentially useful information more quickly. However at the same time they are far worse than a search engine because they aren't able to actually give you the sources to check. When ChatGPT generates a chunk of code, if you ask it where it got it from, it will say it didn't get it from a specific site, it just knows this stuff. Which of course ends up wrong half the time. So you end up with wrong stuff confidently passed off as accurate, which is ultimately stolen from real human sources. When I was in uni it was drilled into me to list my sources. Why should LLMs be held to any different standard? Google's AI summary does show sources, at least few, which is good. I always check them.

Even Claude AI which is supposed to be geared towards coding suffers from these same problems. I am trying to do some esoteric Qt 6 programming involving OpenGL, and all the AIs really struggle here because there's a limited amount of source material to steal from. It's certainly not capable of digesting the API documents and synthesizing code to do something without first seeing someone else's code. Claude AI seems to work best if you use a popular library or framework with lots of online discussion and github code for it. The popular languages and frameworks of the day.

Comment AI detectors remain garbage. (Score 5, Interesting) 18

At one point last week I pasted the first ~300 words or so of the King James Bible into an AI detector. It told me that over half of it was AI generated.

And seriously, considering some of the god-awful stuff passing peer review in "respectable" journals these days, like a paper in AIP Advances that claims God is a scalar field becoming a featured article, or a paper in Nature whose Figure 1 is an unusually-crappy AI image talking about "Runctitiononal Features", "Medical Fymblal", "1 Tol Line storee", etc... at the very least, getting a second opinion from an AI before approving a paper would be wise.

Comment Re:Really? (Score 3, Insightful) 68

automated image pattern matching has been around for decades

The problem is that the LLM only does one trick. When you start integrating other software with it, the other software's input has to be fed in the same way as your other tokens. As the last paragraph of TFS says, "every clock check consumes space in the model's context window" and that's because it's just more data being fed in. But the model doesn't actually ever know what time it is, even for a second; the current time is just mixed into the stew and cooked with everything else. It doesn't have a concept of the current time because it doesn't have a concept of anything.

You could have a traditional system interpreting the time, and checking the LLM's output to determine whether what it said made sense. But now that system has to be complicated enough to determine that, and since the LLM is capable of so much complexity of output it can never really be reliable either. You can check the LLM with another LLM, and that's better than not checking its output at all, but the output checking is subject to the same kinds of failures as the initial processing.

So yeah, we can do that, but it won't eliminate the [class of] problem.

Comment Re:Big, BIG companies should know better (Score 1) 62

(Shuffles off and mutters something about how does a greybeard get Vulture Capitalist funding to setup cross continental niche cloud for people that value stability over shiny, with Open Source ... Open Stack ... Cloudified LibreOffice, Ceph, my lawn)

Every tech company needs at least three things to start with: The business guy, the brain, and the lawyer. Ideally there should also be a marketing guy, but you can add them in later. Also, none of them have to be male, I just like saying "guy", buddy.

Comment Re:Excel is a platform. (Score 1) 62

Untrained? Excel is a spreadsheet tool within the MS Office suite with 27,000 features. It requires a tad more training than handing a moron a hammer

Yes and no, depending. If you are building an application in Excel, yes, all you said is true. If you are using one, no, none of it is. Spreadsheets can be set up such that the user just stuffs data into them where they are supposed to, then clicks a button to get results. Or maybe they don't even have to hit a button.

For the simplest useful example I can think of, I put together a spreadsheet which produces a table we use for asset valuation. This spreadsheet changes every year. If you load my spreadsheet, it will be correct for the current year. No user has to think about that at all, they just load it and get a correct table. You can extrapolate this to basically any level of complexity because Excel has VBA and you can script everything. The user just follows instructions, and they aren't even allowed to edit any cells which could break anything.

Comment Re: Alibaba (Score 1) 32

In case anyone is going this far down the hole, it turned out great. Even though the item was shipped from the US, because the seller didn't respond I got a refund without having to return it.

So far Aliexpress has been responsive to 100% of my issues and I only have needed to be a little patient and not expect everything to be solved immediately or arrive immediately.

Comment Re:Google? wtf (Score 4, Interesting) 62

20 million cells? That seems ridiculous. Why aren't they using a database for something that huge?

I agree that a database-backed application is the right way to go for that much data. However, Finance used Excel because they could. We all like to talk about how bad an idea it is to do that, but Excel brought financial computations on large data sets to people who can't write any code. It has enabled thousands upon thousands of businesses to do things they couldn't do before without paying a programmer to develop a solution they cannot maintain. The fact that other spreadsheets regularly crater when handed data that Excel has no trouble with is exactly why we have so much Excel.

I like to use Drupal to rapidly create database applications which can handle a lot of data without writing code. But I wouldn't expect someone in accounting to be able to do that at all, and that just shifts the problem domain. Instead of getting stuck with Excel, now I'm getting stuck with Drupal. All of the logic just winds up in a different system that you can't trivially transfer it out of, so you have the same exact maintainability problem, except more people know how to work with Excel.

Comment Re:I thought we were saving the planet? (Score 1) 181

FYI, their statement about Iceland is wrong. BEV sales were:

2019: 1000
2020: 2723
2021: 3777
2022: 5850
2023: 9260
2024 (first year of the "kílómetragjald" and the loss of VAT-free purchases): 2913
2025: 5195

Does this look like the changes had no impact to anyone here? It's a simple equation: if you increase the cost advantage of EVs, you shift more people from ICEs to EVs, and if you decrease it, the opposite happens. If you add a new mileage tax, but don't add a new tax to ICE vehicles, then you're reducing the cost advantage. And Iceland's mileage tax was quite harsh.

The whole structure of it is nonsensical (they're working on improving it...), and the implementation was so damned buggy (it's among other things turned alerts on my inbox for government documents into spam, as they keep sending "kílómetragjald" notices, and you can't tell from the email (without taking the time to log in) whether it's kílómetragjald spam or something that actually matters). What I mean by the structure is that it's claimed to be about road maintenance, yet passenger cars on non-studded tyres do negligible road wear. Tax vehicles by axle weight to the fourth times mileage, make them pay for a sticker for the months they want to use studded tyres, and charge flat annual fees (scaled by vehicle cost) for non-maintenance costs. Otherwise, you're inserting severe distortion into the market - transferring money from those who aren't destroying the roads to subsidize those who are, and discouraging the people who aren't destroying the roads from driving to places they want to go (quality of life, economic stimulus, etc)

Comment Re:Make more (Score 1) 24

Make more what? Dragon capsules? Currently SpaceX has no plans to make any more dragons beyond the five they currently have. I believe the fifth capsule had its inaugural flight this year with the private Axiom-4 excursion. These dragon capsules are currently rated for just five flights each, but SpaceX and NASA are working to extend their certifications to 15 flights each. Currently the contract with SpaceX does not include additional missions that would have been flown by Boeing. While it's possible NASA could try to buy some more flights, I don't think they will. SpaceX has the falcon 9 pretty well booked, and their other resources are fixed on Starship. I'm not saying NASA "needs" the Starliner, though. Just that it's not a simple thing to substitute Dragon for Starliner.

Comment Re: Alibaba (Score 1) 32

Well, I'm about to find out if I need to do my first chargeback, I have a delayed response on a return authorization for where I was sent the wrong item. They advertised a different version. This might be confusing for them since the difference is small - yet critical. But there really should be no confusion because they advertised the other version both in the images and the product name/listing title.

Slashdot Top Deals

It is better to never have tried anything than to have tried something and failed. - motto of jerks, weenies and losers everywhere

Working...