Become a fan of Slashdot on Facebook

 



Forgot your password?
typodupeerror

Comment Re:It just shows (Score 1) 63

I chose RC boat because some people are using this competing at "stupid human trick" as an example of intrinsic proof of LLMs being able to supersede humans.

In the scenario of an olympic swimming competition, an autonomous boat vs a manned boat would show no difference to each other, both would compete the task much better than a human. It's a useless test to measure general utility. Just like a person swimming a 1500 meter distance is not really a useful indicator on its own of how useful they are. These Math Olympiads are similar in that they are not particularly indicative of people being useful. The same ways that we would stress a human in impressive ways does not mean that a computer coming at it from a different approach should be considered to have broadly superseded the humans.

Yes, a real boat is valuable and LLM can be valuable when utilized correctly, but in the face of exaggerated hype some pessimistic reality check is called for to balance expectations.

Comment LLMs can't think and they don't need to (Score 2) 101

LLMs have a great deal of utility and extend the reach of computing to a fair amount of scope that was formerly out of reach to computing, but they don't "think" and the branding of the "reasoning" models is marketing, not substantive.

The best evidence is reviewing so-called "reasoning chains" and how the mistakes behave.

Mistakes are certainly plausible in "true thinking", but the way they interact with the rest of the "chain" is frequently telling. It flubs a "step" in the reasoning and if it were actual reasoning, that should propagate to the rest of the chain. However when a mistake is made in the chain, it's often isolated and the "next step" is written as if the previous step said a correct thing, without ever needing to "correct" itself or otherwise recognize the error. What has been found is that if you have it generate more content and dispose of designated "intermediate" content you have a better result, and the intermediate throwaway content certainly looks like what a thought process may look like, but ultimately it's just more prose and mistakes in the content continue to have an interesting behavior of isolation rather than contaminating the rest of an otherwise ok result.

Comment Re:No"AI" cannot think (Score 1) 101

When the model is used for inference, yes. But I assume he was speaking to the awkwardness of training. Take a machine vision that has never been trained on dogs and cats, feed it a dozen labeled images of cats and dogs to retrain it to add dog/cat recognition. Then try to do inference on that model and it will be utterly useless still for dog/cat recognition. Take a model trained on normal images. Then have it try recognition on a fisheye lense. It will fail because it has no idea. You might hope to retrain it to recognize the fisheye distortion generically, but generally that won't work and you have to retrain it to catch the fisheye variant of everything you want. After retraining, if you just slap the fisheye variant into a normal picture, the model is likely to be oblivious to the anomaly.

Don't know if this is an argument about 'thinking', but it is certainly a difference in 'learning', that AI models need to consume way way more than a human to start to give useful results. A model training to operate a car is still oddly dodgy even after consuming more car operating hours than a human will consume in a lifetime. In areas where we have enough training data to feed, this is fine, but it's certainly different than a human learning, which can do something better to extend less training to better results.

Comment Re:It just shows (Score 1) 63

I'm not in denial, the LLMs and other forms of AI have utility, but expectations have to be mitigated.

Was in a discussion with a software executive a couple weeks back who said he fully anticipates he can lay off every one of his software developers and testers in the next year and only have to retain the 'important' people: the executives and sales people.

People see articles like this show how LLMs enable computing to reach another tier of 'stupid human tricks', which is certainly novel, but people overextend exactly what this means. We put too much focus on whether the LLMs can navigate tests explicitly designed to be graded and passable for humans, with effort and imagine that means they are able to chart the less well trodden set of problems that have an unknown or maybe no solution at all.

There's a lot of stuff LLMs can do that computing wasn't able to do before, and that does potentially speak to a great deal of the things humans do today, but we have to temper the wild expectations people are getting.

Comment Re:It just shows (Score 1) 63

The point is we have a myriad of "tests that are hard for humans, but don't necessarily translate to anything vaguely useful". In academics, a lot of tests are only demanding of reasoning ability because the human has limited memory. Computers short on actual "reasoning" largely make up for it by having just mind boggling amounts of something more akin to recall than reasoning (it's something a bit weirder, but as far as analogies go, recall is closer).

It's kind of like bragging that your RC boat could get a gold medal in the Olympic 1500 meter freestyle. It didn't complete the challenge in the same way, and that boat would be unable to, for example, save someone that is about to drown, because the boat can just go places, it can't do what that human swimmer could do. That person swimming 1500m itself is a useless feat of interest, and not really directly useful in and of itself.

Comment Re: More things wrong with the world. (Score 1) 81

The commenter clearly seemed to think the world was going to be supremely unfair to the CEO (turns out 'exec' is ambiguous, as the man, his wife, the mistress, and the mistress' husband are all executives one place or another). You said the exec deserves to lose because of his actions, which seems to be inconsistent. The commenter's stance is based on his blatant assumption that the wife was not earning money and the mistress was just some gold digger, and that even if the wife wasn't earning money, that if the split happens it's unfair for her to get a cut of the CEOs wealth that he earned.

The assertions of misogyny are because he filled in the gaps he didn't know with assumptions consistent with negative stereotypes of women in these situations. He jumped right to the fiction of the struggling man paying huge alimony to some indolent ex-wife living a life of luxury. That the mistress was only in it for gold
digging.

Comment Re:More things wrong with the world. (Score 1) 81

You seem to have just been hit with the headlines and manufactured a scenario where he is a rich guy married to a stay at home wife, with a gold digging mistress.

My spouse was interested enough to bother to dig in and the reality is that the CEO, the wife, the mistress, and the mistress' husband are all four rich with income, so alimony is likely not even a factor. Similarly, the assets being split is unlikely to be lopsided.

From what I've seen in actual life, that all seems to be a rich person trope, and an exaggeration. Those I've known with modest lifestyles that get divorced seem not to have encountered a whole lot of financial duress due to that (maybe child support, but not the wife). I was at a business lunch where three people started bemoaning this as if it were true, that their former spouses are just draining them of all their cash. However, one of them had just been talking about his brand new BMW M5 that his 'lame' former wife would never let him buy and another chimed in with the same experience, albeit with a more humble Kia Stinger. Broadly they all seemed to be doing quite well and the wife would have otherwise been stuck high and dry largely at their "man of their house" mindset that didn't have her earn an income, which is fine, but expect them to be able to use some of your income even after the relationship falls apart.

Comment Re:haha Google Android head is wrong (Score 1) 113

Just because Bitanica gets it right, doesn't mean the broader world understands it.

His complaint is that too many people think of it as a 'degree to get coding' and it's more than that. You seem to agree with that, though maybe room to quibble over the nuance of what more it is or how it should be described.

Comment Re:Everything old is new again (Score 2) 43

Yes, it does need an exclusion zone, it is laid out in their page:
https://thekitepower.com/the-f...

They also note that the flight zone can be used for multiple purposes, subject to limitations.

While Agrivoltaics is a thing, trying to discuss how much power output per area becomes tricky. They are splitting the sun between the crops and panels and so that ability to get 10kw in 40m2 becomes who knows how much more land depending on which approach is selected. It's just a statement that sparse panels might mike good shade for livestock, or that being at a *super* suboptimal angle allows enough light for the plants, while tanking the efficiency of the panels so you are spending way more per kw than deploying them optimally.

Of course, while they mention agriculture, their main scenario seems to be a medium term rental for a project site, which they tout as quick as a big diesel generator but without the emissions. So they are thinking about not needing a project to deploy the solar before doing the actual project when compared to solar, and then having to take down the solar. I'm not sure, practically speaking, the "green" ness is enough to move people away from the status quo of big diesel for such projects,particluraly since they do need the flight zone left clear of structures and the "potential" flight zone seems like a big risk for any construction you might do, even if you could spare the "flight zone" from active work.

Comment Re:My experience (Score 1) 24

Agreed, the move to call them "reasoning" models annoys me.

They basically just go "generate even *more* text and only provide the last bit. Basically to write a story about what "thinking" about the question would look like, which does seem to produce marginally better final output at the expense of an order of magnitude more tokens expended.

But then you look at the "reasoning" chain and you'll see mistakes that, if it were really a reasoning chain, would propagate to next step of the "reasoning" process, but frequently they are anomalies and the next text is generated as if the previous text said the correct thing.

Seems to be that they established that expending more tokens and disposing of most of it causes better results, and that the content to be ignored cosmetically resembles a reasoning chain when it's all correct and consistent, but the errors don't propagate in a way that would be consistent with that truth.

Comment Re:Everything old is new again (Score 2) 43

I think the point was that with solar, that area can have bulidings or "I don't care about what's underneath", but it can't be deployed and have the land also be used for farming. In the kite scenario, the land can do double duty for some things, like agriculture, so long as you land and secure the airfoil during times when you want people in the flight zone.

So you give up 20m2 to get 100kw of wind power with a large 'no people should usually be here' area, but plants, sure. To do the same with solar, you'd need about 745 m2 of non-farm land available, though that can include roof tops, though angle may be suboptimal.

Comment Re:Everything old is new again (Score 2) 43

Looks like they claim 30kw for current product, and 100kw for an iteration coming soon, rather than 10kw. Also for 10kw, we are talking about 40 square meters of area, and the base station for these is about 20 square meters, and yes this is still comparing just the base station of one to total footprint of the other, and if we compared total deployed area, then solar *easily* wins in every factor except for all I know cost.

However while the total area may be pretty large, the area doesn't have to be as cleared or denied sunlight. So you might get to ignore the overall volume for some applications. So it might be fair to compare the ground station footprint to solar footprint.

For example you have a farm where the land is valuable for crops, but you could abide an airfoil around when the fields aren't being worked, or are being worked by pure automation. When you need the flight area worked, you can probably easily land the airfoil for that duration, and then return it to operation when that is done.

Conversely, useless in urban or suburban scenarios but solar is trivial to deploy there.

So if you have a bunch of effectively wasteland, I think this is unlikely to make any sense. But if you have a nuanced land area where people don't need to be, but you do want the land for other purposes, I could see this kite scenario playing out.

Slashdot Top Deals

Wasn't there something about a PASCAL programmer knowing the value of everything and the Wirth of nothing?

Working...