Forgot your password?
typodupeerror

Comment Re:Us too (Score 1) 27

The problem being I haven't seen a good term that refers to the extended LLM scenario that is specific enough to exclude other things like machine vision.

Everyone is referring to the extended LLM scenario and despite things feeding improvements, it still cannot do everything that people promise/believe it can do. I have been inundated by project proposals that largely center around "screw everyone but my job, AI can replace everyone but me", and they are just full of bad ideas.

Basically, the good old "I have an app idea but I can't develop" crowd that actually didn't have a good app idea now think LLM based systems have come to finally realize their vision. As a result, various things are flooded with half-realized concepts that really need to deflate.

For the non-technical folks, a relatively decent analogy is looking at the likes of youtube and just how even worse the uninspired crap has gotten now that GenAI can let them low effort up a significant volume. It's not that GenAI necessarily *has* to make bad content, but bad content creators are equipped to flood the field. Similarly, people who can't deal with software designs are pitching right next to skilled professionals and the target audience doesn't know the difference until after they've already screwed over the wrong party.

Comment Re:This isn't a mirage (Score 1) 27

The same argument could be made around automated fuzzing. A new class of security misbehavior may be identified automatically, and it turns out you can use such tools to identify things to fix as well.

Of course, it could be a problem if it has a high false positive rate, where the attacker can hit false positives and barely be impacted but the false positives drive an impossible churn to keep up with on the defense side... Which frankly could be a thing based on my experience with LLM code review that can catch stuff, but also has false positives and even suggests absolutely broken "fixes" fairly commonly.

Comment Re:Hey what a coincidence... (Score 2) 27

While Anthropic is generally more credible, they have indulged in performative bullshit for the sake of the hype train.

Frankly, if they didn't, they would have been screwed over no matter how well they actually made a product.

Not crazy about the "do things to open source projects, but obfuscate the fact that it's LLM originated" Anthropic thing either way.

Comment Re:Us too (Score 3, Insightful) 27

I do suspect that OpenAI will be the 'Netscape' of this bubble pop. Early mover that in many ways sparked something significant that got left behind by others that did it better.

I am so eager for a bubble pop to recalibrate expectations to properly leverage LLM as appropriate instead of the current madness. It will be an adjustment, but without the craze it won't be nearly so obnoxious.

Comment Hey what a coincidence... (Score 5, Insightful) 27

Anthropic announces that they have a super awesome AI product that's just too awesome for anyone for anyone to see.

And then immediately OpenAI has the exact same thing.

FOMO on "my technology is too scary to exist" is a fun twist.

I know, it's not the first time, someone even linked an article where OpenAI said the same sort of thing about GPT-2 back in 2019...

Comment Re:Reliability? (Score 1) 56

I'd want:
- Trivially replaceable battery. This means no glue, and ideally means a standardized battery approach to maximize chances of buying a replacement one down the line.
- Putting ports on a separate board than the CPU and ram and such. Physical damage comes to ports, especially charging ports. Having this delegated off board minimizes risk of having to replace something expensive.
- Replacable keyboard and screen. Again, at high risk of damage and should be replaceable
- Removable storage. If your mainboard does fail, smoothest if you can move your SSD over to the replacement main board.
- Commitment to consistent form factor. If 5 years down the line it breaks, I can accept if I can't get *exactly* the same board anymore, but it would be nice if I could just get a new generation board and replace it without letting perfectly adequate screen, keyboard, case go to waste.

So mostly Framework, Lenovo recently did a think with a Thinkpad also exhibiting most of these, except no indication of generation to generation consistency in parts.

Comment Re:ThinkPad? (Score 1) 56

Note that this report might be based on perusing websites more than hands on evaluation.

That said, "Lenovo" laptops include the non-thinkpads, which tend to be *terrible* for repair-ability. For example, in many cases they don't consider the keyboard to be a part worthy of keeping replaceable without replacing half of the laptop, despite it being one of the most likely things for a user to break. You can get third-party parts that is just the keyboard, but you have to destroy a lot of plastic welds to even try, and there was never a design to put it really back together after you did that.

The Thinkpads tend to do pretty well, though increasingly the cpu and memory are "just part of the board now", but honestly that's just the direction of that industry in general. We are pushing physics, it's harder for us to do modular RAM at the speeds we want to interact with the RAM, LPCAMM is a thing, but even then you just have a single LPCAMM and it's less about 'repair' and more about being able to have different memory amounts by swapping the module out.

Comment Re:Most Thinkpads Quite Repairable (Score 5, Insightful) 56

Couldn't find actual details on *which* models they looked at.

If you look at the non-ThinkPad Lenovo laptops... They are complete shit for repairability.

The ThinkPads on the other hand tend to be very very good.

But other issues make me wonder about their competency in writing the report. Notably they give Lenovo a "lobbying penalty" for being a member of a group that fights right to repair but gives Motorola a pass for not being in those groups.... Lenovo and Motorola are the same company, and they don't seem to realize that.

Comment Re:What I find amusing is... (Score 2) 38

It's not out of date, it's a simplification.

They don't innately understand their capabilities, but information about it's own capabilities may be fed explicitly into it by other means, just like any other data you want to endeavor to put into the context.

The concept of asking if it implements a certain behavior and either it's deliberately lying or it's not actually there relies upon a false assumption that of course it has innate knowledge of it's own implementation without any "help".

The core relevant issue is that the LLMs will generate an answer based on no data. Instead of "Information on that one way or the other is not available to the model" it sees the answer most consistent with the narrative to be "Those behaviors do not exist". LLMs tend to generate output that implies confidence regardless of whether there should be confidence or not. The workaround has been to try to do everything possible to make sure there is actual data in the context window and hope it just doesn't come up that much, but this is only so possible. Some coding has the opportunity to use test cases to add "the output given failed to work" automatically to the narrative to drive iteration and maybe get further.

Comment Re:Self discipline (Score 1) 128

Here's the thing, some folks do the discipline and keep a healthy weight, but they are basically always feeling hunger. Some people don't feel it but some people are having to constantly fight sensation of hunger, with a respite of a little bit after a meal, and almost never feeling 'full'.

If we had something to tame the rather depressive experience of constantly denying one's hunger because you know in your mind that you got the nutrition and caloric intake you need, but your body wants to eat your way to obesity.

Comment Re:Sounds like the lights might be going out on PO (Score 2) 26

Problem is that the only viable market for mainframe are current mainframe customers, who are so change averse that if you even hint at breaking compatibility they will be triggered to start evaluating *all* their options if they are faced with a potential migration anyway.

IBM may love the idea of shuttering their in-house stuff in favor of massively cheap commodity stuff, but they would absolutely no longer command mainframe margins.

Comment Re: 25,000 lines of code (Score 2) 78

You assume that a standards document exists and is also sufficiently specific for all scenarios. Other than some very fundamental IETF stuff have I seen a standards document that pretty much covers the scope specifically. Even more severely, "specifications" for an internal project have been so traditionally bad, a whole methodology cropped up basically saying that getting specifications that specifically correct is a waste of time because during the coding it will turn out to not be workable.

Yes, it can write hundreds of tests, but if the same mediocre engine that can't code it right is also generating tests, the tests will be mediocre. Leading to bizarre things like a test case to make sure '1234' comes back as 'abcd' and the function just always returns the fixed string 'abcd' and passes the test because it decided to make a test and pass it instead of trying to implement the logic. I have seen people almost superstitiously add to a prompt "and test everything to make sure it's correct" and declare "that'll fix the problems". The superstitious prompting is a big problem in my mind, that people think they add a magic phrase and suddenly the LLM won't make the mistakes LLMs tend to make. I have seen people take an LLM at their word when the LLM "promises" to not make a specific mistake, and then confounded the first time they hit the LLM making the mistake anyway. "It specifically said it wouldn't do that!", it doesn't understand promises, the thing just will generate the 'consistent' followup to a demand for a promise which is text indicating making the promise.

Take the experiment where they took Opus 4.6 and made it produce a C compiler. To do so, the guy at Anthropic said point blank he had to invest a great deal of effort in a test harness, that the process needed an already working gcc to use as a reference on top of that, and specified the end game as a bootable, compiled kernel. Even then he had to intervene to fix it and it couldn't do the whole thing and when people reviewed the published result, it failed to compile other valid code and managed to compile things that shouldn't have been compilable. This is Anthropic with their best model doing a silly stunt to create a knock off of an existing open source project with full access to said project and source code and *still* it being a lot of human work for mediocre output.

Yes, it has utility, but there's a lot of people overestimating capabilities and underestimating risks and it's hard for the non-technical decision makers to tell the difference until much further down the line. Mileage varies greatly depending on the nature of the task at hand as to whether LLM is barely useful at all or it can credibly almost generate the whole thing.

Slashdot Top Deals

"Ask not what A Group of Employees can do for you. But ask what can All Employees do for A Group of Employees." -- Mike Dennison

Working...