Slashdot is powered by your submissions, so send in your scoop

 



Forgot your password?
typodupeerror

Comment Re:Sounds like a good lawsuit (Score 1) 34

You are right, get legal advice, the cost can be passed on to them anyway.

AIUI, your costs can't (or couldn't) generally be passed on when using the small claims system. Has that changed? It's been a while since I went through the process, so it's possible that my information here is out of date.

Comment Re:Sounds like a good lawsuit (Score 1) 34

There is obviously a personal data angle here. There might also be a defamation angle if the system works as implied by TFS, since it appears that someone's reputation has been affected because someone else lied about them and this has demonstrably caused harm? If there was more than one relevant incident then there might also be a harassment angle.

Please be careful with that advice about requesting compensation in a Letter Before Action, though. There are fairly specific rules for what you can and can't claim under our system and just going in with claiming some arbitrary figure of a few thousand pounds in "compensation" for vague damages is far from guaranteed to get the result you're hoping for. If someone were serious about challenging this kind of behaviour, they might do better to consult with a real lawyer initially to understand what they might realistically achieve and what kinds of costs and risks would be involved.

Comment Thirty Fucking Years Late (Score 1, Informative) 91

Congratulations, you feckless imbeciles. You've "innovated" general software package management a mere three $(GOD)-damned decades after Redhat and Debian did it.

While you're at it, why don't you "invent" a tiling window manager that can be driven entirely from the keyboard... Oh, wait...

Honestly... Why is anyone still voluntarily giving money to these chowderheads?

Submission + - Nvidia Accused of Media Manipulation Ahead of RTX 5060 Launch

jjslash writes: Hardware Unboxed has raised serious concerns about Nvidia's handling of the upcoming GeForce RTX 5060 launch. In a recent video, the independent tech reviewers allege that Nvidia is using tightly controlled preview programs to manipulate public perception, while actively sidelining critical voices.

The company is favoring a handful of more "friendly" outlets with early access, under strict conditions. These outlets were given preview drivers – but only under guidelines that make their products shine beyond what's real-world testing would conclude. To cite two examples:

  • One of the restrictions is not comparing the new RTX 5060 to the RTX 4060. Don't even need to explain than one.
  • Another restriction or heavy-handed suggestion: run the RTX 5060 with 4x multi-frame generation turned on, inflating FPS results, while older GPUs that dont support MFG look considerably worse in charts.

The result: glowing previews published just days before the official launch, creating a first impression based almost entirely on Nvidia's marketing narrative.

Comment Re:That's because you don't understand (Score 1) 135

Some are. I work more with smaller businesses than Big Tech and I don't think we've ever had more interest in our software development services.

There is a rational concern that technical people will understand the benefits and limitations of generative AI but management and executive leadership will fall for the hype because it was in the right Gartner quad or something and that will lead to restructuring and job losses. Businesses that get that wrong will probably be making a very expensive mistake and personally I'm quite looking forward to bumping our rates very significantly when they come crying to people who actually know what they're doing to clean up the mess later. It's not nice for anyone whose livelihood is being toyed with in the meantime, obviously, but I don't buy the arguments that this isn't fundamentally an economic inevitability as the comment I replied to was implying.

Comment Re:That's because you don't understand (Score 1) 135

Historically and economically, it is far from certain that your hypothetical 20% increase in productivity would actually result in a proportionate decrease in employment. Indeed, the opposite effect is sometimes observed. Increased efficiency makes each employee more productive/valuable, which in turn makes newer and harder problems cost-effective to solve.

Personally, I question whether any AI coding experiment I have yet performed myself resulted in as much as a 20% productivity gain anyway. I have seen plenty of first-hand evidence to support the theory that seems to be shared by most of the senior+ devs I've talked with, that AI code generators are basically performing on the level of a broadly- but shallowly-experienced junior dev and not showing much qualitative improvement over time.

Whenever yet another tech CEO trots out some random stat about how AI is now writing 105% of the new code in their org, I am reminded of the observation by another former tech CEO, Bill Gates, that measuring programming progress by lines of code is like measuring aircraft building progress by weight.

Comment Re:BS (Score 1) 149

LLMs perform very well with what they've got in context.

True in general, I agree. How well any local tools pick out context to upload seems to be a big (maybe the big) factor in how good their results are with the current generation of models, and if they're relying on a RAG approach then there's definitely scope for that to work well or not.

That said, the experiment I mentioned that collapsed horribly was explicit about adding those source files as context. Unless there was then a serious bug related to uploading that context, it looks like one of the newest models available really did just get a prompt marginally more complicated than "Call this named function and print the output" completely wrong on that occasion. Given that several other experiments using the same tool and model did not seem to suffer from that kind of total collapse, and the performance of that tool and model combination was quite inconsistent overall, such a bug seems very unlikely, though of course I can't be 100% certain.

It's also plausible that the model was confused by having too much context. If it hadn't known about the rest of the codebase, including underlying SQL that it didn't need to respond to the immediate prompt, maybe it would have done better and not hallucinated a bad implementation of a function that was already there.

That's an interesting angle, IMHO, because it's the opposite take to the usual assumption that LLMs perform better when they have more relevant context. In fact, being more selective about the context provided is something I've noticed a few people advocating recently, though usually on cost/performance grounds rather than because they expected it to improve the quality of the output. This could become an interesting subject as we move to models that can accept much more context: if it turns out that having too much information can be a real problem, the general premise that soon we'll provide LLMs with entire codebases to analyse becomes doubtful, but then the question is what we do instead.

Comment Re:BS (Score 1) 149

I could certainly accept the possibility that I write bad prompts if that had been an isolated case, but such absurdities have not been rare in my experiments so far, and yet in other apparently similar scenarios I've seen much better results. Sometimes the AI nails it. Sometimes it's on a different planet. What I have not seen yet is much consistency in what does or doesn't get workable results so far, across several tools and models, several variations of prompting style, and both my own experiments and what I've heard about in discussions with others.

The thing is, if an AI-backed coding aid can't reliably parse a simple one-sentence prompt containing a single explicit instruction together with existing code as context that objectively defines the function call required to get started and the data format that will be returned, I contend that this necessarily means the AI is the problem. Again I can only rely on my own experience, but once you start down the path of spelling out exactly what you want in detail in the prompt and then iterating with further corrections or reinforcement to fix the problems in the earlier responses, I have found it close to certain that the session will end either unproductively with the results being completely discarded or with a series of prompts so long and detailed that you might as well have written the code yourself directly. Whatever effect sometimes causes the these LLMs to spectacularly miss the mark also seems to be quite sticky.

In the interests of completeness, there are several differences between the scenario you tested and the one I described above that potentially explain the very different results we achieved. I haven't tried anything with Qwen3, so I can't comment on the performance of that model from my own experience. I was using local tools that were handling the communication with (in that case) Sonnet, so they might have been obscuring some problems or failing to pass through some relevant information. I wasn't providing only the SQL and the function to be called, I gave the tool access to my entire codebase, probably a few thousands lines of code scattered across tens of files in that particular scenario. Any or all of those factors might have made a difference in the cases where I saw the AI's performance collapse.

Comment Re:I for one am SHOCKED. (Score 1) 52

You don't appear to consider the cost to everyone who didn't buy the glasses, but encounters someone wearing them.

This is the thing that people saying things like "You have no reasonable expectation of privacy in public" seem unable to grasp. There is a massive and qualitative difference between casual social observations that would naturally occur but naturally be forgotten just as quickly and the systematic, global scale, permanently recorded, machine-analysed surveillance orchestrated by the likes of Google and Meta. Privacy norms and (if you're lucky) laws supporting them developed for the former environment and are utterly inadequate at protecting us against the risks of the latter.

And it should probably be illegal to sell or operate any device that is intended be taken into private settings and includes both sensors and communications so that even in a private setting the organisations behind those devices can be receiving surveillance data without others present even knowing, never mind consenting.

Perhaps a proportionate penalty would be that the entire board and executive leadership team of any such organisation and a random selection of 20 of each of their family and friends should be moved to an open plan jail for a year where there are publicly accessible cameras and microphones covering literally every space. Oh, and any of the 20 potentially innocent bystanders who don't think that's OK have the option to leave, but if they do, their year gets added to the board member or executive they're associated with instead.

Comment Re:BS (Score 1) 149

FWIW, I was indeed surprised by some of the things these tools missed. And yes, the worst offenders were the hybrid systems running some sort of local front-end assistant talking to a remote model. Personally, while small context limits get blamed a lot for some of the limitations of current systems, I suspect that limitation is a bit misleading. Even with some of the newer models that can theoretically accept much more context, it would still be extremely slow and expensive to provide all of a large codebase to an LLM as context along with every prompt, at least until we reach a point where we can run the serious LLMs locally on developer PCs instead of relying on remote services.

Even with all of those caveats, if I give a tool explicit context that includes the SQL to define a few tables, a function that runs a SQL query using those tables and returns the results in an explicitly defined type, and a simple prompt to write a function that calls the other function (specified by name) and print out the data it's retrieved in a standard format like JSON, I would not expect it to completely ignore the explicitly named function, hallucinate a different function that it thinks is returning some hacky structure containing about 80% of the relevant data fields, and then mess up the widely known text output format. And yet that is exactly what Sonnet 3.7 did in one of my experiments. That is not a prototype front-end assistant misjudging which context to pass through or a failure to provide an effective prompt. That's a model that just didn't work at all on any level when given a simple task, a clear prompt, and all the context it could possibly need.

Comment Re:BS (Score 1) 149

As for their ability to infer, I couldn't agree less with that.

Dump an entire code base into their context window, and they demonstrate remarkable insight on the code.

Our mileage varies, I guess. I've done quite a few experiments like that recently and so far it seems worse than a 50/50 shot that most of the state-of-the-art models will even pick up on project naming conventions reliably, and far less that they'll follow basic design ideas like keeping UI code and database code in separate packages or preferring the common idioms in the programming languages I was using. These were typically tests with real, existing codebases on the scale of a few thousand lines, and the tools running locally had access to all of that code to provide any context to the remote services they wanted. I've also tried several strategies with including CONVENTIONS.md files and the like to see if that helped with the coding style, again with less than convincing results.

Honestly, after so much hype over the past couple of years, I've been extremely disappointed so far by the reality in my own experiments. I understand how LLMs work and wasn't expecting miracles, but I was expecting something that would at least be quicker than me and my colleagues at doing simple, everyday programming tasks. I'm not sure I've found any actual examples of that yet, and if I have, it was faster by more like 10% than 10x. The general response among my colleagues when we discuss these things is open ridicule at this point, as it seems like most of us have given it a try and reached similar conclusions. I'm happy for you if you've managed to do much better, but I've never seen it myself yet.

Slashdot Top Deals

If you sell diamonds, you cannot expect to have many customers. But a diamond is a diamond even if there are no customers. -- Swami Prabhupada

Working...