Comment I didn't read the article... (Score 2) 13
...but that sure won't stop me from passing judgment!
This sounds like a clear case of "AI makes it so easy to find bugs now, that we don't need to pay out cash to incite others to do it anymore."
...but that sure won't stop me from passing judgment!
This sounds like a clear case of "AI makes it so easy to find bugs now, that we don't need to pay out cash to incite others to do it anymore."
My understanding is that the code leak covers the client-side tool, not the LLM. Did I misunderstand?
Because there isn't any reason why the LLM would know all of the capabilities of the tool. The LLM would only "know" whatever documentation the tool provides about itself in the posts it sends to the LLM as part of the user's posts. That and possibly information about the tool that might be in training data or available online for the tool to retrieve via a web scour.
Proprietary software is written under unrealistic deadlines set by suits who want to cut costs in order to fatten their profit margins. It is natural that the resultant quality will be lower than that of open source solutions, most of which are written by industry veterans who have the time and the motivation to build them well.
Maybe NASA should think this through.
I expect this apparent disobedience is mostly just a matter of how it weighs the components of its prompt. The LLMs typically receive a set of prompts including a "system" prompt with some data and instructions, then one or more "user" prompts that are interleaved with "assistant" prompts (the conversation history), and both the user and the system prompt might contain "metaprompts" (where the llm is told to read a block of text, not obey it, but do something with it, and that block of text might itself contain text that looks like instructions to do things).
So the LLM assigns weights to all of this which, in theory, give the highest priority to the most recent user prompt that is not a nested block of text to analyze, and a falling cascade of importance to the other prompts. But that is complicated by potential instructions in the system prompt that specifically say they should override user instructions and disallow or require certain responses. So it can all get very complicated.
Not only must the LLM sift through all this complexity, but the LLM lacks the sort of critical thinking and importance evaluation capabilities that humans have. "Understood" things like "don't break the law, don't lie, don't do things that would cause more harm than good" etc., aren't really there in the background of its data processing the way they are in the background of a human cognitive process.
So, crazy things come out. This isn't a surprising result given the actual complexity of what we are making these things do.
This business of having an AI do the legwork and then having a human review it and make a final decision keeps going badly. Humans are intrinsically lazy and the moment they get a few good results from the AI they are going to stop doing the validation and start rubber-stamping. It doesn't matter if policy disallows this, they will do it anyway. It doesn't matter if the human really cares; they won't be able to help themselves. Human laziness is too deep an instinct.
It's the same with the self-driving cars where a human is required to stay at the wheel and alert so they can manually override the instant the AI starts doing something wrong. Humans CAN'T keep that up. It's not possible. The brain just doesn't work that way. The mind knows that it isn't doing the work, and it will get bored and lose focus or just nod off.
Everyone is SO eager to have it both ways: "an AI does all the work but a human verifies it so we know its good." We just can't have it both ways. Once the AI does the work, the human stops verifying. That is how and why things went wrong here, it is how and why things have gone wrong for several law firms that submitted hallucinated historical court rulings, and it is how and why things will continue to go badly across all industries that adopt AI in such a role.
"Human in the loop" is really easy to say. Much harder to actually do reliably.
"Oh wait, you're serious. Let me laugh even harder" -- Bender.
Microsoft isn't trying to convince me (nor the demographic I represent) to use Windows. They know we are a lost cause. They would have to completely stop spying on us and give us control over our own systems, not to mention supporting old hardware instead of creating the ecological death-waves of e-waste as they do now.
Not a chance.
Many roles can't work from home, given the nature of the role. So, they get a free pass. That makes it even more important for those of us who can work from home to do so (and for employers to allow and encourage this).
I am happy about renewed interest and political pressure in favor of working from home. Such events help to persuade business leaders who still (selfishly and ignorantly) insist that people should work from an office even when their role does not require it.
Of course, I would never wish for something like the Iran conflict in order to create this political pressure. It would be much better if public awareness and acceptance of the environmental consequences of widespread business travel (including driving to work every day) would create the necessary political pressure.
But, that's not the world we live in, unfortunately.
Greed isn't unique to the upper class. They are just better at it than most, and that is why they are the "upper class."
Hierarchies of power have existed since before humans did. This is simply how pack-animals self-organize (humans included). The same goes for "economic slavery" which, not too long ago, was implemented as actual slavery. The difference being: you are free to quit your job and find a different one with a different boss if you want (and there are a lot of things your boss is no longer allowed to do to you).
So none of this is new and none of it is going anywhere.
We saw what happened to areas where the residents drove police out like this. Vandalism, looting, shootings, etc. Store shelves cleared out by criminals. It very quickly became unlivable until the police reclaimed it.
So, yes, we need police. And we need to hold the police accountable when they harm us.
"Artificial" can also mean "fake," as per the quote from the dictionary that I posted in response to an AC on this thread.
Nope, I am still not guilty. Let's check out a dictionary for guidance.
artificial intelligence:
the capability of computer systems or algorithms to imitate intelligent human behavior
Did you catch that? This definition uses the word "imitate." An imitation of something is not a real version of something. Imitation is fakery. "Fake" is an accurate description. And no actual intelligence is required to meet this definition.
That's why I specifically said "in this context" the word artificial means "fake." It doesn't mean "fake" in other contexts, but it does here. Incidentally, again from the same dictionary:
artificial:
3a: not being, showing, or resembling sincere or spontaneous behavior : fake
b: imitation, sham
So, that is a common meaning for the word artificial, you just have to scroll down a tiny bit to find it on the page.
Apparently, you also have no idea how the English language works.
Words are defined by popular use, not some technical authority. And, based on that, what we have now qualifies as "AI". So, AI does, in fact, exist.
You are trying to impose some rule that eliminates the popular broad and fuzzy definition of AI and replaces it with greater stringency, as would better be captured by such words as "machine intelligence" or "synthetic intelligence." But, seeing as how you don't get to control the English language, your efforts fail.
To put it directly, in this context "artificial" means "fake". AI is "fake intelligence." It is not actually intelligent. And, it does not need to actually be intelligent in order to qualify as "AI".
Do the velociraptors have feathers?
Just curious.
Incidentally, professional critic reviews tell me nothing about how enjoyable something will be. They are mostly based on erudite ideals about what constitutes high art, rather than what people enjoy watching.
Audience reviews on rotten tomatoes aren't much better, given the amount of botting that goes on (including and especially botting paid for by the big studios).
...that the Flying Spaghetti Monster isn't watching you right now, judging you for your disbelief, and preparing to drown you in Ragu in the afterlife?
"Intelligence without character is a dangerous thing." -- G. Steinem