blackcoot - Slashdot User

Comment Re:Unfortunately people are not fungible (Score 1) 113

by TuringTest on Tuesday January 02, 2024 @11:31AM (#64124467) Attached to: Nobel Prize Winner Cautions on Rush Into STEM

Yes, it's the Luddites all over again. No one remembers that the Ludittes had very good personal reasons to oppose technology, and that they suffered under the technological change.

Comment Re:Unfortunately people are not fungible (Score 3, Interesting) 113

by TuringTest on Tuesday January 02, 2024 @08:32AM (#64124055) Attached to: Nobel Prize Winner Cautions on Rush Into STEM

People with affinity for STEM will be fine; even if AIs are very good at making new discoveries, the most important part of any new field of knowledge is realizing what areas are worth exploring; and AI will need to be lead to those areas for quite a long time. Boffins will continue to thrive in such an environment.

What worries me is what will happen to people who are not able to do science. Hospitality and healthcare may indeed thrive, but not everybody is capable of those; and artistic, creative types will suffer when their jobs can be almost equaled at 100x times the rate of production. What will happen to people without any remarkable skill, when their work can be made cheaper and faster by a skilled worker controlling an AI?

Comment Automated knowledge have never existed before now (Score 1) 113

by TuringTest on Tuesday January 02, 2024 @08:24AM (#64124043) Attached to: Nobel Prize Winner Cautions on Rush Into STEM

Economics in general has the problem that you can't store work, e.g. pre-work and keep it for future use, as everyone who has ever cleaned a room can tell you. Most work has to be done in the very moment the result is needed. You can't even refill your car before the gas tank is emptied enough to accept new gas.

But STEM jobs are a way to actually pre-work. Once you automate something, you will not have to do exactly that work again.

You hit the nail on the head. Throughout human history, every day we have needed to provide for food and shelter. Our brain evolved to make us better at anticipating where we would get them and create ways to secure those needs.

The thing is, we now have a new technology - computing - that allows to delay all kinds of work, and we still don't know what all the implications to human society will be.

Before the printing press we had ways to store language in physical form and store ideas; but efficient printing made the process efficient, accelerating our capacity to do science and improve our knowledge. This led to new physics discoveries that brought us better energy management, creating automated machines for physical work. Before steam engines, there were machines that used energy to automate work, but they were limited to specific locations and tasks, such as mills, ploughs and cranes.

Now we have a technology that automates the application of knowledge to virtually any kind of work, physical or mental. Whether AIs have thoughts of their own or not, the fact remains that software is capable of condensing large areas of human knowledge (including STEM knowledge) and applying it repeatedly without direct human intervention. And as you said, we will not have to do again the work to understand those task in order to get their benefits. This will again accelerate how parts of our society function, which will again adapt around the new processes created around that increased efficiency.

Comment Re:Why admit it? (Score 1) 29

by TuringTest on Wednesday December 20, 2023 @12:47PM (#64093587) Attached to: AI Cannot Be Patent 'Inventor,' UK Supreme Court Rules in Landmark Case

I assume the intend is to have dumber system just enumerate all possible ideas en masse and have them patented automatically. That way if someone proves one approach to be actually useful you can file prior registration.

Thus explaining why "UK patent law is currently wholly unsuitable for protecting inventions generated autonomously by AI machines.”

Comment Re: shame (Score 1) 69

by TuringTest on Sunday December 17, 2023 @05:07AM (#64086855) Attached to: Why Google Will Stop Telling Law Enforcement Which Users Were Near a Crime

I mean, seriously, that data has proven crucial in countless federal, state, and local investigations, it would be a shame to lose it all because a private entity chooses not to save it, right?

Don't worry for them, phone tower companies have it covered.

Clever criminals leave their phone home anyway.

Comment Re:LLMs are nothing but a good search engine (Score 1) 46

by TuringTest on Wednesday December 13, 2023 @10:17AM (#64078655) Attached to: MIT Group Releases White Papers On Governance of AI

I can ask any old search engine to give me a recipe for chocolate chip cookies. I can't ask a search engine to then double the ingredients after it provides me with the recipe. That's not merely "combine the multiple found snippets".

We have a different definition of learning from the training set. The LLM is able to double numbers because it was trained on examples of doubling. If you trained it only with those examples and it learned that x2 means doubling, it would not be able to infer that x3 is "adding three times" and x4 is "adding 4 times" unless you also included examples of tripling and quadrupling. The 'learning' it can exhibit is limited to patterns found in the corpus of input data.

LLMs don't have the capability to deduce new facts from the facts it knows, like a symbolic search engine would; only to probabilistically generate strings or images that simulate such generation. The process is completely different. And it's limited to probabilities over facts in the training data; the prompt is used only to calculate posteriors on those probabilities, not to generate new ones.

I've seen such a capability demonstrated with my own eyes. I pasted a portion of the manual for a proprietary DSL that isn't on the Internet into my context and asked deepseek-coder-33b-instruct to write a program in a language it knew nothing about.

However, as soon as you clear the context, the model will again return to knowing nothing about the language. That's an instance of the generating content most likely to match the multiple levels of activated patterns I talked about. It can create results more elaborate than a search engine (which merely returns the original document unchanged), but it's based on the same principle. It's using the content you provided as input, but not learning it. It won't remember a thing about your DSL or the written program, much less generalize about them for other new prompts.
It has not learned in any meaningful sense of the word, which necessarily implies the information becoming part of the permanent knowledge of the model.
Even if you never deleted and kept the DSL spec and the conversation forever in the context, it would be akin to memorising it rather than learning it, a weaker form. You need something like a LoRA to have the model acquire new knowledge, as only then it becomes part of its training data and can be used as new knowledge.

Comment Re:LLMs are nothing but a good search engine (Score 1) 46

by TuringTest on Tuesday December 12, 2023 @06:07PM (#64077267) Attached to: MIT Group Releases White Papers On Governance of AI

This perspective is a fundamental misunderstanding of what LLMs are.
On the contrary, it's the result of careful consideration of how LLMs operate and reflection on the observed results.

The point of the technology is generalization, the ability to apply learned concepts. It isn't about cutting and pasting snippets of text from a dataset.

I didn't say that it's merely cutting and pasting snippets. As I mentioned, the model has the capability to use learned language to combine the multiple found snippets into a single coherent discourse. But their discourse *is* essentially a regurgitation of the many items of content retrieved from the prompt; some as text snippets, and others as more complex patterns learned directly from their training corpus.

But if you think that the model creates knowledge beyond what's provided in the training data, you're the one with a fundamental misunderstanding. What you call "generalization" is a codified compression of the trained corpus; that compression happens to capture patterns in the input documents at multiple levels - some at a surface syntax level, others connected to more abstract concepts that humans used to create and classify the content (such as style, emotion, and the meaning of the topics themselves).

When you apply those compressed patterns to new content, such as an input prompt, it activates the most relevant of those patterns, and generates the content most likely to match the multiple levels of activated patterns in the context of the current generation point. But the models in their current form have no capability at all to create new patterns at runtime based on their applied use, i.e. no memory and no method to reason about what they see.

So you're be mistaken if you think they have any capability to learn from content that was not part of their training data; they need to be retrained with more unit data in order to acquire such new content. Maybe in the near future there'll be a way to have models with actual online learning that get new knowledge directly from their own interactions, like they do now offline with RLHF.

Comment LLMs are nothing but a good search engine (Score 1) 46

by TuringTest on Tuesday December 12, 2023 @03:49AM (#64075485) Attached to: MIT Group Releases White Papers On Governance of AI

The best way to understand what what LLMs are doing is treat them as a search engine that actually works as intended, retrieving multiple results from its corpus of training documents. Thanks to modern languages processing techniques, they are capable of combining several results in a single narratively coherent reply. Just like a search engine though, the quality of the talents is limited by the quality of the documents provided.

LLMs now have the advantage of not being tainted by SEO techniques and in-place advertising (for now), so we get to experience something similar to how Google worked when it was first released: an unbiased index to all the knowledge humans have shared on digital media.

For the second time we'll have a small window of opportunity to see what it's like to have all that knowledge available, until it's again made unusable by the same market forces that poison the pool of content for small personal gains. No need to blame regulation there, which could actually have an opportunity to reduce that degradation if done well.

Comment Re:What was wrong with these? (Score 4, Informative) 46

by TuringTest on Tuesday December 12, 2023 @03:36AM (#64075479) Attached to: MIT Group Releases White Papers On Governance of AI

What was wrong with these? https://en.wikipedia.org/wiki/...
The best known set of laws are Isaac Asimov's "Three Laws of Robotics".

Are you joking? The *whole point* of Asimov's Three Laws of Robotics was to demonstrate, through his
robot stories, that a small set of simple rules could never work too control artificial intelligence in a complex world full of ambiguities.

Comment Three predictions (Score 2) 78

by TuringTest on Wednesday December 06, 2023 @05:12PM (#64061643) Attached to: Millions of Coders Are Now Using AI Assistants. How Will That Change Software?

First, new AI-friendly programming languages will be created, that are simpler for LLMs to learn. Once developers have assistance of the model to create and understand code, easy-to-read concise PLs won't be that essential; but unambiguous precision will be. Code snippets will become more write-only.

Second, business-level programming will become even more dependent on composing pre-built, well-tested library components. Programs will become less of a logically coherent cathedral with solid pillars that (tries to) solve all the needs of an organisation, and more of a camp of communicating, loosely connected tools that each serve one single concern and actually solves the needs of a worker or small team.

Third, thanks to the previous two, most programs won't be built and run with the current enterprise lifecycle of writing code to specs, debug in a development environment, then release; it will become integrated with the AI platform instead, running in the background without the code ever abandoning the environment where it was written. Multimodal AIs like Gemini are capable of writing ad-hoc user interfaces, built on the spot to explore and consume the data according to some current user business needs. Many of the tools will be transient, single use widgets that solve a simple task at hand following the specifications provided by the end user, just like information workers now are able to create spreadsheets to define ad-hoc processes to keep and model data to solve their work. In this scenario, exposing the code is not something that will be needed usually; only to the point that it's needed to check that operations on the data are using the right logic.

Will traditional programming disappear? Of course not; just like system programming is still needed, someone will have to write the well-tested components that the AI combines; and someone will need to keep in check the architecture of more complex combinations of tools that are used in a stable way through time. But for most small tasks, users will finally able to create their own tools for most of their simple and moderately complex information processing needs.

Comment Re:Simple advice for the leaks (Score 1) 43

by TuringTest on Saturday December 02, 2023 @06:37PM (#64049873) Attached to: Amazon's Q Has 'Severe Hallucinations' and Leaks Confidential Data in Public Preview, Employees Warn

I guess we'll soon run into the problem that AI is learning nearly exclusively from AI. Now, let's assume that 90% of AI generated content isn't total garbage. That means the first generation of AI had 100% sensible content to learn from (i.e. 100% human content). The next generation will have about 91%. The generation after hat, about 81%. and so on.

At what point will AI be reduced to learning more garbage than something that could actually be useful?

Oh, you're thinking small. It can get way worse than that, faster.

The latest fancy technique I've seen in use, to (presumably) reduce hallucinations, is using AI advisors as part of the training. Yup, they not only are using AI-generated training data, now they're using AI for evaluating the quality of the generated content as well. What could possibly go wrong? The theory behind this is that it can evaluate thousands of content samples much faster, greatly enlarging the volume of the training.

Allegedly the advisor will be capable of detecting if the content generated is similar enough to the original documents to decide whether generation is hallucinating or being faithful. Truth is, they'll be merely training the model to appear entirely reasonable and accurate, without leaving a trace that it totally fails to understand the meaning of the content it's working with.

Comment Simple advice for the leaks (Score 4, Informative) 43

by TuringTest on Saturday December 02, 2023 @09:12AM (#64048811) Attached to: Amazon's Q Has 'Severe Hallucinations' and Leaks Confidential Data in Public Preview, Employees Warn

Don't put any document in the training that you don't want to see eventually exposed.

The training basically consist in having the network memorize and connect everything you throw at it, codified in a highly compressed form.
Even if you add to the mix instructions telling it to hide some of that information when it's queried from certain angle, you'll never know what other connections have been made that will surface the same information when asked from a different perspective.

Therefore, train any public-facing AI on a need-to-know basis. The most secure information is the one that is not there.

Comment Re:I have to thank UNITY (Score 1) 45

by TuringTest on Wednesday November 29, 2023 @04:47PM (#64042019) Attached to: Unity Software To Cut 3.8% of Staff In 'Company Reset'

Isn't there a massive amount of various plug-ins and resources for both Unity and UE, mostly paid?

Does GODOT allow for the same? Or do they have to be "free and open source" as well to plug into it?

Godot is MIT licensed, so worst case a company can fork it to create a proprietary version. However, the powerful plugin system allows creating plugins with any model.

There's the fact that people arriving to Godot tend to value open source highly, of course. But the newly gained attention could expand the community to areas where more people create proprietary closed source complements for the tool.

Comment Re:When You Want Your AI as Reliable as Your OS (Score 1) 14

by TuringTest on Wednesday November 15, 2023 @06:01PM (#64008199) Attached to: Microsoft and Nvidia Are Making It Easier To Run AI Models on Windows

I expect the BSOD message to be something along the lines of

"Well, I don't think there is any question about it. It can only be attributable to human error. This sort of thing has cropped up before, and it has always been due to human error."

And now that you have posted it online, in its next web scrap cycle the AI will know what text to put in there as the answer most likely to be shown in the context of an AI-generated BSOD message.

Comment Re:They will coexist (Score 1) 129

by TuringTest on Monday November 06, 2023 @02:33AM (#63983478) Attached to: Will Sodium Batteries Become an Alternative To Lithium?

Markets aren't always the most efficient decision-makers, but in my experience governments very rarely are efficient decision-makers. YMMV

It's not said enough, and it may shock Anglo-Saxon audiences with their protestant ethics, but economic efficiency is not the sole axis to measure public actions. Sometimes we want governments to regulate some topics precisely because we don't want the most efficient outcome but because we pursue other values like human life, dignity, equality or not leaving anyone behind. Sadly those values are not typically taken into account as economic markers and thus free markets are unable to optimise for them, requiring regulation to achieve them.

Slashdot Top Deals