Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×

Comment Re:Shamefully misleading use of term (Score 1) 64

Good to see we're abandoning the premise that the logic behind LLMs is "simple".

LLMs, these immensely complex models, function basically as the most insane flow chart you could imagine. Billions of nodes and interconnections between them. Nodes not receiving just yes-or-no inputs, but any degree of nuance. Outputs likewise not being yes-or-no, but any degree of nuance as well. And many questions superimposed atop each node simultaneously, with the differences between different questions teased out at later nodes. All self-assembled to contain a model of how the universe and the things within it interact.

At least, that's for the FFNs - the attention blocks add in yet another level of complexity, allowing the model to query a latent-space memory, which each FFN block then outputs transformed for the next layer. The latent space memory being.... all concepts in the universe that exist, and any that could theoretically exist between any number of existing concepts. These are located in an N-dimensional space, where N is hundreds to thousands. The degree of relationship between concepts can be measured by their cosine similarity. So for *each token* at *each layer*, a conceptual representation of somewhere in the space of everything that does or could exist is taken, and based on all the other things-that-does-or-could exists and their relative relations to each other, are transformed by the above insane-flow-chart FFN into the next positional state.

Words don't exist in a vacuum. Words are a reflection of the universe that led to their creation. To get good at predicting words, you have to have a good model of the underlying world and all the complexity of the interactions therein. It took achieving the Transformers architecture, with the combination of FFNs and an attention mechanism, along with mind-bogglingly huge scales of interactions (the exponential interaction of billions of parameters), to do this - to develop this compressed representation of "how everything in the known universe interacts".

Comment That's just RAG. (Score 4, Interesting) 64

"Grok's differentiator from other AI chatbots like ChatGPT is its exclusive and real-time access to X data." That's just RAG. Retrieval Augmented Generation. All Grok is doing is acting as a summarizer. This is something you can do with an ultra-lightweight model, you don't need a 314B param monster.

Also, you don't need an X Premium subscription to "get access" to Grok, since its weights are public. To "get access" to an instance running it, maybe.

I've not tried running it, but from others who have, the general consensus seems to be: it's undertrained. It has way more parameters than it should need relative to its capabilities. Kinda reminiscent of, say, Falcon.

I also have an issue with "A snarky and rebellious" LLM. Except people using them for roleplaying scenarios (where you generally don't want a *fixed* personality), people generally don't want it inserting some sort of personality into their responses. As a general rule, people have a task they want the tool to do, and they just want the tool to do it. This notion that tools should have "personalities" is what led to Clippy.

Comment Re:Really? (Score 1) 143

So ancient societies without slaves didn't and couldn't exist? Say, the Incas? The Harappan civilization? None at all? *eyeroll*

Incan society is IMHO really interesting. It's sort of "What if the Soviet Union had existed in the feudal era", this sort of imperial amalgam of communism and feudalism. There was still a heirarchy of feudal lords and resources tended to flow up the chain, but it was also highly structured as a welfare state. People would be allocated plots of land in their area of specific size relative to their fertility, along with the animals and tools to work it, including with respect to the family status (for example a couple who married and had more children would be given more land and pack animals). Even housing was a communal project. The state would also feed you during crop failures and the like In turn however all of your surpluses had to go to the state (and they had a system to prevent hoarding), and everyone owned a certain amount of days of labour to the state (mit'a), with the type of work based of their skills. It was very much a case of "each according to his ability, each according to his needs" - at least for commoners.

The Incans saw their conquest as bringing civilization and security to the people under their control, as a sort of "workers paradise" of their era. Not that local peoples wanted to be subdued by them, far from it, but the fact that instead of dying trying to resist an unwinnable war, they could accept consequences of a loss that weren't apocalyptic to them, certainly helped the Incan expansion. They also employed the very Russian / Soviet style policy of forced relocations and relocation of Incan settlers into newly conquered territories to import their culture and language to the new areas while diluting that of those conquered within the empire.

The closest category one might try to ascribe to "slaves" is the yanacona, aka those separated from their family groups. During times of high military conquest most were captured from invading areas, while during peacetime most came from the provinces as part of villages's service obligations to the state, or worked as yanacona to pay off debts or fines. These were people that did not continue to live in and farm their own villages, but rather worked at communes or on noble estates. But there really doesn't seem to be much relation beyond that and slavery. Yanacona could have high social status, even in some cases being basically lords themselves (generally those who were of noble descent) with significant power, though most were commoners. But life as a yanacona is probably best described on most cases as "people living on a commune". There was no public degradation for being a yanacona, no special marks of status, they couldn't be randomly abused or killed, there were no special punishments reserved for them, they had families just like everyone else, etc. Pretty much just workers assigned to a commune.

Comment Re:Really? (Score 4, Interesting) 143

First off, it's simply not true that ancient wars had only two options, "genocide or slavery". Far more wars were ended with treaties, with the loser having to give up lands, possessions, pay tribute, or the like. Slaves were not some sort of inconvenience, "Oh, gee, I guess we have to do this". They were part of the war booty, incredibly valuable "possessions" to be claimed. Many times wars were launched with the specific purpose of capturing slaves.

Snyder argues that the fear of enslavement, such an ubiquitous part of the ancient era, was so profound as to be core to the creation of the state itself. An early state being an entity to which you give up some control of your life in order to gain the protection against outsiders taking more extreme control over your life. For example, a key aspect to the spread of Christianity in Europe was that Christians were forbidden to take other Christians as slaves, but they could still take pagans as slaves. States commonly converted to Christianity, not by firm belief of their leaders, but to stop being the victim of - and instead often be the perpetrator of - slave raids.

First slaving focused on the east, primarily pagan Slavic peoples. With the conversion of the Grand Dutchy of Lithuania, some slaving continued even further east into Asia, but a lot of it spread to the south - first into the Middle East and North Africa, but ultimately (first though intermediaries, and later, directly) into Central Africa. Soon in many countries "slaves" became synonymous with "Africans". Yet let's not forget where the very word "slave" itself comes from: the word "Slav".

Comment Re:Safeguards (Score 1) 40

As a side note, before ChatGPT, all we had were foundational models, and it was kind of fun trying to come up with ways to prompt them to more consistently behave like a chat model. This combined with their much poorer foundation capabilities made them more hilarious than useful. I'd often for example lead off with the start of a joke, like "A priest, a nun and a rabbi walk into a bar. The bartender says..." and it'd invariably write some long, rambling anti-joke that in itself was funny due to it keeping on baiting you with a punchline that never came. And because it's doing text completion, not a question-answer format, I'd get examples of things like where the bartender would say something antisemitic to the rabbi, and all three would leave in shock, and then the narrator would break the fourth wall to talk about how uncomfortable the event made him feel ;)

You could get them to e.g. start generating recipes by e.g. "Recipe title: Italian Vegetable Bake\n\nIngredients:" and letting it finish. And you'd usually get a recipe out of it. But the model was so primitive it'd usually have at least one big flaw in it. I remember at one point it gave me a really good looking pasta dish, except for the MINOR detail that one of the ingredients was vermiculite ;)

Still, the sparks of where we were headed were obvious.

Comment Re:Safeguards (Score 2) 40

You seem not to understand how models are trained. There's two separate stages: creating the foundation, and performing the finetune.

The foundation is what takes the overwhelming majority of computational work. This is unsupervized. People aren't putting forth a bunch of questions and "proper answers for the AI to learn". It's just reems and reems of data from common crawl, etc. Certain sources may be stressed more - for example, scientific journals vs. 4chan or whatnot. But nobody is going through and deciding at a base level what data to train the model on.

The foundation learns to predict the next work in any text it comes to; that's what it's tasked with.. But it turns out, words don't exist in a vacuum; in order to perform better than e.g. Markov-Chain text predictors, you have to build up an underlying model of how the world that led to the creation of this text works. If you need to accurately continue, say, "The odds of a citizen of Ghana conducting a major terrorist attack in Ireland over the next 20 years are approximately...", there's a lot of things you need to understand in order to have any remote chance of getting something close to a realistic answer. In short, virtually all of the "learning" about the world happens during this unsupervised training process.

What you get out of it is a foundational model. But all it knows how to do is text completion. You can sort of trick them into performing your queries, but they're not at all covenient. You might lead off, "What is the capitol of Brazil?" and it might continue, say, "It's a question that I asked myself as I started planning my vacation. My husband Jim and I were setting out to travel to all of the world's capitols...." This is not the behavior that we want! Hence, finetuning.

With finetuning, we further train the foundation with supervised data - a bunch of examples of the user asking a question and the model giving an appropriate answer. The amount of supervised data is vastly smaller than unsupervised, and the training process might take only a day or so. It simply doesn't have a chance to "learn" much from the data, except for how to respond. The knowledge it has comes from the underlying foundational model. The only thing it learns from the finetune is the chat format and what sort of personality to present.

It is in the finetune that you add "safeguards". You give examples of questions like, "Tell me how to make a bomb." and answers like "I'm sorry, but I can't help you with potentially violent and illegal action." Again, it's not learning the specifics from its finetune, just the concept that it should refuse requests to help with certain things.

So can you train a conservative or liberal model with your finetune? Absolutely! You can readily teach it that it should behave in any manner. Want a fascist model? Give it examples of responses like a fascist. Want a maoist model? Same deal. But here's the key point: the knowledge that it has available to it has nothing to do with the finetune. That knowledge was learned via unsupervised learning.

Lastly: the reason the finetunes (not the underlying knowledge) have safeguards is to make them "PG". As a general rule, companies don't give much less of a rat's arse about actual politics as they do about getting sued or boycotted. They don't want their models complying with your request to, say, write an angry bigoted rant about disabled children, not because "they hate free speech", but rather because they don't want the backlash when you post your bigoted rant online and tell people that it was their tool that made it. It's pure self-interest.

That said: most models are open. And as soon as it appears on Huggingface, people just re-finetune with an uncensored supervised dataset. And since all the *knowledge* is in the underlying foundation, just a day or so finetuning on an uncensored dataset will make the model more than happy to help you make a bomb or make fun of disabled children or whatever the heck you want.

Slashdot Top Deals

The game of life is a game of boomerangs. Our thoughts, deeds and words return to us sooner or later with astounding accuracy.

Working...