An AI / LLM is a database you can talk to (query) using natural language. It is an amazing achievement, famously fooling one of Google's own software engineers (Blake Lemoine) into believing the machinery was sentient.
But it's still a database. It's trained for weeks on a datasets to set the weights in the neural network. Tokens from prompts filter through the (static) neural network repeatedly building a response ("inferring").
The problem is as the tokens cycle through the neural network, building the response by filtering through the weights, it's impossible for a human to know exactly what it's doing - specifically how it's reasoning to come to its conclusions. That's where the field of Explainable AI comes in.
To help people get a handle on AI, here's how they're priced - based on tokens:
https://learn.microsoft.com/en...
https://help.openai.com/en/art...
A bunch of weights in a neural network (the weights set by weeks of continuous training on a dataset), tokens extracted from prompts, filter through the neural network, building the output. Is the possibility of sentience in there? Consciousness? What's the core action being taken? How exactly is a token response built?
Here's a description from IBM:
During the training process, these models learn to predict the next word in a sentence based on the context provided by the preceding words. The model does this through attributing a probability score to the recurrence of words that have been tokenized— broken down into smaller sequences of characters. These tokens are then transformed into embeddings, which are numeric representations of this context.
To ensure accuracy, this process involves training the LLM on a massive corpora of text (in the billions of pages), allowing it to learn grammar, semantics and conceptual relationships through zero-shot and self-supervised learning. Once trained on this training data, LLMs can generate text by autonomously predicting the next word based on the input they receive, and drawing on the patterns and knowledge they've acquired. The result is coherent and contextually relevant language generation that can be harnessed for a wide range of NLU and content generation tasks.
Once it's trained, it's trained. The neural network and its weights are set. Then it's time to query / prompt.
It's amazing stuff. That the machinery can do this is astonishing. But it's feeding tokens through a trained, static neural network.
Now... right now it's trained on binary data, audio, video, images, text. Prompts are tokenized from incoming text strings. The technology is in its infancy. Could you program something around this core system to be a decision making platform that could be placed in a robot which could navigate its environment, and make decisions about what to do? I think that's coming. That would require being able to tokenize the world around it. I suspect it would take a vast amount of training data that may be beyond current computing and electrical power capabilities. This is nascent technology and there's a long road ahead.
On the other hand... quantum computing, fusion... these have promise, but engineering limitations limit the ability to realize those promises. So, one needs to have a balanced view. We're just at the beginning though. Relational databases were introduced in the early-mid 70s. This technology has been introduced just now, so who knows what it'll look like in 50 years.
Disclaimer: I'm not remotely an LLM / AI expert. Reading the CACM, thinking about it, but there are lots of people out there programming these things, of which I'm not one. But I am interested and think I have it right.