Comment Re:Well said (Score 3) 121
It's "obvious" for people who doesn't understand how LLMs work
Most people do not understand what is at the heart of an LLM, and as such, doesn't understand its limitations.
I'll simplify a bit for the sake of keeping this post short, but in essence, the "magic" part of LLM is a model that; when given input as tokens (which are words or parts of words) will return the probability for the next word.
The model itself is stateless, it will always return the same words with the same probability given the same input.
It doesn't learn.
It doesn't remember.
It doesn't think.
It doesn't reason.
It doesn't understand.
It only create a list of the next word with attached probability of each.
Of course the "wrapper" around this Magic feeds all previous conversation with the addition of the new word, to generate the next word, and keep on doing that until an answer is complete. That is why it's still power hungry and time-consuming. Having the option to adjust temperature (not just picking the most probable word) and randomness/seed (don't generate the same list of words).
Right now, with the "reasoning" models, we are going down the same path as we did with XML back in the day. "If you can't solve it using XML, then you are not using enough XML". Basically feeding the complete result of the LLM back into itself (or a different one) to have one more pass to see if the training data can match some outlier or unusual text.
To train these models it takes the amount of power of a small city. The technology is almost at its peak. It will improve of course, but not significantly. If something else than LLM comes along, then we can revisit the topic.
Just to be clear: I'm using Copilot (github) both in a professional capacity and for my hobby coding projects. I also have a paid sub to ChatGPT helping me looking into things I want to learn more about. It's a fantastic technology, and it can really help. But it doesn't really do all the things that people think it will do.
Most people do not understand what is at the heart of an LLM, and as such, doesn't understand its limitations.
I'll simplify a bit for the sake of keeping this post short, but in essence, the "magic" part of LLM is a model that; when given input as tokens (which are words or parts of words) will return the probability for the next word.
The model itself is stateless, it will always return the same words with the same probability given the same input.
It doesn't learn.
It doesn't remember.
It doesn't think.
It doesn't reason.
It doesn't understand.
It only create a list of the next word with attached probability of each.
Of course the "wrapper" around this Magic feeds all previous conversation with the addition of the new word, to generate the next word, and keep on doing that until an answer is complete. That is why it's still power hungry and time-consuming. Having the option to adjust temperature (not just picking the most probable word) and randomness/seed (don't generate the same list of words).
Right now, with the "reasoning" models, we are going down the same path as we did with XML back in the day. "If you can't solve it using XML, then you are not using enough XML". Basically feeding the complete result of the LLM back into itself (or a different one) to have one more pass to see if the training data can match some outlier or unusual text.
To train these models it takes the amount of power of a small city. The technology is almost at its peak. It will improve of course, but not significantly. If something else than LLM comes along, then we can revisit the topic.
Just to be clear: I'm using Copilot (github) both in a professional capacity and for my hobby coding projects. I also have a paid sub to ChatGPT helping me looking into things I want to learn more about. It's a fantastic technology, and it can really help. But it doesn't really do all the things that people think it will do.