Comment Re: psychiatrist for AI (Score 1) 60
LLMs absolutely, without question, do not learn the way you seem to think they do. They do not learn from having conversations. They do not learn by being presented with text in a prompt, though if your experience is limited to chatbots could be forgiven for mistakenly thinking that was the case. Neural networks are not artificial brains. They have no mechanism by which they can 'learn by experience'. They 'learn' by having an external program modify their weights in response to the the difference between their output and the expected output for a given input.
This is "absolutely without question" incorrect. One of the most useful properties of LLMs is demonstrated in-context learning capabilities where a good instruction tuned model is able to learn from conversations and information provided to it without modifying model weights.
It might also interest you to know than the model itself is completely deterministic. Given an input, it will always produce the same output. The trick is that the model doesn't actually produce a next token, but a list of probabilities for the next token. The actual token is selected probabilistically, which is why you'll get different responses despite the model being completely deterministic.
Who cares? This is a rather specific and strange distinction without a difference that does not seem to be in any way related to anything stated in this thread. Randomness in token selection impacts the KV matrix which impacts evaluation of subsequent tokens.
Remember that each token is produced essentially in isolation. The model doesn't work out a solution first and carefully craft a response, it produces tokens one at a time, without retaining any internal state between them.
This is pure BS, key value matrices are maintained throughout.
That's a very misleading term. The model isn't on mushrooms. (Remember that the model proper is completely deterministic.)
Again with determinism nonsense.
A so-called 'hallucination' in an LLM's output just means that the output is factually incorrect. As LLMs do not operate on facts and concepts but on statistical relationships between tokens, there is no operational difference between a 'correct' response and a 'hallucination'. Both kinds of output are produced the same way, by the same process. A 'hallucination' isn't the model malfunctioning, but an entirely expected result of the model operating correctly.
LOL see the program isn't malfunctioning it is just doing what it was programmed to do. These word games are pointless.