Comment Re:weak (Score 1) 82
It's important to add in your simplified explanation that you'd include the frequency of the word pairs also. In ngrams, a primitive form of language prediction model, that would be all you need to store for a value of n=2. But for it to be useful, you'd want a value more like 4 or 5 at least. A value of n=3 would mean in addition to storing the frequency of word pairs, you'd store the frequency of word triplets (three words in sequence). n=4 is four words in sequence and so on. The size of the data store grows exponentially, but the realism of the output starts to get decent around 3 or 4 for a very simple algorithm that isn't even AI. Anyway. That's all.