The resulting string of symbols[2] is used to train two 4th-order Markov models (Jelinek, 1986). One of these models can predict which symbol will following any sequence of four symbols, while the other can predict which symbol will precede any such sequence. Markov models express their predictions as a probability distribution over all known symbols, and are therefore capable of choosing likely words over unlikely ones. Models of order 4 were chosen to ensure that the prediction is based on two words; this has been found necessary to produce output resembling natural language (Hutchens, 1994). Jason Hutchens: how megahal works
Short URL for this post: http://tmblr.co/ZYYmby78KBKZ