This idea, of using a relational probabilistic language for disambiguation, was first proposed by Goldman and Charniak.  They got stuck in a place where AI often gets stuck --– knowledge acquisition. Where do these models come from?  Well, perhaps learning can come to our rescue.

Current models are at syntactic level, e.g., word-word correlations.  These contain a surprising amount of semantic information.  But that information is at the syntactic level, which substantially limits its applicability.  As a simple example, although we can learn that a “teacher” is often in the same sentence as “train”, and we can use that to disambiguate some sentences, we won’t be able to use it in contexts where
we have “Sarah” in a sentence, who we happen to know is a teacher from previous sentences.