This is a great idea.  But what are the current limitations?  Current models used in probabilistic perception are at the surface level.

This idea arises in many perception tasks. In vision, we can have high-level models of activities, and of how people interact with each other, giving us much better models for visual tracking.  However, I don’t have much time left, so  I’ll spend just a couple of slides talking
about it in the context of natural language processing.