|Let us consider the more
limited task of simply recognizing which webpage corresponds to which type of
entity. The most standard approach is
to classify the webpages into one of several categories: faculty, student,
project, etc, using the words in the webpage as features, e.g., using the
naïve Bayes model.