AI Qualifying Exam Reading List Associated with CS 769 - ADVANCED NATURAL LANGUAGE PROCESSING

For Fall 2009 and Later Exams

Topics

  1. Language Modeling: Probability theory.  Bayes theorem. Maximum likelihood and MAP estimators.  Bernoulli, multinomial, beta, Dirichlet distributions. n-grams, smoothing.
    [LN. Bishop06, sections 1.2, 2.1, 2.2.  MS99, chapter 6]
  2. Information Theory: Entropy, mutual information, KL divergence, cross entropy, entropy rate.
    [LN. Bishop06, section 1.6, MS99, section 2.2]
  3. Information Retrieval: ad-hoc retrieval, precision, recall, F measure, vector space model, cosine similarity, tf.idf, Hub-Authority, PageRank.
    [LN. MS99, sections 15.1, 15.2]
  4. Text Classification: decision theory, naive Bayes, logistic regression, support vector machines, the EM algorithm.
    [LN. Bishop06, sections 1.5, 4.3, 7.1, 8.1, 9]
  5. Latent Topic Models
    [LN. Blei03. GS04]
  6. Spectral Clustering
    [LN. vonLuxburg07]
  7. Inference in Graphical Models: Bayes Networks, conditional independence, Markov Random Fields, sum-product algorithm, max-sum algorithm
    [LN. Bishop06, chapter 8]
  8. Hidden Markov Models
    [LN. Bishop06, section 13.2. Rabiner89. MS99, chapter 9]
  9. Conditional Random Fields
    [LN. Sutton06]

  10.  

References

[Bishop06]
Christopher M. Bishop, Pattern Recognition and Machine Learning. Springer Verlag, 2006.
[Blei03]
D. Blei, A. Ng, and M. Jordan. Latent Dirichlet allocation. Journal of Machine Learning Research, 3:993–1022, January 2003.
[GS04]
Griffiths, T., & Steyvers, M. Finding Scientific Topics. Proceedings of the National Academy of Sciences, 101 (suppl. 1), 5228-5235. 2004
[LN]
Most topics above are covered by lecture notes, which should be studied.  They are available online at http://www.cs.wisc.edu/~jerryzhu/cs769.html
[MS99]
Manning & Schutze, Foundations of statistical natural language processing. the MIT press, 1999.
[Rabiner89]
Lawrence R. Rabiner, 1989. A tutorial on hidden Markov models and selected applications in speech recognition, Proceedings of the IEEE 77(2),  pp. 257-286. (An Erratum by Ali Rahimi)
[Sutton06]
Charles Sutton and Andrew McCallum. An Introduction to Conditional Random Fields for Relational Learning. In Introduction to Statistical Relational Learning. Edited by Lise Getoor and Ben Taskar. MIT Press. 2006.
[vonLuxburg07]
Ulrike von Luxburg. A Tutorial on Spectral Clustering. Statistics and Computing 17(4), 395-416 (12 2007).