Accounting for Burstiness in Topic Models (2009)

Authors

Abstract

Topic models are used in a variety of tasks to great success. However, even state-of-the-art topic models suffer from an important flaw. They do not capture the tendency of words to appear in bursts: if a word appears once in a document, it is more likely to appear again. We introduce a topic model that uses Dirichlet compound multinomials to account for this burstiness. This model outperforms the standard LDA model at modeling text and non-text data, and can be incorporated into more complex topic models as well.

Discussion

Enter your comment (wiki syntax is allowed):
AKRUZ
 
paper/2009/162.txt · Last modified: 2009/05/24 17:42 (external edit)
 
Driven by DokuWiki