Text Mining with Probabilistic Topic Models

Text Mining with Probabilistic Topic Models

Applications in Information Retrieval and Concept Modeling

LAP Lambert Academic Publishing ( 14.09.2010 )

€ 59,00

MoreBooks! sitesinden satın al

Statistical topic models are a class of probabilistic latent variable models for textual data that represent text documents as distributions over topics. These models have been shown to produce interpretable summarization of documents in the form of topics. In this book, we describe how the statistical topic modeling framework can be used for information retrieval tasks and for the integration of background knowledge in the form of semantic concepts. We first describe the special-words topic models in which a document is represented as a distribution of (i) a mixture of shared topics, (ii) a special-words distribution specific to the document, and (iii) a corpus-level background distribution. We describe the utility of the special-words topic models for information retrieval tasks. We next describe the problem of integrating background knowledge in the form of semantic concepts into the topic modeling framework. To combine data-driven topics and semantic concepts, we describe the concept-topic model and the hierarchical concept-topic model which represent a document as a distribution over data-driven topics and semantic concepts.

Kitap detayları:

ISBN-13:

978-3-8383-6410-0

ISBN-10:

3838364104

EAN:

9783838364100

Kitabın dili:

English

Yazar:

Chaitanya Chemudugunta

Sayfa sayısı:

140

Yayın tarihi:

14.09.2010

Kategori:

Bilişim