Welcome to P K Kelkar Library, Online Public Access Catalogue (OPAC)

Normal view MARC view ISBD view

Mining latent entity structures /

By: Wang, Chi [author.].
Contributor(s): Han, Jiawei [author.].
Material type: materialTypeLabelBookSeries: Synthesis digital library of engineering and computer science: ; Synthesis lectures on data mining and knowledge discovery: # 10.Publisher: San Rafael, California (1537 Fourth Street, San Rafael, CA 94901 USA) : Morgan & Claypool, 2015.Description: 1 PDF (xi, 147 pages) : illustrations.Content type: text Media type: electronic Carrier type: online resourceISBN: 9781627056618.Subject(s): Data mining | Latent structure analysis | information networks | text mining | link analysis | topic modeling | phrase extraction | role discovery | clustering | ranking | relationship mining | probabilistic models | real-world applications | efficient and scalable algorithmsDDC classification: 006.312 Online resources: Abstract with links to resource Also available in print.
Contents:
1. Introduction -- 1.1 Motivation -- 1.2 Data model: a text-rich heterogeneous information network modeL -- 1.3 Latent entity structure -- 1.4 The mining framework -- 1.4.1 Hierarchical topic and community discovery -- 1.4.2 Topical phrase mining -- 1.4.3 Entity topical role analysis -- 1.4.4 Entity relationship mining --
2. Hierarchical topic and community discovery -- 2.1 Generative model for text or homogeneous networks -- 2.2 Generative model for heterogeneous network -- 2.2.1 The basic model -- 2.2.2 Learning link-type weights -- 2.2.3 Shape of hierarchy -- 2.3 Empirical analysis -- 2.3.1 Efficacy of subtopic discovery -- 2.3.2 Topical hierarchy quality -- 2.3.3 Case study --
3. Topical phrase mining -- 3.1 Criteria of good phrases and topical phrases -- 3.2 KERT: mining phrases in short, content-representative text -- 3.2.1 Phrase quality -- 3.2.2 Topical phrase quality -- 3.3 ToPMine: mining phrases in general text -- 3.3.1 Frequent phrase mining -- 3.3.2 Segmentation and phrase filtering -- 3.3.3 Topical phrase ranking -- 3.4 Empirical analysis -- 3.4.1 The impact of the four criteria -- 3.4.2 Comparison of mining methods -- 3.4.3 Scalability --
4. Entity topical role analysis -- 4.1 Role of given entities -- 4.1.1 Entity specific phrase ranking -- 4.1.2 Distribution over subtopics -- 4.1.3 Case study -- 4.2 Entities of given roles --
5. Mining entity relations -- 5.1 Unsupervised hierarchical relation mining -- 5.1.1 Notations -- 5.1.2 Assumptions and framework -- 5.1.3 Stage 1: preprocessing -- 5.1.4 Stage 2: TPFG model -- 5.1.5 Model inference -- 5.1.6 Empirical analysis -- 5.2 Supervised hierarchical relation mining -- 5.2.1 Conditional random field for hierarchical relationship -- 5.2.2 Potential function design -- 5.2.3 Model inference and learning -- 5.2.4 Empirical analysis -- 5.3 Semi-supervised co-profiling -- 5.3.1 Observations -- 5.3.2 Model -- 5.3.3 Inference algorithm -- 5.3.4 Empirical analysis --
6. Scalable and robust topic discovery -- 6.1 Latent dirichlet allocation with topic tree -- 6.2 The STROD algorithm -- 6.2.1 Moment-based inference -- 6.2.2 Scalability improvement -- 6.2.3 Hyperparameter learning -- 6.3 Empirical analysis -- 6.3.1 Scalability -- 6.3.2 Robustness -- 6.3.3 Interpretability --
7. Application and research frontier -- 7.1 Application -- 7.1.1 Online analytical processing of information networks -- 7.1.2 Social influence and viral marketing -- 7.1.3 Relevance targeting -- 7.2 Research frontier --
Bibliography -- Authors' biographies.
Abstract: The 'big data' era is characterized by an explosion of information in the form of digital data collections, ranging from scientific knowledge, to social media, news, and everyone's daily life. Examples of such collections include scientific publications, enterprise logs, news articles, social media, and general web pages. Valuable knowledge about multi-typed entities is often hidden in the unstructured or loosely structured, interconnected data. Mining latent structures around entities uncovers hidden knowledge such as implicit topics, phrases, entity roles and relationships. In this monograph, we investigate the principles and methodologies of mining latent entity structures from massive unstructured and interconnected data. We propose a text-rich information network model for modeling data in many different domains. This leads to a series of new principles and powerful methodologies for mining latent structures, including (1) latent topical hierarchy, (2) quality topical phrases, (3) entity roles in hierarchical topical communities, and (4) entity relations. This book also introduces applications enabled by the mined structures and points out some promising research directions.
    average rating: 0.0 (0 votes)
Item type Current location Call number Status Date due Barcode Item holds
E books E books PK Kelkar Library, IIT Kanpur
Available EBKE627
Total holds: 0

Mode of access: World Wide Web.

System requirements: Adobe Acrobat Reader.

Part of: Synthesis digital library of engineering and computer science.

Includes bibliographical references (pages 141-145).

1. Introduction -- 1.1 Motivation -- 1.2 Data model: a text-rich heterogeneous information network modeL -- 1.3 Latent entity structure -- 1.4 The mining framework -- 1.4.1 Hierarchical topic and community discovery -- 1.4.2 Topical phrase mining -- 1.4.3 Entity topical role analysis -- 1.4.4 Entity relationship mining --

2. Hierarchical topic and community discovery -- 2.1 Generative model for text or homogeneous networks -- 2.2 Generative model for heterogeneous network -- 2.2.1 The basic model -- 2.2.2 Learning link-type weights -- 2.2.3 Shape of hierarchy -- 2.3 Empirical analysis -- 2.3.1 Efficacy of subtopic discovery -- 2.3.2 Topical hierarchy quality -- 2.3.3 Case study --

3. Topical phrase mining -- 3.1 Criteria of good phrases and topical phrases -- 3.2 KERT: mining phrases in short, content-representative text -- 3.2.1 Phrase quality -- 3.2.2 Topical phrase quality -- 3.3 ToPMine: mining phrases in general text -- 3.3.1 Frequent phrase mining -- 3.3.2 Segmentation and phrase filtering -- 3.3.3 Topical phrase ranking -- 3.4 Empirical analysis -- 3.4.1 The impact of the four criteria -- 3.4.2 Comparison of mining methods -- 3.4.3 Scalability --

4. Entity topical role analysis -- 4.1 Role of given entities -- 4.1.1 Entity specific phrase ranking -- 4.1.2 Distribution over subtopics -- 4.1.3 Case study -- 4.2 Entities of given roles --

5. Mining entity relations -- 5.1 Unsupervised hierarchical relation mining -- 5.1.1 Notations -- 5.1.2 Assumptions and framework -- 5.1.3 Stage 1: preprocessing -- 5.1.4 Stage 2: TPFG model -- 5.1.5 Model inference -- 5.1.6 Empirical analysis -- 5.2 Supervised hierarchical relation mining -- 5.2.1 Conditional random field for hierarchical relationship -- 5.2.2 Potential function design -- 5.2.3 Model inference and learning -- 5.2.4 Empirical analysis -- 5.3 Semi-supervised co-profiling -- 5.3.1 Observations -- 5.3.2 Model -- 5.3.3 Inference algorithm -- 5.3.4 Empirical analysis --

6. Scalable and robust topic discovery -- 6.1 Latent dirichlet allocation with topic tree -- 6.2 The STROD algorithm -- 6.2.1 Moment-based inference -- 6.2.2 Scalability improvement -- 6.2.3 Hyperparameter learning -- 6.3 Empirical analysis -- 6.3.1 Scalability -- 6.3.2 Robustness -- 6.3.3 Interpretability --

7. Application and research frontier -- 7.1 Application -- 7.1.1 Online analytical processing of information networks -- 7.1.2 Social influence and viral marketing -- 7.1.3 Relevance targeting -- 7.2 Research frontier --

Bibliography -- Authors' biographies.

Abstract freely available; full-text restricted to subscribers or individual document purchasers.

Compendex

INSPEC

Google scholar

Google book search

The 'big data' era is characterized by an explosion of information in the form of digital data collections, ranging from scientific knowledge, to social media, news, and everyone's daily life. Examples of such collections include scientific publications, enterprise logs, news articles, social media, and general web pages. Valuable knowledge about multi-typed entities is often hidden in the unstructured or loosely structured, interconnected data. Mining latent structures around entities uncovers hidden knowledge such as implicit topics, phrases, entity roles and relationships. In this monograph, we investigate the principles and methodologies of mining latent entity structures from massive unstructured and interconnected data. We propose a text-rich information network model for modeling data in many different domains. This leads to a series of new principles and powerful methodologies for mining latent structures, including (1) latent topical hierarchy, (2) quality topical phrases, (3) entity roles in hierarchical topical communities, and (4) entity relations. This book also introduces applications enabled by the mined structures and points out some promising research directions.

Also available in print.

Title from PDF title page (viewed on April 26, 2015).

There are no comments for this item.

Log in to your account to post a comment.

Powered by Koha