000 06362nam a2200769 i 4500
001 7084069
003 IEEE
005 20200413152917.0
006 m eo d
007 cr cn |||m|||a
008 150426s2015 caua foab 000 0 eng d
020 _a9781627056618
_qebook
020 _z9781627056601
_qprint
024 7 _a10.2200/S00625ED1V01Y201502DMK010
_2doi
035 _a(CaBNVSL)swl00404858
035 _a(OCoLC)908031780
040 _aCaBNVSL
_beng
_erda
_cCaBNVSL
_dCaBNVSL
050 4 _aQA76.9.D343
_bW255 2015
082 0 4 _a006.312
_223
100 1 _aWang, Chi.,
_eauthor.
245 1 0 _aMining latent entity structures /
_cChi Wang, Jiawei Han.
264 1 _aSan Rafael, California (1537 Fourth Street, San Rafael, CA 94901 USA) :
_bMorgan & Claypool,
_c2015.
300 _a1 PDF (xi, 147 pages) :
_billustrations.
336 _atext
_2rdacontent
337 _aelectronic
_2isbdmedia
338 _aonline resource
_2rdacarrier
490 1 _aSynthesis lectures on data mining and knowledge discovery,
_x2151-0075 ;
_v# 10
538 _aMode of access: World Wide Web.
538 _aSystem requirements: Adobe Acrobat Reader.
500 _aPart of: Synthesis digital library of engineering and computer science.
504 _aIncludes bibliographical references (pages 141-145).
505 0 _a1. Introduction -- 1.1 Motivation -- 1.2 Data model: a text-rich heterogeneous information network modeL -- 1.3 Latent entity structure -- 1.4 The mining framework -- 1.4.1 Hierarchical topic and community discovery -- 1.4.2 Topical phrase mining -- 1.4.3 Entity topical role analysis -- 1.4.4 Entity relationship mining --
505 8 _a2. Hierarchical topic and community discovery -- 2.1 Generative model for text or homogeneous networks -- 2.2 Generative model for heterogeneous network -- 2.2.1 The basic model -- 2.2.2 Learning link-type weights -- 2.2.3 Shape of hierarchy -- 2.3 Empirical analysis -- 2.3.1 Efficacy of subtopic discovery -- 2.3.2 Topical hierarchy quality -- 2.3.3 Case study --
505 8 _a3. Topical phrase mining -- 3.1 Criteria of good phrases and topical phrases -- 3.2 KERT: mining phrases in short, content-representative text -- 3.2.1 Phrase quality -- 3.2.2 Topical phrase quality -- 3.3 ToPMine: mining phrases in general text -- 3.3.1 Frequent phrase mining -- 3.3.2 Segmentation and phrase filtering -- 3.3.3 Topical phrase ranking -- 3.4 Empirical analysis -- 3.4.1 The impact of the four criteria -- 3.4.2 Comparison of mining methods -- 3.4.3 Scalability --
505 8 _a4. Entity topical role analysis -- 4.1 Role of given entities -- 4.1.1 Entity specific phrase ranking -- 4.1.2 Distribution over subtopics -- 4.1.3 Case study -- 4.2 Entities of given roles --
505 8 _a5. Mining entity relations -- 5.1 Unsupervised hierarchical relation mining -- 5.1.1 Notations -- 5.1.2 Assumptions and framework -- 5.1.3 Stage 1: preprocessing -- 5.1.4 Stage 2: TPFG model -- 5.1.5 Model inference -- 5.1.6 Empirical analysis -- 5.2 Supervised hierarchical relation mining -- 5.2.1 Conditional random field for hierarchical relationship -- 5.2.2 Potential function design -- 5.2.3 Model inference and learning -- 5.2.4 Empirical analysis -- 5.3 Semi-supervised co-profiling -- 5.3.1 Observations -- 5.3.2 Model -- 5.3.3 Inference algorithm -- 5.3.4 Empirical analysis --
505 8 _a6. Scalable and robust topic discovery -- 6.1 Latent dirichlet allocation with topic tree -- 6.2 The STROD algorithm -- 6.2.1 Moment-based inference -- 6.2.2 Scalability improvement -- 6.2.3 Hyperparameter learning -- 6.3 Empirical analysis -- 6.3.1 Scalability -- 6.3.2 Robustness -- 6.3.3 Interpretability --
505 8 _a7. Application and research frontier -- 7.1 Application -- 7.1.1 Online analytical processing of information networks -- 7.1.2 Social influence and viral marketing -- 7.1.3 Relevance targeting -- 7.2 Research frontier --
505 8 _aBibliography -- Authors' biographies.
506 1 _aAbstract freely available; full-text restricted to subscribers or individual document purchasers.
510 0 _aCompendex
510 0 _aINSPEC
510 0 _aGoogle scholar
510 0 _aGoogle book search
520 3 _aThe 'big data' era is characterized by an explosion of information in the form of digital data collections, ranging from scientific knowledge, to social media, news, and everyone's daily life. Examples of such collections include scientific publications, enterprise logs, news articles, social media, and general web pages. Valuable knowledge about multi-typed entities is often hidden in the unstructured or loosely structured, interconnected data. Mining latent structures around entities uncovers hidden knowledge such as implicit topics, phrases, entity roles and relationships. In this monograph, we investigate the principles and methodologies of mining latent entity structures from massive unstructured and interconnected data. We propose a text-rich information network model for modeling data in many different domains. This leads to a series of new principles and powerful methodologies for mining latent structures, including (1) latent topical hierarchy, (2) quality topical phrases, (3) entity roles in hierarchical topical communities, and (4) entity relations. This book also introduces applications enabled by the mined structures and points out some promising research directions.
530 _aAlso available in print.
588 _aTitle from PDF title page (viewed on April 26, 2015).
650 0 _aData mining.
650 0 _aLatent structure analysis.
653 _ainformation networks
653 _atext mining
653 _alink analysis
653 _atopic modeling
653 _aphrase extraction
653 _arole discovery
653 _aclustering
653 _aranking
653 _arelationship mining
653 _aprobabilistic models
653 _areal-world applications
653 _aefficient and scalable algorithms
700 1 _aHan, Jiawei.,
_eauthor.
776 0 8 _iPrint version:
_z9781627056601
830 0 _aSynthesis digital library of engineering and computer science.
830 0 _aSynthesis lectures on data mining and knowledge discovery ;
_v# 10.
_x2151-0075
856 4 2 _3Abstract with links to resource
_uhttp://ieeexplore.ieee.org/servlet/opac?bknumber=7084069
999 _c562127
_d562127