000 -LEADER |
fixed length control field |
06511nam a22007091i 4500 |
001 - CONTROL NUMBER |
control field |
8673866 |
003 - CONTROL NUMBER IDENTIFIER |
control field |
IEEE |
005 - DATE AND TIME OF LATEST TRANSACTION |
control field |
20200413152931.0 |
006 - FIXED-LENGTH DATA ELEMENTS--ADDITIONAL MATERIAL CHARACTERISTICS |
fixed length control field |
m eo d |
007 - PHYSICAL DESCRIPTION FIXED FIELD--GENERAL INFORMATION |
fixed length control field |
cr cn |||m|||a |
008 - FIXED-LENGTH DATA ELEMENTS--GENERAL INFORMATION |
fixed length control field |
190402s2019 caua foab 000 0 eng d |
020 ## - INTERNATIONAL STANDARD BOOK NUMBER |
International Standard Book Number |
9781681735207 |
Qualifying information |
electronic |
020 ## - INTERNATIONAL STANDARD BOOK NUMBER |
Canceled/invalid ISBN |
9781681735214 |
Qualifying information |
hardcover |
020 ## - INTERNATIONAL STANDARD BOOK NUMBER |
Canceled/invalid ISBN |
9781681735191 |
Qualifying information |
paperback |
024 7# - OTHER STANDARD IDENTIFIER |
Standard number or code |
10.2200/S00903ED1V01Y201902DMK017 |
Source of number or code |
doi |
035 ## - SYSTEM CONTROL NUMBER |
System control number |
(CaBNVSL)thg00978686 |
035 ## - SYSTEM CONTROL NUMBER |
System control number |
(OCoLC)1091193939 |
040 ## - CATALOGING SOURCE |
Original cataloging agency |
CaBNVSL |
Language of cataloging |
eng |
Description conventions |
rda |
Transcribing agency |
CaBNVSL |
Modifying agency |
CaBNVSL |
050 #4 - LIBRARY OF CONGRESS CALL NUMBER |
Classification number |
QA76.9.D343 |
Item number |
Z536 2019eb |
082 04 - DEWEY DECIMAL CLASSIFICATION NUMBER |
Classification number |
006.312 |
Edition number |
23 |
100 1# - MAIN ENTRY--PERSONAL NAME |
Personal name |
Zhang, Chao |
Titles and words associated with a name |
(Computer scientist), |
Relator term |
author. |
245 10 - TITLE STATEMENT |
Title |
Multidimensional mining of massive text data / |
Statement of responsibility, etc. |
Chao Zhang, Jiawei Han. |
264 #1 - PRODUCTION, PUBLICATION, DISTRIBUTION, MANUFACTURE, AND COPYRIGHT NOTICE |
Place of production, publication, distribution, manufacture |
[San Rafael, California] : |
Name of producer, publisher, distributor, manufacturer |
Morgan & Claypool, |
Date of production, publication, distribution, manufacture, or copyright notice |
[2019] |
300 ## - PHYSICAL DESCRIPTION |
Extent |
1 PDF (xiv, pages) : |
Other physical details |
illustrations. |
336 ## - CONTENT TYPE |
Content type term |
text |
Source |
rdacontent |
337 ## - MEDIA TYPE |
Media type term |
electronic |
Source |
isbdmedia |
338 ## - CARRIER TYPE |
Carrier type term |
online resource |
Source |
rdacarrier |
490 1# - SERIES STATEMENT |
Series statement |
Synthesis lectures on data mining and knowledge discovery, |
International Standard Serial Number |
2151-0067 ; |
Volume/sequential designation |
#17 |
538 ## - SYSTEM DETAILS NOTE |
System details note |
Mode of access: World Wide Web. |
538 ## - SYSTEM DETAILS NOTE |
System details note |
System requirements: Adobe Acrobat Reader. |
500 ## - GENERAL NOTE |
General note |
Part of: Synthesis digital library of engineering and computer science. |
504 ## - BIBLIOGRAPHY, ETC. NOTE |
Bibliography, etc. note |
Includes bibliographical references (pages 169-181). |
505 0# - FORMATTED CONTENTS NOTE |
Formatted contents note |
1. Introduction -- 1.1. Overview -- 1.2. Main parts -- 1.3. Technical roadmap -- 1.4. Organization |
505 8# - FORMATTED CONTENTS NOTE |
Formatted contents note |
part I. Cube construction algorithms. 2. Topic-level taxonomy generation -- 2.1. Overview -- 2.2. Related work -- 2.3. Preliminaries -- 2.4. Adaptive term clustering -- 2.5. Adaptive term embedding -- 2.6. Experimental evaluation -- 2.7. Summary |
505 8# - FORMATTED CONTENTS NOTE |
Formatted contents note |
3. Term-level taxonomy generation / Jiaming Shen -- 3.1. Overview -- 3.2. Related work -- 3.3. Problem formulation -- 3.4. The HiExpan framework -- 3.5. Experiments -- 3.6. Summary |
505 8# - FORMATTED CONTENTS NOTE |
Formatted contents note |
4. Weakly supervised text classification / Yu Meng -- 4.1. Overview -- 4.2. Related work -- 4.3. Preliminaries -- 4.4. Pseudo-document generation -- 4.5. Neural models with self-training -- 4.6. Experiments -- 4.7. Summary 69 |
505 8# - FORMATTED CONTENTS NOTE |
Formatted contents note |
5. Weakly supervised hierarchical text classification / Yu Meng -- 5.1. Overview -- 5.2. Related work -- 5.3. Problem formulation -- 5.4. Pseudo-document generation -- 5.5. The hierarchical classification model -- 5.6. Experiments -- 5.7. Summary |
505 8# - FORMATTED CONTENTS NOTE |
Formatted contents note |
part II. Cube exploitation algorithms. 6. Multidimensional summarization / Fangbo Tao -- 6.1. Introduction -- 6.2. Related work -- 6.3. Preliminaries -- 6.4. The ranking measure -- 6.5. The RepPhrase method -- 6.6. Experiments -- 6.7. Summary |
505 8# - FORMATTED CONTENTS NOTE |
Formatted contents note |
7. Cross-dimension prediction in cube space -- 7.1. Overview -- 7.2. Related work -- 7.3. Preliminaries -- 7.4. Semi-supervised multimodal embedding -- 7.5. Online updating of multimodal embedding -- 7.6. Experiments -- 7.7. Summary |
505 8# - FORMATTED CONTENTS NOTE |
Formatted contents note |
8. Event detection in cube space -- 8.1. Overview -- 8.2. Related work -- 8.3. Preliminaries -- 8.4. Candidate generation -- 8.5. Candidate classification -- 8.6. Supporting continuous event detection -- 8.7. Complexity analysis -- 8.8. Experiments -- 8.9. Summary |
505 8# - FORMATTED CONTENTS NOTE |
Formatted contents note |
9. Conclusions -- 9.1. Summary -- 9.2. Future work. |
506 ## - RESTRICTIONS ON ACCESS NOTE |
Terms governing access |
Abstract freely available; full-text restricted to subscribers or individual document purchasers. |
510 0# - CITATION/REFERENCES NOTE |
Name of source |
Compendex |
510 0# - CITATION/REFERENCES NOTE |
Name of source |
INSPEC |
510 0# - CITATION/REFERENCES NOTE |
Name of source |
Google scholar |
510 0# - CITATION/REFERENCES NOTE |
Name of source |
Google book search |
520 3# - SUMMARY, ETC. |
Summary, etc. |
Unstructured text, as one of the most important data forms, plays a crucial role in data-driven decision making in domains ranging from social networking and information retrieval to scientific research and healthcare informatics. In many emerging applications, people's information need from text data is becoming multidimensional--they demand useful insights along multiple aspects from a text corpus. However, acquiring such multidimensional knowledge from massive text data remains a challenging task. This book presents data mining techniques that turn unstructured text data into multidimensional knowledge. We investigate two core questions. (1) How does one identify task-relevant text data with declarative queries in multiple dimensions? (2) How does one distill knowledge from text data in a multidimensional space? To address the above questions, we develop a text cube framework. First, we develop a cube construction module that organizes unstructured data into a cube structure, by discovering latent multidimensional and multi-granular structure from the unstructured text corpus and allocating documents into the structure. Second, we develop a cube exploitation module that models multiple dimensions in the cube space, thereby distilling from user-selected data multidimensional knowledge. Together, these two modules constitute an integrated pipeline: leveraging the cube structure, users can perform multidimensional, multigranular data selection with declarative queries; and with cube exploitation algorithms, users can extract multidimensional patterns from the selected data for decision making. The proposed framework has two distinctive advantages when turning text data into multidimensional knowledge: flexibility and label-efficiency. First, it enables acquiring multidimensional knowledge flexibly, as the cube structure allows users to easily identify task-relevant data along multiple dimensions at varied granularities and further distill multidimensional knowledge. Second, the algorithms for cube construction and exploitation require little supervision; this makes the framework appealing for many applications where labeled data are expensive to obtain. |
530 ## - ADDITIONAL PHYSICAL FORM AVAILABLE NOTE |
Additional physical form available note |
Also available in print. |
588 ## - SOURCE OF DESCRIPTION NOTE |
Source of description note |
Title from PDF title page (viewed on April 2, 2019). |
650 #0 - SUBJECT ADDED ENTRY--TOPICAL TERM |
Topical term or geographic name entry element |
Data mining. |
650 #0 - SUBJECT ADDED ENTRY--TOPICAL TERM |
Topical term or geographic name entry element |
Text processing (Computer science) |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
text mining |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
multidimensional analysis |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
data cube |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
limited supervision |
700 1# - ADDED ENTRY--PERSONAL NAME |
Personal name |
Han, Jiawei, |
Relator term |
author. |
776 08 - ADDITIONAL PHYSICAL FORM ENTRY |
Relationship information |
Print version: |
International Standard Book Number |
9781681735214 |
-- |
9781681735191 |
830 #0 - SERIES ADDED ENTRY--UNIFORM TITLE |
Uniform title |
Synthesis digital library of engineering and computer science. |
830 #0 - SERIES ADDED ENTRY--UNIFORM TITLE |
Uniform title |
Synthesis lectures on data mining and knowledge discovery ; |
Volume/sequential designation |
#17. |
856 42 - ELECTRONIC LOCATION AND ACCESS |
Materials specified |
Abstract with links to resource |
Uniform Resource Identifier |
https://ieeexplore.ieee.org/servlet/opac?bknumber=8673866 |
856 40 - ELECTRONIC LOCATION AND ACCESS |
Materials specified |
Abstract with links to full text |
Uniform Resource Identifier |
https://doi.org/10.2200/S00903ED1V01Y201902DMK017 |