Welcome to P K Kelkar Library, Online Public Access Catalogue (OPAC)

Big data integration / (Record no. 562123)

000 -LEADER
fixed length control field 06749nam a2200697 i 4500
001 - CONTROL NUMBER
control field 7065199
003 - CONTROL NUMBER IDENTIFIER
control field IEEE
005 - DATE AND TIME OF LATEST TRANSACTION
control field 20200413152917.0
006 - FIXED-LENGTH DATA ELEMENTS--ADDITIONAL MATERIAL CHARACTERISTICS
fixed length control field m eo d
007 - PHYSICAL DESCRIPTION FIXED FIELD--GENERAL INFORMATION
fixed length control field cr cn |||m|||a
008 - FIXED-LENGTH DATA ELEMENTS--GENERAL INFORMATION
fixed length control field 150320s2015 caua foab 001 0 eng d
020 ## - INTERNATIONAL STANDARD BOOK NUMBER
International Standard Book Number 9781627052245
Qualifying information ebook
020 ## - INTERNATIONAL STANDARD BOOK NUMBER
Canceled/invalid ISBN 9781627052238
Qualifying information print
024 7# - OTHER STANDARD IDENTIFIER
Standard number or code 10.2200/S00578ED1V01Y201404DTM040
Source of number or code doi
035 ## - SYSTEM CONTROL NUMBER
System control number (CaBNVSL)swl00404797
035 ## - SYSTEM CONTROL NUMBER
System control number (OCoLC)905421798
040 ## - CATALOGING SOURCE
Original cataloging agency CaBNVSL
Language of cataloging eng
Description conventions rda
Transcribing agency CaBNVSL
Modifying agency CaBNVSL
050 #4 - LIBRARY OF CONGRESS CALL NUMBER
Classification number QA76.9.D343
Item number D654 2015
082 04 - DEWEY DECIMAL CLASSIFICATION NUMBER
Classification number 006.312
Edition number 23
100 1# - MAIN ENTRY--PERSONAL NAME
Personal name Dong, Xin Luna.,
Relator term author.
245 10 - TITLE STATEMENT
Title Big data integration /
Statement of responsibility, etc. Xin Luna Dong, Divesh Srivastava.
264 #1 - PRODUCTION, PUBLICATION, DISTRIBUTION, MANUFACTURE, AND COPYRIGHT NOTICE
Place of production, publication, distribution, manufacture San Rafael, California (1537 Fourth Street, San Rafael, CA 94901 USA) :
Name of producer, publisher, distributor, manufacturer Morgan & Claypool,
Date of production, publication, distribution, manufacture, or copyright notice 2015.
300 ## - PHYSICAL DESCRIPTION
Extent 1 PDF (xx, 178 pages) :
Other physical details illustrations.
336 ## - CONTENT TYPE
Content type term text
Source rdacontent
337 ## - MEDIA TYPE
Media type term electronic
Source isbdmedia
338 ## - CARRIER TYPE
Carrier type term online resource
Source rdacarrier
490 1# - SERIES STATEMENT
Series statement Synthesis lectures on data management,
International Standard Serial Number 2153-5426 ;
Volume/sequential designation # 40
538 ## - SYSTEM DETAILS NOTE
System details note Mode of access: World Wide Web.
538 ## - SYSTEM DETAILS NOTE
System details note System requirements: Adobe Acrobat Reader.
500 ## - GENERAL NOTE
General note Part of: Synthesis digital library of engineering and computer science.
504 ## - BIBLIOGRAPHY, ETC. NOTE
Bibliography, etc. note Includes bibliographical references (pages 165-173) and index.
505 0# - FORMATTED CONTENTS NOTE
Formatted contents note 1. Motivation: challenges and opportunities for BDI -- 1.1 Traditional data integration -- 1.1.1 The flights example: data sources -- 1.1.2 The flights example: data integration -- 1.1.3 Data integration: architecture & three major steps -- 1.2 BDI: challenges -- 1.2.1 The "V" dimensions -- 1.2.2 Case study: quantity of deep web data -- 1.2.3 Case study: extracted domain-specific data -- 1.2.4 Case study: quality of deep web data -- 1.2.5 Case study: surface web structured data -- 1.2.6 Case study: extracted knowledge triples -- 1.3 BDI: opportunities -- 1.3.1 Data redundancy -- 1.3.2 Long data -- 1.3.3 Big data platforms -- 1.4 Outline of book --
505 8# - FORMATTED CONTENTS NOTE
Formatted contents note 2. Schema alignment -- 2.1 Traditional schema alignment: a quick tour -- 2.1.1 Mediated schema -- 2.1.2 Attribute matching -- 2.1.3 Schema mapping -- 2.1.4 Query answering -- 2.2 Addressing the variety and velocity challenges -- 2.2.1 Probabilistic schema alignment -- 2.2.2 Pay-as-you-go user feedback -- 2.3 Addressing the variety and volume challenges -- 2.3.1 Integrating deep web data -- 2.3.2 Integrating web tables --
505 8# - FORMATTED CONTENTS NOTE
Formatted contents note 3. Record linkage -- 3.1 Traditional record linkage: a quick tour -- 3.1.1 Pairwise matching -- 3.1.2 Clustering -- 3.1.3 Blocking -- 3.2 Addressing the volume challenge -- 3.2.1 Using MapReduce to parallelize blocking -- 3.2.2 Meta-blocking: pruning pairwise matchings -- 3.3 Addressing the velocity challenge -- 3.3.1 Incremental record linkage -- 3.4 Addressing the variety challenge -- 3.4.1 Linking text snippets to structured data -- 3.5 Addressing the veracity challenge -- 3.5.1 Temporal record linkage -- 3.5.2 Record linkage with uniqueness constraints --
505 8# - FORMATTED CONTENTS NOTE
Formatted contents note 4. BDI: data fusion -- 4.1 Traditional data fusion: a quick tour -- 4.2 Addressing the veracity challenge -- 4.2.1 Accuracy of a source -- 4.2.2 Probability of a value being true -- 4.2.3 Copying between sources -- 4.2.4 The end-to-end solution -- 4.2.5 Extensions and alternatives -- 4.3 Addressing the volume challenge -- 4.3.1 A MapReduce-based framework for offline fusion -- 4.3.2 Online data fusion -- 4.4 Addressing the velocity challenge -- 4.5 Addressing the variety challenge --
505 8# - FORMATTED CONTENTS NOTE
Formatted contents note 5. BDI: emerging topics -- 5.1 Role of crowdsourcing -- 5.1.1 Leveraging transitive relations -- 5.1.2 Crowdsourcing the end-to-end workflow -- 5.1.3 Future work -- 5.2 Source selection -- 5.2.1 Static sources -- 5.2.2 Dynamic sources -- 5.2.3 Future work -- 5.3 Source profiling -- 5.3.1 The Bellman system -- 5.3.2 Summarizing sources -- 5.3.3 Future work --
505 8# - FORMATTED CONTENTS NOTE
Formatted contents note 6. Conclusions -- Bibliography -- Authors' biographies -- Index.
506 1# - RESTRICTIONS ON ACCESS NOTE
Terms governing access Abstract freely available; full-text restricted to subscribers or individual document purchasers.
510 0# - CITATION/REFERENCES NOTE
Name of source Compendex
510 0# - CITATION/REFERENCES NOTE
Name of source INSPEC
510 0# - CITATION/REFERENCES NOTE
Name of source Google scholar
510 0# - CITATION/REFERENCES NOTE
Name of source Google book search
520 3# - SUMMARY, ETC.
Summary, etc. The big data era is upon us: data are being generated, analyzed, and used at an unprecedented scale, and data-driven decision making is sweeping through all aspects of society. Since the value of data explodes when it can be linked and fused with other data, addressing the big data integration (BDI) challenge is critical to realizing the promise of big data. BDI differs from traditional data integration along the dimensions of volume, velocity, variety, and veracity. First, not only can data sources contain a huge volume of data, but also the number of data sources is now in the millions. Second, because of the rate at which newly collected data are made available, many of the data sources are very dynamic, and the number of data sources is also rapidly exploding. Third, data sources are extremely heterogeneous in their structure and content, exhibiting considerable variety even for substantially similar entities. Fourth, the data sources are of widely differing qualities, with significant differences in the coverage, accuracy and timeliness of data provided. This book explores the progress that has been made by the data integration community on the topics of schema alignment, record linkage and data fusion in addressing these novel challenges faced by big data integration. Each of these topics is covered in a systematic way: first starting with a quick tour of the topic in the context of traditional data integration, followed by a detailed, example-driven exposition of recent innovative techniques that have been proposed to address the BDI challenges of volume, velocity, variety, and veracity. Finally, it presents emerging topics and opportunities that are specific to BDI, identifying promising directions for the data integration community.
530 ## - ADDITIONAL PHYSICAL FORM AVAILABLE NOTE
Additional physical form available note Also available in print.
588 ## - SOURCE OF DESCRIPTION NOTE
Source of description note Title from PDF title page (viewed on March 20, 2015).
650 #0 - SUBJECT ADDED ENTRY--TOPICAL TERM
Topical term or geographic name entry element Big data.
650 #0 - SUBJECT ADDED ENTRY--TOPICAL TERM
Topical term or geographic name entry element Data integration (Computer science)
653 ## - INDEX TERM--UNCONTROLLED
Uncontrolled term big data integration
653 ## - INDEX TERM--UNCONTROLLED
Uncontrolled term data fusion
653 ## - INDEX TERM--UNCONTROLLED
Uncontrolled term record linkage
653 ## - INDEX TERM--UNCONTROLLED
Uncontrolled term schema alignment
653 ## - INDEX TERM--UNCONTROLLED
Uncontrolled term variety
653 ## - INDEX TERM--UNCONTROLLED
Uncontrolled term velocity
653 ## - INDEX TERM--UNCONTROLLED
Uncontrolled term veracity
653 ## - INDEX TERM--UNCONTROLLED
Uncontrolled term volume
700 1# - ADDED ENTRY--PERSONAL NAME
Personal name Srivastava, Divesh.,
Relator term author.
776 08 - ADDITIONAL PHYSICAL FORM ENTRY
Relationship information Print version:
International Standard Book Number 9781627052238
830 #0 - SERIES ADDED ENTRY--UNIFORM TITLE
Uniform title Synthesis digital library of engineering and computer science.
830 #0 - SERIES ADDED ENTRY--UNIFORM TITLE
Uniform title Synthesis lectures on data management ;
Volume/sequential designation # 40.
International Standard Serial Number 2153-5426
856 42 - ELECTRONIC LOCATION AND ACCESS
Materials specified Abstract with links to resource
Uniform Resource Identifier http://ieeexplore.ieee.org/servlet/opac?bknumber=7065199
Holdings
Withdrawn status Lost status Damaged status Not for loan Permanent Location Current Location Date acquired Barcode Date last seen Price effective from Koha item type
        PK Kelkar Library, IIT Kanpur PK Kelkar Library, IIT Kanpur 2020-04-13 EBKE623 2020-04-13 2020-04-13 E books

Powered by Koha