Welcome to P K Kelkar Library, Online Public Access Catalogue (OPAC)

Data profiling / (Record no. 562332)

000 -LEADER
fixed length control field 05679nam a22007451i 4500
001 - CONTROL NUMBER
control field 8540360
003 - CONTROL NUMBER IDENTIFIER
control field IEEE
005 - DATE AND TIME OF LATEST TRANSACTION
control field 20200413152928.0
006 - FIXED-LENGTH DATA ELEMENTS--ADDITIONAL MATERIAL CHARACTERISTICS
fixed length control field m eo d
007 - PHYSICAL DESCRIPTION FIXED FIELD--GENERAL INFORMATION
fixed length control field cr cn |||m|||a
008 - FIXED-LENGTH DATA ELEMENTS--GENERAL INFORMATION
fixed length control field 181128s2019 caua foab 000 0 eng d
020 ## - INTERNATIONAL STANDARD BOOK NUMBER
International Standard Book Number 9781681734477
Qualifying information ebook
020 ## - INTERNATIONAL STANDARD BOOK NUMBER
Canceled/invalid ISBN 9781681734484
Qualifying information hardcover
020 ## - INTERNATIONAL STANDARD BOOK NUMBER
Canceled/invalid ISBN 9781681734460
Qualifying information paperback
024 7# - OTHER STANDARD IDENTIFIER
Standard number or code 10.2200/S00878ED1V01Y201810DTM052
Source of number or code doi
035 ## - SYSTEM CONTROL NUMBER
System control number (CaBNVSL)swl000408790
035 ## - SYSTEM CONTROL NUMBER
System control number (OCoLC)1076493845
040 ## - CATALOGING SOURCE
Original cataloging agency CaBNVSL
Language of cataloging eng
Description conventions rda
Transcribing agency CaBNVSL
Modifying agency CaBNVSL
050 #4 - LIBRARY OF CONGRESS CALL NUMBER
Classification number Z666.7
Item number .A233 2019
082 04 - DEWEY DECIMAL CLASSIFICATION NUMBER
Classification number 025.3
Edition number 23
100 1# - MAIN ENTRY--PERSONAL NAME
Personal name Abedjan, Ziawasch,
Relator term author.
245 10 - TITLE STATEMENT
Title Data profiling /
Statement of responsibility, etc. Ziawasch Abedjan, Lukasz Golab, Felix Naumann, Thorsten Papenbrock.
264 #1 - PRODUCTION, PUBLICATION, DISTRIBUTION, MANUFACTURE, AND COPYRIGHT NOTICE
Place of production, publication, distribution, manufacture [San Rafael, California] :
Name of producer, publisher, distributor, manufacturer Morgan & Claypool,
Date of production, publication, distribution, manufacture, or copyright notice 2019.
300 ## - PHYSICAL DESCRIPTION
Extent 1 PDF (xviii, 136 pages) :
Other physical details illustrations.
336 ## - CONTENT TYPE
Content type term text
Source rdacontent
337 ## - MEDIA TYPE
Media type term electronic
Source isbdmedia
338 ## - CARRIER TYPE
Carrier type term online resource
Source rdacarrier
490 1# - SERIES STATEMENT
Series statement Synthesis lectures on data management,
International Standard Serial Number 2153-5426 ;
Volume/sequential designation # 52
538 ## - SYSTEM DETAILS NOTE
System details note Mode of access: World Wide Web.
538 ## - SYSTEM DETAILS NOTE
System details note System requirements: Adobe Acrobat Reader.
500 ## - GENERAL NOTE
General note Part of: Synthesis digital library of engineering and computer science.
504 ## - BIBLIOGRAPHY, ETC. NOTE
Bibliography, etc. note Includes bibliographical references (pages 113-134).
505 0# - FORMATTED CONTENTS NOTE
Formatted contents note 1. Discovering metadata -- 1.1 Motivation and overview -- 1.2 Data profiling and data mining -- 1.3 Use cases -- 1.4 Organization of this book --
505 8# - FORMATTED CONTENTS NOTE
Formatted contents note 2. Data profiling tasks -- 2.1 Single-column analysis -- 2.2 Dependency discovery -- 2.3 Relaxed dependencies --
505 8# - FORMATTED CONTENTS NOTE
Formatted contents note 3. Single-column analysis -- 3.1 Cardinalities -- 3.2 Value distributions -- 3.3 Data types, patterns, and domains -- 3.4 Data completeness -- 3.5 Approximate statistics -- 3.6 Summary and discussion --
505 8# - FORMATTED CONTENTS NOTE
Formatted contents note 4. Dependency discovery -- 4.1 Dependency definitions -- 4.2 Search space and data structures -- 4.3 Discovering unique column combinations -- 4.4 Discovering functional dependencies -- 4.5 Discovering inclusion dependencies --
505 8# - FORMATTED CONTENTS NOTE
Formatted contents note 5. Relaxed and other dependencies -- 5.1 Relaxing the extent of a dependency -- 5.1.1 Partial dependencies -- 5.1.2 Conditional dependencies -- 5.2 Relaxing attribute comparisons -- 5.2.1 Metric and matching dependencies -- 5.2.2 Order and sequential dependencies -- 5.3 Approximating the dependency discovery -- 5.4 Generalizing functional dependencies -- 5.4.1 Denial constraints -- 5.4.2 Multivalued dependencies --
505 8# - FORMATTED CONTENTS NOTE
Formatted contents note 6. Use cases -- 6.1 Data exploration -- 6.2 Schema engineering -- 6.3 Data cleaning -- 6.4 Query optimization -- 6.5 Data integration --
505 8# - FORMATTED CONTENTS NOTE
Formatted contents note 7. Profiling non-relational data -- 7.1 XML -- 7.2 RDF -- 7.3 Time series -- 7.4 Graphs -- 7.5 Text --
505 8# - FORMATTED CONTENTS NOTE
Formatted contents note 8. Data profiling tools -- 8.1 Research prototypes -- 8.2 Commercial tools --
505 8# - FORMATTED CONTENTS NOTE
Formatted contents note 9. Data profiling challenges -- 9.1 Functional challenges -- 9.1.1 Profiling dynamic data -- 9.1.2 Interactive profiling -- 9.1.3 Profiling for integration -- 9.1.4 Interpreting profiling results -- 9.2 Non-functional challenges -- 9.2.1 Efficiency and scalability -- 9.2.2 Profiling on new architectures -- 9.2.3 Benchmarking profiling methods --
505 8# - FORMATTED CONTENTS NOTE
Formatted contents note 10. Conclusions -- Bibliography -- Authors' biographies.
506 ## - RESTRICTIONS ON ACCESS NOTE
Terms governing access Abstract freely available; full-text restricted to subscribers or individual document purchasers.
510 0# - CITATION/REFERENCES NOTE
Name of source Compendex
510 0# - CITATION/REFERENCES NOTE
Name of source INSPEC
510 0# - CITATION/REFERENCES NOTE
Name of source Google scholar
510 0# - CITATION/REFERENCES NOTE
Name of source Google book search
520 3# - SUMMARY, ETC.
Summary, etc. Data profiling refers to the activity of collecting data about data, i.e., metadata. Most IT professionals and researchers who work with data have engaged in data profiling, at least informally, to understand and explore an unfamiliar dataset or to determine whether a new dataset is appropriate for a particular task at hand. Data profiling results are also important in a variety of other situations, including query optimization, data integration, and data cleaning. Simple metadata are statistics, such as the number of rows and columns, schema and datatype information, the number of distinct values, statistical value distributions, and the number of null or empty values in each column. More complex types of metadata are statements about multiple columns and their correlation, such as candidate keys, functional dependencies, and other types of dependencies. This book provides a classification of the various types of profilable metadata, discusses popular data profiling tasks, and surveys state-of-the-art profiling algorithms. While most of the book focuses on tasks and algorithms for relational data profiling, we also briefly discuss systems and techniques for profiling non-relational data such as graphs and text. We conclude with a discussion of data profiling challenges and directions for future work in this area.
530 ## - ADDITIONAL PHYSICAL FORM AVAILABLE NOTE
Additional physical form available note Also available in print.
588 ## - SOURCE OF DESCRIPTION NOTE
Source of description note Title from PDF title page (viewed on November 28, 2018).
650 #0 - SUBJECT ADDED ENTRY--TOPICAL TERM
Topical term or geographic name entry element Metadata.
650 #0 - SUBJECT ADDED ENTRY--TOPICAL TERM
Topical term or geographic name entry element Data mining.
653 ## - INDEX TERM--UNCONTROLLED
Uncontrolled term data analysis
653 ## - INDEX TERM--UNCONTROLLED
Uncontrolled term data modeling
653 ## - INDEX TERM--UNCONTROLLED
Uncontrolled term dependency discovery
653 ## - INDEX TERM--UNCONTROLLED
Uncontrolled term data mining
653 ## - INDEX TERM--UNCONTROLLED
Uncontrolled term metadata
700 1# - ADDED ENTRY--PERSONAL NAME
Personal name Golab, Lukasz,
Dates associated with a name 1978-,
Relator term author.
700 1# - ADDED ENTRY--PERSONAL NAME
Personal name Naumann, Felix,
Relator term author.
700 1# - ADDED ENTRY--PERSONAL NAME
Personal name Papenbrock, Thorsten,
Relator term author.
776 08 - ADDITIONAL PHYSICAL FORM ENTRY
Relationship information Print version:
International Standard Book Number 9781681734460
-- 9781681734484
830 #0 - SERIES ADDED ENTRY--UNIFORM TITLE
Uniform title Synthesis digital library of engineering and computer science.
830 #0 - SERIES ADDED ENTRY--UNIFORM TITLE
Uniform title Synthesis lectures on data management ;
Volume/sequential designation # 52.
International Standard Serial Number 2153-5426
856 42 - ELECTRONIC LOCATION AND ACCESS
Materials specified Abstract with links to resource
Uniform Resource Identifier https://ieeexplore.ieee.org/servlet/opac?bknumber=8540360
Holdings
Withdrawn status Lost status Damaged status Not for loan Permanent Location Current Location Date acquired Barcode Date last seen Price effective from Koha item type
        PK Kelkar Library, IIT Kanpur PK Kelkar Library, IIT Kanpur 2020-04-13 EBKE832 2020-04-13 2020-04-13 E books

Powered by Koha