Query processing over incomplete databases /
By: Gao, Yunjun [author.].
Contributor(s): Miao, Xiaoye [author.].
Material type: BookSeries: Synthesis digital library of engineering and computer science: ; Synthesis lectures on data management: # 50.Publisher: [San Rafael, California] : Morgan & Claypool, 2018.Description: 1 PDF (xv, 106 pages) : illustrations.Content type: text Media type: electronic Carrier type: online resourceISBN: 9781681734217.Subject(s): Querying (Computer science) | Database searching | Missing observations (Statistics) | query processing | incomplete data | missing data | similarity search | k-nearest neighbor search | skyline query | top-k dominating query | crowdsourcingDDC classification: 005.7565 Online resources: Abstract with links to resource Also available in print.Item type | Current location | Call number | Status | Date due | Barcode | Item holds |
---|---|---|---|---|---|---|
E books | PK Kelkar Library, IIT Kanpur | Available | EBKE815 |
Mode of access: World Wide Web.
System requirements: Adobe Acrobat Reader.
Part of: Synthesis digital library of engineering and computer science.
Includes bibliographical references (pages 87-103).
1. Introduction -- 1.1 Applications of incomplete data management -- 1.2 Overview of incomplete databases -- 1.2.1 Indexing incomplete databases -- 1.2.2 Querying incomplete databases -- 1.2.3 Incomplete database management systems -- 1.3 Challenges of querying incomplete databases -- 1.4 Organization --
2. Handling incomplete data methods -- 2.1 Method taxonomy -- 2.2 Overview of imputation methods -- 2.2.1 Statistical imputation -- 2.2.2 Machine learning-based imputation -- 2.2.3 Modern imputation methods --
3. Query semantics on incomplete data -- 3.1 K-nearest neighbor search on incomplete data -- 3.1.1 Background -- 3.1.2 Problem definition -- 3.2 Skyline queries on incomplete data -- 3.2.1 Background -- 3.2.2 Problem definition -- 3.3 Top-k dominating queries on incomplete data -- 3.3.1 Background -- 3.3.2 Problem definition --
4. Advanced techniques -- 4.1 Index structures -- 4.1.1 Lab index for k-nearest neighbor search on incomplete data -- 4.1.2 Histogram index for k-nearest neighbor search on incomplete data -- 4.1.3 Bitmap index for top-k dominating queries on incomplete data -- 4.2 Pruning heuristics -- 4.2.1 Alpha value pruning for k-nearest neighbor search on incomplete data -- 4.2.2 Histogram-based pruning for k-nearest neighbor search on incomplete data -- 4.2.3 Local skyband pruning for top-k dominating queries on incomplete data -- 4.2.4 Upper bound score pruning for top-k dominating queries on incomplete data -- 4.2.5 Bitmap pruning for top-k dominating queries on incomplete data -- 4.3 Crowdsourcing techniques -- 4.3.1 Crowdsourcing framework for skyline queries on incomplete data -- 4.3.2 C-table construction -- 4.3.3 Probability computation -- 4.3.4 Crowd task selection --
5. Conclusions -- Bibliography -- Authors' biographies.
Abstract freely available; full-text restricted to subscribers or individual document purchasers.
Compendex
INSPEC
Google scholar
Google book search
Incomplete data is part of life and almost all areas of scientific studies. Users tend to skip certain fields when they fill out online forms; participants choose to ignore sensitive questions on surveys; sensors fail, resulting in the loss of certain readings; publicly viewable satellite map services have missing data in many mobile applications; and in privacy-preserving applications, the data is incomplete deliberately in order to preserve the sensitivity of some attribute values. Query processing is a fundamental problem in computer science, and is useful in a variety of applications. In this book, we mostly focus on the query processing over incomplete databases, which involves finding a set of qualified objects from a specified incomplete dataset in order to support a wide spectrum of real-life applications. We first elaborate the three general kinds of methods of handling incomplete data, including (i) discarding the data with missing values, (ii) imputation for the missing values, and (iii) just depending on the observed data values. For the third method type, we introduce the semantics of k-nearest neighbor (kNN) search, skyline query, and top-k dominating query on incomplete data, respectively. In terms of the three representative queries over incomplete data, we investigate some advanced techniques to process incomplete data queries, including indexing, pruning as well as crowdsourcing techniques.
Also available in print.
Title from PDF title page (viewed on August 29, 2018).
There are no comments for this item.