Welcome to P K Kelkar Library, Online Public Access Catalogue (OPAC)

On the efficient determination of most near neighbors : (Record no. 562157)

000 -LEADER
fixed length control field 05427nam a2200697 i 4500
001 - CONTROL NUMBER
control field 7302713
003 - CONTROL NUMBER IDENTIFIER
control field IEEE
005 - DATE AND TIME OF LATEST TRANSACTION
control field 20200413152918.0
006 - FIXED-LENGTH DATA ELEMENTS--ADDITIONAL MATERIAL CHARACTERISTICS
fixed length control field m eo d
007 - PHYSICAL DESCRIPTION FIXED FIELD--GENERAL INFORMATION
fixed length control field cr cn |||m|||a
008 - FIXED-LENGTH DATA ELEMENTS--GENERAL INFORMATION
fixed length control field 150917s2015 cau foab 000 0 eng d
020 ## - INTERNATIONAL STANDARD BOOK NUMBER
International Standard Book Number 9781627058094
Qualifying information ebook
020 ## - INTERNATIONAL STANDARD BOOK NUMBER
Canceled/invalid ISBN 9781627058087
Qualifying information print
024 7# - OTHER STANDARD IDENTIFIER
Standard number or code 10.2200/S00661ED1V01Y201508ICR044
Source of number or code doi
035 ## - SYSTEM CONTROL NUMBER
System control number (CaBNVSL)swl00405557
035 ## - SYSTEM CONTROL NUMBER
System control number (OCoLC)921518060
040 ## - CATALOGING SOURCE
Original cataloging agency CaBNVSL
Language of cataloging eng
Description conventions rda
Transcribing agency CaBNVSL
Modifying agency CaBNVSL
050 #4 - LIBRARY OF CONGRESS CALL NUMBER
Classification number TK5105.884
Item number .M256 2015
082 04 - DEWEY DECIMAL CLASSIFICATION NUMBER
Classification number 025.04
Edition number 23
100 1# - MAIN ENTRY--PERSONAL NAME
Personal name Manasse, Mark S.,
Relator term author.
245 10 - TITLE STATEMENT
Title On the efficient determination of most near neighbors :
Remainder of title horseshoes, hand grenades, Web search, and other situations when close is close enough /
Statement of responsibility, etc. Mark S. Manasse.
250 ## - EDITION STATEMENT
Edition statement Second edition.
264 #1 - PRODUCTION, PUBLICATION, DISTRIBUTION, MANUFACTURE, AND COPYRIGHT NOTICE
Place of production, publication, distribution, manufacture San Rafael, California (1537 Fourth Street, San Rafael, CA 94901 USA) :
Name of producer, publisher, distributor, manufacturer Morgan & Claypool,
Date of production, publication, distribution, manufacture, or copyright notice 2015.
300 ## - PHYSICAL DESCRIPTION
Extent 1 PDF (xix, 80 pages)
336 ## - CONTENT TYPE
Content type term text
Source rdacontent
337 ## - MEDIA TYPE
Media type term electronic
Source isbdmedia
338 ## - CARRIER TYPE
Carrier type term online resource
Source rdacarrier
490 1# - SERIES STATEMENT
Series statement Synthesis lectures on information concepts, retrieval, and services,
International Standard Serial Number 1947-9468 ;
Volume/sequential designation # 44
538 ## - SYSTEM DETAILS NOTE
System details note Mode of access: World Wide Web.
538 ## - SYSTEM DETAILS NOTE
System details note System requirements: Adobe Acrobat Reader.
500 ## - GENERAL NOTE
General note Part of: Synthesis digital library of engineering and computer science.
504 ## - BIBLIOGRAPHY, ETC. NOTE
Bibliography, etc. note Includes bibliographical references (pages 75-77).
505 0# - FORMATTED CONTENTS NOTE
Formatted contents note 1. Introduction -- 1.1 On similarity, resemblance, look-alikes, and entity resolution -- 1.2 You must know at least this much math to read this book -- 1.3 Cumulative distribution and probability density functions --
505 8# - FORMATTED CONTENTS NOTE
Formatted contents note 2. Comparing web pages for similarity: an overview -- 2.1 Choosing the features of a web page to compare -- 2.2 Turning features into integers (Rabin hashing) -- 2.3 How should we measure the proximity of features? -- 2.4 Feature reduction -- 2.5 Putting it together with supershingling --
505 8# - FORMATTED CONTENTS NOTE
Formatted contents note 3. A personal history of web search -- 3.1 Complexity issues and implementation -- 3.2 Implementing duplicate suppression -- 3.3 Rabin hashing revisited --
505 8# - FORMATTED CONTENTS NOTE
Formatted contents note 4. Uniform sampling after Alta Vista -- 4.1 Using less randomness to improve sampling efficiency -- 4.2 Conjectures vs. theorems -- 4.3 Finding the first point of divergence efficiently -- 4.4 Uniform consistent sampling summarized --
505 8# - FORMATTED CONTENTS NOTE
Formatted contents note 5. Why weight (and how)? -- 5.1 Constant expected-time consistent weighted sampling -- 5.2 Constant time consistent weighted sampling -- 5.3 Accelerating weighted sampling --
505 8# - FORMATTED CONTENTS NOTE
Formatted contents note 6. A few applications -- 6.1 Web deduplication -- 6.2 File systems: winnowing and friends -- 6.3 Further applications --
505 8# - FORMATTED CONTENTS NOTE
Formatted contents note 7. Forks in the road: Flajolet and slightly biased sampling -- 7.1 Flajolet-Martin -- 7.2 Li's rediscovery -- 7.3 Approximation by randomized rounding -- 7.4 Scaling --
505 8# - FORMATTED CONTENTS NOTE
Formatted contents note Afterword -- Bibliography -- Author's biography.
506 1# - RESTRICTIONS ON ACCESS NOTE
Terms governing access Abstract freely available; full-text restricted to subscribers or individual document purchasers.
510 0# - CITATION/REFERENCES NOTE
Name of source Compendex
510 0# - CITATION/REFERENCES NOTE
Name of source INSPEC
510 0# - CITATION/REFERENCES NOTE
Name of source Google scholar
510 0# - CITATION/REFERENCES NOTE
Name of source Google book search
520 3# - SUMMARY, ETC.
Summary, etc. The time-worn aphorism "close only counts in horseshoes and hand grenades" is clearly inadequate. Close also counts in golf, shuffleboard, archery, darts, curling, and other games of accuracy in which hitting the precise center of the target isn't to be expected every time, or in which we can expect to be driven from the target by skilled opponents. This book is not devoted to sports discussions, but to efficient algorithms for determining pairs of closely related web pages, and a few other situations in which we have found that inexact matching is good enough - where proximity suffices. We will not, however, attempt to be comprehensive in the investigation of probabilistic algorithms, approximation algorithms, or even techniques for organizing the discovery of nearest neighbors. We are more concerned with finding nearby neighbors; if they are not particularly close by, we are not particularly interested. In thinking of when approximation is sufficient, remember the oft-told joke about two campers sitting around after dinner. They hear noises coming towards them. One of them reaches for a pair of running shoes, and starts to don them. The second then notes that even with running shoes, they cannot hope to outrun a bear, to which the first notes that most likely the bear will be satiated after catching the slower of them. We seek problems in which we don't need to be faster than the bear, just faster than the others fleeing the bear.
530 ## - ADDITIONAL PHYSICAL FORM AVAILABLE NOTE
Additional physical form available note Also available in print.
588 ## - SOURCE OF DESCRIPTION NOTE
Source of description note Title from PDF title page (viewed on September 17, 2015).
650 #0 - SUBJECT ADDED ENTRY--TOPICAL TERM
Topical term or geographic name entry element Internet searching.
650 #0 - SUBJECT ADDED ENTRY--TOPICAL TERM
Topical term or geographic name entry element Nearest neighbor analysis (Statistics)
650 #0 - SUBJECT ADDED ENTRY--TOPICAL TERM
Topical term or geographic name entry element Statistical matching.
653 ## - INDEX TERM--UNCONTROLLED
Uncontrolled term nearest neighbor
653 ## - INDEX TERM--UNCONTROLLED
Uncontrolled term search algorithms
653 ## - INDEX TERM--UNCONTROLLED
Uncontrolled term information retrieval
653 ## - INDEX TERM--UNCONTROLLED
Uncontrolled term IR
653 ## - INDEX TERM--UNCONTROLLED
Uncontrolled term multi-dimensional
776 08 - ADDITIONAL PHYSICAL FORM ENTRY
Relationship information Print version:
International Standard Book Number 9781627058087
830 #0 - SERIES ADDED ENTRY--UNIFORM TITLE
Uniform title Synthesis digital library of engineering and computer science.
830 #0 - SERIES ADDED ENTRY--UNIFORM TITLE
Uniform title Synthesis lectures on information concepts, retrieval, and services ;
Volume/sequential designation # 44.
International Standard Serial Number 1947-9468
856 42 - ELECTRONIC LOCATION AND ACCESS
Materials specified Abstract with links to resource
Uniform Resource Identifier http://ieeexplore.ieee.org/servlet/opac?bknumber=7302713
Holdings
Withdrawn status Lost status Damaged status Not for loan Permanent Location Current Location Date acquired Barcode Date last seen Price effective from Koha item type
        PK Kelkar Library, IIT Kanpur PK Kelkar Library, IIT Kanpur 2020-04-13 EBKE657 2020-04-13 2020-04-13 E books

Powered by Koha