000 -LEADER |
fixed length control field |
06856nam a2201477 i 4500 |
001 - CONTROL NUMBER |
control field |
7374857 |
003 - CONTROL NUMBER IDENTIFIER |
control field |
IEEE |
005 - DATE AND TIME OF LATEST TRANSACTION |
control field |
20200413152920.0 |
006 - FIXED-LENGTH DATA ELEMENTS--ADDITIONAL MATERIAL CHARACTERISTICS |
fixed length control field |
m eo d |
007 - PHYSICAL DESCRIPTION FIXED FIELD--GENERAL INFORMATION |
fixed length control field |
cr cn |||m|||a |
008 - FIXED-LENGTH DATA ELEMENTS--GENERAL INFORMATION |
fixed length control field |
160122s2016 caua foab 000 0 eng d |
020 ## - INTERNATIONAL STANDARD BOOK NUMBER |
International Standard Book Number |
9781627058131 |
Qualifying information |
ebook |
020 ## - INTERNATIONAL STANDARD BOOK NUMBER |
Canceled/invalid ISBN |
9781627058124 |
Qualifying information |
print |
024 7# - OTHER STANDARD IDENTIFIER |
Standard number or code |
10.2200/S00662ED1V01Y201508ICR045 |
Source of number or code |
doi |
035 ## - SYSTEM CONTROL NUMBER |
System control number |
(CaBNVSL)swl00406108 |
035 ## - SYSTEM CONTROL NUMBER |
System control number |
(OCoLC)935806030 |
040 ## - CATALOGING SOURCE |
Original cataloging agency |
CaBNVSL |
Language of cataloging |
eng |
Description conventions |
rda |
Transcribing agency |
CaBNVSL |
Modifying agency |
CaBNVSL |
050 #4 - LIBRARY OF CONGRESS CALL NUMBER |
Classification number |
TK5105.884 |
Item number |
.C257 2016 |
082 04 - DEWEY DECIMAL CLASSIFICATION NUMBER |
Classification number |
005.758 |
Edition number |
23 |
100 1# - MAIN ENTRY--PERSONAL NAME |
Personal name |
Cambazoglu, B. Barla., |
Relator term |
author. |
245 10 - TITLE STATEMENT |
Title |
Scalability challenges in web search engines / |
Statement of responsibility, etc. |
B. Barla Cambazoglu. Ricardo Baeza-Yates. |
264 #1 - PRODUCTION, PUBLICATION, DISTRIBUTION, MANUFACTURE, AND COPYRIGHT NOTICE |
Place of production, publication, distribution, manufacture |
San Rafael, California (1537 Fourth Street, San Rafael, CA 94901 USA) : |
Name of producer, publisher, distributor, manufacturer |
Morgan & Claypool, |
Date of production, publication, distribution, manufacture, or copyright notice |
2016. |
300 ## - PHYSICAL DESCRIPTION |
Extent |
1 PDF (xv, 122 pages) : |
Other physical details |
illustrations. |
336 ## - CONTENT TYPE |
Content type term |
text |
Source |
rdacontent |
337 ## - MEDIA TYPE |
Media type term |
electronic |
Source |
isbdmedia |
338 ## - CARRIER TYPE |
Carrier type term |
online resource |
Source |
rdacarrier |
490 1# - SERIES STATEMENT |
Series statement |
Synthesis lectures on information concepts, retrieval, and services, |
International Standard Serial Number |
1947-9468 ; |
Volume/sequential designation |
# 45 |
538 ## - SYSTEM DETAILS NOTE |
System details note |
Mode of access: World Wide Web. |
538 ## - SYSTEM DETAILS NOTE |
System details note |
System requirements: Adobe Acrobat Reader. |
500 ## - GENERAL NOTE |
General note |
Part of: Synthesis digital library of engineering and computer science. |
504 ## - BIBLIOGRAPHY, ETC. NOTE |
Bibliography, etc. note |
Includes bibliographical references (pages 93-120). |
505 0# - FORMATTED CONTENTS NOTE |
Formatted contents note |
1. Introduction -- 1.1 Web search business -- 1.2 Basic search engine architecture -- 1.3 Scalability issues -- |
505 8# - FORMATTED CONTENTS NOTE |
Formatted contents note |
2. The web crawling system -- 2.1 Basic web crawling architecture -- 2.2 Extending the web repository -- 2.3 Refreshing the web repository -- 2.4 Managing the web repository -- 2.5 Distributed web crawling -- 2.6 Factors affecting crawling performance -- 2.7 Literature on web crawling -- 2.8 Open issues in web crawling -- |
505 8# - FORMATTED CONTENTS NOTE |
Formatted contents note |
3. The indexing system -- 3.1 Basic indexing architecture -- 3.2 Inverted index -- 3.3 Compressing an inverted index -- 3.4 Constructing an inverted index -- 3.5 Updating an inverted index -- 3.6 Partitioning an inverted index -- 3.7 Literature on indexing -- 3.8 Open issues in indexing -- |
505 8# - FORMATTED CONTENTS NOTE |
Formatted contents note |
4. The query processing system -- 4.1 Basic query processing architecture -- 4.2 Query processing on a search node -- 4.3 Query processing in a search cluster -- 4.4 Architectural optimizations -- 4.5 Caching -- 4.6 Query processing on multiple search sites -- 4.7 Literature on query processing -- 4.8 Open issues in query processing -- |
505 8# - FORMATTED CONTENTS NOTE |
Formatted contents note |
5. Concluding remarks -- Bibliography -- Authors' biographies. |
506 1# - RESTRICTIONS ON ACCESS NOTE |
Terms governing access |
Abstract freely available; full-text restricted to subscribers or individual document purchasers. |
510 0# - CITATION/REFERENCES NOTE |
Name of source |
Compendex |
510 0# - CITATION/REFERENCES NOTE |
Name of source |
INSPEC |
510 0# - CITATION/REFERENCES NOTE |
Name of source |
Google scholar |
510 0# - CITATION/REFERENCES NOTE |
Name of source |
Google book search |
520 3# - SUMMARY, ETC. |
Summary, etc. |
In this book, we aim to provide a fairly comprehensive overview of the scalability and efficiency challenges in large-scale web search engines. More specifically, we cover the issues involved in the design of three separate systems that are commonly available in every web-scale search engine: web crawling, indexing, and query processing systems. We present the performance challenges encountered in these systems and review a wide range of design alternatives employed as solution to these challenges, specifically focusing on algorithmic and architectural optimizations. We discuss the available optimizations at different computational granularities, ranging from a single computer node to a collection of data centers. We provide some hints to both the practitioners and theoreticians involved in the field about the way large-scale web search engines operate and the adopted design choices. Moreover, we survey the efficiency literature, providing pointers to a large number of relatively important research papers. Finally, we discuss some open research problems in the context of search engine efficiency. |
530 ## - ADDITIONAL PHYSICAL FORM AVAILABLE NOTE |
Additional physical form available note |
Also available in print. |
588 ## - SOURCE OF DESCRIPTION NOTE |
Source of description note |
Title from PDF title page (viewed on January 22, 2016). |
650 #0 - SUBJECT ADDED ENTRY--TOPICAL TERM |
Topical term or geographic name entry element |
Web search engines. |
650 #0 - SUBJECT ADDED ENTRY--TOPICAL TERM |
Topical term or geographic name entry element |
Computer networks |
General subdivision |
Scalability. |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
cache invalidation |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
central broker |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
compression |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
content spam |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
delay attacks |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
distributed crawling |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
distributed query processing |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
DNS cache |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
document id reassignment |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
download throughput |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
dynamic index pruning |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
early exit optimization |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
effectiveness |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
efficiency |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
forward index |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
index construction |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
index maintenance |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
index partitioning |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
index replication |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
indexing |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
inverted index |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
inverted list cache |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
inverted list |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
link exchange |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
link farm |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
link spam |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
machine-learned ranking |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
matching |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
multisite web search |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
near duplicate detection |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
page cache |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
performance |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
position list |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
posting list |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
query-independent feature |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
query expansion |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
query forwarding |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
query interpretation |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
query processing |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
query rewriting |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
relevance |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
query scheduling |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
response latency |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
result cache |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
result freshness |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
result preparation |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
result retrieval |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
scalability |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
search center |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
search cluster |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
search engine result page |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
search quality |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
selective search |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
shingles |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
skip pointer |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
snippet |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
soft 404 page |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
spider trap |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
static index pruning |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
text processing |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
throughput |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
tiering |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
time-to-live |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
two-phase ranking |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
URL-seen test |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
URL caching |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
web change |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
web coverage |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
web crawler |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
web frontier |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
web graph |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
web repository |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
web search engine |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
website mirror |
700 1# - ADDED ENTRY--PERSONAL NAME |
Personal name |
Baeza-Yates, R. |
Fuller form of name |
(Ricardo), |
Relator term |
author. |
776 08 - ADDITIONAL PHYSICAL FORM ENTRY |
Relationship information |
Print version: |
International Standard Book Number |
9781627058124 |
830 #0 - SERIES ADDED ENTRY--UNIFORM TITLE |
Uniform title |
Synthesis digital library of engineering and computer science. |
830 #0 - SERIES ADDED ENTRY--UNIFORM TITLE |
Uniform title |
Synthesis lectures on information concepts, retrieval, and services ; |
Volume/sequential designation |
# 45. |
International Standard Serial Number |
1947-9468 |
856 42 - ELECTRONIC LOCATION AND ACCESS |
Materials specified |
Abstract with links to resource |
Uniform Resource Identifier |
http://ieeexplore.ieee.org/servlet/opac?bknumber=7374857 |