000 -LEADER |
fixed length control field |
04748nam a2200685 i 4500 |
001 - CONTROL NUMBER |
control field |
6812606 |
003 - CONTROL NUMBER IDENTIFIER |
control field |
IEEE |
005 - DATE AND TIME OF LATEST TRANSACTION |
control field |
20200413152904.0 |
006 - FIXED-LENGTH DATA ELEMENTS--ADDITIONAL MATERIAL CHARACTERISTICS |
fixed length control field |
m eo d |
007 - PHYSICAL DESCRIPTION FIXED FIELD--GENERAL INFORMATION |
fixed length control field |
cr cn |||m|||a |
008 - FIXED-LENGTH DATA ELEMENTS--GENERAL INFORMATION |
fixed length control field |
120112s2012 caua foab 000 0 eng d |
020 ## - INTERNATIONAL STANDARD BOOK NUMBER |
International Standard Book Number |
9781608457960 (electronic bk.) |
020 ## - INTERNATIONAL STANDARD BOOK NUMBER |
Canceled/invalid ISBN |
9781608457953 (pbk.) |
024 7# - OTHER STANDARD IDENTIFIER |
Standard number or code |
10.2200/S00396ED1V01Y201111DTM022 |
Source of number or code |
doi |
035 ## - SYSTEM CONTROL NUMBER |
System control number |
(CaBNVSL)swl00400390 |
035 ## - SYSTEM CONTROL NUMBER |
System control number |
(OCoLC)772525452 |
040 ## - CATALOGING SOURCE |
Original cataloging agency |
CaBNVSL |
Transcribing agency |
CaBNVSL |
Modifying agency |
CaBNVSL |
050 #4 - LIBRARY OF CONGRESS CALL NUMBER |
Classification number |
QA76.9.T48 |
Item number |
B274 2012 |
082 04 - DEWEY DECIMAL CLASSIFICATION NUMBER |
Classification number |
005 |
Edition number |
23 |
100 1# - MAIN ENTRY--PERSONAL NAME |
Personal name |
Barsky, Marina. |
245 10 - TITLE STATEMENT |
Title |
Full-text (substring) indexes in external memory |
Medium |
[electronic resource] / |
Statement of responsibility, etc. |
Marina Barsky, Ulrike Stege, Alex Thomo. |
260 ## - PUBLICATION, DISTRIBUTION, ETC. |
Place of publication, distribution, etc. |
San Rafael, Calif. (1537 Fourth Street, San Rafael, CA 94901 USA) : |
Name of publisher, distributor, etc. |
Morgan & Claypool, |
Date of publication, distribution, etc. |
c2012. |
300 ## - PHYSICAL DESCRIPTION |
Extent |
1 electronic text (xii, 76 p.) : |
Other physical details |
ill., digital file. |
490 1# - SERIES STATEMENT |
Series statement |
Synthesis lectures on data management, |
International Standard Serial Number |
2153-5426 ; |
Volume/sequential designation |
# 22 |
538 ## - SYSTEM DETAILS NOTE |
System details note |
Mode of access: World Wide Web. |
538 ## - SYSTEM DETAILS NOTE |
System details note |
System requirements: Adobe Acrobat Reader. |
500 ## - GENERAL NOTE |
General note |
Part of: Synthesis digital library of engineering and computer science. |
500 ## - GENERAL NOTE |
General note |
Series from website. |
504 ## - BIBLIOGRAPHY, ETC. NOTE |
Bibliography, etc. note |
Includes bibliographical references (p. 71-74). |
505 0# - FORMATTED CONTENTS NOTE |
Formatted contents note |
Preface -- Acknowledgments -- |
505 8# - FORMATTED CONTENTS NOTE |
Formatted contents note |
1. Structures for indexing substrings -- 1.1 Indexing substrings -- 1.2 Suffix array -- 1.3 Suffix trie -- 1.4 Suffix tree -- 1.5 Representation of a suffix tree -- 1.6 Possible solutions to the memory problem -- 1.6.1 Index compression -- 1.6.2 Using disk space -- 1.7 Summary -- 1.8 Bibliographic notes -- |
505 8# - FORMATTED CONTENTS NOTE |
Formatted contents note |
2. External construction of suffix trees -- 2.1 Transformation algorithms -- 2.2 Brute-force algorithms -- 2.3 Algorithms based on suffix-links -- 2.3.1 The Ukkonen algorithm -- 2.3.2 Distributed and paged suffix trees -- 2.4 Disk-optimized algorithms -- 2.4.1 The top down algorithm -- 2.4.2 The partition-and-merge algorithm -- 2.4.3 The merge sort algorithm -- 2.5 Summary -- 2.6 Bibliographic notes -- |
505 8# - FORMATTED CONTENTS NOTE |
Formatted contents note |
3. Scaling up:when the input exceeds the main memory -- 3.1 The wavefront algorithm -- 3.2 The B2ST algorithm -- 3.3 Summary -- 3.4 Bibliographic notes -- |
505 8# - FORMATTED CONTENTS NOTE |
Formatted contents note |
4. Queries for disk-based indexes -- 4.1 Index layouts -- 4.1.1 Partitioning by prefix -- 4.1.2 Partitioning by intervals -- 4.1.3 String B-tree -- 4.2 Pattern matching with disk-based indexes -- 4.2.1 Exact pattern matching -- 4.2.2 Matching all substrings of a query string -- 4.2.3 Approximate pattern matching -- 4.3 Repeating and unique substrings -- 4.3.1 Maximal repeats -- 4.3.2 Common and unique substrings -- 4.4 Summary -- 4.5 Bibliographic notes -- |
505 8# - FORMATTED CONTENTS NOTE |
Formatted contents note |
5. Conclusions and open problems -- 5.1 Need for better construction algorithms -- 5.2 Need for better query algorithms -- |
505 8# - FORMATTED CONTENTS NOTE |
Formatted contents note |
Bibliography -- Authors' biographies. |
506 1# - RESTRICTIONS ON ACCESS NOTE |
Terms governing access |
Abstract freely available; full-text restricted to subscribers or individual document purchasers. |
510 0# - CITATION/REFERENCES NOTE |
Name of source |
Compendex |
510 0# - CITATION/REFERENCES NOTE |
Name of source |
INSPEC |
510 0# - CITATION/REFERENCES NOTE |
Name of source |
Google scholar |
510 0# - CITATION/REFERENCES NOTE |
Name of source |
Google book search |
520 3# - SUMMARY, ETC. |
Summary, etc. |
Nowadays, textual databases are among the most rapidly growing collections of data. Some of these collections contain a new type of data that differs from classical numerical or textual data. These are long sequences of symbols, not divided into well-separated small tokens (words). The most prominent among such collections are databases of biological sequences, which are experiencing today an unprecedented growth rate. Starting in 2008, the "1000 Genomes Project" has been launched with the ultimate goal of collecting sequences of additional 1,500 Human genomes, 500 each of European, African, and East Asian origin. This will produce an extensive catalog of Human genetic variations. The size of just the raw sequences in this catalog would be about 5 terabytes. |
530 ## - ADDITIONAL PHYSICAL FORM AVAILABLE NOTE |
Additional physical form available note |
Also available in print. |
588 ## - SOURCE OF DESCRIPTION NOTE |
Source of description note |
Title from PDF t.p. (viewed on January 12, 2012). |
650 #0 - SUBJECT ADDED ENTRY--TOPICAL TERM |
Topical term or geographic name entry element |
Text processing (Computer science) |
650 #0 - SUBJECT ADDED ENTRY--TOPICAL TERM |
Topical term or geographic name entry element |
Data structures (Computer science) |
650 #0 - SUBJECT ADDED ENTRY--TOPICAL TERM |
Topical term or geographic name entry element |
Computer algorithms. |
650 #0 - SUBJECT ADDED ENTRY--TOPICAL TERM |
Topical term or geographic name entry element |
Magnetic memory (Computers) |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
full-text indexes |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
suffix trees |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
suffix arrays |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
external-memory algorithms |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
string pattern matching |
700 1# - ADDED ENTRY--PERSONAL NAME |
Personal name |
Stege, Ulrike. |
700 1# - ADDED ENTRY--PERSONAL NAME |
Personal name |
Thomo, Alex. |
776 08 - ADDITIONAL PHYSICAL FORM ENTRY |
Relationship information |
Print version: |
International Standard Book Number |
9781608457953 |
830 #0 - SERIES ADDED ENTRY--UNIFORM TITLE |
Uniform title |
Synthesis digital library of engineering and computer science. |
830 #0 - SERIES ADDED ENTRY--UNIFORM TITLE |
Uniform title |
Synthesis lectures on data management ; |
Volume/sequential designation |
# 22. |
International Standard Serial Number |
2153-5426 |
856 42 - ELECTRONIC LOCATION AND ACCESS |
Materials specified |
Abstract with links to resource |
Uniform Resource Identifier |
http://ieeexplore.ieee.org/servlet/opac?bknumber=6812606 |