000 -LEADER |
fixed length control field |
06362nam a2200673 i 4500 |
001 - CONTROL NUMBER |
control field |
6812612 |
003 - CONTROL NUMBER IDENTIFIER |
control field |
IEEE |
005 - DATE AND TIME OF LATEST TRANSACTION |
control field |
20200413152908.0 |
006 - FIXED-LENGTH DATA ELEMENTS--ADDITIONAL MATERIAL CHARACTERISTICS |
fixed length control field |
m eo d |
007 - PHYSICAL DESCRIPTION FIXED FIELD--GENERAL INFORMATION |
fixed length control field |
cr cn |||m|||a |
008 - FIXED-LENGTH DATA ELEMENTS--GENERAL INFORMATION |
fixed length control field |
130118s2013 caua foab 000 0 eng d |
020 ## - INTERNATIONAL STANDARD BOOK NUMBER |
International Standard Book Number |
9781608454747 (electronic bk.) |
020 ## - INTERNATIONAL STANDARD BOOK NUMBER |
Canceled/invalid ISBN |
9781608454730 (pbk.) |
024 7# - OTHER STANDARD IDENTIFIER |
Standard number or code |
10.2200/S00462ED1V01Y201212SAP010 |
Source of number or code |
doi |
035 ## - SYSTEM CONTROL NUMBER |
System control number |
(CaBNVSL)swl00402005 |
035 ## - SYSTEM CONTROL NUMBER |
System control number |
(OCoLC)824619566 |
040 ## - CATALOGING SOURCE |
Original cataloging agency |
CaBNVSL |
Transcribing agency |
CaBNVSL |
Modifying agency |
CaBNVSL |
050 #4 - LIBRARY OF CONGRESS CALL NUMBER |
Classification number |
TK7882.S65 |
Item number |
H677 2013 |
082 04 - DEWEY DECIMAL CLASSIFICATION NUMBER |
Classification number |
006.454 |
Edition number |
23 |
100 1# - MAIN ENTRY--PERSONAL NAME |
Personal name |
Hori, Takaaki. |
245 10 - TITLE STATEMENT |
Title |
Speech recognition algorithms using weighted finite-state transducers |
Medium |
[electronic resource] / |
Statement of responsibility, etc. |
Takaaki Hori and Atsushi Nakamura. |
260 ## - PUBLICATION, DISTRIBUTION, ETC. |
Place of publication, distribution, etc. |
San Rafael, Calif. (1537 Fourth Street, San Rafael, CA 94901 USA) : |
Name of publisher, distributor, etc. |
Morgan & Claypool, |
Date of publication, distribution, etc. |
c2013. |
300 ## - PHYSICAL DESCRIPTION |
Extent |
1 electronic text (xii, 150 p.) : |
Other physical details |
ill., digital file. |
490 1# - SERIES STATEMENT |
Series statement |
Synthesis lectures on speech and audio processing, |
International Standard Serial Number |
1932-1678 ; |
Volume/sequential designation |
# 10 |
538 ## - SYSTEM DETAILS NOTE |
System details note |
Mode of access: World Wide Web. |
538 ## - SYSTEM DETAILS NOTE |
System details note |
System requirements: Adobe Acrobat Reader. |
500 ## - GENERAL NOTE |
General note |
Part of: Synthesis digital library of engineering and computer science. |
500 ## - GENERAL NOTE |
General note |
Series from website. |
504 ## - BIBLIOGRAPHY, ETC. NOTE |
Bibliography, etc. note |
Includes bibliographical references (p. 137-148). |
505 0# - FORMATTED CONTENTS NOTE |
Formatted contents note |
Preface -- 1. Introduction -- 1.1 Speech recognition and computation -- 1.2 Why WFST? -- 1.3 Purpose of this book -- 1.4 Book organization -- |
505 8# - FORMATTED CONTENTS NOTE |
Formatted contents note |
2. Brief overview of speech recognition -- 2.1 Statistical framework of speech recognition -- 2.2 Speech analysis -- 2.3 Acoustic model -- 2.3.1 Hidden Markov model -- 2.3.2 Computation of acoustic likelihood -- 2.3.3 Output probability distribution -- 2.4 Subword models and pronunciation lexicon -- 2.5 Context-dependent phone models -- 2.6 Language model -- 2.6.1 Finite-state grammar -- 2.6.2 N-gram model -- 2.6.3 Back-off smoothing -- 2.7 Decoder -- 2.7.1 Viterbi algorithm for continuous speech recognition -- 2.7.2 Time-synchronous Viterbi beam search -- 2.7.3 Practical techniques for LVCSR -- 2.7.4 Context-dependent phone search network -- 2.7.5 Lattice generation and N-best search -- |
505 8# - FORMATTED CONTENTS NOTE |
Formatted contents note |
3. Introduction to weighted finite-state transducers -- 3.1 Finite automata -- 3.2 Basic properties of finite automata -- 3.3 Semiring -- 3.4 Basic operations -- 3.5 Transducer composition -- 3.6 Optimization -- 3.6.1 Determinization -- 3.6.2 Weight pushing -- 3.6.3 Minimization -- 3.7 Epsilon removal -- |
505 8# - FORMATTED CONTENTS NOTE |
Formatted contents note |
4. Speech recognition by weighted finite-state transducers -- 4.1 Overview of WFST-based speech recognition -- 4.2 Construction of component WFSTs -- 4.2.1 Acoustic models -- 4.2.2 Phone context dependency -- 4.2.3 Pronunciation lexicon -- 4.2.4 Language models -- 4.3 Composition and optimization -- 4.4 Decoding algorithm using a single WFST -- 4.5 Decoding performance -- |
505 8# - FORMATTED CONTENTS NOTE |
Formatted contents note |
5. Dynamic decoders with on-the-fly WFST operations -- 5.1 Problems in the native WFST approach -- 5.2 On-the-fly composition and optimization -- 5.3 Known problems of on-the-fly composition approach -- 5.4 Look-ahead composition -- 5.4.1 How to obtain prospective output labels -- 5.4.2 Basic principle of look-ahead composition -- 5.4.3 Realization of look-ahead composition using a filter transducer -- 5.4.4 Look-ahead composition with weight pushing -- 5.4.5 Generalized composition -- 5.4.6 Interval representation of label sets -- 5.5 On-the-fly rescoring approach -- 5.5.1 Construction of component WFSTs for on-the-fly rescoring -- 5.5.2 Concept -- 5.5.3 Algorithm -- 5.5.4 Approximation in decoding -- 5.5.5 Comparison with look-ahead composition -- |
505 8# - FORMATTED CONTENTS NOTE |
Formatted contents note |
6. Summary and perspective -- 6.1 Realization of advanced speech recognition techniques using WFSTs -- 6.1.1 WFSTs for extended language models -- 6.1.2 Dynamic grammars based on WFSTs -- 6.1.3 Wide-context-dependent HMMs -- 6.1.4 Extension of WFSTs for multi-modal inputs -- 6.1.5 Use of WFSTs for learning -- 6.2 Integration of speech and language processing -- 6.3 Other speech applications using WFSTs -- 6.4 Conclusion -- |
505 8# - FORMATTED CONTENTS NOTE |
Formatted contents note |
Bibliography -- Authors' biographies. |
506 1# - RESTRICTIONS ON ACCESS NOTE |
Terms governing access |
Abstract freely available; full-text restricted to subscribers or individual document purchasers. |
510 0# - CITATION/REFERENCES NOTE |
Name of source |
Compendex |
510 0# - CITATION/REFERENCES NOTE |
Name of source |
INSPEC |
510 0# - CITATION/REFERENCES NOTE |
Name of source |
Google scholar |
510 0# - CITATION/REFERENCES NOTE |
Name of source |
Google book search |
520 3# - SUMMARY, ETC. |
Summary, etc. |
This book introduces the theory, algorithms, and implementation techniques for efficient decoding in speech recognition mainly focusing on the Weighted Finite-State Transducer (WFST) approach. The decoding process for speech recognition is viewed as a search problem whose goal is to find a sequence of words that best matches an input speech signal. Since this process becomes computationally more expensive as the system vocabulary size increases, research has long been devoted to reducing the computational cost. Recently, the WFST approach has become an important state-of-the-art speech recognition technology, because it offers improved decoding speed with fewer recognition errors compared with conventional methods. However, it is not easy to understand all the algorithms used in this framework, and they are still in a black box for many people. In this book, we review the WFST approach and aim to provide comprehensive interpretations of WFST operations and decoding algorithms to help anyone who wants to understand, develop, and study WFST-based speech recognizers. We also mention recent advances in this framework and its applications to spoken language processing. |
530 ## - ADDITIONAL PHYSICAL FORM AVAILABLE NOTE |
Additional physical form available note |
Also available in print. |
588 ## - SOURCE OF DESCRIPTION NOTE |
Source of description note |
Title from PDF t.p. (viewed on January 18, 2013). |
650 #0 - SUBJECT ADDED ENTRY--TOPICAL TERM |
Topical term or geographic name entry element |
Speech processing systems. |
650 #0 - SUBJECT ADDED ENTRY--TOPICAL TERM |
Topical term or geographic name entry element |
Automatic speech recognition. |
650 #0 - SUBJECT ADDED ENTRY--TOPICAL TERM |
Topical term or geographic name entry element |
Transducers. |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
speech recognition |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
automaton |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
weighted finite-state transducer |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
Viterbi algorithm |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
decoder |
653 ## - INDEX TERM--UNCONTROLLED |
Uncontrolled term |
optimization |
700 1# - ADDED ENTRY--PERSONAL NAME |
Personal name |
Nakamura, Atsushi. |
776 08 - ADDITIONAL PHYSICAL FORM ENTRY |
Relationship information |
Print version: |
International Standard Book Number |
9781608454730 |
830 #0 - SERIES ADDED ENTRY--UNIFORM TITLE |
Uniform title |
Synthesis digital library of engineering and computer science. |
830 #0 - SERIES ADDED ENTRY--UNIFORM TITLE |
Uniform title |
Synthesis lectures on speech and audio processing ; |
Volume/sequential designation |
# 10. |
International Standard Serial Number |
1932-1678 |
856 42 - ELECTRONIC LOCATION AND ACCESS |
Materials specified |
Abstract with links to resource |
Uniform Resource Identifier |
http://ieeexplore.ieee.org/servlet/opac?bknumber=6812612 |