000 07280nam a2200733 i 4500
001 7555397
003 IEEE
005 20200413152922.0
006 m eo d
007 cr cn |||m|||a
008 160816s2016 caua foab 001 0 eng d
020 _a9781627055024
_qebook
020 _z9781627059008
_qprint
024 7 _a10.2200/S00716ED1V04Y201604HLT033
_2doi
035 _a(CaBNVSL)swl00406781
035 _a(OCoLC)956738395
040 _aCaBNVSL
_beng
_erda
_cCaBNVSL
_dCaBNVSL
050 4 _aP308
_b.W557 2016
082 0 4 _a418.020285
_223
100 1 _aWilliams, Philip.,
_eauthor.
245 1 0 _aSyntax-based statistical machine translation /
_cPhilip Williams, Rico Sennrich, Matt Post, Philipp Koehn.
264 1 _a[San Rafael, California] :
_bMorgan & Claypool,
_c2016.
300 _a1 PDF (xvii, 190 pages) :
_billustrations.
336 _atext
_2rdacontent
337 _aelectronic
_2isbdmedia
338 _aonline resource
_2rdacarrier
490 1 _aSynthesis lectures on human language technologies,
_x1947-4059 ;
_v# 33
538 _aMode of access: World Wide Web.
538 _aSystem requirements: Adobe Acrobat Reader.
500 _aPart of: Synthesis digital library of engineering and computer science.
504 _aIncludes bibliographical references (pages 159-175) and index.
505 0 _a1. Models -- 1.1 Syntactic translation units -- 1.1.1 Phrases -- 1.1.2 Phrases with gaps -- 1.1.3 Phrases with labels -- 1.1.4 Phrases with internal tree structure -- 1.2 Grammar formalisms -- 1.2.1 Context-free grammar -- 1.2.2 Synchronous context-free grammar -- 1.2.3 Synchronous tree-substitution grammar -- 1.2.4 Probabilistic and weighted grammars -- 1.3 Statistical models -- 1.3.1 Generative models -- 1.3.2 Discriminative models -- 1.4 A classification of syntax-based models -- 1.4.1 String-to-string -- 1.4.2 String-to-tree -- 1.4.3 Tree-to-string -- 1.4.4 Tree-to-tree -- 1.5 A brief history of syntax-based SMT --
505 8 _a2. Learning from parallel text -- 2.1 Preliminaries -- 2.2 Hierarchical phrase-based grammar -- 2.2.1 Rule extraction -- 2.2.2 Features -- 2.3 Syntax-augmented grammar -- 2.3.1 Rule extraction -- 2.3.2 Extraction heuristics -- 2.3.3 Features -- 2.4 GHKM -- 2.4.1 Identifying frontier nodes -- 2.4.2 Extracting minimal rules -- 2.4.3 Unaligned source words -- 2.4.4 Composed rules -- 2.4.5 Features -- 2.5 A comparison -- 2.6 Summary --
505 8 _a3. Decoding I: preliminaries -- 3.1 Hypergraphs, forests, and derivations -- 3.1.1 Basic definitions -- 3.1.2 Parse forests -- 3.1.3 Translation forests -- 3.1.4 Derivations -- 3.1.5 Weighted derivations -- 3.2 Algorithms on hypergraphs -- 3.2.1 The topological sort algorithm -- 3.2.2 The Viterbi max-derivation algorithm -- 3.2.3 The CYK max-derivation algorithm -- 3.2.4 The eager and lazy k-best algorithms -- 3.3 Historical notes and further reading --
505 8 _a4. Decoding II: tree decoding -- 4.1 Decoding with local features -- 4.1.1 A basic decoding algorithm -- 4.1.2 Hyperedge bundling -- 4.2 State splitting -- 4.2.1 Adding a bigram language model feature -- 4.2.2 The state-split hypergraph -- 4.2.3 Complexity -- 4.3 Beam search -- 4.3.1 The beam -- 4.3.2 Rest cost estimation -- 4.3.3 Monotonicity redux -- 4.3.4 Exhaustive beam filling -- 4.3.5 Cube pruning -- 4.3.6 Cube growing -- 4.3.7 State refinement -- 4.4 Efficient tree parsing -- 4.5 Tree-to-tree decoding -- 4.6 Historical notes and further reading --
505 8 _a5. Decoding III: string decoding -- 5.1 Basic beam search -- 5.1.1 Parse forest complexity -- 5.2 Faster beam search -- 5.2.1 Constrained width parsing -- 5.2.2 Per-subspan beam search -- 5.3 Handling non-binary grammars -- 5.3.1 Binarization -- 5.3.2 Alternatives to binarization -- 5.4 Interim summary -- 5.5 Parsing algorithms -- 5.5.1 The CYK+ algorithm -- 5.5.2 Trie-based grammar storage -- 5.5.3 The recursive CYK+ algorithm -- 5.6 STSG and distinct-category SCFG -- 5.6.1 STSG -- 5.6.2 Distinct-category SCFG -- 5.7 Historical notes and further reading --
505 8 _a6. Selected topics -- 6.1 Transformations on trees -- 6.1.1 Tree restructuring -- 6.1.2 Tree re-labeling -- 6.1.3 Fuzzy syntax -- 6.1.4 Forest-based approaches -- 6.1.5 Beyond context-free models -- 6.2 Dependency structure -- 6.2.1 Dependency treelet translation -- 6.2.2 String-to-dependency SMT -- 6.3 Improving grammaticality -- 6.3.1 Agreement -- 6.3.2 Subcategorization -- 6.3.3 Morphological structure in synchronous grammars -- 6.3.4 Syntactic language models -- 6.4 Evaluation metrics --
505 8 _a7. Closing remarks -- 7.1 Which approach is best? -- 7.2 What's next? --
505 8 _aA. Open-source tools -- Bibliography -- Authors' biographies -- Author index -- Index.
506 1 _aAbstract freely available; full-text restricted to subscribers or individual document purchasers.
510 0 _aCompendex
510 0 _aINSPEC
510 0 _aGoogle scholar
510 0 _aGoogle book search
520 3 _aThis unique book provides a comprehensive introduction to the most popular syntax-based statistical machine translation models, filling a gap in the current literature for researchers and developers in human language technologies. While phrase-based models have previously dominated the field, syntax-based approaches have proved a popular alternative, as they elegantly solve many of the shortcomings of phrase-based models. The heart of this book is a detailed introduction to decoding for syntax-based models. The book begins with an overview of synchronous-context free grammar (SCFG) and synchronous tree-substitution grammar (STSG) along with their associated statistical models. It also describes how three popular instantiations (Hiero, SAMT, and GHKM) are learned from parallel corpora. It introduces and details hypergraphs and associated general algorithms, as well as algorithms for decoding with both tree and string input. Special attention is given to efficiency, including search approximations such as beam search and cube pruning, data structures, and parsing algorithms. The book consistently highlights the strengths (and limitations) of syntax-based approaches, including their ability to generalize phrase-based translation units, their modeling of specific linguistic phenomena, and their function of structuring the search space.
530 _aAlso available in print.
588 _aTitle from PDF title page (viewed on August 16, 2016).
650 0 _aMachine translating.
650 0 _aTranslating and interpreting
_xData processing.
653 _astatistical machine translation
653 _asyntax
653 _asynchronous grammar formalisms
653 _anatural language processing
653 _acomputational linguistics
653 _amachine learning
653 _astatistical modeling
700 1 _aSennrich, Rico.,
_eauthor.
700 1 _aPost, Matt.,
_eauthor.
700 1 _aKoehn, Philipp.,
_eauthor.
776 0 8 _iPrint version:
_z9781627059008
830 0 _aSynthesis digital library of engineering and computer science.
830 0 _aSynthesis lectures on human language technologies ;
_v# 33.
_x1947-4059
856 4 2 _3Abstract with links to resource
_uhttp://ieeexplore.ieee.org/servlet/opac?bknumber=7555397
999 _c562222
_d562222