Phonological Parsing in Speech Recognition by K. Church

It is well-known that phonemes have different acoustic realizations depending on the context. Thus, for example, the phoneme /t! is typically realized with a heavily aspirated strong burst at the beginning of a syllable as in the word Tom, but without a burst at the end of a syllable in a word like cat. Variation such as this is often considered to be problematic for speech recogni tion: (1) In most systems for sentence recognition, such modifications must be viewed as a kind of 'noise' that makes it more difficult to hypothesize lexical candidates given an in put phonetic transcription. To see that this must be the case, we note that each phonological rule [in a certain example] results in irreversible ambiguity-the phonological rule does not have a unique inverse that could be used to recover the underlying phonemic representation for a lexical item. For example, . . . schwa vowels could be the first vowel in a word like 'about' or the surface realization of almost any English vowel appearing in a sufficiently destressed word. The tongue flap [(] could have come from a /t! or a /d/. [65, pp. 548-549] This view of allophonic variation is representative of much of the speech recognition literature, especially during the late 1970's. One can find similar statements by Cole and Jakimik [22] and by Jelinek [50].

1. Introduction.- 1.1 Historical Background and Problem Statement.- 1.1.1 Simplifications to the General Speech Understanding Problem.- 1.1.2 Major Components of a CSR System.- 1.2 Allophonic Constraints are Useful.- 1.2.1 Did you hit it to Tom?.- 1.2.1.1 Phonological Rules: A Source of Noise?.- 1.2.1.2 Phonological Rules: A Source of Constraint.- 1.2.1.3 Redundant Constraints.- 1.2.2 Plant some more tulips.- 1.2.3 We make all of our children.- 1.3 Problems with Rewrite-Rules.- 1.4 Trends Toward Larger Constituents.- 1.4.1 Trends in Speech Research.- 1.4.1.1 Advantage 1: Improved Performance due to Sharing.- 1.4.1.2 Advantage 2: Lexicon is Free of Allophones.- 1.4.2 Trends in Linguistic Theory.- 1.4.2.1 The 'Classical' Position.- 1.4.2.2 Metrical Foot Structure.- 1.4.2.2.1 Metrical Foot Structure and Stress.- 1.4.2.2.2 Is Foot Structure Just a Notational Variant?.- 1.5 Parsing and Matching.- 1.5.1 Parsing.- 1.5.2 Matching.- 1.5.2.1 Canonicalization.- 1.5.2.2 The Final Result.- 1.6 Summary.- 1.7 Outline of What's To Come.- 2. Representation of Segments.- 2.1 Stevens' Theory of Invariant Features.- 2.2 Our Position.- 2.3 What's New.- 2.3.1 Use of Parsing Constraints.- 2.3.2 Decompositional View of Segments.- 2.4 Motivations for Representing Phonetic Distinctions.- 2.4.1 Capture Phonetic Constraints on Suprasegmental Constituents.- 2.4.2 Challenge to the Theory of Invariant Features.- 2.4.2.1 Example I: /t, d/.- 2.4.2.2 Example II: Nasals.- 2.4.2.3 Example III: /t, k/.- 2.5 Capturing Generalizations.- 2.5.1 Allophonic Proliferation.- 2.5.2 Consistency in Transcriptions.- 2.5.3 Consistency Among Dictionaries.- 2.6 Summary.- 3. Allophonic Rules.- 3.1 Flapping and Syllable Level Generalizations.- 3.1.1 Modification 1: Optional Consonant.- 3.1.2 Modification 2: Word Boundaries.- 3.1.3 Modification 3: Stress.- 3.2 Non-Linear Formulations of Flapping.- 3.2.1 Kahn's Ambi-Syllabicity.- 3.2.2 Metrical Foot Structure.- 3.2.3 Differences between Syllable and Foot Structure.- 3.2.3.1 Word Boundaries.- 3.2.3.2 Tri-Syllabic Feet.- 3.2.4 The Optional Consonant (Revisited).- 3.2.5 Interaction with Morphology.- 3.2.5.1 #-Prefixes.- 3.2.5.2 #-Suffixes.- 3.2.6 Summary of Argument for Non-Linear Analysis.- 3.3 Implementation Difficulties and the Lexical Expansion Solution.- 3.3.1 Generative in Nature.- 3.3.2 Restrictions on Allophonic Grammars.- 4. An Alternative: Phrase-Structure Rules.- 4.1 PS Trees Bear More Fruit Than You Would Have Thought.- 4.2 The Constituency Hypothesis.- 4.2.1 Foot-Internal Rules.- 4.2.2 Non-Foot-Initial Rules.- 4.2.3 Foot-Initial Rules.- 4.2.4 Re-organization of the Grammar.- 4.3 Advantages of Phrase-Structure Formulation.- 4.3.1 Robustness.- 4.3.2 Efficiency.- 4.3.3 The Chicken or Egg Paradox.- 4.4 Summary.- 5. Parser Implementation.- 5.1 An Introduction to Chart Parsing.- 5.2 Representation Issues.- 5.2.1 The Chart Tends to be Sparse.- 5.2.2 Taking Advantage of the Sparseness.- 5.3 A Parser Based on Matrix Operations.- 5.3.1 Concatenation and Union.- 5.3.2 Optionality.- 5.3.3 Transitive Closure.- 5.4 No Recursion.- 5.5 Order of Evaluation.- 5.6 Feature Manipulation.- 5.7 Additional Lattice Operations.- 5.7.1 Over-generate and Filter.- 5.7.2 Context Primitives.- 5.7.3 Garbage Collection.- 5.8 Debugging Capabilities.- 5.9 Summary.- 6. Phonotactic Constraints.- 6.1 The Affix Position.- 6.2 The Length Restriction.- 6.3 The Sonority Hierarchy.- 6.3.1 Exceptions to the Sonority Hierarchy.- 6.4 Practical Applications of Phonotactic Constraints.- 6.4.1 The Sonority Bound on Syllable Ambiguity.- 6.4.2 Removing Redundancy From the Lexicon.- 6.4.2.1 A Note on Liquids and Glides.- 6.5 Summary.- 7. When Phonotactic Constraints are Not Enough.- 7.1 Basic Principles.- 7.1.1 Maximize Onsets and Stress Resyllabification.- 7.1.2 Morphology.- 7.1.3 Maximize Onsets.- 7.1.4 Stress Resyllabification.- 7.2 Against Stress Resyllabification.- 7.2.1 Alternative 1: No Resyllabification.- 7.2.2 Alternative 2: Limited Stress Resyllabification.- 7.2.3 Alternative 3: Vowel Resyllabification.- 7.2.4 Summary of Alternatives to Stress Resyllabification.- 7.3 Practical Applications of Vowel Resyllabification.- 7.3.1 Alternative Points of View.- 7.3.2 Vowel Class and Allophonic Cues are Often Redundant.- 7.3.3 Reduction of Branching Factor.- 7.3.4 Schwas Enrich the Speech Signal.- 7.3.5 Stronger Form of Vowel Resyllabification.- 7.4 Automatic Syllabification of Lexicons.- 7.5 Summary.- 8. Robustness Issues.- 8.1 Alternatives in the Input Lattice.- 8.2 Problems for Parsing.- 8.2.1 Allophonic Constraints on an Impoverished Lattice.- 8.2.2 Phonotactic Constraints Aren't Helping Either.- 8.3 Relaxing Phonological Distinctions.- 8.4 Conservation of Distinctive Features.- 8.4.1 Deletion in Homorganic Nasal Clusters.- 8.4.2 Deletion in Fricative Clusters.- 8.4.3 Tunafish Sandwich.- 8.5 Probabilistic Methods.- 8.5.1 Problems with Relaxing Distinctions.- 8.5.2 The Similarity of Probabilistic and Categorical Approaches.- 8.6 Distinctive Features.- 8.6.1 Modeling Confusions at the Segmental Level.- 8.6.2 A Comparison of Seneff's Performance with BBN's Front End.- 8.6.3 Practical Application of Confusion Matrices.- 8.6.4 Modelling Confusions at the Distinctive Feature Level.- 8.6.5 Feature Integration.- 8.6.6 Practical Applications.- 8.7 Summary.- 9. Conclusion.- 9.1 Review of the Standard Position.- 9.2 Review of Nakatani's Position.- 9.3 Review of the Constituency Hypothesis.- 9.4 Review of Phonotactic Constraints.- 9.5 Comparison with Syntactic Notions of Constituency.- 9.6 Contributions.- References.- Appendix I. The Organization of the Lexicon.- I.1. Linear Representation and Linear Search.- I.2. Non-Recursive Discrimination Networks.- I.3. Recursive Discrimination Networks.- I.4. Hash Tables Based on Equivalence Class Abstractions.- I.5. Shipman and Zue.- I.6. Morse Code.- I.7. Selecting the Appropriate Gross Classification.- I.8. Summary.- Appendix II. Don't Depend Upon Syntax and Semantics.- II.1. Higher Level vs. Lower Level Constraints.- II.2. Too Much Dependence in the Past.- II.3. How Much Can Higher Constraints Help?.- II.4. Detraction from the Important Low-Level Issues.- II.5. New Directions: Recognition without Understanding.- II.6. Lower-Level Constraints Bear More Fruit.- II.7. Summary.- Appendix III. Lexical Phonology.- III.1. Difference Between + and #.- III.2. Pipeline Design.- III.3. Distinctions Between Lexical and Postlexical Rules.- III.4. Which Rules are Lexical and Which are Postlexical?.- III.5. The Implementation of Lexical and Postlexical Rules.- Appendix IV. A Sample Grammar.- Appendix V. Sample Lexicon.- Appendix VI. Sample Output.

Phonological Parsing in Speech Recognition K. Church

Phonological Parsing in Speech Recognition by K. Church

Summary

Phonological Parsing in Speech Recognition Summary

Phonological Parsing in Speech Recognition by K. Church

Table of Contents

Additional information

Customer Reviews - Phonological Parsing in Speech Recognition