Warenkorb
Kostenloser Versand
Unsere Operationen sind klimaneutral

Spoken Language Processing Xuedong Huang

Spoken Language Processing von Xuedong Huang

Spoken Language Processing Xuedong Huang


€13.99
Zustand - Sehr Gut
Nur noch 1

Zusammenfassung

This title is a guide to building systems that interact with the user via speech as well as other modalities. The fundamentals of speech recognition, text to speech and dialogue processing are discussed as components present in a spoken language system.

Spoken Language Processing Zusammenfassung

Spoken Language Processing: A Guide to Theory, Algorithm and System Development Xuedong Huang

Remarkable progress is being made in spoken language processing, but many powerful techniques have remained hidden in conference proceedings and academic papers, inaccessible to most practitioners. In this book, the leaders of the Speech Technology Group at Microsoft Research share these advances -- presenting not just the latest theory, but practical techniques for building commercially viable products.KEY TOPICS:Spoken Language Processing draws upon the latest advances and techniques from multiple fields: acoustics, phonology, phonetics, linguistics, semantics, pragmatics, computer science, electrical engineering, mathematics, syntax, psychology, and beyond. The book begins by presenting essential background on speech production and perception, probability and information theory, and pattern recognition. The authors demonstrate how to extract useful information from the speech signal; then present a variety of contemporary speech recognition techniques, including hidden Markov models, acoustic and language modeling, and techniques for improving resistance to environmental noise. Coverage includes decoders, search algorithms, large vocabulary speech recognition techniques, text-to-speech, spoken language dialog management, user interfaces, and interaction with non-speech interface modalities. The authors also present detailed case studies based on Microsoft's advanced prototypes, including the Whisper speech recognizer, Whistler text-to-speech system, and MiPad handheld computer.MARKET:For anyone involved with planning, designing, building, or purchasing spoken language technology.

Über Xuedong Huang

XUEDONG HUANG is founder and head of the Speech Technology Group at Microsoft Research. He received his Ph.D. from the University of Edinburgh. He is an IEEE Fellow. ALEX ACERO and HSIAO-WUEN HON are Senior Researchers at Microsoft Research and Senior Members of IEEE. Both received doctorates from Carnegie Mellon University. Foreword by Dr. Raj Reddy, Carnegie Mellon University

Inhaltsverzeichnis

(NOTE: Each chapter ends with Historical Perspective and Further Reading.) 1. Introduction. Motivations. Spoken Language System Architecture. Book Organization. Target Audiences. I. FUNDAMENTAL THEORY. 2. Spoken language Structure. Sound and Human Speech Systems. Phonetics and Phonology. Syllables and Words. Syntax and Semantics. 3. Probability, Statistics, and Information Theory. Probability Theory. Estimation Theory. Significance Testing. Information Theory. 4. Pattern Recognition. Bayes' Decision Theory. How to Construct Classifiers. Discriminative Training. Unsupervised Estimation Methods. Classification and Regression Trees. II. SPEECH PROCESSING. 5. Digital Signal Processing. Digital Signals and Systems. Continuous-Frequency Transforms. Discrete-Frequency Transforms. Digital Filters and Windows. Digital Processing of Analog Signals. Multirate Signal Processing. Filterbanks. Stochastic Processes. 6. Speech Signal Representations. Short-Time Fourier Analysis. Acoustical Model of Speech Production. Linear Predictive Coding. Cepstral Processing. Perceptually Motivated Representations. Formant Frequencies. The Role of Pitch. 7. Speech Coding. Speech Coders Attributes. Scalar Waveform Coders. Scalar Frequency Domain Coders. Code Excited Linear Prediction (CELP). Low-Brit Speech Coders. III. SPEECH RECOGNITION. 8. Hidden Markov Models. The Markov Chain. Definition of the Hidden Markov Model. Continuous and Semicontinuous HMMs. Practical Issues in Using HMMs. HMM Limitations. 9. Acoustic Modeling. Variability in the Speech Signal. How to Measure Speech Recognition Errors. Signal Processing-Extracting Features. Phonectic Modeling-Selecting Appropriate Units. Acoustic Modeling-Scoring Acoustic Features. Adaptive Techniques-Minimizing Mismatches. Confidence Measures: Measuring the Reliability. Other Techniques. Case Study: Whisper. 10. Environmental Robustness. The Acoustical Environment. Acoustical Transducers. Adaptive Echo Cancellation (AEC). Multimicrophone Speech Enhancement. Environment Compensation Preprocessing. Environment Model Adaptation. Modeling Nonstationary Noise. 11. Language Modeling. Formal Language Theory. Stochastic Language Models. Complexity Measure of Language Models. N-Gram Smoothing. Adaptive Language Models. Practical Issues. 12. Basic Search Algorithms. Basic Search Algorithms. Search Algorithms for Speech Recognition. Language Model States. Time-Synchronous Viterbi Beam Search. Stack Decoding (A Search). 13. Large-Vocabulary Search Algorithms. Efficient Manipulation of a Tree Lexicon. Other Efficient Search Techniques. N-Best and Multipass Search Strategies. Search-Algorithm Evaluation. Case Study-Microsoft Whisper. IV. TEXT-TO-SPEECH SYSTEMS. 14. Text and Phonetic Analysis. Modules and Data Flow. Lexicon. Document Structured Detection. Text Normalization. Linguistic Analysis. Homograph Disambiguation. Morphological Analysis. Letter-to-Sound Conversion. Evaluation. Case Study: Festival. 15. Prosody. The Role of Understanding. Prosody Generation Schematic. Speaking Style. Symbolic Prosody. Duration Assignment. Pitch Generation. Prosody Markup Languages. Prosody Evaluation. 16. Speech Synthesis. Attributes of Speech Synthesis. Formant Speech Synthesis. Concatenative Speech Synthesis. Prosodic Modification of Speech. Source-Filter Models for Prosody Modification. Evaluation of TTS Systems. V. SPOKEN LANGUAGE SYSTEMS. 17. Spoken Language Understanding. Written vs. Spoken Languages. Dialog Structure. Semantic Representation. Sentence Interpretation. Discourse Analysis. Dialog Management. Response Generation and Rendition. Evaluation. Case Study-Dr. Who. 18. Applications and User Interfaces. Application Architecture. Typical Applications. Speech Interface Design. Internationalization. Case Study-MIPAD. Index.

Zusätzliche Informationen

GOR007716838
9780130226167
0130226165
Spoken Language Processing: A Guide to Theory, Algorithm and System Development Xuedong Huang
Gebraucht - Sehr Gut
Broschiert
Pearson Education (US)
20010503
1008
N/A
Die Abbildung des Buches dient nur Illustrationszwecken, die tatsächliche Bindung, das Cover und die Auflage können sich davon unterscheiden.
Dies ist ein gebrauchtes Buch. Es wurde schon einmal gelesen und weist von der früheren Nutzung Gebrauchsspuren auf. Wir gehen davon aus, dass es im Großen und Ganzen in einem sehr guten Zustand ist. Sollten Sie jedoch nicht vollständig zufrieden sein, setzen Sie sich bitte mit uns in Verbindung.