Lecture 7.5: Hynek Hermansky - Auditory Perception in Speech Technology, Part 2

TL;DR
The auditory system processes speech by analyzing temporal trajectories of spectral energies, and noise reduction techniques involve using long temporal contexts, sub-band processing, and multi-stream adaptation to unknown noise.
Transcript
The following content is provided under a Creative Commons license. Your support will help MIT OpenCourseWare continue to offer high quality educational resources for free. To make a donation or view additional materials from hundreds of MIT courses, visit MIT OpenCourseWare at ocw.mit.edu. HYNEK HERMANSKY: So we have this wanted information and un... Read More
Key Insights
- 🤕 The auditory system processes speech by analyzing temporal trajectories of spectral energies, inspired by equal loudness curves and critical bands.
- 😘 Linear distortions in speech can be mitigated through techniques like reduction of low frequencies and integration within critical bands.
Install to Summarize YouTube Videos and Get Transcripts
Explore YouTube Video Summarizer or Get YouTube Transcript Extractor
Questions & Answers
Q: How does the auditory system process speech?
The auditory system analyzes temporal trajectories of spectral energies to extract speech information, using techniques like equal loudness curves and critical bands.
Q: How does the system handle linear distortions in speech?
Linear distortions, such as differences in vocal tracts, can be addressed by reducing low frequencies and integrating within critical bands.
Q: How can noise be mitigated in speech processing?
Noise can be reduced by using sub-band processing, multi-stream adaptation, and selecting the best stream based on performance.
Q: What are some key techniques for noise reduction in speech processing?
Techniques include sub-band processing, multi-stream adaptation, and performance-based stream selection.
Summary & Key Takeaways
-
The auditory system analyzes temporal trajectories of spectral energies to extract speech information, using techniques like equal loudness curves and critical bands.
-
Linear distortions in speech, such as differences in vocal tracts, can be mitigated by using techniques like reduction of low frequencies and integration within critical bands.
-
Dealing with noise in speech processing involves using techniques like sub-band processing, multi-stream adaptation, and selecting the best stream based on performance.
Read in Other Languages (beta)
Share This Summary 📚
Summarize YouTube Videos and Get Video Transcripts with 1-Click
Try YouTube Summary with ChatGPT & Claude or YouTube Transcript Generator
Explore More Summaries from MIT OpenCourseWare 📚
Summarize YouTube Videos and Get Video Transcripts with 1-Click
Try YouTube Summary with ChatGPT & Claude or YouTube Transcript Generator


