[Note] Understand the role of self attention for efficient speech recognition
- Self-attention plays two role in the success of Transformer-based ASR
- The “attention map” in self-attention module can be categorize to 2 group
- “phonetic”(vertical) and “linguistic”(diagonal)
- Phonetic: lower layer, extract phonologically meaningful global context
- Linguistic: higher layer, attent to local context
- -the phonetic variance is standardized in lower SA layers so upper SA layers can identify local linguistic features.
- The “attention map” in self-attention module can be categorize to 2 group
[Note] Understand the role of self attention for efficient speech recognition Read More »