[Note] InterAug: Augmenting Noisy Intermediate Predictions for CTC-based ASR

https://arxiv.org/abs/2204.00174

  • A novel training method for CTC-based ASR using augmented intermediate representations for conditioning
    • a extension of self-condition CTC
  • Methods: noisy conditioning
    • feature space: Mask time or feature
    • token space: Insert, delete, substitute token in “condition”.

Results

  • feature masking seems ineffctive
  • Token substitution perform the best
    • contained many blank tokens <-> non-blank token: combine of ins and del error.
  • masking latent feature cause excessive loss of information
  • proposed method can stably obtain the effect of augmentation.