research - Hank's blog

[Note] Understand the role of self attention for efficient speech recognition

By Hank Lu in blog on 31 Mar 2022

* Self-attention plays two role in the success of Transformer-based ASR * The "attention map" in self-attention module can be categorize to 2 group * "phonetic"(vertical) and "linguistic"(diagonal) * Phonetic: lower layer, extract phonologically meaningful global context * Linguistic: higher layer, attent to local context * -> the phonetic variance is standardized in lower…

[Note] PERT: Pre-training BERT with permuted language model

By Hank Lu in blog on 25 Mar 2022

Can we use pre-training task other than MLM? * https://arxiv.org/abs/2203.06906 * Proposed: Permuted Language Model(PerLM) * Input: permute a proportion of text * Target: position of the original token Pretraining LM tasks * Masked LM * Whole word masking(wwm,): * alleviate "input information leaking" issue * Mask consecutive N-gram * e.g.…

[Note] wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations

By Hank Lu in blog on 21 Mar 2022

* https://arxiv.org/abs/2006.11477 * Self-supervised speech representation * contrastive loss: masked continous speech input -> quantized target * quantized module: gumbel softmax(latent representation codebooks) * wav2vec2.0 Large with 10min data: 5.2/8.6 LS-clean test * Fairseq * Well explained: https://neurosys.com/wav2vec-2-0-framework Feature Encoder(CNN) 將 raw audio…

[Note] Improving CTC-based speech recognition via knowledge transferring from pre-trained language models

By Hank Lu in blog on 17 Mar 2022

https://arxiv.org/abs/2203.03582 Motivation * CTC-based models are always weaker than AED models and requrire the assistance of external LM. * Conditional independence assumption * Hard to utilize contextualize information Proposed * Transfer the knowledge of pretrained language model(BERT, GPT-2) to CTC-based ASR model. No inference speed reduction, only use…

Facebook Hate Speech Detection

By Hank Lu in blog on 25 Jun 2021

只要有人類的地方就會有惡意言論，而 Facebook 身為全球最大的社交平台，從以往僱用審查團隊去人工檢視，近年來也開始引入 AI 系統來輔助偵測，在 NLP 領域令人振奮的 BERT 系列模型更扮演了關鍵的角色。本文由黃偉愷, Ke-Han Lu 共同完成，是「人工智慧與大數據之商業價值」這門課的期末報告，我們分成兩大方向調查了 Facebook 在惡意言論偵測的近期發展： * Facebook Hate Speech Detection：背景介紹及以政策面探討 FB 如何審查、定義惡意言論，AI系統對於目前 FB 的影響 * Facebook BERT-based System：以技術角度介紹 BERT-based 模型的迷人之處及其原理 Facebook Hate Speech Detection 背景介紹 Facebook的創辦人馬克·祖克柏曾說：「Facebook的創建理念是，打造一個全球性的社區，加深人與人之間的聯繫，…

How to Read a Paper

By Hank Lu in blog on 23 Apr 2020

這學期開始進入正式課程之前，教授提供了一些關於「如何讀 Paper」的文章。對於一位剛要進入研究領域的學生來說，讀文獻是很重要的，用對方法可以節省很多心力，避免變成被論文海淹沒的菸酒生。…