[Note] PERT: Pre-training BERT with permuted language model

Can we use pre-training task other than MLM?

[Note] PERT: Pre-training BERT with permuted language model Read More »