What is Masked Language Model?

Question

Accepted Answer

A pre-training approach where the model learns to predict randomly hidden (masked) words in a sentence, building deep understanding of language structure and semantics. Masked language modeling is the training objective behind BERT (Bidirectional Encoder Representations from Transformers), introduced by Google in 2018. During training, 15% of tokens are randomly masked and the model predicts them from context. This bidirectional approach — reading both left and right context — gave BERT breakthrough performance on NLP benchmarks, improving state-of-the-art results on 11 tasks simultaneously. The technique remains the foundation of encoder-based models used for classification, search, and information extraction.

Masked Language Model

Explore the Data

Related Terms

Artificial General Intelligence (AGI)

AI Alignment

ChatGPT

Fine-Tuning

Foundation Model

Frontier Model

AI Economy Pulse