Incredible evolution Journey of NLP models
Referring to the above example, the masked words were kept hidden as part of training and model was to anticipate the words. For BERT, it was essential to understand the context of the sentence based on the unmasked words and predict the masked one, failing to produce expected words like Red and Good, would be a result of failed training techniques.