Artificial Intelligence & Machine Learning

BERT

Definition

BERT (Bidirectional Encoder Representations from Transformers) is a language representation model introduced by Google in 2018. It is designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers.

Why It Matters

BERT was a revolutionary breakthrough in Natural Language Processing. Its bidirectional nature allowed it to achieve a much deeper understanding of language context than previous models, leading to state-of-the-art performance on a wide range of NLP tasks.

Contextual Example

The pre-trained BERT model can be quickly fine-tuned for tasks like sentiment analysis or question answering, achieving high accuracy with relatively little task-specific training data.

Common Misunderstandings

  • BERT is an "encoder-only" Transformer model, which makes it particularly good at understanding tasks, as opposed to generative tasks.
  • It was a key milestone in the development of modern Large Language Models.

Related Terms

Last Updated: December 17, 2025