Long Short-Term Memory (LSTM) – Technology Term Explained | Technology Definitions

Definition

Long Short-Term Memory (LSTM) is a special kind of Recurrent Neural Network (RNN) that is capable of learning long-term dependencies. It uses a series of "gates" to control what information is added to or removed from its internal memory state.

Why It Matters

LSTMs were a major breakthrough in sequence modeling. They solve the vanishing gradient problem of standard RNNs, allowing them to remember information for long periods, which is crucial for tasks like language translation and speech recognition.

Contextual Example

In a long paragraph, an LSTM can remember the subject mentioned at the beginning of the paragraph and use that information to understand a pronoun used at the very end.

Common Misunderstandings

GRUs (Gated Recurrent Units) are a similar, slightly simpler architecture to LSTMs.
While still used, LSTMs have been largely surpassed by the Transformer architecture for many state-of-the-art NLP tasks.

Definition

Why It Matters

Contextual Example

Common Misunderstandings

Related Terms