Jason's Notebook

❯

❯

Model Evaluation

❯

Natural Language Processing Metrics

❯

Perplexity

Jan 12, 20261 min read

INFO

Measures how well a language model predicts a sample; lower values indicate better predictions.

How It Works

Perplexity quantifies how “surprised” a language model is by a sequence of tokens.
For a sequence $W = w_{1}, w_{2}, ..., w_{N}$ :

Perplexity (W) = exp (- \frac{1}{N} i = 1 \sum N lo g p (w_{i} ∣ w_{1}, ..., w_{i - 1}))

$p (w_{i} ∣ w_{1}, ..., w_{i - 1})$ : Probability of token $w_{i}$ given its preceding context
$N$ : Number of tokens in the sequence

What to Look For

Lower perplexity = better fluency and grammatical structure
Sensitive to vocabulary and tokenization
Depends on the evaluation model used (e.g., GPT, BERT, n-gram)

Application Models

Transformer
Recurrent Neural Network (RNN)
Long Short-Term Memory (LSTM)

Graph View

How It Works
What to Look For
Application Models

Backlinks

Natural Language Processing Metrics

Created with Quartz v4.5.1 © 2026

© Copyright Jason Wang

GitHub
Jason's Portfolio Website