INFO
Evaluates the quality of machine translation by comparing output with reference translations.
How It Works
BLEU measures the precision of n-grams in the candidate translation that appear in the reference translation.
It includes a brevity penalty to discourage overly short outputs.
- : Precision of n-grams of size
- : Weight for each n-gram level (typically uniform)
- : Brevity penalty to penalize short outputs
What to Look For
- Higher BLEU = better alignment with reference
- Sensitive to exact word matches, not semantics
- Best for machine translation, but also used in summarization and captioning