Skip to content

Chroma Features

Chroma is the representation that makes harmony analysis possible.

It folds the whole spectrum into 12 pitch-class bins: C, C♯, D, ... B. Each frame then says "how much C energy, how much C♯ energy, ..." regardless of octave.

Key detection and chord recognition both read from it. This page expands on the harmony section of MIR Overview.

Pitch class: forgetting the octave

A pitch class groups every octave of a note together. C2, C3, and C4 are different pitches but the same pitch class, "C." Western music has 12 pitch classes, one per semitone. Chroma takes the energy spread across all frequencies in an STFT frame and sums it into those 12 bins — a C played in any octave lands in the same bin.

This deliberate forgetting is the point. Harmony is about which notes, not which octave: a C-major chord is a C-major chord whether it is voiced low or high. Folding octaves together gives a clean, 12-dimensional fingerprint of the harmony at each moment.

Chromagram: chroma over time

Stack a chroma vector for every frame and you get a chromagram — a 12-row image showing how pitch-class energy moves through the song. Sustained chords appear as steady horizontal bands; a key change shifts which bins stay lit. Visualizers draw it directly; analyzers read patterns out of it.

CHROMA · PITCH CLASSIDLE
Chromagram — harmony folded into 12 bins

Every frequency is folded onto one of twelve pitch classes, so octave is forgotten and only the harmony remains. This clip walks a C–Am–F–G turnaround: watch the lit rows shift as each chord changes, then play to follow the progression.

Why chroma is right for harmony — and wrong for melody

QuestionUse chroma?Why
What key is this?YesOverall pitch-class distribution reveals the tonal center
What chord is playing now?YesA frame's chroma matches a chord template
What octave is the bass in?NoOctave is folded away
What is the lead melody note?NoUse pitch tracking; chroma cannot separate the line
What instrument is this?NoTimbre needs spectral shape; use MFCC

Matching the representation to the question is the recurring MIR skill: chroma trades octave and timbre detail away precisely to make harmony clear.

How libsonare builds chroma

libsonare derives chroma from the STFT magnitude by mapping each frequency bin onto its pitch class and accumulating energy into 12 bins, with tuning and normalization handling so off-A440 material still aligns. An NNLS (non-negative least squares) chroma variant can reduce the influence of overtones, giving cleaner profiles for chord work. Chroma feeds the Krumhansl-Schmuckler key estimator and the template-matching chord recognizer, so improving the chroma stage improves both downstream features at once.

Related: MIR Overview, Key Detection, Chord Recognition, Spectrogram and STFT