Skip to content

JavaScript/TypeScript API Reference

Complete API reference for libsonare JavaScript/TypeScript interface.

Overview

libsonare provides audio analysis, mastering, mixing, and editing DSP capabilities for web applications. The npm package is the WebAssembly build, and most analysis/effect functions work on decoded Float32Array PCM. For loading, the Audio.fromMemory* factories can decode encoded bytes in memory (a native WASM decoder for WAV/MP3, plus an optional browser fallback for AAC/OGG/FLAC).

CategoryFunctionsUse Cases
Quick AnalysisdetectBpm, detectKey, detectBeatsDJ apps, music players, beat sync
Full Analysisanalyze, analyzeWithProgressMusic production, song metadata
Audio Effectshpss, timeStretch, pitchShift, spectralEditRemixing, practice tools, region repair
FeaturesmelSpectrogram, chroma, mfccML input, visualization
MasteringmasterAudio, masteringChain, StreamingMasteringChainLUFS targets, true-peak limiting, presets, streaming chains
MixingmixStereo, Mixer, mixingScenePresetNamesStem mixing, routing, automation, meters
Editing DSPpitchCorrectToMidi, noteStretch, spectralEdit, voiceChange, StreamingRetune, RealtimeVoiceChangerVocal tuning, note edits, pitch/formant changes
Audio ClassAudio.fromBuffer, Audio.fromMemory, Audio.fromMemoryWithBrowserFallbackOOP wrapper for all functions

Terminology

New to audio analysis? See the Glossary for explanations of terms like BPM, STFT, Chroma, and more.

Most functions take decoded PCM, not a file path

Most browser functions do not take an MP3 or WAV path; they take decoded PCM samples plus sampleRate. To go from encoded bytes to samples, either decode with the Web Audio API (AudioContext.decodeAudioData) yourself, or use the Audio.fromMemory / Audio.fromMemoryWithBrowserFallback factories below — they decode encoded bytes in memory (native WASM decoder for WAV/MP3, optional browser fallback for AAC/OGG/FLAC) and hand back an Audio instance.

For a cross-binding feature map, see Feature Map. For the complete mastering processor registry and mixing scene format, see Mastering Processors and Mixing Scene JSON.

How To Read This Reference

Read this page in three passes:

  1. Start with Pick The Smallest API That Solves The Job and choose one function family.
  2. Read only the section for that family, then run one recipe from Examples.
  3. Come back to the full type definitions when you need exact return shapes, optional parameters, or runtime parity.

For browser apps, keep the core rule in mind: initialize WASM with await init(), decode files to PCM first, then pass Float32Array samples plus the original sampleRate.

Pick The Smallest API That Solves The Job

The package is broad, so start from the task rather than the function list:

You needStart withWhy
One tempo/key/beat value for a trackdetectBpm, detectKey, detectBeatsFast, direct answers without building the full analysis object
Metadata for a whole songanalyze or the focused analyze* helpersanalyze gives the common summary; focused helpers expose more detail
A live visualizer or progressive BPM/key/chord UIStreamAnalyzerProcesses blocks and drains frame buffers for UI rendering
Browser mastering or delivery previewmasterAudio*, masteringChain*, StreamingMasteringChainUse presets first, then move to named processors when you need control
Stem balance, sends, buses, or metersmixStereo or MixerOne-shot mix first; persistent scene mixer when routing matters
Vocal/note/spectral editspitchCorrectToMidi, noteStretch, spectralEdit, voiceChange, StreamingRetune, RealtimeVoiceChangerEditing DSP changes the signal rather than analyzing it
Room decay, clarity, equivalent-room estimates, or generated room characteranalyzeImpulseResponse, detectAcoustic, estimateRoom, synthesizeRir, roomMorphThese describe or apply the recording space, not the music

Installation

bash
npm install @libraz/libsonare
bash
yarn add @libraz/libsonare
bash
pnpm add @libraz/libsonare

Import

typescript
import {
  init,
  Audio,
  detectBpm,
  detectKey,
  detectBeats,
  detectOnsets,
  analyze,
  analyzeWithProgress,
  version
} from '@libraz/libsonare';

Initialization

init(options?)

Initialize the WASM module. Must be called before any analysis functions.

typescript
async function init(options?: {
  locateFile?: (path: string, prefix: string) => string;
}): Promise<void>

Example:

typescript
import { init, detectBpm } from '@libraz/libsonare';

// Basic initialization
await init();

// With custom file location
await init({
  locateFile: (path, prefix) => `/custom/wasm/path/${path}`
});

isInitialized()

Check if the module is initialized.

typescript
function isInitialized(): boolean

version()

Get the library version.

typescript
function version(): string  // e.g., "1.4.1"

projectAbiVersion()

ABI version of the project/editing POD surface used by Project serialization, bounce, and realtime-engine clip exchange.

typescript
function projectAbiVersion(): number

voiceChangerAbiVersion()

ABI version of the realtime voice-changer POD config used by native and FFI APIs. This is separate from preset JSON schemaVersion, currently 1. Check user-authored presets with validateRealtimeVoiceChangerPresetJson(...) before accepting them.

typescript
function voiceChangerAbiVersion(): number

Voice Preset Accessors

Use these when you need the canonical voice-character preset ID or the resolved flat POD config without parsing preset JSON.

typescript
function voiceCharacterPresetId(preset: VoicePresetId | number): string | null
function realtimeVoiceChangerPresetConfig(preset: VoicePresetId | number): RealtimeVoiceChangerPodConfig | null

Realtime environment helpers

These helpers describe the runtime capabilities used by RealtimeEngine. Use them before wiring AudioWorklet/SharedArrayBuffer paths, especially when the page may run under different browser isolation policies.

typescript
function engineAbiVersion(): number
function engineCapabilities(): {
  engineAbiVersion: number;
  expectedEngineAbiVersion: number;
  abiCompatible: boolean;
  sharedArrayBuffer: boolean;
  atomics: boolean;
  audioWorklet: boolean;
  mode: 'sab' | 'postMessage';
}
function hasFfmpegSupport(): boolean

hasFfmpegSupport() reports whether the loaded build can decode through FFmpeg. The browser/WASM npm package works on decoded PCM and normally returns false; Python/native builds are the intended place to decode files directly.

Analysis Functions

detectBpm(samples, sampleRate)

Detect BPM (tempo) from audio samples.

Use Cases

  • DJ Software: Match tempos between tracks for seamless mixing
  • Music Players: Display tempo information, auto-generate playlists by tempo
  • Fitness Apps: Match music to workout intensity
  • Beat Sync: Synchronize visualizations or animations to music
typescript
function detectBpm(samples: Float32Array, sampleRate?: number): number
ParameterTypeDescription
samplesFloat32ArrayMono audio samples (range -1.0 to 1.0)
sampleRate?numberSample rate in Hz (default: 22050; e.g., 44100)

Always pass the real sample rate

Although sampleRate is optional here (defaulting to 22050 Hz), decoded browser audio is almost always 44100 or 48000 Hz. Pass the buffer's actual audioBuffer.sampleRate, or the reported BPM will be wrong. Unlike detectBpm, the other analysis functions (detectKey, detectBeats, analyze) make sampleRate required for exactly this reason.

Returns: Detected BPM as a number.

typescript
const bpm = detectBpm(samples, sampleRate);
console.log(`BPM: ${bpm}`);  // "BPM: 120"

detectKey(samples, sampleRate)

Detect musical key from audio samples. Returns the root note (C, D, E...) and mode (major/minor).

Use Cases

  • Harmonic Mixing: DJs match keys for smooth transitions (Camelot wheel)
  • Transposition: Suggest key changes to match vocal range
  • Music Recommendation: Find songs in compatible keys
  • Practice Tools: Display key for musicians to play along
typescript
function detectKey(samples: Float32Array, sampleRate: number): Key

Returns: Key object

typescript
interface Key {
  root: PitchClass;      // 0-11 (C=0, B=11)
  mode: Mode;            // Major, Minor, or modal value; see Mode enum
  confidence: number;    // 0.0 to 1.0
  name: string;          // "C major", "A minor"
  shortName: string;     // "C", "Am"
}

const KeyProfile = {
  KrumhanslSchmuckler: 0,
  Temperley: 1,
  Shaath: 2,
  FaraldoEDMT: 3,
  FaraldoEDMA: 4,
  FaraldoEDMM: 5,
  BellmanBudge: 6,
} as const;
typescript
const key = detectKey(samples, sampleRate);
console.log(`Key: ${key.name}`);        // "C major"
console.log(`Confidence: ${(key.confidence * 100).toFixed(1)}%`);

detectBeats(samples, sampleRate)

Detect beat times from audio samples. Returns exact timestamps of each beat.

Use Cases

  • Music Visualization: Trigger effects on each beat
  • Rhythm Games: Generate note charts from audio
  • Video Editing: Auto-cut to the beat
  • Loop Creation: Find perfect loop points
typescript
function detectBeats(samples: Float32Array, sampleRate: number): Float32Array

Returns: Float32Array of beat times in seconds

typescript
const beats = detectBeats(samples, sampleRate);
console.log(`Found ${beats.length} beats`);
for (let i = 0; i < beats.length; i++) {
  console.log(`Beat ${i + 1}: ${beats[i].toFixed(3)}s`);
}

detectOnsets(samples, sampleRate)

Detect onset times (note attacks) from audio samples. More granular than beats - captures every note/hit.

Use Cases

  • Drum Transcription: Detect individual drum hits
  • Audio-to-MIDI: Convert audio to note events
  • Sample Slicing: Automatically segment audio at transients
typescript
function detectOnsets(samples: Float32Array, sampleRate: number): Float32Array

Returns: Float32Array of onset times in seconds

analyze(samples, sampleRate) Heavy

Perform complete music analysis. Returns BPM, key, beats, chords, sections, timbre, and more.

Use Cases

  • Music Library Management: Auto-tag songs with metadata
  • Music Production: Analyze reference tracks
  • DJ Preparation: Get all track info at once
  • Music Education: Study song structure

Performance

This is the heaviest API. For long audio files (>3 minutes), consider using analyzeWithProgress to show progress, or analyze only relevant segments.

typescript
function analyze(samples: Float32Array, sampleRate: number): AnalysisResult

Returns: Complete AnalysisResult. A single analyze() call returns the full result — chords, sections, timbre, dynamics, rhythm, melody, form, and per-beat strength — on every binding, so you rarely need the focused helpers unless you only want one field.

typescript
const result = analyze(samples, sampleRate);
console.log(`BPM: ${result.bpm}`);
console.log(`Key: ${result.key.name}`);
console.log(`Chords: ${result.chords.length}`);
console.log(`Form: ${result.form}`);  // e.g., "IABABCO"

analyzeWithProgress(samples, sampleRate, onProgress) Heavy

Perform complete music analysis with progress reporting.

typescript
function analyzeWithProgress(
  samples: Float32Array,
  sampleRate: number,
  onProgress: (progress: number, stage: string) => void
): AnalysisResult

Progress Stages:

StageDescriptionProgress
"features"Feature precomputation0.0
"bpm"BPM detection0.15
"key"Key detection0.15
"beats"Beat tracking0.25
"chords"Chord recognition0.40
"sections"Section detection0.55
"timbre"Timbre analysis0.70
"dynamics"Dynamics analysis0.80
"rhythm"Rhythm analysis0.90
"melody"Melody contour extraction0.95
"complete"Finished1.0
typescript
const result = analyzeWithProgress(samples, sampleRate, (progress, stage) => {
  console.log(`${stage}: ${Math.round(progress * 100)}%`);
});

Focused analysis helpers

One call is usually enough

analyze() already returns chords, sections, timbre, dynamics, rhythm, melody, form, and per-beat strength. Reach for a focused helper only when you want a single field or need options the high-level call hides.

Use the focused helpers when the default analyze(...) result is either too broad or not detailed enough. They share the same mono Float32Array input model but expose options that are hidden by the high-level call.

TaskFunctionNotes
Downbeat/bar startsdetectDownbeats(samples, sampleRate)Returns seconds for likely bar starts. Pair with detectBeats for grid displays.
Ranked key candidatesdetectKeyCandidates(samples, sampleRate, options?)Useful when the top key is ambiguous or when you want profile/mode filtering.
Detailed tempo candidatesanalyzeBpm(samples, sampleRate, ...)Returns the best BPM plus alternate candidates and tempo evidence.
Rhythm characteranalyzeRhythm(samples, sampleRate, ...)Reports groove, syncopation, and regularity style features.
DynamicsanalyzeDynamics(samples, sampleRate, ...)Dynamic range, loudness range, crest factor, and compression flag.
TimbreanalyzeTimbre(samples, sampleRate, ...)Brightness, warmth, density, roughness, and complexity.
ChordsdetectChords(samples, sampleRate, options?)Returns { chords } of chord segments; options include HMM smoothing, key context, inversions, and chromaMethod: 'stft' | 'nnls'.
SectionsanalyzeSections(samples, sampleRate, ...)Song-structure sections such as intro, verse, chorus, bridge, and outro. Long inputs keep accurate start / end times even when the internal boundary grid is pooled.
MelodyanalyzeMelody(samples, sampleRate, ...)Monophonic melody contour based on pitch tracking.
typescript
const keys = detectKeyCandidates(samples, sampleRate, {
  modes: [Mode.Major, Mode.Minor],
  profile: 'krumhansl',
  genreHint: 'pop',
});

const { chords } = detectChords(samples, sampleRate, {
  useHmm: true,
  useKeyContext: true,
  keyRoot: keys[0].key.root,
  keyMode: keys[0].key.mode,
  chromaMethod: 'nnls',
});

const sections = analyzeSections(samples, sampleRate);

chordFunctionalAnalysis(samples, keyRoot, keyMode, sampleRate?, options?)

Functional (Roman-numeral) harmonic analysis of the detected chord progression, relative to the given key. It runs chord detection internally and labels each detected chord, so pass the same keyRoot/keyMode you get from detectKey(...) and the same options you would give detectChords(...).

typescript
function chordFunctionalAnalysis(
  samples: Float32Array,
  keyRoot: PitchClass,
  keyMode: Mode,
  sampleRate?: number,
  options?: ChordDetectionOptions,
): string[]   // one Roman-numeral label per detected chord, e.g. ["I", "IV", "V", "vi"]
typescript
const key = detectKey(samples, sampleRate);
const roman = chordFunctionalAnalysis(samples, key.root, key.mode, sampleRate);
console.log(roman);  // e.g. ["I", "IV", "V", "vi"]

detectKey(...) and detectKeyCandidates(...) accept the same KeyDetectionOptions includes:

Option groupValues
Controlsmodes, profile, genreHint, useHpss, loudnessWeighted, highPassHz
Profile namesks, krumhansl, temperley, shaath, keyfinder, faraldo-edmt / edmt, faraldo-edma / edma, faraldo-edmm / edmm, bellman-budge / bellman
Genre hintsauto, edm, electronic, dance, pop, classical, jazz

Room Acoustics

These functions describe or apply the recording space rather than the song itself.

GoalUse
Measure a clean impulse responseanalyzeImpulseResponse(...)
Estimate room decay from ordinary audiodetectAcoustic(...)
Fit a practical room model from audioestimateRoom(...)
Create a mono room impulse response from dimensionssynthesizeRir(...)
Add a target-room character as an effectroomMorph(...)

RIR and room morphing

RIR means room impulse response: samples that describe how a room reacts to a short sound. roomMorph(...) is a creative effect, not dereverberation.

typescript
const ir = analyzeImpulseResponse(impulseResponseSamples, sampleRate, 6);
console.log(ir.rt60, ir.edt, ir.c50, ir.c80, ir.confidence);

const blind = detectAcoustic(roomRecording, sampleRate, {
  nOctaveBands: 6,
  nThirdOctaveSubbands: 24,
  minDecayDb: 30,
  noiseFloorMarginDb: 10,
});
console.log(blind.isBlind, blind.rt60Bands);

const estimate = estimateRoom(roomRecording, sampleRate, {
  referenceAbsorption: 0.15,
  nOctaveBands: 6,
});
console.log(estimate.volume, estimate.length, estimate.width, estimate.height);
console.log(estimate.drrDb, estimate.confidence, estimate.absorptionBands);

const rir = synthesizeRir({ lengthM: 7, widthM: 5, heightM: 3, absorption: 0.2 });
console.log(rir.sampleRate, rir.rir.length, rir.hasError);

const morphed = roomMorph(samples, sampleRate, { lengthM: 12, widthM: 9, heightM: 4, wet: 0.6 });

See Room Acoustics for how to interpret RT60, EDT, C50, C80, D50, band arrays, room estimates, generated RIRs, and confidence.

Audio Effects

hpss(samples, sampleRate, kernelHarmonic?, kernelPercussive?) Heavy

Harmonic-Percussive Source Separation. Splits audio into tonal (vocals, synths) and transient (drums) components.

Use Cases

  • Remixing: Isolate drums or remove them
  • Karaoke: Extract instrumental by removing vocals (use harmonic)
  • Better Analysis: Use harmonic-only for cleaner chord detection
  • Drum Extraction: Get just the percussion for sampling
SIGNAL · HARMONICSIDLE
Waveform and spectrum — where harmonics come from

The top panel is the wave in time; the bottom is its spectrum. A sine has only its fundamental, while a saw stacks every harmonic and a square only the odd ones — switch the shape and watch the comb appear.

Waveform
Frequency
220 Hz

Performance

HPSS requires STFT computation and median filtering. Processing time scales with audio duration.

typescript
function hpss(
  samples: Float32Array,
  sampleRate: number,
  kernelHarmonic?: number,    // default: 31
  kernelPercussive?: number   // default: 31
): HpssResult

interface HpssResult {
  harmonic: Float32Array;
  percussive: Float32Array;
  sampleRate: number;
}

harmonic(samples, sampleRate) Heavy

Extract harmonic component from audio.

typescript
function harmonic(samples: Float32Array, sampleRate: number): Float32Array

percussive(samples, sampleRate) Heavy

Extract percussive component from audio.

typescript
function percussive(samples: Float32Array, sampleRate: number): Float32Array

timeStretch(samples, sampleRate, rate) Heavy

Time-stretch audio without changing pitch. Rate < 1.0 = slower, > 1.0 = faster.

Use Cases

  • Practice Tools: Slow down music to learn difficult passages
  • DJ Mixing: Match tempos between tracks
  • Podcast Editing: Speed up/slow down speech
  • Music Production: Fit samples to project tempo
PARAM SWEEP · TIME STRETCHIDLE
Time stretch — changing length, not pitch

Time stretching is pitch shift's exact opposite: it changes how long the audio lasts while leaving the pitch alone. Drag the rate and the drum hits spread out or bunch up — the waveform fills more or less of the panel — but the spectrum below barely moves. Below 1.0 the clip slows down and grows; above 1.0 it speeds up and shrinks. Press play to hear the groove change tempo with no chipmunk effect.

Rate
1 ×

Performance

Uses phase vocoder algorithm. Processing time increases with audio duration.

typescript
function timeStretch(
  samples: Float32Array,
  sampleRate: number,
  rate: number   // 0.5 = half speed, 2.0 = double speed
): Float32Array

pitchShift(samples, sampleRate, semitones) Heavy

Pitch-shift audio without changing duration. Measured in semitones (+12 = one octave up).

Use Cases

  • Key Matching: Transpose songs to match for mixing
  • Vocal Tuning: Correct or adjust vocal pitch
  • Creative Effects: Create harmonies, chipmunk/deep voice effects
  • Instrument Practice: Transpose to comfortable key

Performance

Combines time stretching and resampling. Processing time increases with audio duration.

typescript
function pitchShift(
  samples: Float32Array,
  sampleRate: number,
  semitones: number   // +12 = one octave up
): Float32Array

Editing DSP

These functions change the signal itself rather than only analyzing it. They are also available as Audio instance methods, where the stored sampleRate is used automatically.

typescript
function pitchCorrectToMidi(
  samples: Float32Array,
  sampleRate: number,
  currentMidi: number,
  targetMidi: number,
): Float32Array

// Retune a tracked pitch contour to a fixed target note, frame by frame.
// f0Hz is a per-frame f0 track (e.g. from pitchYin/pitchPyin), aligned to
// hopLength. Pass the matching voiced/voicedProb arrays to skip unvoiced
// frames; unvoiced or NaN frames are left untouched.
function pitchCorrectToMidiTimevarying(
  samples: Float32Array,
  f0Hz: Float32Array,
  targetMidi: number,
  sampleRate: number,
  hopLength: number,
  voiced?: Int32Array,
  voicedProb?: Float32Array,
): Float32Array

function noteStretch(
  samples: Float32Array,
  sampleRate: number,
  options?: {
    onsetSample?: number,    // note onset position in samples
    offsetSample?: number,   // note offset position in samples
    stretchRatio?: number,   // >1 lengthens the region, <1 shortens it
  },
): Float32Array

function spectralEdit(
  samples: Float32Array,
  sampleRate: number,
  ops?: Array<{
    startSample?: number;
    endSample?: number;
    lowHz?: number;
    highHz?: number;
    gainDb?: number;
    mode?: 'gain' | 'attenuate' | 'mute' | 'heal';
  }>,
  options?: {
    nFft?: number;
    hopLength?: number;
    window?: 'hann' | 'hamming' | 'blackman' | 'rectangular';
    healRadiusFrames?: number;
  },
): Float32Array

function voiceChange(
  samples: Float32Array,
  sampleRate: number,
  options?: {
    pitchSemitones?: number,  // negative shifts down; default 0
    formantFactor?: number,   // >1 brightens, <1 darkens; default 1.0
  },
): Float32Array

CLI equivalents:

bash
sonare pitch-correct vocal.wav --current-midi 68.7 --target-midi 69 -o corrected.wav
sonare note-stretch take.wav --onset 12000 --offset 24000 --ratio 1.25 -o held.wav
sonare voice-change vocal.wav --pitch-semitones 3 --formant-factor 1.05 -o voice.wav

See Spectral Editing for region examples and option notes.

normalize(samples, sampleRate, targetDb?)

Normalize audio to target peak level.

typescript
function normalize(
  samples: Float32Array,
  sampleRate: number,
  targetDb?: number   // default: 0.0 (full scale)
): Float32Array

trim(samples, sampleRate, thresholdDb?)

Trim silence from beginning and end of audio.

typescript
function trim(
  samples: Float32Array,
  sampleRate: number,
  thresholdDb?: number   // default: -60.0
): Float32Array

This is the simple Audio-level threshold trim. For librosa-compatible frame/RMS silence detection that also returns the original start/end sample range, use trimSilence(...) below.

Feature Extraction

stft(samples, sampleRate, nFft?, hopLength?) Medium

Compute Short-Time Fourier Transform.

typescript
function stft(
  samples: Float32Array,
  sampleRate: number,
  nFft?: number,      // default: 2048
  hopLength?: number  // default: 512
): StftResult

interface StftResult {
  nBins: number;
  nFrames: number;
  nFft: number;
  hopLength: number;
  sampleRate: number;
  magnitude: Float32Array;
  power: Float32Array;
}

stftDb(samples, sampleRate, nFft?, hopLength?) Medium

Compute STFT and return in dB scale.

typescript
function stftDb(
  samples: Float32Array,
  sampleRate: number,
  nFft?: number,      // default: 2048
  hopLength?: number  // default: 512
): { nBins: number; nFrames: number; db: Float32Array }

melSpectrogram(samples, sampleRate, nFft?, hopLength?, nMels?) Medium

Compute Mel spectrogram. Frequency representation that matches human pitch perception.

Use Cases

  • Machine Learning: Input for genre classification, mood detection
  • Visualization: Create frequency spectrograms for audio players
  • Similarity Search: Compare songs by their spectral content
  • Voice Analysis: Analyze speech patterns and characteristics
typescript
function melSpectrogram(
  samples: Float32Array,
  sampleRate: number,
  nFft?: number,      // default: 2048
  hopLength?: number, // default: 512
  nMels?: number,     // default: 128
  fmin?: number,      // default: 0 (librosa default)
  fmax?: number,      // default: 0 = sampleRate / 2
  htk?: boolean       // default: false = Slaney formula; true = HTK
): MelSpectrogramResult

interface MelSpectrogramResult {
  nMels: number;
  nFrames: number;
  sampleRate: number;
  hopLength: number;
  power: Float32Array;
  db: Float32Array;
}

mfcc(samples, sampleRate, nFft?, hopLength?, nMels?, nMfcc?) Medium

Compute MFCC (Mel-Frequency Cepstral Coefficients). Compact representation of spectral envelope.

Use Cases

  • Speech Recognition: Standard input for speech-to-text systems
  • Speaker Identification: Identify who is speaking
  • Timbre Analysis: Characterize instrument/voice quality
  • Audio Fingerprinting: Create compact song signatures
typescript
function mfcc(
  samples: Float32Array,
  sampleRate: number,
  nFft?: number,      // default: 2048
  hopLength?: number, // default: 512
  nMels?: number,     // default: 128
  nMfcc?: number,     // default: 20
  fmin?: number,      // default: 0 (librosa default)
  fmax?: number,      // default: 0 = sampleRate / 2
  htk?: boolean       // default: false = Slaney formula; true = HTK
): MfccResult

interface MfccResult {
  nMfcc: number;
  nFrames: number;
  coefficients: Float32Array;
}

Set fmin/fmax to bound the Mel band edges, and pass htk: true to use the HTK Mel formula instead of Slaney. The inverse helpers (melToStft, melToAudio, mfccToAudio) take matching fmin/fmax/htk arguments, so a round-trip stays consistent when you keep the same values on both sides.

chroma(samples, sampleRate, nFft?, hopLength?) Medium

Compute chromagram (pitch class distribution). Maps all frequencies to 12 pitch classes (C, C#, D, ..., B).

Use Cases

  • Chord Detection: Identify chords being played
  • Key Detection: Determine song key from pitch distribution
  • Cover Song Detection: Match songs regardless of tempo/key
  • Music Similarity: Compare harmonic content between tracks
CHROMA · PITCH CLASSIDLE
Chromagram — harmony folded into 12 bins

Every frequency is folded onto one of twelve pitch classes, so octave is forgotten and only the harmony remains. This clip walks a C–Am–F–G turnaround: watch the lit rows shift as each chord changes, then play to follow the progression.

typescript
function chroma(
  samples: Float32Array,
  sampleRate: number,
  nFft?: number,      // default: 2048
  hopLength?: number  // default: 512
): ChromaResult

interface ChromaResult {
  nChroma: number;        // 12
  nFrames: number;
  sampleRate: number;
  hopLength: number;
  features: Float32Array;
  meanEnergy: number[];   // [12] per pitch class
}

Spectral Features

typescript
// Spectral centroid (center of mass) in Hz
function spectralCentroid(
  samples: Float32Array,
  sampleRate: number,
  nFft?: number,
  hopLength?: number
): Float32Array

// Spectral bandwidth in Hz
function spectralBandwidth(
  samples: Float32Array,
  sampleRate: number,
  nFft?: number,
  hopLength?: number
): Float32Array

// Spectral rolloff frequency in Hz
function spectralRolloff(
  samples: Float32Array,
  sampleRate: number,
  nFft?: number,
  hopLength?: number,
  rollPercent?: number  // default: 0.85
): Float32Array

// Spectral flatness (0 = tonal, 1 = noise-like)
function spectralFlatness(
  samples: Float32Array,
  sampleRate: number,
  nFft?: number,
  hopLength?: number
): Float32Array

// Spectral contrast matrix, shape (nBands + 1) x nFrames
function spectralContrast(
  samples: Float32Array,
  sampleRate?: number,
  nFft?: number,
  hopLength?: number,
  nBands?: number,
  fmin?: number,
  quantile?: number
): Matrix2dResult

// Per-frame polynomial spectral coefficients, shape (order + 1) x nFrames
function polyFeatures(
  samples: Float32Array,
  sampleRate?: number,
  nFft?: number,
  hopLength?: number,
  order?: number
): Matrix2dResult

// Zero crossing rate
function zeroCrossingRate(
  samples: Float32Array,
  sampleRate: number,
  frameLength?: number,
  hopLength?: number
): Float32Array

// Sample indices where the waveform crosses zero
function zeroCrossings(
  samples: Float32Array,
  threshold?: number,
  refMagnitude?: boolean,
  pad?: boolean,
  zeroPos?: boolean
): Int32Array

// RMS energy
function rmsEnergy(
  samples: Float32Array,
  sampleRate: number,
  frameLength?: number,
  hopLength?: number
): Float32Array

CQT, VQT, NNLS chroma, inverse features, and loudness

These functions are not just "more features"; they solve different modeling problems:

NeedUseWhy
Log-frequency pitch representationcqt(...), pseudoCqt(...), hybridCqt(...)Constant-Q bins align well with musical pitch over octaves; pseudo/hybrid variants trade accuracy and speed across bins.
Variable bandwidth pitch representationvqt(...)Like CQT, but with a bandwidth offset for low-frequency stability.
Chord-friendly chromannlsChroma(...), chromaCens(...), bassChroma(...)NNLS, CENS, and low-register chroma variants can be cleaner for chord or bass-register work than plain STFT chroma.
Spectral shape detailspectralContrast(...), polyFeatures(...), zeroCrossings(...), onsetStrengthMulti(...)Librosa-compatible contrast bands, polynomial coefficients, zero-crossing indices, and multi-band onset strength.
Pitch/tuning offsetpitchTuning(...), estimateTuning(...)Estimate tuning in fractions of a bin from detected frequencies or directly from audio.
Decomposition and remixingdecompose(...), decomposeWithInit(...), nnFilter(...), remix(...), phaseVocoder(...), hpssWithResidual(...)NMF factorization, selectable NMF initialization, nearest-neighbor filtering, interval remixing, time scaling, and HPSS residual output.
Reconstruct approximate audio/featuresmelToStft, melToAudio, mfccToMel, mfccToAudioGriffin-Lim based inverse paths for visualization, debugging, and feature round-trips.
Delivery loudness measurementslufs, lufsInterleaved, momentaryLufs, shortTermLufs, ebur128LoudnessRangeITU-R BS.1770 / EBU R128 style loudness values, including multichannel integrated loudness and LRA.
typescript
const cqtResult = cqt(samples, sampleRate, 512, 32.7, 84, 12);
const pseudo = pseudoCqt(samples, sampleRate);
const hybrid = hybridCqt(samples, sampleRate);
const nnls = nnlsChroma(samples, sampleRate);
const cens = chromaCens(samples, sampleRate);
const bass = bassChroma(samples, sampleRate);
const loudness = lufs(samples, sampleRate);

const contrast = spectralContrast(samples, sampleRate);
const poly = polyFeatures(samples, sampleRate);
const crossings = zeroCrossings(samples);
const onsetBands = onsetStrengthMulti(samples, sampleRate);
const tuning = estimateTuning(samples, sampleRate);
const offset = pitchTuning(pitch.f0);
const { w, h } = decompose(spectrogram, nFeatures, nFrames, 8);
const warmStarted = decomposeWithInit(spectrogram, nFeatures, nFrames, 8, 50, 2.0, 'nndsvd');
const filtered = nnFilter(spectrogram, nFeatures, nFrames);
const remixed = remix(samples, Int32Array.from([0, sampleRate, sampleRate, 2 * sampleRate]));
const stretched = phaseVocoder(samples, 1.5, sampleRate);
const hpssResidual = hpssWithResidual(samples, sampleRate);
const multichannel = lufsInterleaved(interleavedStereo, 2, sampleRate);
const lra = ebur128LoudnessRange(samples, sampleRate);
const reconstructed = melToAudio(mel.power, mel.nMels, mel.nFrames, sampleRate);

Closest CLI equivalents from the source-built C++ CLI:

bash
sonare cqt song.wav
sonare vqt song.wav
sonare nnls-chroma song.wav
sonare lufs song.wav --json
sonare mel-to-audio song.wav -o mel-preview.wav

For reconstruction limits and parameter notes, see Inverse Features. For librosa-parity details, see librosa Compatibility.

Pitch Detection Medium

typescript
// YIN algorithm
function pitchYin(
  samples: Float32Array,
  sampleRate: number,
  frameLength?: number,  // default: 2048
  hopLength?: number,    // default: 512
  fmin?: number,         // default: 65 Hz
  fmax?: number,         // default: 2093 Hz
  threshold?: number,    // default: 0.3
  fillNa?: boolean       // default: false; true writes 0 for unvoiced f0 frames
): PitchResult

// pYIN algorithm (probabilistic YIN with HMM smoothing)
function pitchPyin(
  samples: Float32Array,
  sampleRate: number,
  frameLength?: number,
  hopLength?: number,
  fmin?: number,
  fmax?: number,
  threshold?: number,
  fillNa?: boolean       // default: false; true writes 0 for unvoiced f0 frames
): PitchResult

interface PitchResult {
  f0: Float32Array;
  voicedProb: Float32Array;
  voicedFlag: boolean[];
  nFrames: number;
  medianF0: number;
  meanF0: number;
}

By default, unvoiced f0 frames remain NaN. Set fillNa: true when a downstream numeric pipeline cannot carry NaN and should treat unvoiced frames as 0.

Unit Conversion

These functions are lightweight and fast.

typescript
// Hz <-> Mel (Slaney formula)
function hzToMel(hz: number): number
function melToHz(mel: number): number

// Hz <-> MIDI note number (A4 = 440 Hz = 69)
function hzToMidi(hz: number): number
function midiToHz(midi: number): number

// Hz <-> Note name
function hzToNote(hz: number): string      // "A4", "C#5"
function noteToHz(note: string): number

// Time <-> Frames
function framesToTime(frames: number, sr: number, hopLength: number): number
function timeToFrames(time: number, sr: number, hopLength: number): number

// Frames <-> Samples (librosa.frames_to_samples / samples_to_frames)
function framesToSamples(frames: number, hopLength?: number, nFft?: number): number
function samplesToFrames(samples: number, hopLength?: number, nFft?: number): number

// dB conversions (vectorised)
function powerToDb(values: Float32Array, ref?: number, amin?: number, topDb?: number): Float32Array
function amplitudeToDb(values: Float32Array, ref?: number, amin?: number, topDb?: number): Float32Array
function dbToPower(values: Float32Array, ref?: number): Float32Array
function dbToAmplitude(values: Float32Array, ref?: number): Float32Array

Metering

Standalone meters report level, dynamics, and stereo-image statistics from a decoded buffer. They are independent of the mastering chain and the streaming engine: pass a Float32Array or a left/right pair and get back a value or report. Every function accepts optional options with a validate flag (default true); set validate: false to skip NaN/Inf input checks on hot paths.

Single-channel level meters

typescript
// Sample peak, dBFS
function meteringPeakDb(samples: Float32Array, sampleRate?: number, options?: ValidateOptions): number
// RMS level, dBFS
function meteringRmsDb(samples: Float32Array, sampleRate?: number, options?: ValidateOptions): number
// Crest factor (peak − RMS), dB
function meteringCrestFactorDb(samples: Float32Array, sampleRate?: number, options?: ValidateOptions): number
// Mean (DC) offset, linear amplitude
function meteringDcOffset(samples: Float32Array, sampleRate?: number, options?: ValidateOptions): number
// Inter-sample (true) peak, dBFS. oversampleFactor is a power of two in 1..16 (0 / omit = 4)
function meteringTruePeakDb(samples: Float32Array, sampleRate?: number, oversampleFactor?: number, options?: ValidateOptions): number

Clipping and dynamic range

typescript
function meteringDetectClipping(
  samples: Float32Array,
  sampleRate?: number,
  options?: MeteringDetectClippingOptions
): ClippingReport

interface MeteringDetectClippingOptions extends ValidateOptions {
  threshold?: number;        // linear absolute threshold, default 0.999
  minRegionSamples?: number; // minimum run length to report, default 1
}

function meteringDynamicRange(
  samples: Float32Array,
  sampleRate?: number,
  options?: MeteringDynamicRangeOptions
): DynamicRangeReport

interface MeteringDynamicRangeOptions extends ValidateOptions {
  windowSec?: number;      // 0 / omit = 3 s
  hopSec?: number;         // 0 / omit = 1 s
  lowPercentile?: number;  // omit or negative = 0.10 (0 is a literal 0th percentile)
  highPercentile?: number; // omit or negative = 0.95
}

interface ClippingReport {
  clippedSamples: number;
  clippingRatio: number;
  maxClippedPeak: number;
  regions: ClippingRegion[];
}
interface ClippingRegion {
  startSample: number;
  endSample: number;
  length: number;
  peak: number;
}
interface DynamicRangeReport {
  dynamicRangeDb: number;
  lowPercentileDb: number;
  highPercentileDb: number;
  windowRmsDb: Float32Array;
}

Stereo image

typescript
// Pearson correlation between channels, −1..1
function meteringStereoCorrelation(left: Float32Array, right: Float32Array, sampleRate?: number, options?: ValidateOptions): number
// Mid/side stereo width
function meteringStereoWidth(left: Float32Array, right: Float32Array, sampleRate?: number, options?: ValidateOptions): number
// Per-sample mid/side point series
function meteringVectorscope(left: Float32Array, right: Float32Array, sampleRate?: number, options?: ValidateOptions): VectorscopeReport
// Phase-scope point series plus summary stats
function meteringPhaseScope(left: Float32Array, right: Float32Array, sampleRate?: number, options?: ValidateOptions): PhaseScopeReport

// Display-sized mid/side vectorscope: like meteringVectorscope but the point series is
// deterministically decimated to at most maxPoints (0 / >= length = one point per sample).
function meteringVectorscopeDecimated(left: Float32Array, right: Float32Array, sampleRate?: number, maxPoints?: number, options?: ValidateOptions): VectorscopeReport
// Display-sized phase scope: like meteringPhaseScope but the point series is decimated to at
// most maxPoints; summary stats are still computed over the full-resolution signal.
function meteringPhaseScopeDecimated(left: Float32Array, right: Float32Array, sampleRate?: number, maxPoints?: number, options?: ValidateOptions): PhaseScopeReport

interface VectorscopeReport {
  mid: Float32Array;
  side: Float32Array;
}
interface PhaseScopeReport {
  mid: Float32Array;
  side: Float32Array;
  radius: Float32Array;
  angleRad: Float32Array;
  correlation: number;
  averageAbsAngleRad: number;
  maxRadius: number;
}

meteringStereoCorrelation, meteringStereoWidth, meteringVectorscope, and meteringPhaseScope require left and right to be the same length.

Spectrum snapshot

meteringSpectrum is Welch-averaged over the whole signal (split into 50%-overlapping Hann frames whose power spectra are averaged). For a true single-frame snapshot that is not time-averaged, use meteringSpectrumFrame, whose frameOffset positional argument selects where the analysis frame starts.

typescript
function meteringSpectrum(
  samples: Float32Array,
  sampleRate?: number,
  options?: SpectrumOptions & ValidateOptions
): SpectrumReport

// True single-frame snapshot (one Hann-windowed nFft FFT), NOT time-averaged like meteringSpectrum.
// The analysis frame spans [frameOffset, frameOffset + nFft); samples past the end are zero-padded.
function meteringSpectrumFrame(
  samples: Float32Array,
  sampleRate?: number,
  frameOffset?: number,
  options?: SpectrumOptions & ValidateOptions
): SpectrumReport

interface SpectrumOptions {
  nFft?: number;                 // 0 / omit = 2048
  applyOctaveSmoothing?: boolean;
  octaveFraction?: number;       // e.g. 3 = 1/3-octave; 0 / omit = 3
  dbRef?: number;                // 0 / omit = 1.0
  dbAmin?: number;               // 0 / omit = library floor
}
interface SpectrumReport {
  frequencies: Float32Array;
  magnitude: Float32Array;
  power: Float32Array;
  db: Float32Array;
  nFft: number;
  sampleRate: number;
}

Scale Quantization

12-TET scale helpers for building pitch-correction targets. modeMask is a 12-bit mask where bit i enables the i-th pitch class relative to root (a PitchClass, C = 0); natural major is 0b101010110101. referenceMidi is the tuning anchor (pass 0 for A4 = 69).

typescript
// Snap a (possibly fractional) MIDI number to the nearest enabled pitch class
function scaleQuantizeMidi(root: number, modeMask: number, midi: number, referenceMidi?: number): number
// Correction (quantized − input), in semitones
function scaleCorrectionSemitones(root: number, modeMask: number, midi: number, referenceMidi?: number): number
// Is pitchClass (0..11) enabled by modeMask relative to root?
function scalePitchClassEnabled(root: number, modeMask: number, pitchClass: number): boolean

Pair scaleQuantizeMidi(...) with pitchCorrectToMidi(...) to retune a detected note to the nearest scale degree.

librosa-Compatible Helpers

These librosa-parity helpers mirror the behavior of the corresponding librosa functions and are exposed across the WASM, Node, and Python bindings. See librosa Compatibility for the librosa function each helper matches.

What each helper is for

  • Pre / De-emphasis — Classic one-tap IIR pre-processing that boosts (or undoes) high frequencies before analysis.
  • Silence Trim / Split — Practical helpers that cut leading/trailing silence or split a recording on silent gaps.
  • Frame / Pad / Length — Utilities to slice a waveform into fixed-length frames, or align array sizes before feeding fixed-frame DSP.
  • Peak Picking / Vector Normalize — Post-processing on 1-D signals (e.g. onset envelopes) to extract peak indices or normalize vectors under a chosen norm.
  • PCEN — Dynamic range compression for mel spectrograms; produces features that are more robust to background noise and gain changes.
  • Tonnetz — Projects a chromagram into a 6-D harmonic space — useful for chord-relation and modulation analysis.
  • Tempogram / PLP — A time-varying tempo representation built from the onset envelope, and the predominant local pulse extracted from it.

Pre-emphasis / De-emphasis

typescript
function preemphasis(samples: Float32Array, coef?: number, zi?: number): Float32Array
function deemphasis(samples: Float32Array, coef?: number, zi?: number): Float32Array

coef defaults to 0.97. Pass zi to provide an initial condition (the value from a previous frame's tail) when streaming.

Silence Trim / Split

typescript
function trimSilence(
  samples: Float32Array,
  topDb?: number,        // default 60
  frameLength?: number,  // default 2048
  hopLength?: number,    // default 512
): { audio: Float32Array; startSample: number; endSample: number }

function splitSilence(
  samples: Float32Array,
  topDb?: number,
  frameLength?: number,
  hopLength?: number,
): Int32Array  // flat [start0, end0, start1, end1, ...]

trimSilence matches librosa.effects.trim. It uses frame RMS and a topDb distance below the peak RMS, then returns both the trimmed audio and the original [startSample, endSample) range.

This is distinct from trim(samples, sampleRate, thresholdDb), which is a simpler threshold trim.

splitSilence matches librosa.effects.split and returns non-silent intervals as sample-index pairs.

Frame / Pad / Length Helpers

typescript
function frameSignal(
  samples: Float32Array,
  frameLength: number,
  hopLength: number,
): { nFrames: number; frames: Float32Array }  // row-major

function padCenter(values: Float32Array, targetSize: number, padValue?: number): Float32Array
function fixLength(values: Float32Array, targetSize: number, padValue?: number): Float32Array
function fixFrames(frames: Int32Array, xMin?: number, xMax?: number, pad?: boolean): Int32Array

frameSignal is librosa.util.frame. padCenter, fixLength, and fixFrames mirror the librosa.util helpers of the same names.

Peak Picking / Vector Normalize

typescript
function peakPick(
  values: Float32Array,
  preMax: number,
  postMax: number,
  preAvg: number,
  postAvg: number,
  delta: number,
  wait: number,
): Int32Array  // peak indices

function vectorNormalize(
  values: Float32Array,
  normType?: number,  // 0 = inf, 1 = L1, 2 = L2, 3 = power (default 0)
  threshold?: number, // default 1e-12
): Float32Array

peakPick is librosa.util.peak_pick. vectorNormalize is librosa.util.normalize.

peakPick parameters
  • preMax / postMax — local-maximum window (in samples) on each side of a candidate.
  • preAvg / postAvg — averaging window on each side; a candidate must exceed the local mean + delta.
  • delta — required prominence above the local mean. Increase to reject smaller peaks.
  • wait — minimum spacing between successive peaks. Suppresses double-trigger.

Used as a post-processing step on 1-D signals such as onset envelopes.

vectorNormalize normType
  • 0 (inf, default) — divide by max absolute value, mapping into [-1, 1] (peak-style normalization).
  • 1 (L1) — divide by sum of absolute values (probability-distribution style).
  • 2 (L2) — divide by sqrt of sum of squares (common feature-vector pre-processing).
  • 3 (power) — divide by sum of squares (energy normalization).

threshold skips normalization when the chosen norm is below it — guards against amplifying near-silent frames.

PCEN (Per-Channel Energy Normalization)

typescript
function pcen(
  values: Float32Array,
  nBins: number,
  nFrames: number,
  options?: {
    sampleRate?: number;
    hopLength?: number;
    timeConstant?: number;  // default 0.4
    gain?: number;          // default 0.98
    bias?: number;          // default 2.0
    power?: number;         // default 0.5
    eps?: number;           // default 1e-6
  },
): Float32Array

pcen matches librosa.pcen. Input is a row-major [nBins x nFrames] mel spectrogram; output uses the same layout.

Tonnetz / Tempogram / PLP

typescript
function tonnetz(
  chromagram: Float32Array,   // row-major [nChroma x nFrames]
  nChroma: number,
  nFrames: number,
): Float32Array               // [6 x nFrames]

function tempogram(
  onsetEnvelope: Float32Array,
  sampleRate: number,
  hopLength?: number,         // default 512
  winLength?: number,         // default 384
  mode?: 'autocorrelation' | 'auto' | 'ac' | 'cosine' | 0 | 1,  // default 'autocorrelation'
): { nFrames: number; winLength: number; data: Float32Array }

function fourierTempogram(
  onsetEnvelope: Float32Array,
  sampleRate?: number,
  hopLength?: number,
  winLength?: number,
): { nBins: number; nFrames: number; data: Float32Array }

function cyclicTempogram(
  onsetEnvelope: Float32Array,
  sampleRate: number,
  hopLength?: number,
  winLength?: number,
  bpmMin?: number,            // default 60
  nBins?: number,             // default 60
): { nFrames: number; nBins: number; data: Float32Array }

function tempogramRatio(
  tempogramData: Float32Array,
  winLength?: number,
  sampleRate?: number,
  hopLength?: number,
): Float32Array

function plp(
  onsetEnvelope: Float32Array,
  sampleRate: number,
  hopLength?: number,
  tempoMin?: number,          // default 30
  tempoMax?: number,          // default 300
  winLength?: number,
): Float32Array

These helpers mirror familiar librosa rhythm and harmony features:

HelperMeaning
tonnetzCorresponds to librosa.feature.tonnetz
tempogramCorresponds to librosa.feature.tempogram; autocorrelation by default
fourierTempogramFFT-based tempogram
cyclicTempogramTempo classes folded by octave
plplibrosa.beat.plp (predominant local pulse)

For tempogram, pass mode: 'cosine' to use the window-local cosine-similarity variant. The wrapper also accepts 'auto', 'ac', 0, and 1 aliases for parity with lower-level bindings.

See Realtime and Streaming for when to use each.

Resampling

resample(samples, srcSr, targetSr) Medium

High-quality resampling using r8brain algorithm.

typescript
function resample(
  samples: Float32Array,
  srcSr: number,
  targetSr: number
): Float32Array

Audio Class

The Audio class provides an object-oriented wrapper around the common one-shot functions. It stores the samples and sample rate internally, so you don't need to pass them to every call. Focused helpers such as section/melody/timbre/dynamics analysis and room-acoustic estimation remain standalone in the WASM wrapper.

Audio.fromBuffer(samples, sampleRate)

Create an Audio instance from raw sample data.

typescript
const audio = Audio.fromBuffer(samples, 44100);

sampleRate is optional and defaults to 48000. Always pass the buffer's actual sample rate, since the stored value feeds every instance method.

Audio.fromMemory(bytes)

Decode encoded audio bytes (Uint8Array) such as WAV or MP3 with the native WASM decoder and return an Audio instance. Throws a SonareError when the format is not supported by the bundled decoder.

typescript
const audio = Audio.fromMemory(new Uint8Array(await file.arrayBuffer()));

Audio.fromMemoryWithBrowserFallback(bytes, options?)

async; returns Promise<Audio>. Tries Audio.fromMemory first, then falls back to the browser codec stack (AudioContext.decodeAudioData) for formats the native decoder lacks, e.g. AAC, OGG, and FLAC. Browser-decoded multi-channel audio is mixed down to mono to match the Audio wrapper contract. Accepts an optional BrowserAudioDecodeOptions (audioContext / createAudioContext / targetSampleRate); a context this helper creates itself is closed afterward.

typescript
const audio = await Audio.fromMemoryWithBrowserFallback(
  new Uint8Array(await file.arrayBuffer()),
);

Properties

PropertyTypeDescription
audio.dataFloat32ArrayRaw audio samples
audio.lengthnumberNumber of samples
audio.sampleRatenumberSample rate (Hz)
audio.durationnumberDuration (seconds)

Instance Methods

Common one-shot helpers are available as instance methods — samples and sampleRate are provided automatically. Focused helpers such as analyzeSections(...), analyzeMelody(...), analyzeDynamics(...), analyzeTimbre(...), and the room-acoustic functions remain standalone calls in the WASM wrapper.

typescript
import {
  init,
  Audio,
  analyzeSections,
  analyzeMelody,
  analyzeDynamics,
  analyzeTimbre,
  detectAcoustic,
} from '@libraz/libsonare';

await init();

const audio = Audio.fromBuffer(samples, 44100);

// Analysis
const bpm = audio.detectBpm();
const key = audio.detectKey();
const keyCandidates = audio.detectKeyCandidates();
const beats = audio.detectBeats();
const downbeats = audio.detectDownbeats();
const onsets = audio.detectOnsets();
const result = audio.analyze();
const chords = audio.detectChords({ useHmm: true });
const sections = analyzeSections(audio.data, audio.sampleRate);
const melody = analyzeMelody(audio.data, audio.sampleRate);
const dynamics = analyzeDynamics(audio.data, audio.sampleRate);
const timbre = analyzeTimbre(audio.data, audio.sampleRate);
const acoustic = detectAcoustic(audio.data, audio.sampleRate);

// Effects
const { harmonic, percussive } = audio.hpss();
const corrected = audio.pitchCorrectToMidi(68.7, 69);
const held = audio.noteStretch({ onsetSample: 12000, offsetSample: 24000, stretchRatio: 1.25 });
const voice = audio.voiceChange({ pitchSemitones: 3, formantFactor: 1.05 });
const stretched = audio.timeStretch(1.5);
const shifted = audio.pitchShift(2);
const normalized = audio.normalize(-3.0);
const trimmed = audio.trim(-60.0);

// Feature extraction
const stftResult = audio.stft();
const mel = audio.melSpectrogram();
const mfcc = audio.mfcc();
const chroma = audio.chroma();
const nnls = audio.nnlsChroma();
const env = audio.onsetEnvelope();
const loudness = audio.lufs();
const centroid = audio.spectralCentroid();
const bandwidth = audio.spectralBandwidth();
const rolloff = audio.spectralRolloff();
const flatness = audio.spectralFlatness();
const zcr = audio.zeroCrossingRate();
const rms = audio.rmsEnergy();
const pitch = audio.pitchPyin();

// Resampling
const resampled = audio.resample(22050);

All parameters (e.g., nFft, hopLength, nMels) have the same defaults as the standalone functions.

Streaming API

The Streaming API enables real-time audio analysis for visualizations and live monitoring. Unlike batch analysis, streaming processes audio chunk by chunk with minimal latency.

When to Use

  • Batch API: Pre-recorded files, full analysis (BPM, key, chords, sections)
  • Streaming API: Live audio, visualizations, real-time feedback

StreamConfig

Configuration options for StreamAnalyzer.

typescript
interface StreamConfig {
  sampleRate?: number;         // default: 44100 (stream default, not 22050)
  nFft?: number;               // default: 2048
  hopLength?: number;          // default: 512
  nMels?: number;              // default: 128
  fmin?: number;               // default: 0
  fmax?: number;               // default: 0 (= sr/2)
  tuningRefHz?: number;        // default: 440
  computeMel?: boolean;        // default: true
  computeChroma?: boolean;     // default: true
  computeOnset?: boolean;      // default: true
  computeSpectral?: boolean;   // default: true
  emitEveryNFrames?: number;   // default: 1 (no throttling)
  magnitudeDownsample?: number;// default: 1
  keyUpdateIntervalSec?: number;  // default: 5
  bpmUpdateIntervalSec?: number;  // default: 10
  window?: number;             // 0=Hann (default), 1=Hamming, 2=Blackman, 3=Rectangular
  outputFormat?: number;       // 0=Float32 (default), 1=Int16, 2=Uint8
}

outputFormat controls how readFramesU8/readFramesI16 quantize on the way out (the analysis itself always runs in float). See Realtime and Streaming.

The legacy computeMagnitude flag is no longer supported; passing it makes the constructor throw. The flag was removed because magnitude frames are not exposed by the StreamAnalyzer read paths; use stft/stftDb offline or the spectrum metering helpers for magnitude data.

streamAnalyzerConfigDefaults() returns a fully-populated StreamConfigDefaults object (a Required<StreamConfig>) holding the library's default values for every field above. Use it to seed a settings UI or to compute a diff against a user-supplied config; StreamAnalyzer itself applies these same defaults for any field you omit.

StreamAnalyzer Class

typescript
class StreamAnalyzer {
  constructor(config: StreamConfig);

  // Process audio chunk (internal offset tracking)
  process(samples: Float32Array): void;

  // Process with external synchronization
  processWithOffset(samples: Float32Array, sampleOffset: number): void;

  // Number of frames ready to read
  availableFrames(): number;

  // Read processed frames (full float precision)
  readFrames(maxFrames: number): FrameBuffer;

  // Quantized reads for bandwidth-reduced transfer / visualization
  // (optional quantizeConfig widens quantization ranges for unusually loud/quiet streams;
  // see Realtime and Streaming → custom quantization ranges)
  readFramesU8(maxFrames: number, quantizeConfig?: StreamQuantizeConfig): StreamFramesU8;   // Uint8 feature arrays
  readFramesI16(maxFrames: number, quantizeConfig?: StreamQuantizeConfig): StreamFramesI16; // Int16 feature arrays

  // Reset state for new stream
  reset(baseSampleOffset?: number): void;

  // Get statistics and progressive estimates
  stats(): AnalyzerStats;

  // Total frames processed
  frameCount(): number;

  // Current time position (seconds)
  currentTime(): number;

  // Get the sample rate
  sampleRate(): number;

  // Set expected total duration for pattern lock timing
  setExpectedDuration(durationSeconds: number): void;

  // Set normalization gain for loud/compressed audio
  setNormalizationGain(gain: number): void;

  // Set tuning reference frequency (default: 440 Hz)
  setTuningRefHz(refHz: number): void;

  // Release resources (call when done). `delete()` is canonical; `dispose()` is an alias.
  delete(): void;
  dispose(): void;
}

FrameBuffer

Structure-of-Arrays format for efficient transfer via postMessage.

typescript
interface FrameBuffer {
  nFrames: number;
  nMels: number;
  timestamps: Float32Array;      // [nFrames]
  mel: Float32Array;             // [nFrames * nMels]
  chroma: Float32Array;          // [nFrames * 12]
  onsetStrength: Float32Array;   // [nFrames]
  rmsEnergy: Float32Array;       // [nFrames]
  spectralCentroid: Float32Array;// [nFrames]
  spectralFlatness: Float32Array;// [nFrames]
  chordRoot: Int32Array;         // [nFrames] per-frame chord root
  chordQuality: Int32Array;      // [nFrames] per-frame chord quality
  chordConfidence: Float32Array; // [nFrames] per-frame chord confidence
}

ChordChange

A detected chord change in the progression.

typescript
interface ChordChange {
  root: PitchClass;
  quality: ChordQuality;
  startTime: number;
  confidence: number;
}

BarChord

A chord detected at bar boundary (beat-synchronized).

typescript
interface BarChord {
  barIndex: number;
  root: PitchClass;
  quality: ChordQuality;
  startTime: number;
  confidence: number;
}

PatternScore

Match score for a known chord progression pattern.

typescript
interface PatternScore {
  name: string;   // pattern name (e.g., "royalRoad", "pop")
  score: number;  // match score (0-1)
}

AnalyzerStats

typescript
interface AnalyzerStats {
  totalFrames: number;
  totalSamples: number;
  durationSeconds: number;
  estimate: ProgressiveEstimate;
}

ProgressiveEstimate

BPM, key, and chord estimates that improve over time as more audio is processed.

typescript
interface ProgressiveEstimate {
  // BPM estimation
  bpm: number;              // 0 if not yet estimated
  bpmConfidence: number;    // 0-1, increases over time
  bpmCandidateCount: number;

  // Key estimation
  key: PitchClass;          // 0-11 (C-B)
  keyMinor: boolean;
  keyConfidence: number;    // 0-1, increases over time

  // Chord estimation (current)
  chordRoot: PitchClass;
  chordQuality: ChordQuality;
  chordConfidence: number;
  chordStartTime: number;
  chordProgression: ChordChange[];     // detected chord changes
  barChordProgression: BarChord[];     // bar-synchronized chords
  currentBar: number;                  // current bar index
  barDuration: number;                 // bar duration in seconds

  // Pattern detection
  votedPattern: BarChord[];            // voted chord for each pattern position
  patternLength: number;              // length of repeating pattern (default: 4 bars)
  detectedPatternName: string;        // best matching pattern name (e.g., "royalRoad")
  detectedPatternScore: number;       // match score (0-1)
  allPatternScores: PatternScore[];   // all known pattern scores

  // Statistics
  accumulatedSeconds: number;
  usedFrames: number;
  updated: boolean;         // true if estimate changed this frame
}

Basic Streaming Example

typescript
import { init, StreamAnalyzer } from '@libraz/libsonare';

await init();

// Create analyzer with config object
const analyzer = new StreamAnalyzer({
  sampleRate: 44100,
  nFft: 2048,
  hopLength: 512,
  nMels: 128,
  computeMel: true,
  computeChroma: true,
  computeOnset: true,
  emitEveryNFrames: 1
});

// Process audio chunks (e.g., from AudioWorklet)
function processChunk(samples: Float32Array) {
  analyzer.process(samples);

  // Read available frames
  const available = analyzer.availableFrames();
  if (available > 0) {
    const frames = analyzer.readFrames(available);

    // Use for visualization
    updateVisualization(frames);

    // Check progressive estimates
    const stats = analyzer.stats();
    if (stats.estimate.bpm > 0) {
      console.log(`BPM: ${stats.estimate.bpm.toFixed(1)}`);
      console.log(`Key: ${stats.estimate.key} ${stats.estimate.keyMinor ? 'minor' : 'major'}`);
      console.log(`Current bar: ${stats.estimate.currentBar}`);
      console.log(`Chord progression:`, stats.estimate.chordProgression);
      console.log(`Bar chords:`, stats.estimate.barChordProgression);
    }
  }
}

// Clean up when done (delete() is canonical; dispose() is an alias)
analyzer.delete();
Why call dispose() / delete()? (embind handles)

Classes like StreamAnalyzer, Mixer, and StreamingMasteringChain are C++ objects exposed to JavaScript through embind (Emscripten's C++↔JS bridge).

Each object owns a block of WASM heap memory. The JavaScript garbage collector cannot see or reclaim that memory, so you must release the object yourself.

ClassCleanup method
StreamAnalyzerdelete()
Mixerdelete()
StreamingMasteringChaindelete()

StreamAnalyzer also keeps dispose() as a backward-compatibility alias for delete(). Some WASM classes also expose destroy() as an alias. Skipping cleanup slowly leaks WASM memory in long-running pages.

Plain functions like analyze() return ordinary JS values and need no cleanup. Node native cleanup differs; see Native Bindings.

AudioWorklet Integration

worklet-processor.ts:

typescript
import { init, StreamAnalyzer } from '@libraz/libsonare';

class AnalyzerProcessor extends AudioWorkletProcessor {
  private analyzer?: StreamAnalyzer;

  constructor() {
    super();
    void init().then(() => {
      this.analyzer = new StreamAnalyzer({
        sampleRate,
        nFft: 2048,
        hopLength: 512,
        nMels: 128,
        computeMel: true,
        computeChroma: true,
        computeOnset: true,
        emitEveryNFrames: 4
      });
    });
  }

  process(inputs: Float32Array[][]): boolean {
    const input = inputs[0]?.[0];
    if (!input || !this.analyzer) return true;

    this.analyzer.process(input);

    const available = this.analyzer.availableFrames();
    if (available >= 4) {
      const frames = this.analyzer.readFrames(available);
      this.port.postMessage(frames, [
        frames.mel.buffer,
        frames.chroma.buffer
      ]);
    }

    return true;
  }
}

registerProcessor('analyzer-processor', AnalyzerProcessor);

Data Flow Diagram

Timestamp Synchronization

Stream Time vs AudioContext Time

FrameBuffer.timestamps represents stream time (cumulative input samples), not AudioContext.currentTime. For synchronization:

typescript
// Track offset when starting
const startTime = audioContext.currentTime;
const startOffset = 0;

// In visualization, add offset
const audioTime = startTime + frame.timestamps[i];

Performance Tips

  1. Throttle with emitEveryNFrames: Set to 4 for 60fps visualizations
  2. Process in AudioWorklet: Avoid main thread blocking
  3. Batch reads: Read multiple frames at once when available
  4. Call delete(): Release resources when done to prevent memory leaks

Types

AnalysisResult

typescript
interface AnalysisResult {
  bpm: number;
  bpmConfidence: number;
  key: Key;
  timeSignature: TimeSignature;
  beatTimes: Float32Array;  // Convenience copy of beats[].time, useful for librosa-style code
  beats: Beat[];            // Beat objects with per-beat strength
  chords: Chord[];
  sections: Section[];
  timbre: Timbre;
  dynamics: Dynamics;
  rhythm: RhythmFeatures;
  melody: MelodyContour;
  form: string;  // e.g., "IABABCO"
}

Beat

typescript
interface Beat {
  time: number;      // seconds
  strength: number;  // 0.0 to 1.0
}

Chord

typescript
interface Chord {
  root: PitchClass;
  bass: PitchClass;     // bass note for inversions
  quality: ChordQuality;
  start: number;       // seconds
  end: number;         // seconds
  confidence: number;
  name: string;        // "C", "Am", "G7"
}

Section

typescript
interface Section {
  type: SectionType;
  start: number;
  end: number;
  energyLevel: number;
  confidence: number;
  name: string;  // "Intro", "Verse 1", "Chorus"
}

TimeSignature

typescript
interface TimeSignature {
  numerator: number;    // e.g., 4
  denominator: number;  // e.g., 4
  confidence: number;
}

Timbre

typescript
interface Timbre {
  brightness: number;   // 0.0 to 1.0
  warmth: number;
  density: number;
  roughness: number;
  complexity: number;
}

interface TimbreFrame {
  brightness: number;
  warmth: number;
  density: number;
  roughness: number;
  complexity: number;
}

interface TimbreAnalysisResult extends TimbreFrame {
  spectralCentroid: Float32Array;
  spectralFlatness: Float32Array;
  spectralRolloff: Float32Array;
  timbreOverTime: TimbreFrame[];
}

Dynamics

typescript
interface Dynamics {
  dynamicRangeDb: number;
  peakDb: number;
  rmsDb: number;
  loudnessRangeDb: number;
  crestFactor: number;
  isCompressed: boolean;
}

RhythmFeatures

typescript
interface RhythmFeatures {
  syncopation: number;
  grooveType: string;  // "straight", "shuffle", "swing"
  patternRegularity: number;
  tempoStability: number;
  timeSignature: TimeSignature;
}

MelodyContour

typescript
interface MelodyContour {
  pitchRangeOctaves: number;
  pitchStability: number;
  meanFrequency: number;
  vibratoRate: number;     // Hz
  pitches: MelodyPoint[];  // per-frame pitch trajectory
}

MelodyPoint

typescript
interface MelodyPoint {
  time: number;        // frame time in seconds
  frequency: number;   // estimated f0 in Hz (0 when unvoiced)
  confidence: number;  // voicing confidence, 0.0 to 1.0
}

Enumerations

PitchClass

typescript
const PitchClass = {
  C: 0, Cs: 1, D: 2, Ds: 3, E: 4, F: 5,
  Fs: 6, G: 7, Gs: 8, A: 9, As: 10, B: 11
} as const;

Mode

typescript
const Mode = {
  Major: 0,
  Minor: 1,
  Dorian: 2,
  Phrygian: 3,
  Lydian: 4,
  Mixolydian: 5,
  Locrian: 6
} as const;

ChordQuality

typescript
const ChordQuality = {
  Major: 0, Minor: 1, Diminished: 2, Augmented: 3,
  Dominant7: 4, Major7: 5, Minor7: 6, Sus2: 7, Sus4: 8,
  Unknown: 9, Add9: 10, MinorAdd9: 11, Dim7: 12,
  HalfDim7: 13, Major9: 14, Dominant9: 15, Sus2Add4: 16
} as const;

SectionType

typescript
const SectionType = {
  Intro: 0, Verse: 1, PreChorus: 2, Chorus: 3,
  Bridge: 4, Instrumental: 5, Outro: 6, Unknown: 7
} as const;

Error Handling

All functions throw if the module is not initialized — call await init() first.

Native (C++) failures throw a structured SonareError: an Error subclass carrying a numeric code and its canonical codeName, mirroring the C ABI error enum. The same failure reports the same numeric code on every binding (WASM, Node native, Python, C ABI), so you can branch on the cause instead of matching message text. The package exports the ErrorCode enum, the SonareError class, and an isSonareError(value) type guard.

typescript
import { ErrorCode, isSonareError, Mixer } from '@libraz/libsonare';

try {
  const mixer = Mixer.fromSceneJson(sceneJson, 48000, 512);
} catch (error) {
  if (isSonareError(error) && error.code === ErrorCode.InvalidParameter) {
    // e.g. 'send timing must be a string ("pre" or "post")'
    console.error(`scene rejected: ${error.codeName}: ${error.message}`);
  } else {
    throw error;
  }
}
ErrorCodeValue
Ok0
FileNotFound1
InvalidFormat2
DecodeFailed3
InvalidParameter4
OutOfMemory5
NotSupported6
InvalidState7
Unknown99

The codes match Python's SonareError.code and the C ABI SonareError enum, and the Python CLI maps them onto its exit codes.

Mastering API

The browser package includes the same named mastering processors used by the /mastering demo. Decode audio with Web Audio API, pass Float32Array channel buffers to libsonare, then export the returned samples as WAV in your application.

typescript
import {
  init,
  masterAudioStereo,
  masteringChainStereo,
  masteringChainStereoWithProgress,
  masteringAssistantSuggest,
  masteringAudioProfile,
  masteringPresetNames,
  masteringProcessorNames,
  masteringProcess,
  masteringStreamingPreview,
  masteringStereoAnalyze,
} from '@libraz/libsonare'

await init()

console.log(masteringProcessorNames())
console.log(masteringPresetNames())

const result = masteringChainStereo(left, right, sampleRate, {
  spectral: {
    airBand: { amount: 0.35, shelfFrequencyHz: 14000 },
  },
  maximizer: {
    truePeakLimiter: {
      ceilingDb: -1,
      lookaheadMs: 5,
      releaseMs: 50,
      oversampleFactor: 4,
      applyGainAtInputRate: false,
    },
  },
  loudness: {
    targetLufs: -14,
    ceilingDb: -1,
    truePeakOversample: 4,
  },
})

console.log(result.outputLufs, result.appliedGainDb, result.stages)

const presetResult = masterAudioStereo(left, right, sampleRate, 'pop', {
  'loudness.targetLufs': -14,
  'maximizer.truePeakLimiter.releaseMs': 50,
})
console.log(presetResult.outputLufs, presetResult.stages)

const progressResult = masteringChainStereoWithProgress(left, right, sampleRate, {
  loudness: { targetLufs: -14, ceilingDb: -1, truePeakOversample: 4 },
}, (progress, stage) => {
  console.log(`mastering ${(progress * 100).toFixed(0)}%: ${stage}`)
})
console.log(progressResult.outputLufs)

const mono = masteringProcess('spectral.airBand', samples, sampleRate, {
  amount: 0.4,
  shelfFrequencyHz: 14000,
})

const stereoReport = masteringStereoAnalyze('stereo.monoCompatCheck', left, right, sampleRate)
console.log(JSON.parse(stereoReport))

const profile = JSON.parse(masteringAudioProfile(samples, sampleRate, {
  nFft: 2048,
  hopLength: 512,
  truePeakOversample: 4,
}))
const suggestions = JSON.parse(masteringAssistantSuggest(samples, sampleRate, {
  targetLufs: -14,
  ceilingDb: -1,
  preferStreamingSafe: true,
}))
const deliveryPreview = JSON.parse(masteringStreamingPreview(samples, sampleRate, [
  { name: 'YouTube', targetLufs: -14, ceilingDb: -1 },
  { name: 'Podcast', targetLufs: -16, ceilingDb: -1 },
]))
console.log(profile, suggestions, deliveryPreview)

masteringAudioProfile() accepts optional numeric profile settings: nFft, hopLength, and truePeakOversample. masteringAssistantSuggest() accepts targetLufs, ceilingDb, enableRepair, preferStreamingSafe, and speechMonoAmount; snake_case aliases are also accepted by the native bindings.

Use masteringPairProcessorNames() and masteringPairAnalyze() for reference-track workflows such as match analysis or A/B reporting. Pair inputs should use the same sample rate and comparable duration.

StreamingEqualizer

StreamingEqualizer is the block-by-block EQ wrapper used for realtime-safe processing: up to 24 bands, zero-latency/natural/linear phase modes, dynamic EQ, mid/side processing, external sidechain input, spectrum snapshots, and offline reference matching. In the WASM wrapper, call init() first and delete() when done.

typescript
import { init, StreamingEqualizer } from '@libraz/libsonare';
await init();

const eq = new StreamingEqualizer({ sampleRate: 48000, maxBlockSize: 512 });
try {
  eq.setBand(0, {
    type: 'HighShelf',
    frequencyHz: 8000,
    gainDb: 4,
    q: 0.7,
    enabled: true,
  });
  eq.setPhaseMode(1); // 1 = zero-latency, 2 = natural, 3 = linear
  eq.setAutoGain(true);

  const { left, right } = eq.processStereo(leftBlock, rightBlock);
  console.log(eq.spectrum(), eq.latencySamples(), left, right);
} finally {
  eq.delete();
}

Source-built C++ CLI equivalents for file-based EQ and filtering:

bash
sonare eq track.wav --type 2 --frequency-hz 8000 --gain-db 4 --q 0.7 -o eq.wav
sonare filter track.wav --type hp --cutoff 80 -o filtered.wav

StreamingRetune

StreamingRetune is the block-by-block mono pitch retune wrapper. It maintains grain and delay state across calls, so use prepare() before the first block and delete() when done.

typescript
import { init, StreamingRetune } from '@libraz/libsonare';
await init();

const retune = new StreamingRetune({ semitones: 3, mix: 1, grainSize: 0 });
retune.prepare(48000, 512);

try {
  const out = retune.processMono(inputBlock);
  retune.setConfig({ semitones: -2, mix: 0.75 });
  console.log(out, retune.config(), retune.grainSize());
} finally {
  retune.delete();
}

Closest CLI equivalents for offline files from the source-built C++ CLI:

bash
sonare pitch-shift vocal.wav --semitones 3 -o shifted.wav
sonare voice-change vocal.wav --pitch-semitones 3 --formant-factor 1.0 -o voice.wav

RealtimeVoiceChanger

RealtimeVoiceChanger is the preset-driven live voice chain. It combines retune, formant, EQ, gate, compressor, de-esser, reverb, and limiter stages, and keeps state across audio blocks. Use it for monitoring, AudioWorklet-style processing, or chunked voice rendering where voiceChange(...) is too simple.

Factory preset IDs are available at runtime with realtimeVoiceChangerPresetNames(). Preset JSON can be fetched and validated with realtimeVoiceChangerPresetJson(...) and validateRealtimeVoiceChangerPresetJson(...). The current schema version is 1.

typescript
import {
  init,
  RealtimeVoiceChanger,
  realtimeVoiceChangerPresetJson,
  realtimeVoiceChangerPresetConfig,
  realtimeVoiceChangerPresetNames,
  validateRealtimeVoiceChangerPresetJson,
  voiceCharacterPresetId,
} from '@libraz/libsonare';

await init();

const preset = realtimeVoiceChangerPresetNames()[1]; // e.g. "bright-idol"
const presetJson = realtimeVoiceChangerPresetJson(preset);
const presetConfig = realtimeVoiceChangerPresetConfig(preset);
console.log(voiceCharacterPresetId(1), validateRealtimeVoiceChangerPresetJson(presetJson).ok, presetConfig);

const changer = new RealtimeVoiceChanger(preset);
changer.prepare(48000, 128, 1);

try {
  const out = changer.processMono(inputBlock);
  const realtime = changer.createRealtimeMonoBuffer(128);
  realtime.input.set(inputBlock.subarray(0, 128));
  realtime.process();
  console.log(out, realtime.output, changer.latencySamples());
} finally {
  changer.delete();
}

The zero-copy buffer helpers (createRealtimeMonoBuffer, createRealtimeInterleavedBuffer, and createRealtimePlanarBuffer) return WASM heap views owned by the changer. Reuse them inside a realtime loop, and discard them after delete().

voiceChangeRealtime(samples, options?)

voiceChangeRealtime(...) is the offline whole-buffer convenience wrapper around RealtimeVoiceChanger. It internally constructs and prepares a changer, runs the per-block render loop for you, then disposes it — matching the Python voice_change_realtime and Node wrappers — so callers do not manage the stateful object themselves.

typescript
function voiceChangeRealtime(
  samples: Float32Array,
  options?: {
    sampleRate?: number;
    preset?: VoicePresetId | number | RealtimeVoiceChangerConfigInput;
    channels?: 1 | 2;   // default 1 (mono); 2 = interleaved stereo (L0,R0,L1,R1,...)
    blockSize?: number; // default 512
  },
): Float32Array  // same layout/length as the input
typescript
import { init, voiceChangeRealtime, realtimeVoiceChangerPresetNames } from '@libraz/libsonare';
await init();

const preset = realtimeVoiceChangerPresetNames()[1]; // e.g. "bright-idol"
const out = voiceChangeRealtime(vocal, { sampleRate: 48000, preset });

channels defaults to 1 (a plain mono buffer); pass channels: 2 for interleaved stereo input. The output has the same layout and length as the input.

Use this when you have the full buffer already. Reach for RealtimeVoiceChanger directly for manual block-by-block live use, and for voiceChange(...) when you only need a one-shot pitch/formant change without the full preset chain. See Realtime Voice Changer for the preset list and chain stages.

StreamingMasteringChain

For real-time or memory-constrained use cases, such as processing audio block-by-block from AudioWorklet or a stream, the WASM module exposes StreamingMasteringChain. It accepts a StreamingMasteringChainConfig, which extends masteringChain()'s MasteringChainConfig with two optional streaming-only fields:

  • loudnessStaticGainDb — a precomputed static loudness gain in dB (e.g. targetLufs - measuredIntegratedLufs), applied per block so a preset's streaming preview matches its offline render with a loudness stage enabled.
  • loudnessStaticGainPeakDb — the offline-measured source true-peak in dBFS. When set, the static gain is clamped to loudness.ceilingDb - loudnessStaticGainPeakDb so the streaming limiter is not driven harder than the offline chain.

It otherwise prepares processor state for a fixed block size and applies the chain incrementally.

typescript
import { init, StreamingMasteringChain } from '@libraz/libsonare';
await init();

const chain = new StreamingMasteringChain({
  eq: { tiltDb: 0.5 },
  dynamics: { compressor: { thresholdDb: -20 } },
  maximizer: { truePeakLimiter: { ceilingDb: -1, oversampleFactor: 4 } },
});

chain.prepare(48000, /*maxBlockSize=*/512, /*numChannels=*/2);

const monoOut = chain.processMono(monoBlock);                // 1ch
const { left, right } = chain.processStereo(leftBlock, rightBlock); // 2ch

console.log(chain.stageNames());      // ['eq.tilt', 'dynamics.compressor', ...]
console.log(chain.latencySamples());  // total latency reported by active stages

chain.reset();   // clear processor state without re-preparing
chain.delete();  // release the WASM handle (call when done)

Stereo-only stages are skipped when numChannels === 1.

Repair stages exposed by the chain config are offline-only and throw if enabled on the streaming constructor:

  • repair.declick
  • repair.dereverb
  • repair.denoise

These are the only repair stages the chain config surfaces (per the shipped MasteringChainConfig.repair type), and the streaming constructor throws when any of them is enabled. The other repair processors (declip, decrackle, dehum) are not part of the chain config at all and run only through the one-shot helpers masteringRepairDeclip / masteringRepairDecrackle / masteringRepairDehum.

Use masteringChain* or masterAudio* when you need the chain-config repair stages.

The loudness stage is a special case. The streaming chain cannot measure whole-signal integrated LUFS, so an enabled loudness stage throws at construction unless you supply loudnessStaticGainDb (optionally with loudnessStaticGainPeakDb). With those fields set, the chain applies the precomputed static gain plus the loudness stage's true-peak limiter per block instead of throwing.

Use reset() between independent songs that share the same chain. Use delete() to free the underlying handle.

The named mastering API families are:

PurposeFunction
Apply simple loudness masteringmastering()
List built-in mastering presetsmasteringPresetNames()
Apply a preset to mono audiomasterAudio()
Apply a preset to stereo audiomasterAudioStereo()
Apply a preset to mono audio with progressmasterAudioWithProgress()
Apply a preset to stereo audio with progressmasterAudioStereoWithProgress()
Run a full mono chainmasteringChain()
Run a full stereo chainmasteringChainStereo()
Run a full mono chain with progressmasteringChainWithProgress()
Run a full stereo chain with progressmasteringChainStereoWithProgress()
Run block-by-block EQStreamingEqualizer
Run a streaming chain (block-by-block)StreamingMasteringChain
Summarize source audio for mastering decisionsmasteringAudioProfile()
Suggest mastering moves from source analysismasteringAssistantSuggest()
Preview loudness targets for delivery platformsmasteringStreamingPreview()
List mono/stereo processorsmasteringProcessorNames()
Get machine-readable processor classificationsmasteringProcessorCatalog()
List chain insert processorsmasteringInsertNames()
List the parameter keys an insert acceptsmasteringInsertParamNames(name)
List realtime-automatable insert parametersmasteringInsertParamInfo(name)
Process mono audiomasteringProcess()
Process stereo audiomasteringProcessStereo()
List pair processorsmasteringPairProcessorNames()
Process source/reference pairmasteringPairProcess()
List pair analysesmasteringPairAnalysisNames()
Analyze source/reference pairmasteringPairAnalyze()
List stereo analysesmasteringStereoAnalysisNames()
Analyze stereo channelsmasteringStereoAnalyze()

Related mastering guides: Processing chain, Tone and air, Dynamics, Stereo, limiter, and loudness, Reference match.

Standalone dynamics and repair processors

Every named stage is also a one-shot function, so you can run a single processor without assembling a chain. The dynamics processors return a DynamicsResult (the processed samples plus latencySamples, the processor's look-ahead latency in samples); the repair processors return a Float32Array.

typescript
// Offline dynamics
function masteringDynamicsCompressor(samples: Float32Array, sampleRate: number, options?: CompressorOptions): DynamicsResult
function masteringDynamicsGate(samples: Float32Array, sampleRate: number, options?: GateOptions): DynamicsResult
function masteringDynamicsTransientShaper(samples: Float32Array, sampleRate: number, options?: TransientShaperOptions): DynamicsResult

// Offline repair
function masteringRepairDeclick(samples: Float32Array, sampleRate: number, options?: DeclickOptions): Float32Array
function masteringRepairDeclip(samples: Float32Array, sampleRate: number, options?: DeclipOptions): Float32Array
function masteringRepairDecrackle(samples: Float32Array, sampleRate: number, options?: DecrackleOptions): Float32Array
function masteringRepairDehum(samples: Float32Array, sampleRate: number, options?: DehumOptions): Float32Array
function masteringRepairDenoiseClassical(samples: Float32Array, sampleRate: number, options?: DenoiseClassicalOptions): Float32Array
function masteringRepairDereverbClassical(samples: Float32Array, sampleRate: number, options?: DereverbClassicalOptions): Float32Array
function masteringRepairTrimSilence(samples: Float32Array, sampleRate: number, options?: TrimSilenceOptions): Float32Array

The repair stages are offline-only and are rejected by StreamingMasteringChain — run them with these one-shot helpers or inside masteringChain*/masterAudio*. See Dynamics and Repair.

MasteringChainConfig

masteringChain* and StreamingMasteringChain use the nested config schema below. Every key is optional. Only the stages you set are activated.

The chain runs in this order: repair → eq → dynamics → saturation → spectral → stereo → maximizer → loudness.

masterAudio* starts from a preset and accepts overrides using the same key names in flat dot-notation form, such as "dynamics.compressor.thresholdDb".

maximizer.truePeakLimiter.releaseMs controls the post-limiter release time. Omit it to keep the preset/config default of 50 ms; if you provide a flat override, the value is applied directly. maximizer.truePeakLimiter.applyGainAtInputRate applies static loudness gain before oversampling when set, which is useful when you need that gain staged at the source rate for host parity.

Full interface (click to expand)
typescript
interface MasteringChainConfig {
  repair?: {
    denoise?: boolean;
    nFft?: number; hopLength?: number; ddAlpha?: number; gainFloor?: number;
    declick?: { threshold?: number; neighborRatio?: number; maxClickSamples?: number;
                lpcOrder?: number; residualRatio?: number; };
    dereverb?: { threshold?: number; attenuation?: number; nFft?: number;
                 hopLength?: number; t60Sec?: number; lateDelayMs?: number;
                 overSubtraction?: number; spectralFloor?: number;
                 wpeEnabled?: boolean; wpeIterations?: number; wpeTaps?: number;
                 wpeStrength?: number; };
  };
  eq?: { tiltDb?: number; pivotHz?: number };
  dynamics?: {
    compressor?: { thresholdDb?: number; ratio?: number; attackMs?: number;
                   releaseMs?: number; kneeDb?: number; makeupGainDb?: number;
                   autoMakeup?: boolean; };
    deesser?: { frequencyHz?: number; thresholdDb?: number; ratio?: number;
                attackMs?: number; releaseMs?: number; rangeDb?: number;
                bandpassQ?: number; };
    transientShaper?: { attackGainDb?: number; sustainGainDb?: number;
                        fastAttackMs?: number; fastReleaseMs?: number;
                        slowAttackMs?: number; slowReleaseMs?: number;
                        sensitivity?: number; maxGainDb?: number;
                        gainSmoothingMs?: number; lookaheadMs?: number; };
    multibandComp?: { lowCutoffHz?: number; highCutoffHz?: number;
                      lowThresholdDb?: number;  lowRatio?: number;
                      lowAttackMs?: number;     lowReleaseMs?: number;
                      midThresholdDb?: number;  midRatio?: number;
                      midAttackMs?: number;     midReleaseMs?: number;
                      highThresholdDb?: number; highRatio?: number;
                      highAttackMs?: number;    highReleaseMs?: number; };
  };
  saturation?: {
    tape?: { driveDb?: number; saturation?: number; hysteresis?: number;
             outputGainDb?: number; speedIps?: number; headBumpDb?: number;
             bias?: number; gapLoss?: number; };
    exciter?: { frequencyHz?: number; driveDb?: number; amount?: number;
                q?: number; evenOddMix?: number; };
  };
  spectral?: {
    airBand?: { amount?: number; shelfFrequencyHz?: number;
                dynamicThresholdDb?: number; dynamicRangeDb?: number; };
  };
  stereo?: {
    imager?: { width?: number; outputGainDb?: number;
               decorrelationAmount?: number; preserveEnergy?: boolean; };
    monoMaker?: { amount?: number };
  };
  maximizer?: {
    truePeakLimiter?: { ceilingDb?: number; lookaheadMs?: number;
                        releaseMs?: number; oversampleFactor?: number;
                        applyGainAtInputRate?: boolean; };
  };
  loudness?: { targetLufs?: number; ceilingDb?: number;
               truePeakOversample?: number; };
}

interface MasteringResult {
  samples: Float32Array;
  sampleRate: number;
  inputLufs: number;
  outputLufs: number;
  appliedGainDb: number;
  latencySamples?: number;
}
interface MasteringChainResult extends MasteringResult { stages: string[] }
interface MasteringStereoResult {
  left: Float32Array;
  right: Float32Array;
  sampleRate: number;
  inputLufs: number;
  outputLufs: number;
  appliedGainDb: number;
  latencySamples: number;
}
// Returned by masteringChainStereo / masterAudioStereo (and their
// WithProgress variants); MasteringStereoResult is the return type of
// masteringProcessStereo.
interface MasteringStereoChainResult {
  left: Float32Array;
  right: Float32Array;
  sampleRate: number;
  inputLufs: number;
  outputLufs: number;
  appliedGainDb: number;
  stages: string[];
  latencySamples?: number;
}

The glossary mastering guides explain when to reach for each section: Repair, Tone and Air, Dynamics, Stereo, Limiter, Loudness.

Mixing API

The WASM package exposes the libsonare mixing engine. mixStereo(...) is a compact one-shot renderer for stem arrays. Mixer is a persistent scene-based mixer with channel strips, buses, sends, VCA groups, automation, strip meters, and goniometer buffers.

typescript
import {
  Mixer,
  mixStereo,
  mixingScenePresetJson,
  mixingScenePresetNames,
} from '@libraz/libsonare';

mixingScenePresetNames(); // ['vocalReverbSend', ...]

const offline = mixStereo([vocalL, musicL], [vocalR, musicR], sampleRate, {
  inputTrimDb: [3, 0],
  faderDb: [-3, -12],
  pan: [0, -0.2],
  width: [1, 0.9],
  muted: [false, false],
});

const mixer = Mixer.fromSceneJson(mixingScenePresetJson('vocalReverbSend'), sampleRate, 512);
mixer.sceneWarnings(); // non-fatal scene-load warnings: insert params no processor reads (typos)
const block = mixer.processStereo([vocalBlockL, musicBlockL], [vocalBlockR, musicBlockR]);
const meter = mixer.stripMeter(0, 'postFader');

mixer.scheduleFaderAutomation(0, sampleRate * 8, -6, 's-curve');
mixer.schedulePanAutomation(0, sampleRate * 8, -0.25, 'linear');
mixer.scheduleSendAutomation(0, 0, sampleRate * 12, -12, 'hold');

const goniometer = mixer.readGoniometerLatest(0, 256);
const sceneJson = mixer.toSceneJson();
mixer.delete();

Mixer.createRealtimeBuffer() and processStereoInto(...) are intended for AudioWorklet-style render loops where avoiding per-block allocation matters. See Mixing Engine for scene and routing details.

Projects, instruments & live MIDI

The package also exposes the project, synthesis, and live-input surface used to turn MIDI/clip arrangements into audio. These are summarized here; each topic has a dedicated guide.

GoalUseGuide
Build/load a clip + MIDI arrangement and edit itProject (Project.fromJson, toSceneJson, MIDI event helpers)Project Editing
Render a project to audioproject.bounceWithSynthInstrument(s)Project Bounce
Pick a built-in synth voicesynthPresetNames(), synthPresetPatch(name), engine.setSynthInstrument(...)Native Synth
Play through a SoundFontproject.loadSoundFont(bytes) / engine.loadSoundFont(bytes)SoundFont Player
Schedule MIDI clips into the live engine, sample-accuratelyengine.setMidiClips(...), engine.sampleAtPpq(ppq)Realtime and Streaming
Mix the engine's tracks live with lanes, buses, sends, and stripsengine.setTrackLanes(...), engine.setTrackBuses(...), strip JSON settersRealtime and Streaming
Drive the engine from a hardware/Web MIDI devicebindWebMidi(engine, ...) Browser onlyMIDI Input
Feed a live microphone into the enginebindMicrophoneInput(context, engine, ...) Browser onlyRecording and Takes
typescript
import { Project, synthPresetNames } from '@libraz/libsonare';

const project = Project.fromJson(projectJson);
const audio = project.bounceWithSynthInstrument(synthPresetNames()[0]);

bounceWithSynthInstrument(...) accepts either one instrument or an array of instruments, one per destination. Each entry may be a preset name (a "va:" routing prefix is allowed), an explicit SynthPatch, or null for the init patch.

bindWebMidi(...) and bindMicrophoneInput(...) are browser-only helpers that wire Web MIDI / a MediaStream into a live RealtimeEngine. See Realtime and Streaming for the engine itself.

Type Export Index

The WASM package exports TypeScript helper types in addition to functions and classes. Use these when typing options, realtime buffers, and callback payloads.

AreaExported types/constants
Environment and engineEXPECTED_ENGINE_ABI_VERSION, EXPECTED_PROJECT_ABI_VERSION, EngineCapabilities, ProgressCallback
Engine lane mixer, markers, and MIDI clipsEngineTrackLane, EngineTrackSend, EngineBus, EngineMarker, EngineMidiClipSchedule, EngineMidiEvent, MarkerKind, ProjectMarker, SurroundPan
Key/chord/rhythm/timbre analysisChordDetectionOptions, KeyProfileName, RhythmAnalysisResult, TimbreAnalysisResult, TimbreFrame, DynamicsAnalysisResult
Spectral and feature transformsMelPowerResult, StftPowerResult, SpectralRegionOp, SpectralEditOptions, TempogramMode
MasteringMasteringProcessorParams, MasteringProcessorCatalogEntry, MasteringInsertParamInfo, MasteringChannelPolicy, MasteringStereoChainResult
Streaming retuneStreamingRetuneConfig
Streaming EQStreamingEqualizerConfig, EqBandType, EqBandPhase, EqCoeffMode, EqMatchOptions, EqStereoPlacement
Realtime voiceVoicePresetId, RealtimeVoiceChangerConfigInput, RealtimeVoiceChangerPodConfig, RealtimeVoiceChangerMonoBuffer, RealtimeVoiceChangerInterleavedBuffer, RealtimeVoiceChangerPlanarBuffer
Mixing and Worklet realtime buffersMixerRealtimeBuffer, SonareScopeRingBuffer, SonareScopeRingReadResult, SonareWorkletScopeSnapshot

Performance Summary

APILoadNotes
StreamAnalyzerReal-timePer-chunk processing, ~2ms/frame, progressive BPM/key/chord estimation
MixerReal-timeScene-based block processing with automation and meters
analyze / analyzeWithProgressHeavyFull analysis pipeline
hpss / harmonic / percussiveHeavySTFT + median filtering
timeStretchHeavyPhase vocoder
pitchShiftHeavyTime stretch + resample
stft / stftDbMediumMultiple FFT operations
melSpectrogram / mfccMediumSTFT + filterbank
chromaMediumSTFT + chroma filterbank
pitchYin / pitchPyinMediumPer-frame pitch detection
resampleMediumHigh-quality resampling
detectBpm / detectKeyLightSingle result
detectBeats / detectOnsetsLightFrame-based detection
Unit conversion functionsLightPure computation
normalize / trimLightSimple processing

Bundle Size

FileSizeGzipped
sonare.js~57 KB~14 KB
index.js~166 KB~35 KB
sonare.wasm~2,986 KB~1,070 KB
Total~3,210 KB~1,121 KB

Browser Support

BrowserMinimum Version
Chrome57+
Firefox52+
Safari11+
Edge16+

Requirements: WebAssembly, ES2017+ (async/await), Web Audio API