Skip to content

Native Bindings

libsonare ships three bindings: browser WASM, Python, and a Node.js native addon. This page compares all three so you can pick one, and documents the Node native addon in detail. See the individual API pages for each language:

  • Python API — ctypes-based bindings with wheels on PyPI
  • Node.js (N-API) — Native addon for direct C++ performance (documented below)

For beginners, the choice is usually simple: use Python for scripts and notebooks, use WASM for browser apps, and use Node native only when you specifically need native file decoding or native runtime performance from Node.js.

You are buildingUsePackage
A browser appWASM@libraz/libsonare
A Python script or notebookPythonpip install libsonare
A Node.js app needing native decode or performanceNode native@libraz/libsonare-native

What You Will Learn

By the end of this page you should be able to:

  • choose between browser WASM, Python, and Node native without treating them as interchangeable packages;
  • build and import the Node N-API addon when native decoding or performance is required;
  • understand which examples use @libraz/libsonare and which use @libraz/libsonare-native;
  • map native addon functions back to the broader JavaScript, Python, mastering, and mixing docs.

Comparison

WebAssemblyPythonNode.js (N-API)
PlatformBrowserDesktopDesktop
Distributionnpm (@libraz/libsonare)PyPI (pip install libsonare)Source (bindings/node)
BuildEmscriptenPre-built wheels (or CMake + pip)CMake + cmake-js
PerformanceNear-nativeNativeNative
StreamingYesYesYes
File I/OSample-based APIs; Audio.fromMemory(...) decodes WAV/MP3 bytes and browser fallback can decode supported formatsWAV/MP3 by default; FFmpeg formats in FFmpeg buildsWAV/MP3 by default; FFmpeg formats in FFmpeg builds
EffectsYesYesYes
Feature ExtractionYesYesYes
Inverse reconstructionYesYesYes
Unit ConversionsYesYesYes
MasteringYesYesYes
MixingYesYesYes

Node.js (N-API)

The Node.js binding is a native addon built with N-API, providing direct C++ performance without WebAssembly overhead.

What are N-API and a "native addon"?

A native addon is a compiled C/C++ module that Node loads like a normal package. It runs real machine code instead of JavaScript or WebAssembly.

N-API (Node-API) is the stable interface Node provides for building these addons. It shields the addon from V8 engine internals, so one compiled binary can keep working across Node versions without recompiling.

The practical upside is native speed and direct file decoding from Node. The cost is that the addon must be built or installed for your platform, instead of running everywhere like the WASM package.

Choosing A Node Package

PackageInitializationUse when
@libraz/libsonareCall await init() before useYou want the browser-compatible WASM package or the exact browser-demo API
@libraz/libsonare-nativeNo WASM init; import and call functions directlyYou need native file decoding, native runtime performance, or source-tree addon development

Examples in JavaScript API use the WASM package. Examples below use the native addon unless the import path is explicitly @libraz/libsonare.

Requirements

  • Node.js 22+
  • CMake 3.16+
  • C++17 compiler
  • Yarn 4+

Installation

bash
git clone https://github.com/libraz/libsonare.git
cd libsonare/bindings/node
yarn install
yarn build

Mastering API

Node users can choose between the WASM npm package and the native addon:

PackageUse when
@libraz/libsonareYou want the same API as the browser demo or need Web-compatible WASM.
@libraz/libsonare-nativeYou need native file decoding or native runtime performance in Node.js.
typescript
import {
  masterAudioStereo,
  masteringChainStereo,
  masteringAssistantSuggest,
  masteringAudioProfile,
  masteringPresetNames,
  masteringPairAnalyze,
  masteringProcessorNames,
  masteringStreamingPreview,
} from '@libraz/libsonare-native'

console.log(masteringProcessorNames())
console.log(masteringPresetNames())

const mastered = masteringChainStereo(left, right, sampleRate, {
  dynamics: {
    compressor: {
      thresholdDb: -18,
      ratio: 2.2,
      autoMakeup: true,
    },
  },
  loudness: {
    targetLufs: -14,
    ceilingDb: -1,
    truePeakOversample: 4,
  },
})
console.log(mastered.outputLufs, mastered.stages)

const presetMaster = masterAudioStereo(left, right, sampleRate, 'pop', {
  'loudness.targetLufs': -14,
})
console.log(presetMaster.outputLufs, presetMaster.stages)

const matchReport = JSON.parse(
  masteringPairAnalyze('match.referenceLoudness', source, reference, sampleRate),
)

const masteredWithProgress = masteringChainStereo(left, right, sampleRate, {
  loudness: { targetLufs: -14, ceilingDb: -1, truePeakOversample: 4 },
}, (progress, stage) => {
  console.log(`render ${(progress * 100).toFixed(0)}%: ${stage}`)
})
console.log(masteredWithProgress.outputLufs)

const profile = JSON.parse(masteringAudioProfile(samples, sampleRate, {
  nFft: 2048,
  hopLength: 512,
  truePeakOversample: 4,
}))
const suggestions = JSON.parse(masteringAssistantSuggest(samples, sampleRate, {
  targetLufs: -14,
  ceilingDb: -1,
  preferStreamingSafe: true,
}))
const deliveryPreview = JSON.parse(masteringStreamingPreview(samples, sampleRate, [
  { name: 'Streaming', targetLufs: -14, ceilingDb: -1 },
]))
console.log(profile, suggestions, deliveryPreview)

The assistant/profile helpers accept the same option names as the WASM entry points. Profile params are nFft, hopLength, and truePeakOversample; assistant params are targetLufs, ceilingDb, enableRepair, preferStreamingSafe, and speechMonoAmount. The native parser also accepts snake_case aliases.

For long offline renders, pass the optional progress callback to masteringChain(...), masteringChainStereo(...), masterAudio(...), or masterAudioStereo(...) and update your Node UI from that callback.

The WASM package exposes the same camelCase mastering API names as the browser demo. The main groups are:

GroupAPI names
Presets and quick entry pointsmastering(), masteringPresetNames(), masterAudio(), masterAudioStereo(), masterAudioWithProgress(), masterAudioStereoWithProgress()
Full chainsmasteringChain(), masteringChainStereo(), masteringChainWithProgress(), masteringChainStereoWithProgress()
Offline dynamics (one-shot)masteringDynamicsCompressor(), masteringDynamicsGate(), masteringDynamicsTransientShaper()
Offline repair (one-shot)masteringRepairDeclick(), masteringRepairDeclip(), masteringRepairDecrackle(), masteringRepairDehum(), masteringRepairDenoiseClassical(), masteringRepairDereverbClassical(), masteringRepairTrimSilence()
Assistant and profilingmasteringAudioProfile(), masteringAssistantSuggest(), masteringStreamingPreview()
Named processorsmasteringProcessorNames(), masteringProcessorCatalog(), masteringInsertNames(), masteringInsertParamNames(name), masteringInsertParamInfo(name), masteringProcess(), masteringProcessStereo()
Pair and stereo analysismasteringPairProcessorNames(), masteringPairProcess(), masteringPairAnalysisNames(), masteringPairAnalyze(), masteringStereoAnalysisNames(), masteringStereoAnalyze()
Streaming renderStreamingMasteringChain

Node native uses the same base names but folds progress into an optional final callback argument instead of exporting separate *WithProgress wrapper functions.

Mixing API

The native addon and the WASM package both expose the mixing surface: mixStereo(...), mixingScenePresetNames(), mixingScenePresetJson(), and the persistent Mixer class.

Use it for channel-strip processing, scene presets, sends, buses, automation, meters, and offline stem rendering. See Mixing Engine for the cross-runtime guide.

For persistent mixers, Node native accepts a StripRef (number | string) for most strip control methods; WASM methods use numeric strip indexes and expose stripById(id) for lookup. Node stripMeter(strip) reads the post-fader meter; use meterTap(strip, 'preFader' | 'postFader') when you need an explicit tap. After loading scene JSON, mixer.sceneWarnings() lists insert params no processor consumed (typically typos) as non-fatal warnings.

Projects, Instruments & Live MIDI

The Node native addon exposes the same headless-DAW surface as WASM and Python: the Project class (tracks, clips, tempo, undo/redo, SMF/MIDI 2.0 interchange), instrument-bound bounces (bounceWithSynthInstrument(s), SoundFont loading), the NativeSynth preset catalog (synthPresetNames() / synthPresetPatch() / SynthPatch), chordFunctionalAnalysis(...), and the RealtimeEngine with live MIDI input. The engine carries the same lane mixer and MIDI clip schedule as the other bindings — setTrackLanes / setTrackBuses, the per-track, master, and bus strip JSON setters, queueable setSoloMute, track pan/law/mode/dual-pan/channel-delay setters, insert-param-by-name setters, wide/scope telemetry, setMidiClips, and sampleAtPpq — with the same camelCase names as WASM (see Realtime and Streaming). The browser-only glue (bindWebMidi, bindMicrophoneInput) is WASM-specific and not part of the native addon.

The guides carry the depth: Project Editing, Bouncing Projects, Built-in Synthesizer, SoundFont Player, and MIDI Input.

Error Handling

Like the WASM package, the native addon throws a structured SonareError on every native failure: an Error subclass with a numeric code and its canonical codeName, mirroring the C ABI error enum. Both packages export ErrorCode, SonareError, and the isSonareError(value) type guard, and the same failure reports the same numeric code on every binding. See Error Handling for the code table and a usage example.

Audio Wrapper Differences

The WASM Audio class is a convenience wrapper for common one-shot helpers. Focused helpers remain standalone when they are less common or need a different calling shape.

Available as Audio methodsStill standalone in WASM
Core BPM/key/beat/chord analysisanalyzeSections(...)
HPSS and editing helpersanalyzeMelody(...)
Mastering helpersanalyzeDynamics(...)
Feature extractionanalyzeTimbre(...)
Loudness and resamplingRoom-acoustic helpers, section/melody/dynamics/timbre helpers

Node native's Audio wrapper is broader because it can call into the native addon directly.

CapabilityNode nativeWASM
Extra Audio methodsMore focused analysis and room-acoustic methods are available as instance methodsUse standalone focused helpers where available
File constructionAudio.fromFile(...), Audio.fromMemory(...)Audio.fromBuffer(...), Audio.fromMemory(...), Audio.fromMemoryWithBrowserFallback(...)

Node native adds analyzeBpm(...), analyzeImpulseResponse(...), detectAcoustic(...), analyzeRhythm(...), analyzeDynamics(...), analyzeTimbre(...), and detectChords(...) to Audio. The room helpers estimateRoom(...), synthesizeRir(...), and roomMorph(...) remain standalone functions.

StreamingMasteringChain

The native addon (and the WASM package) exposes a StreamingMasteringChain class for incremental rendering — for example when bridging an Electron app, worker, or audio capture pipeline. It accepts the same nested config as masteringChain() and renders one block at a time.

typescript
import { StreamingMasteringChain } from '@libraz/libsonare-native';

const chain = new StreamingMasteringChain({
  eq: { tilt: { tiltDb: 0.5 } },
  dynamics: { compressor: { thresholdDb: -20 } },
  maximizer: { truePeakLimiter: { ceilingDb: -1, oversampleFactor: 4 } },
});

chain.prepare(48000, /*maxBlockSize=*/512, /*numChannels=*/2);

const monoOut = chain.processMono(monoBlock);
const { left, right } = chain.processStereo(leftBlock, rightBlock);

console.log(chain.stageNames(), chain.latencySamples());
chain.reset();   // clear state without rebuilding

Stereo-only stages are skipped when numChannels === 1. The streaming chain rejects offline-only repair stages and the whole-file loudness stage; use masteringChain* or masterAudio* for those. The WASM build also exposes chain.delete() to release the underlying handle; the native addon releases its handle on GC.

Related mastering guides: Browser local processing, Reference match, Quality checklist.

StreamingEqualizer

StreamingEqualizer is available in Node native, Python, and WASM.

It is useful when you need an EQ that keeps state across processMono / processStereo calls. It can also publish a spectrum snapshot and configure bands from a reference match.

Node native and WASM accept the same phase mode values; Python additionally supports context-manager usage:

RuntimePhase mode values
Node native'zero', 'natural', 'linear', or 1 / 2 / 3
WASM'zero', 'natural', 'linear', or 1 / 2 / 3 (same as Node native)
PythonString or numeric modes; also supports with StreamingEqualizer(...) as eq: / eq.close()
typescript
import { StreamingEqualizer } from '@libraz/libsonare-native';

const eq = new StreamingEqualizer({ sampleRate: 48000, maxBlockSize: 512 });
eq.setBand(0, { type: 'HighShelf', frequencyHz: 8000, gainDb: 4, enabled: true });
eq.setPhaseMode('natural');
eq.setAutoGain(true);

const { left, right } = eq.processStereo(leftBlock, rightBlock);
console.log(eq.spectrum(), eq.latencySamples(), left, right);

@libraz/libsonare-native is currently intended to be built from bindings/node in the source tree. To use it from another project, reference the built local package through your workspace or a file: dependency.

The native build auto-detects FFmpeg development libraries via pkg-config. Without FFmpeg it decodes WAV and MP3. To require or disable FFmpeg explicitly:

bash
SONARE_FFMPEG=1 yarn build  # require FFmpeg-backed decoding
SONARE_FFMPEG=0 yarn build  # force WAV/MP3-only decoding

Usage

typescript
import {
  Audio, analyze, detectBpm, detectKey, detectBeats, version
} from '@libraz/libsonare-native';

// Load audio
const audio = Audio.fromFile('music.mp3');
const samples = audio.getData();
const sampleRate = audio.getSampleRate();

// Individual analysis
const bpm = detectBpm(samples, sampleRate);
const key = detectKey(samples, sampleRate);
const beats = detectBeats(samples, sampleRate);

// Full analysis
const result = analyze(samples, sampleRate);
console.log(`BPM: ${result.bpm}`);
console.log(`Key: ${result.key.name}`);     // "C major"
console.log(`Beats: ${result.beatTimes.length}`);

Audio Effects

typescript
import { Audio } from '@libraz/libsonare-native';

const audio = Audio.fromFile('music.mp3');

// Harmonic-Percussive Source Separation
const hpssResult = audio.hpss();
const harmonic = audio.harmonic();
const percussive = audio.percussive();

// Time stretch / pitch shift
const stretched = audio.timeStretch(1.5);      // 1.5x speed
const shifted = audio.pitchShift(2.0);         // Up 2 semitones

// Normalize and trim silence
const normalized = audio.normalize(0.0);        // 0 dB
const trimmed = audio.trim(-60.0);

Feature Extraction

typescript
import { Audio } from '@libraz/libsonare-native';

const audio = Audio.fromFile('music.mp3');

// Spectrogram features
const stftResult = audio.stft(2048, 512);
const mel = audio.melSpectrogram(2048, 512, 128);
const mfcc = audio.mfcc(2048, 512, 128, 13);
const chroma = audio.chroma(2048, 512);

// Spectral features
const centroid = audio.spectralCentroid();
const bandwidth = audio.spectralBandwidth();
const rolloff = audio.spectralRolloff();
const flatness = audio.spectralFlatness();
const zcr = audio.zeroCrossingRate();
const rms = audio.rmsEnergy();

// Pitch detection
const pitchYin = audio.pitchYin();
const pitchPyin = audio.pitchPyin();
console.log(`Median F0: ${pitchPyin.medianF0.toFixed(1)} Hz`);

Unit Conversions

typescript
import {
  hzToMel, melToHz, hzToMidi, midiToHz,
  hzToNote, noteToHz, framesToTime, timeToFrames
} from '@libraz/libsonare-native';

hzToMel(440);        // → Mel scale value
melToHz(549.64);     // → Hz
hzToMidi(440);       // → 69
midiToHz(69);        // → 440
hzToNote(440);       // → "A4"
noteToHz('A4');      // → 440

framesToTime(100, 22050, 512);  // → seconds
timeToFrames(2.32, 22050, 512); // → frame index

API Reference

Audio

MethodDescription
Audio.fromFile(path)Load WAV/MP3 from disk; also FFmpeg-supported formats when built with FFmpeg
Audio.fromBuffer(samples, sampleRate?)Create from Float32Array; sampleRate defaults to 48000
Audio.fromMemory(data)Decode encoded audio bytes with the same format support as fromFile
audio.getData()Float32Array of samples
audio.getSampleRate()Sample rate (Hz)
audio.getDuration()Duration (seconds)
audio.getLength()Number of samples
audio.destroy()Release the native handle. Optional — the addon also cleans up on GC, but call this for deterministic cleanup of long-lived processes

The Audio instance also exposes the common analysis, effects, feature, loudness, and mastering helpers as methods. For example, use audio.detectBpm() or audio.masteringChain(config) when you already have an Audio object.

A few focused helpers remain standalone functions, including analyzeSections(...), analyzeMelody(...), cqt(...), and vqt(...). For those, pass audio.getData() and audio.getSampleRate() explicitly.

Cleanup with using (Node 22+)

Every native handle class — Audio, RealtimeEngine, Project, Mixer, and ClipPageProvider — implements [Symbol.dispose], so on Node 22+ you can use the using keyword for automatic, throw-safe cleanup at scope exit:

typescript
import { RealtimeEngine } from '@libraz/libsonare-native';

function render() {
  using engine = new RealtimeEngine(48000, 128);
  engine.setTempo(120);
  // ... the handle is released when this scope ends, even on an exception.
}

On Node versions below 22, keep the explicit-release pattern in a try/finally. destroy() is the canonical native release method on every handle class; Project and Mixer also expose delete() as a WASM-compatible alias. GC also reclaims handles eventually, but using/explicit release gives deterministic cleanup that long-lived processes should prefer.

Analysis Functions

FunctionReturn TypeDescription
detectBpm(samples, sampleRate?)numberTempo in BPM
detectKey(samples, sampleRate?)KeyRoot, mode, confidence
detectBeats(samples, sampleRate?)Float32ArrayBeat timestamps
detectOnsets(samples, sampleRate?)Float32ArrayOnset timestamps
detectChords(samples, sampleRate?, minDuration?, smoothingWindow?, threshold?, useTriadsOnly?, nFft?, hopLength?, useBeatSync?, useHmm?, hmmBeamWidth?, useKeyContext?, keyRoot?, keyMode?, detectInversions?, chromaMethod?)ChordAnalysisResultChord progression with timings. Trailing options enable HMM smoothing, key context, inversions, and the chroma method ('stft' default)
detectDownbeats(samples, sampleRate?)Float32ArrayDownbeat (bar-start) timestamps
detectKeyCandidates(samples, sampleRate?, options?)KeyCandidate[]Ranked key candidates with correlation scores
analyze(samples, sampleRate?)AnalysisResultFull analysis in one call: bpm, bpmConfidence, key, timeSignature, beatTimes, beats, plus chords, sections, timbre, dynamics, rhythm, melody, and form. The dedicated detect*/analyze* functions below remain available for targeted or parameterized analysis
analyzeWithProgress(samples, sampleRate?, onProgress?)AnalysisResultSame as analyze with a (progress, stage) callback for long inputs
analyzeBpm(samples, sampleRate?, options?)BpmAnalysisResultTempo with confidence and alternate candidates. options: bpmMin, bpmMax, startBpm, nFft, hopLength, maxCandidates
analyzeRhythm(samples, sampleRate?, options?)RhythmResultTime signature, groove, syncopation. options: bpmMin, bpmMax, startBpm, nFft, hopLength
analyzeDynamics(samples, sampleRate?, options?)DynamicsResultDynamic range, loudness range, crest factor. options: windowSec, hopLength, compressionThreshold
analyzeTimbre(samples, sampleRate?, options?)TimbreResultBrightness, warmth, density, roughness, complexity, plus per-window timbreOverTime. options: nFft, hopLength, nMels, nMfcc, windowSec
analyzeSections(samples, sampleRate?, options?)Section[]Structural sections (intro/verse/chorus…) with timings. options: nFft, hopLength, minSectionSec. Long inputs may use a pooled boundary grid; use each section's start / end for placement
analyzeMelody(samples, sampleRate?, options?)MelodyResultLead-melody contour (F0 per frame). options: fmin, fmax, frameLength, hopLength, threshold, usePyin, center
detectAcoustic(samples, sampleRate?, options?)AcousticResultRoom acoustics from a recording (RT60, etc.). options: nOctaveBands, nThirdOctaveSubbands, minDecayDb, noiseFloorMarginDb
analyzeImpulseResponse(samples, sampleRate?, nOctaveBands?)AcousticResultRoom acoustics from a measured impulse response
estimateRoom(samples, sampleRate?, options?)RoomEstimateResultEquivalent-room estimate with volume, dimensions, DRR, absorption bands, RT60 bands, and confidence
synthesizeRir(options?)RirResultMono room impulse response from shoebox geometry
roomMorph(samples, sampleRate, options?)Float32ArrayOffline creative morph toward a target room
lufs(samples, sampleRate?)LufsResultIntegrated, momentary, short-term loudness and loudness range (ITU-R BS.1770)
lufsInterleaved(samples, channels, sampleRate?)LufsResultChannel-weighted multichannel loudness from interleaved samples
ebur128LoudnessRange(samples, sampleRate?)numberStandards-compliant EBU R128 loudness range (LRA) in LU
momentaryLufs(samples, sampleRate?)Float32ArrayMomentary loudness (400 ms) per step
shortTermLufs(samples, sampleRate?)Float32ArrayShort-term loudness (3 s) per step
version()stringLibrary version
voiceChangerAbiVersion()numberABI version of the realtime voice-changer POD config; separate from preset JSON schemaVersion
voiceCharacterPresetId(preset)VoicePresetId | nullCanonical voice-character preset ID for an ordinal or ID
realtimeVoiceChangerPresetConfig(preset)RealtimeVoiceChangerConfigResolved flat POD config for a built-in voice preset, without JSON parsing. Throws on an unknown preset name or out-of-range ordinal
hasFfmpegSupport()booleanWhether the loaded native addon can decode via FFmpeg

Default sample rates differ by helper family:

Helper familyDefault sampleRate
Music analysis, effects, feature, and loudness helpers22050
analyzeImpulseResponse, detectAcoustic, estimateRoom, and synthesizeRir in the native wrapper48000

Common helpers are also available as Audio instance methods, as noted in the Audio section.

The tables below document the Node native wrapper. The WASM package uses the same camelCase names, but functions with a required argument after sampleRate require that sampleRate position to be supplied. See JavaScript API for the browser signatures.

Asynchronous variants (Node only)

The Node addon also exposes Promise-returning variants. They run the DSP pipeline on a libuv worker thread, so the JS event loop is not blocked.

These functions resolve with the same shape as their synchronous counterparts. They are Node-native-only; the WASM build has no worker-thread equivalent.

Progress callbacks are not available on the async path. If you need progress updates, use the synchronous call with onProgress. If you only need concurrency, run several async calls in parallel.

FunctionReturn TypeDescription
analyzeAsync(samples, sampleRate?)Promise<AnalysisResult>Async variant of analyze(...)
masterAudioAsync(samples, sampleRate?, presetName?, overrides?)Promise<MasteringChainResult>Async variant of masterAudio(...)
masterAudioStereoAsync(left, right, sampleRate?, presetName?, overrides?)Promise<MasteringChainStereoResult>Async variant of masterAudioStereo(...)

Effects Functions

FunctionReturn TypeDescription
hpss(samples, sr?, kernelHarmonic?, kernelPercussive?)HpssResultHarmonic-Percussive Source Separation
hpssWithResidual(samples, sr?, kernelHarmonic?, kernelPercussive?)HpssWithResidualResultHPSS with harmonic, percussive, and residual outputs
harmonic(samples, sr?)Float32ArrayExtract harmonic component
percussive(samples, sr?)Float32ArrayExtract percussive component
timeStretch(samples, rate, sr?)Float32ArrayTime-stretch without pitch change
phaseVocoder(samples, rate, sr?, nFft?, hopLength?)Float32ArrayDirect phase-vocoder time scaling
pitchShift(samples, semitones, sr?)Float32ArrayPitch-shift without tempo change
remix(samples, intervals, sr?, alignZeros?)Float32ArrayReorder or concatenate sample intervals
normalize(samples, sr?, targetDb?)Float32ArrayNormalize to target dB (default: 0.0)
trim(samples, sr?, thresholdDb?)Float32ArrayTrim silence (default: -60.0 dB)
resample(samples, srcSr, targetSr)Float32ArrayResample to target sample rate
pitchCorrectToMidi(samples, sr, currentMidi, targetMidi)Float32ArrayRetune a held note from one MIDI pitch to another
noteStretch(samples, sr?, options?)Float32ArrayTime-stretch a single note span in place; options is { onsetSample, offsetSample, stretchRatio }
voiceChange(samples, sr?, options?)Float32ArrayPitch + formant shift for voice transformation; options is { pitchSemitones, formantFactor }

trim(...) is the simple threshold edit helper. trimSilence(...) below is the librosa-compatible frame/RMS helper that returns the original sample range.

Feature Extraction Functions

FunctionReturn TypeDescription
stft(samples, sr?, nFft?, hopLength?)StftResultShort-Time Fourier Transform
stftDb(samples, sr?, nFft?, hopLength?)StftDbResultSTFT in decibels
melSpectrogram(samples, sr?, nFft?, hopLength?, nMels?)MelSpectrogramResultMel spectrogram
mfcc(samples, sr?, nFft?, hopLength?, nMels?, nMfcc?)MfccResultMel-Frequency Cepstral Coefficients
chroma(samples, sr?, nFft?, hopLength?)ChromaResultChroma features
spectralCentroid(samples, sr?, nFft?, hopLength?)Float32ArraySpectral centroid per frame
spectralBandwidth(samples, sr?, nFft?, hopLength?)Float32ArraySpectral bandwidth per frame
spectralRolloff(samples, sr?, nFft?, hopLength?, rollPercent?)Float32ArraySpectral rolloff per frame
spectralFlatness(samples, sr?, nFft?, hopLength?)Float32ArraySpectral flatness per frame
spectralContrast(samples, sr?, nFft?, hopLength?, nBands?, fmin?, quantile?)Matrix2dResultSpectral contrast, shape (nBands + 1) x nFrames
spectralEdit(samples, sr, ops?, options?)Float32ArrayRegion-based STFT edit with gain, attenuate, mute, or heal ops
polyFeatures(samples, sr?, nFft?, hopLength?, order?)Matrix2dResultPer-frame polynomial spectral coefficients
zeroCrossingRate(samples, sr?, frameLength?, hopLength?)Float32ArrayZero-crossing rate per frame
zeroCrossings(samples, threshold?, refMagnitude?, pad?, zeroPos?)Int32ArrayZero-crossing sample indices
rmsEnergy(samples, sr?, frameLength?, hopLength?)Float32ArrayRMS energy per frame
pitchYin(samples, sr?, frameLength?, hopLength?, fmin?, fmax?, threshold?, fillNa?)PitchResultYIN pitch estimation; unvoiced f0 stays NaN unless fillNa is true
pitchPyin(samples, sr?, frameLength?, hopLength?, fmin?, fmax?, threshold?, fillNa?)PitchResultpYIN pitch estimation; unvoiced f0 stays NaN unless fillNa is true
pitchTuning(frequencies, resolution?, binsPerOctave?)numberTuning offset from frequencies
estimateTuning(samples, sr?, nFft?, hopLength?, resolution?, binsPerOctave?)numberTuning offset from audio
cqt(samples, sr?, hopLength?, fmin?, nBins?, binsPerOctave?)CqtResultConstant-Q transform magnitude
vqt(samples, sr?, hopLength?, fmin?, nBins?, binsPerOctave?, gamma?)CqtResultVariable-Q transform magnitude (gamma controls Q)
nnlsChroma(samples, sr?){ nChroma, nFrames, data }NNLS chromagram (note-activation chroma)
decompose(s, nFeatures, nFrames, nComponents, nIter?, beta?)DecomposeResultNMF factor matrices from a row-major spectrogram
hybridCqt(samples, sr?, hopLength?, fmin?, nBins?, binsPerOctave?)CqtResultHybrid CQT magnitude (true CQT in low bins, pseudo-CQT in high bins)
pseudoCqt(samples, sr?, hopLength?, fmin?, nBins?, binsPerOctave?)CqtResultApproximate (pseudo) CQT magnitude (single FFT)
bassChroma(samples, sr?, hopLength?, nChroma?)ChromaResultBass-focused chroma (low-register pitch-class distribution)
chromaCens(samples, sr?, hopLength?, nChroma?)ChromaResultCENS energy-normalized/smoothed chroma
onsetStrengthMulti(samples, sr?, nFft?, hopLength?, nMels?, nBands?){ nBands, nFrames, data }Multi-band onset strength (nBands default 3; data row-major [nBands x nFrames])
decomposeWithInit(s, nFeatures, nFrames, nComponents, nIter?, beta?, init?)DecomposeResultNMF factor matrices with selectable init ('random' default, 'nndsvd')
nnFilter(s, nFeatures, nFrames, aggregate?, k?, width?)Matrix2dResultNearest-neighbor filtering
onsetEnvelope(samples, sr?, nFft?, hopLength?, nMels?)Float32ArrayOnset strength envelope (the input to the tempogram family)

Default parameters: nFft=2048, hopLength=512, nMels=128, nMfcc=20, pitch fmin=65.0, fmax=2093.0, threshold=0.3, rollPercent=0.85. CQT/VQT use fmin=32.70319566 Hz (C1), nBins=84, and binsPerOctave=12. bassChroma/chromaCens default nChroma=12; onsetStrengthMulti defaults nBands=3; decomposeWithInit defaults nIter=50, beta=2, init='random'.

Inverse Reconstruction Functions

Reconstruct a spectrum or audio from a mel spectrogram or MFCC matrix. Phase is estimated with Griffin-Lim, so the round-trip is lossy — see Inverse Features.

FunctionReturn TypeDescription
melToStft(mel, nMels, nFrames, sampleRate?, nFft?, fmin?, fmax?, htk?)InverseStftResultLinear STFT power from a mel spectrogram
melToAudio(mel, nMels, nFrames, sr?, nFft?, hopLength?, nIter?, fmin?, fmax?)Float32ArrayAudio from a mel spectrogram (Griffin-Lim)
mfccToMel(mfcc, nMfcc, nFrames, nMels?)InverseMelResultMel spectrogram from MFCC coefficients
mfccToAudio(mfcc, nMfcc, nFrames, nMels?, sampleRate?, nFft?, hopLength?, fmin?, fmax?, nIter?, htk?)Float32ArrayAudio from MFCC coefficients

librosa-Compatible Helpers

These mirror the corresponding librosa functions — see librosa Compatibility for the full mapping.

What each helper is for

  • preemphasis / deemphasis — classic one-tap IIR pre-processing on the waveform.
  • trimSilence / splitSilence — trim leading/trailing silence or split on silent gaps.
  • frameSignal / padCenter / fixLength / fixFrames — framing and size-alignment utilities for fixed-frame DSP.
  • peakPick / vectorNormalize — peak detection on 1-D signals and vector-norm normalization.
  • pcen — dynamic range compression for mel spectrograms.
  • tonnetz — projects chroma into a 6-D harmonic space.
  • tempogram / plp — time-varying tempo representation and dominant local pulse.
FunctionReturn TypeDescription
preemphasis(samples, coef?, zi?)Float32ArrayPre-emphasis filter
deemphasis(samples, coef?, zi?)Float32ArrayInverse pre-emphasis
trimSilence(samples, topDb?, frameLength?, hopLength?){ audio: Float32Array; startSample: number; endSample: number }librosa.effects.trim, distinct from threshold trim(...)
splitSilence(samples, topDb?, frameLength?, hopLength?)Int32Arraylibrosa.effects.split — flat [start0, end0, start1, end1, ...]
frameSignal(samples, frameLength, hopLength){ nFrames: number; frames: Float32Array }librosa.util.frame (row-major)
padCenter(values, targetSize, padValue?)Float32Arraylibrosa.util.pad_center
fixLength(values, targetSize, padValue?)Float32Arraylibrosa.util.fix_length
fixFrames(frames, xMin?, xMax?, pad?)Int32Arraylibrosa.util.fix_frames
peakPick(values, preMax, postMax, preAvg, postAvg, delta, wait)Int32Arraylibrosa.util.peak_pick
vectorNormalize(values, normType?, threshold?)Float32Arraylibrosa.util.normalize. normType: 0=inf, 1=L1, 2=L2, 3=power. The Node wrapper defaults threshold to 0.0; WASM defaults it to 1e-12
pcen(values, nBins, nFrames, options?)Float32Arraylibrosa.pcen (row-major mel input)
tonnetz(chromagram, nChroma, nFrames)Float32Arraylibrosa.feature.tonnetz ([6 x nFrames])
tempogram(onsetEnvelope, sr?, hopLength?, winLength?, mode?){ nFrames: number; winLength: number; data: Float32Array }librosa.feature.tempogram; mode is 'autocorrelation' (default) or 'cosine'
fourierTempogram(onsetEnvelope, sr?, hopLength?, winLength?){ nBins: number; nFrames: number; data: Float32Array }librosa.feature.fourier_tempogram
cyclicTempogram(onsetEnvelope, sr, hopLength?, winLength?, bpmMin?, nBins?){ nFrames: number; nBins: number; data: Float32Array }Cyclic (tempo-octave-invariant) tempogram
tempogramRatio(tempogramData, winLength?, sr?, hopLength?)Float32Arraylibrosa.feature.tempogram_ratio
plp(onsetEnvelope, sr?, hopLength?, tempoMin?, tempoMax?, winLength?)Float32Arraylibrosa.beat.plp

Conversion Functions

FunctionDescription
hzToMel(hz)Hertz → Mel scale
melToHz(mel)Mel scale → Hertz
hzToMidi(hz)Hertz → MIDI note number
midiToHz(midi)MIDI note number → Hertz
hzToNote(hz)Hertz → note name (e.g., "A4")
noteToHz(note)Note name → Hertz
framesToTime(frames, sr?, hopLength?)Frame index → seconds (sr default 22050, hopLength default 512)
timeToFrames(time, sr?, hopLength?)Seconds → frame index (sr default 22050, hopLength default 512)
framesToSamples(frames, hopLength?, nFft?)Frame index → sample index (librosa.frames_to_samples)
samplesToFrames(samples, hopLength?, nFft?)Sample index → frame index (librosa.samples_to_frames)
powerToDb(values, ref?, amin?, topDb?)Power → dB (librosa.power_to_db)
amplitudeToDb(values, ref?, amin?, topDb?)Amplitude → dB (librosa.amplitude_to_db)
dbToPower(values, ref?)dB → power
dbToAmplitude(values, ref?)dB → amplitude

Metering Functions

Standalone level, dynamics, and stereo-image meters. Each accepts an optional options object with a validate flag (default true); pass { validate: false } to skip NaN/Inf input checks on hot paths. The stereo meters require left and right to be equal length.

FunctionReturn TypeDescription
meteringPeakDb(samples, sr?, options?)numberSample peak (dBFS)
meteringRmsDb(samples, sr?, options?)numberRMS level (dBFS)
meteringCrestFactorDb(samples, sr?, options?)numberCrest factor, peak − RMS (dB)
meteringDcOffset(samples, sr?, options?)numberMean (DC) offset, linear amplitude
meteringTruePeakDb(samples, sr?, oversampleFactor?, options?)numberInter-sample (true) peak (dBFS); oversampleFactor is a power of two in 1..16 (default 4)
meteringDetectClipping(samples, sr?, options?)ClippingReportClipped-sample runs; options adds threshold (default 0.999) and minRegionSamples (default 1)
meteringDynamicRange(samples, sr?, options?)DynamicRangeReportSliding-window dynamic range; options adds windowSec, hopSec, lowPercentile, highPercentile (omit for defaults: window 3 s, hop 1 s, low 0.10, high 0.95)
meteringStereoCorrelation(left, right, sr?, options?)numberPearson correlation, −1..1
meteringStereoWidth(left, right, sr?, options?)numberMid/side stereo width
meteringVectorscope(left, right, sr?, options?)VectorscopeReportPer-sample mid/side point series
meteringPhaseScope(left, right, sr?, options?)PhaseScopeReportPhase-scope point series plus summary stats
meteringSpectrum(samples, sr?, options?)SpectrumReportSingle-frame magnitude/power/dB spectrum; options adds nFft, applyOctaveSmoothing, octaveFraction, dbRef, dbAmin

Scale Quantization

12-TET scale helpers for building pitch-correction targets. modeMask is a 12-bit mask where bit i enables the i-th pitch class relative to root (PitchClass, C = 0); natural major is 0b101010110101. referenceMidi is the tuning anchor (pass 0 for A4 = 69). Pair with pitchCorrectToMidi(...) to retune to the nearest scale degree.

FunctionReturn TypeDescription
scaleQuantizeMidi(root, modeMask, midi, referenceMidi?)numberSnap a (fractional) MIDI number to the nearest enabled pitch class
scaleCorrectionSemitones(root, modeMask, midi, referenceMidi?)numberCorrection (quantized − input), in semitones
scalePitchClassEnabled(root, modeMask, pitchClass)booleanWhether pitchClass (0..11) is enabled relative to root

Streaming and Realtime Classes

Beyond the one-shot functions, the native addon exposes the same streaming and realtime classes as the WASM build:

ClassPurpose
StreamAnalyzerBlock-by-block analysis with progressive BPM/key estimates and readFramesSoa/readFramesI16/readFramesU8. See Realtime Streaming.
StreamingEqualizerReal-time-safe block EQ.
StreamingMasteringChainIncremental mastering render (documented above).
RealtimeVoiceChangerPreset-driven live voice chain for block processing.
MixerPersistent multi-strip mixer from a JSON scene. See Mixing Engine.
RealtimeEngineTransport/clip/automation engine for DAW-style hosting.
typescript
import { StreamAnalyzer } from '@libraz/libsonare-native';

const analyzer = new StreamAnalyzer({ sampleRate: 48000, computeMel: true, computeOnset: true });
analyzer.process(block);                 // feed a Float32Array block
const frames = analyzer.readFramesSoa(analyzer.availableFrames());
const stats = analyzer.stats();          // stats.estimate.bpm / .key (PitchClass int)

Node native names the float Structure-of-Arrays read readFramesSoa(...). The WASM wrapper exposes the same operation as readFrames(...) for browser examples.

RealtimeVoiceChanger in Node native is constructed with { sampleRate, maxBlockSize, channels, preset }, then used with processMono(...), processMonoInto(...), processInterleaved(...), or processPlanarStereo(...). For offline convenience, voiceChangeRealtime(...) runs a whole mono buffer through the same preset chain in 512-sample blocks.

typescript
import {
  RealtimeVoiceChanger,
  realtimeVoiceChangerPresetConfig,
  realtimeVoiceChangerPresetNames,
  voiceCharacterPresetId,
  voiceChangeRealtime,
} from '@libraz/libsonare-native';

const changer = new RealtimeVoiceChanger({
  sampleRate: 48000,
  maxBlockSize: 128,
  channels: 1,
  preset: 'bright-idol',
});

const blockOut = changer.processMono(inputBlock);
const rendered = voiceChangeRealtime(vocal, 48000, 'soft-whisper');
const presetConfig = realtimeVoiceChangerPresetConfig('bright-idol');
console.log(
  voiceCharacterPresetId(1),
  realtimeVoiceChangerPresetNames(),
  presetConfig,
  changer.latencySamples(),
  blockOut,
  rendered,
);
changer.destroy();

RealtimeEngine is shared at the class level, but a few wrapper details differ.

DetailWASMNode native
Capability checkAdds engineCapabilities() and checks ABI compatibility before constructionExposes engineAbiVersion() but not the browser capability helper
Capture buffer setupsetCaptureBuffer(numChannels, capacityFrames)setCaptureBuffer(channels) with preallocated channel buffers

Types

typescript
interface Key {
  root: string;        // Pitch-class name, e.g. "C", "C#", "A"
  mode: string;        // Mode name, e.g. "major", "minor"
  confidence: number;
  name: string;        // "C major", "A minor"
  shortName: string;   // "C", "Am"
}

interface TimeSignature {
  numerator: number;
  denominator: number;
  confidence: number;
}

interface AnalysisResult {
  bpm: number;
  bpmConfidence: number;
  key: Key;
  timeSignature: TimeSignature;
  beatTimes: Float32Array;                       // Derived from beats[].time
  beats: Array<{ time: number; strength: number }>;
  chords: AnalysisChord[];                       // Detected chord progression
  sections: AnalysisSection[];                   // Song-structure sections
  timbre: AnalysisTimbre;                        // Aggregate timbre summary
  dynamics: AnalysisDynamics;                    // Aggregate dynamics summary
  rhythm: AnalysisRhythm;                        // Aggregate rhythm summary
  melody: AnalysisMelody;                        // Melody-contour summary
  form: string;                                  // Musical form label, e.g. "AABA"
}
// analyze() returns the full result above. The dedicated detect*/analyze*
// functions remain available for targeted or parameterized analysis.

interface HpssResult {
  harmonic: Float32Array;
  percussive: Float32Array;
  sampleRate: number;
}

interface StftResult {
  nBins: number;
  nFrames: number;
  nFft: number;
  hopLength: number;
  sampleRate: number;
  magnitude: Float32Array;  // nBins × nFrames, row-major
  power: Float32Array;      // nBins × nFrames, row-major
}

interface StftDbResult {
  nBins: number;
  nFrames: number;
  db: Float32Array;         // Power in decibels
}

interface MelSpectrogramResult {
  nMels: number;
  nFrames: number;
  sampleRate: number;
  hopLength: number;
  power: Float32Array;      // nMels × nFrames, row-major
  db: Float32Array;         // nMels × nFrames, row-major
}

interface MfccResult {
  nMfcc: number;
  nFrames: number;
  coefficients: Float32Array;  // nMfcc × nFrames, row-major
}

interface ChromaResult {
  nChroma: number;
  nFrames: number;
  sampleRate: number;
  hopLength: number;
  features: Float32Array;   // nChroma × nFrames, row-major
  meanEnergy: number[];     // nChroma values
}

interface PitchResult {
  f0: Float32Array;         // Fundamental frequency per frame (Hz)
  voicedProb: Float32Array; // Voicing probability per frame (0–1)
  voicedFlag: boolean[];    // Voiced/unvoiced decision per frame
  nFrames: number;
  medianF0: number;
  meanF0: number;
}

The native package also exports TypeScript helper types for option objects, callbacks, streaming snapshots, and realtime engine messages. Use these names when annotating application code instead of re-declaring the shapes locally.

AreaExported types
Analysis options/resultsAnalysisProgressCallback, BpmCandidate, ChordChromaMethod, KeyMode, KeyProfile, MelodyPoint, SectionTypeOrdinal, TempogramMode, TrimSilenceMode
Streaming analysisStreamAnalyzerConfig, StreamAnalyzerStats, StreamFramesSoa, StreamProgressiveEstimate, StreamChordChange, StreamBarChord, StreamPatternScore
Mastering and meteringMasteringPreset, SoloProcessor, StreamingPlatform, DynamicsProcessorResult, CompressorDetector, DecrackleMode, DenoiseClassicalMode, DenoiseClassicalNoiseEstimator, EqBandInput, EqPhaseMode, EqSpectrumSnapshot
MixingAutomationCurve, GoniometerPoint, MeterTap, MixMeterSnapshot, MixResult, MixerProcessResult, PanLaw, PanMode, SendTiming
Realtime voiceVoicePresetId, RealtimeVoiceChangerConfigInput, RealtimeVoiceChangerConfig, RealtimeVoiceChangerOptions
Realtime engine graphEngineGraphSpec, EngineGraphNode, EngineGraphNodeType, EngineGraphConnection, EngineGraphMix, EngineGraphParameterBinding, EngineParameterInfo
Realtime engine transportEngineTransportState, EngineMarker, EngineClip, EngineAutomationPoint, EngineAutomationPointCurve, EngineMetronomeConfig
Realtime engine jobs/telemetryEngineBounceOptions, EngineBounceResult, EngineFreezeOptions, EngineFreezeResult, EngineCaptureStatus, EngineTelemetry, EngineTelemetryType, EngineTelemetryError, EngineMeterTelemetry