Skip to content

Architecture

This document describes the internal architecture of libsonare.

Module Overview

Directory Structure

src/
├── util/               # Level 0: Basic utilities
│   ├── types.h         # MatrixView, ErrorCode, enums
│   ├── exception.h     # SonareException
│   └── math_utils.h    # mean, variance, argmax, etc.

├── core/               # Level 1-3: Core DSP
│   ├── convert.h       # Hz/Mel/MIDI conversion
│   ├── window.h        # Hann, Hamming, Blackman
│   ├── fft.h           # KissFFT wrapper
│   ├── spectrum.h      # STFT/iSTFT
│   ├── audio.h         # Audio buffer
│   ├── audio_io.h      # WAV/MP3 loading
│   └── resample.h      # r8brain resampling

├── filters/            # Level 4: Filterbanks
│   ├── mel.h           # Mel filterbank
│   ├── chroma.h        # Chroma filterbank
│   ├── dct.h           # DCT for MFCC
│   └── iir.h           # IIR filters

├── feature/            # Level 4: Feature extraction
│   ├── mel_spectrogram.h
│   ├── chroma.h
│   ├── spectral.h
│   ├── onset.h
│   └── pitch.h

├── effects/            # Level 5: Audio effects
│   ├── hpss.h
│   ├── time_stretch.h
│   ├── pitch_shift.h
│   └── normalize.h

├── analysis/           # Level 6: Music analysis
│   ├── music_analyzer.h
│   ├── bpm_analyzer.h
│   ├── key_analyzer.h
│   ├── beat_analyzer.h
│   ├── chord_analyzer.h
│   ├── section_analyzer.h
│   └── ...

├── streaming/          # Level 6: Real-time streaming
│   ├── stream_analyzer.h   # Main streaming analyzer
│   ├── stream_config.h     # Configuration options
│   └── stream_frame.h      # Frame and buffer types

├── quick.h             # Simple function API
├── sonare.h            # Unified include header
├── sonare_c.h          # C API header
└── wasm/
    └── bindings.cpp    # Embind bindings

Data Flow

Audio Analysis Pipeline

Audio Effects Pipeline

Streaming Pipeline

The streaming pipeline processes audio in real-time, maintaining overlap state between chunks.

Progressive Estimation

The streaming pipeline also accumulates chroma and onset data for progressive BPM/key estimation. Estimates are updated periodically (default: BPM every 10s, key every 5s) and improve in confidence over time.

Key Design Decisions

Lazy Initialization

MusicAnalyzer uses lazy initialization for individual analyzers:

cpp
// Only BPM is computed
float bpm = analyzer.bpm();

// Key detection triggers chroma computation
Key key = analyzer.key();

// Full analysis computes everything
AnalysisResult result = analyzer.analyze();

Zero-Copy Audio Slicing

Audio uses shared_ptr with offset/size for zero-copy slicing:

cpp
auto full = Audio::from_file("song.mp3");

// Both share same underlying buffer
auto intro = full.slice(0, 30);     // 0-30 sec
auto chorus = full.slice(60, 90);   // 60-90 sec

WASM Compatibility

Core modules avoid:

  • File I/O (handled by Audio I/O layer)
  • Threading (single-threaded execution)
  • Dynamic loading
  • System-specific APIs

librosa Compatibility

Default parameters match librosa:

ParameterDefault
sample_rate22050
n_fft2048
hop_length512
n_mels128
fmin0
fmaxsr/2

Third-Party Libraries

LibraryPurposeLicense
KissFFTFFTBSD-3-Clause
Eigen3Matrix opsMPL-2.0
dr_libsWAV decodePublic Domain
minimp3MP3 decodeCC0-1.0
r8brainResamplingMIT

WASM Compilation

Output: ~228KB WASM + ~34KB JS
Build: Emscripten with Embind
Flags: -sWASM=1 -sMODULARIZE=1 -sEXPORT_ES6=1

Released under the Apache-2.0 License.