Utilities - Python Audio DSP

Audio I/O Scales & Intervals Maqamat Basic Functions Noise Algorithms Spectral Analysis Transient Extraction

Audio I/O

Audio Loading & Saving

utils.audio_io no librosa needed

Lightweight audio I/O with built-in WAV support. Handles both file paths and numpy arrays seamlessly.

load_audio()

Parameter	Type	Default	Description
audio_input	str \| np.array	required	File path or numpy array
sr	int	None	Sample rate (required if input is array)
mono	bool	True	Convert to mono

Returns: (audio_array, sample_rate)

save_audio()

Parameter	Type	Default	Description
file_path	str	required	Output file path
audio	np.array	required	Audio data (normalized to [-1, 1])
sr	int	required	Sample rate
bit_depth	int	16	Bit depth (16 or 24)

normalize_audio()

Parameter	Type	Default	Description
audio	np.array	required	Audio data
peak	float	1.0	Target peak amplitude

Returns: Normalized audio array

resample_audio()

Parameter	Type	Description
audio	np.array	Audio data
orig_sr	int	Original sample rate
target_sr	int	Target sample rate

Returns: Resampled audio array

Example Usage

from audio_dsp.utils import load_audio, save_audio, normalize_audio, resample_audio
from audio_dsp.effects import vocoder

# Load from file
sr, carrier = load_audio("carrier.wav")
sr, modulator = load_audio("modulator.wav")

# Process with vocoder (accepts files OR arrays)
output, sr = vocoder(carrier, modulator, sr=sr, n_filters=16)

# Normalize and save
output = normalize_audio(output, peak=0.9)
save_audio("output.wav", output, sr)

# Resample to different rate
output_48k = resample_audio(output, sr, 48000)
save_audio("output_48k.wav", output_48k, 48000)

Scales & Intervals

Scale Generation

utils.scales_and_melody

Generate equal-tempered scales and categorize intervals by cents.

generate_scale()

Parameter	Type	Description
root_freq	float	Base frequency in Hz (e.g., 440 for A4)
divisions	int	Number of equal divisions per octave (12 = standard, 19, 31 for microtonal)

Returns: (frequencies list, cents list 0-1200)

categorise_interval()

Parameter	Type	Default	Description
cents	float	required	Interval size in cents
threshold	float	20	Max cents error for matching

Returns: (interval_name, error_cents) or ("Novel interval", error)

Recognized Intervals

Cents	Name
0	Unison
100	Minor 2nd
200	Major 2nd
300	Minor 3rd
400	Major 3rd
500	Perfect 4th
600	Tritone
700	Perfect 5th
800	Minor 6th
900	Major 6th
1000	Minor 7th
1100	Major 7th
1200	Octave

Example Usage

from audio_dsp.utils import generate_scale, categorise_interval

# Standard 12-TET scale from A4
freqs, cents = generate_scale(root_freq=440, divisions=12)
print(freqs)  # [440.0, 466.16, 493.88, 523.25, ...]
print(cents)  # [0, 100, 200, 300, 400, ...]

# 19-TET microtonal scale
freqs_19, cents_19 = generate_scale(440, divisions=19)
print(f"Step size: {cents_19[1]:.1f} cents")  # ~63.2 cents

# Analyze intervals
for c in cents_19[:8]:
    name, error = categorise_interval(c)
    print(f"{c:.1f} cents: {name} (error: {error:.1f})")

# Check perfect fifth (should be ~700 cents)
name, error = categorise_interval(702)
print(f"{name}: {error} cents flat/sharp")  # Perfect 5th: 2 cents sharp

Arabic Maqamat

generate_maqam_frequencies

utils.maqamat

Generate frequencies for Arabic maqam scales with quarter-tone microtonal intervals.

Parameters

Parameter	Type	Default	Description
maqam_name	str	required	Name of the maqam
num_of_notes	int	required	Number of notes to generate
root_freq	float	required	Base frequency in Hz
step_size	int	1	Step through extended maqam

Supported Maqamat

Maqam	Character	Notes
Rast	Bright, fundamental	C D E-half-flat F G A B-half-flat C
Bayati	Emotional, melancholic	D E-half-flat F G A Bb C D
Saba	Sad, yearning	D E-half-flat F Gb A Bb C D
Hijaz	Exotic, dramatic	D Eb F# G A Bb C D
Nahawand	Minor-like, gentle	C D Eb F G Ab B C
Kurd	Phrygian-like	D Eb F G A Bb C D
Nahawand Murassah	Decorated minor	Nahawand with ornaments
Sikah	Mystical, floating	E-half-flat F G A B-half-flat C D

Example Usage

from audio_dsp.utils import generate_maqam_frequencies
from audio_dsp.utils import sine_wave, apply_envelope
import numpy as np
import soundfile as sf

# Generate Maqam Rast from D (293.66 Hz)
rast_freqs = generate_maqam_frequencies(
    maqam_name="Rast",
    num_of_notes=8,
    root_freq=293.66
)
print(rast_freqs)

# Generate Maqam Hijaz from A
hijaz_freqs = generate_maqam_frequencies(
    maqam_name="Hijaz",
    num_of_notes=15,
    root_freq=220,
    step_size=1
)

# Play the scale
audio = []
for freq in rast_freqs:
    note = sine_wave(freq, duration=0.5)
    note = apply_envelope(note, 44100)
    audio.append(note)

sf.write("maqam_rast.wav", np.concatenate(audio), 44100)

# Emotional Bayati scale
bayati_freqs = generate_maqam_frequencies("Bayati", 8, 293.66)

Basic Functions

Audio Helpers

utils

Basic waveform generation and envelope functions.

sine_wave()

Parameter	Type	Default	Description
freq	float	required	Frequency in Hz
duration	float	required	Duration in seconds
sample_rate	int	44100	Sample rate in Hz
amplitude	float	0.5	Peak amplitude (0-1)

Returns: numpy array of audio samples

apply_envelope()

Parameter	Type	Default	Description
signal	np.array	required	Input audio signal
sample_rate	int	required	Sample rate in Hz
fade_duration	float	0.05	Fade in/out time in seconds

Returns: Signal with linear fade-in and fade-out applied (prevents clicks)

Example Usage

from audio_dsp.utils import sine_wave, apply_envelope
import numpy as np
import soundfile as sf

# Generate a simple tone
tone = sine_wave(freq=440, duration=1.0, amplitude=0.8)

# Apply envelope to prevent clicks
tone = apply_envelope(tone, sample_rate=44100, fade_duration=0.02)

sf.write("tone.wav", tone, 44100)

# Create a melody
melody_freqs = [262, 294, 330, 349, 392]  # C D E F G
melody = []
for f in melody_freqs:
    note = sine_wave(f, 0.4)
    note = apply_envelope(note, 44100)
    melody.append(note)

sf.write("melody.wav", np.concatenate(melody), 44100)

Noise Algorithms

Noise Generation

utils.noise_algorithms

Perlin noise, simplex noise, and fractal noise generation for modulation and textures.

perlin_noise()

Classic Perlin noise for smooth, natural-sounding modulation.

simplex_noise()

Improved noise algorithm with fewer directional artifacts.

fractal_noise()

Multi-octave noise for complex, evolving textures.

Example Usage

from audio_dsp.utils.noise_algorithms import perlin_noise
import numpy as np

# Generate smooth modulation curve
modulation = perlin_noise(
    length=44100,  # 1 second of samples
    scale=0.01     # Controls smoothness
)

# Use for LFO modulation
import soundfile as sf
audio, sr = sf.read("input.wav")

# Modulate amplitude with Perlin noise
modulated = audio * (0.5 + 0.5 * modulation[:len(audio)])

Spectral Analysis

Spectral Analyzer

utils.spectral_analyzer no librosa needed

FFT analysis, spectral peak detection, and frequency analysis tools.

analyze_spectrum()

Compute FFT and return frequency/magnitude data.

find_peaks()

Detect spectral peaks for pitch detection or analysis.

spectral_centroid()

Calculate the "center of mass" of the spectrum (brightness measure).

Transient Extraction

Transient Extractor

utils.transient_extractor

Extract attack transients from audio for physical modeling synthesis.

Example Usage

from audio_dsp.utils.transient_extractor import extract_transient

# Extract the attack portion of a drum hit
transient = extract_transient(
    input_file="snare.wav",
    output_file="snare_transient.json",
    threshold=0.1,
    attack_time=0.05  # 50ms
)

# Use with PhysicalModelingSynth
from audio_dsp.synth.physical_modeling_synth import PhysicalModelingSynth

synth = PhysicalModelingSynth()
synth.synthesize(
    frequency=440,
    length=2.0,
    transient_file="snare_transient.json",
    spectral_file="guitar_body.spectral"
)

Additional Utilities

Other Tools

utils

utils.arpeggio

Arpeggiation pattern generators (up, down, up-down, random).

utils.blend_modes

Digital blend modes for audio mixing (multiply, screen, overlay, etc.).

utils.image_to_audio

Convert images to audio using spectral representation. Requires PIL/cv2.