{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "\n# Additive Synthesis\n\n**Author**: [Moto Hira](moto@meta.com)_\n\nThis tutorial is the continuation of\n[Oscillator and ADSR Envelope](./oscillator_tutorial.html)_.\n\nThis tutorial shows how to perform additive synthesis and subtractive\nsynthesis using TorchAudio's DSP functions.\n\nAdditive synthesis creates timbre by combining multiple waveform.\nSubtractive synthesis creates timbre by applying filters.\n\n

Warning

This tutorial requires prototype DSP features, which are\n available in nightly builds.\n\n Please refer to https://pytorch.org/get-started/locally\n for instructions for installing a nightly build.

\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "import torch\nimport torchaudio\n\nprint(torch.__version__)\nprint(torchaudio.__version__)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Overview\n\n\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "try:\n from torchaudio.prototype.functional import adsr_envelope, extend_pitch, oscillator_bank\nexcept ModuleNotFoundError:\n print(\n \"Failed to import prototype DSP features. \"\n \"Please install torchaudio nightly builds. \"\n \"Please refer to https://pytorch.org/get-started/locally \"\n \"for instructions to install a nightly build.\"\n )\n raise\n\nimport matplotlib.pyplot as plt\nfrom IPython.display import Audio" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Creating multiple frequency pitches\n\nThe core of additive synthesis is oscillator. We create a timbre by\nsumming up the multiple waveforms generated by oscillator.\n\nIn [the oscillator tutorial](./oscillator_tutorial.html)_, we used\n:py:func:`~torchaudio.prototype.functional.oscillator_bank` and\n:py:func:`~torchaudio.prototype.functional.adsr_envelope` to generate\nvarious waveforms.\n\nIn this tutorial, we use\n:py:func:`~torchaudio.prototype.functional.extend_pitch` to create\na timbre from base frequency.\n\n\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "First, we define some constants and helper function that we use\nthroughout the tutorial.\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "PI = torch.pi\nPI2 = 2 * torch.pi\n\nF0 = 344.0 # fundamental frequency\nDURATION = 1.1 # [seconds]\nSAMPLE_RATE = 16_000 # [Hz]\n\nNUM_FRAMES = int(DURATION * SAMPLE_RATE)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "def plot(freq, amp, waveform, sample_rate, zoom=None, vol=0.1):\n t = (torch.arange(waveform.size(0)) / sample_rate).numpy()\n\n fig, axes = plt.subplots(4, 1, sharex=True)\n axes[0].plot(t, freq.numpy())\n axes[0].set(title=f\"Oscillator bank (bank size: {amp.size(-1)})\", ylabel=\"Frequency [Hz]\", ylim=[-0.03, None])\n axes[1].plot(t, amp.numpy())\n axes[1].set(ylabel=\"Amplitude\", ylim=[-0.03 if torch.all(amp >= 0.0) else None, None])\n axes[2].plot(t, waveform)\n axes[2].set(ylabel=\"Waveform\")\n axes[3].specgram(waveform, Fs=sample_rate)\n axes[3].set(ylabel=\"Spectrogram\", xlabel=\"Time [s]\", xlim=[-0.01, t[-1] + 0.01])\n\n for i in range(4):\n axes[i].grid(True)\n pos = axes[2].get_position()\n fig.tight_layout()\n\n if zoom is not None:\n ax = fig.add_axes([pos.x0 + 0.02, pos.y0 + 0.03, pos.width / 2.5, pos.height / 2.0])\n ax.plot(t, waveform)\n ax.set(xlim=zoom, xticks=[], yticks=[])\n\n waveform /= waveform.abs().max()\n return Audio(vol * waveform, rate=sample_rate, normalize=False)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Harmonic Overtones\n\nHarmonic overtones are frequency components that are an integer\nmultiple of the fundamental frequency.\n\nWe look at how to generate the common waveforms that are used in\nsynthesizers. That is,\n\n - Sawtooth wave\n - Square wave\n - Triangle wave\n\n\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Sawtooth wave\n\n[Sawtooth wave](https://en.wikipedia.org/wiki/Sawtooth_wave) can be\nexpressed as the following. It contains all the integer harmonics, so\nit is commonly used in subtractive synthesis as well.\n\n\\begin{align}\\begin{align*}\n y_t &= \\sum_{k=1}^{K} A_k \\sin ( 2 \\pi f_k t ) \\\\\n \\text{where} \\\\\n f_k &= k f_0 \\\\\n A_k &= -\\frac{ (-1) ^k }{k \\pi}\n \\end{align*}\\end{align}\n\n\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The following function takes fundamental frequencies and amplitudes,\nand adds extend pitch in accordance with the formula above.\n\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "def sawtooth_wave(freq0, amp0, num_pitches, sample_rate):\n freq = extend_pitch(freq0, num_pitches)\n\n mults = [-((-1) ** i) / (PI * i) for i in range(1, 1 + num_pitches)]\n amp = extend_pitch(amp0, mults)\n waveform = oscillator_bank(freq, amp, sample_rate=sample_rate)\n return freq, amp, waveform" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now synthesize a waveform\n\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "freq0 = torch.full((NUM_FRAMES, 1), F0)\namp0 = torch.ones((NUM_FRAMES, 1))\nfreq, amp, waveform = sawtooth_wave(freq0, amp0, int(SAMPLE_RATE / F0), SAMPLE_RATE)\nplot(freq, amp, waveform, SAMPLE_RATE, zoom=(1 / F0, 3 / F0))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "It is possible to oscillate the base frequency to create a\ntime-varying tone based on sawtooth wave.\n\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "fm = 10 # rate at which the frequency oscillates [Hz]\nf_dev = 0.1 * F0 # the degree of frequency oscillation [Hz]\n\nphase = torch.linspace(0, fm * PI2 * DURATION, NUM_FRAMES)\nfreq0 = F0 + f_dev * torch.sin(phase).unsqueeze(-1)\n\nfreq, amp, waveform = sawtooth_wave(freq0, amp0, int(SAMPLE_RATE / F0), SAMPLE_RATE)\nplot(freq, amp, waveform, SAMPLE_RATE, zoom=(1 / F0, 3 / F0))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Square wave\n\n[Square wave](https://en.wikipedia.org/wiki/Square_wave) contains\nonly odd-integer harmonics.\n\n\\begin{align}\\begin{align*}\n y_t &= \\sum_{k=0}^{K-1} A_k \\sin ( 2 \\pi f_k t ) \\\\\n \\text{where} \\\\\n f_k &= n f_0 \\\\\n A_k &= \\frac{ 4 }{n \\pi} \\\\\n n &= 2k + 1\n \\end{align*}\\end{align}\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "def square_wave(freq0, amp0, num_pitches, sample_rate):\n mults = [2.0 * i + 1.0 for i in range(num_pitches)]\n freq = extend_pitch(freq0, mults)\n\n mults = [4 / (PI * (2.0 * i + 1.0)) for i in range(num_pitches)]\n amp = extend_pitch(amp0, mults)\n\n waveform = oscillator_bank(freq, amp, sample_rate=sample_rate)\n return freq, amp, waveform" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "freq0 = torch.full((NUM_FRAMES, 1), F0)\namp0 = torch.ones((NUM_FRAMES, 1))\nfreq, amp, waveform = square_wave(freq0, amp0, int(SAMPLE_RATE / F0 / 2), SAMPLE_RATE)\nplot(freq, amp, waveform, SAMPLE_RATE, zoom=(1 / F0, 3 / F0))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Triangle wave\n\n[Triangle wave](https://en.wikipedia.org/wiki/Triangle_wave)\nalso only contains odd-integer harmonics.\n\n\\begin{align}\\begin{align*}\n y_t &= \\sum_{k=0}^{K-1} A_k \\sin ( 2 \\pi f_k t ) \\\\\n \\text{where} \\\\\n f_k &= n f_0 \\\\\n A_k &= (-1) ^ k \\frac{8}{(n\\pi) ^ 2} \\\\\n n &= 2k + 1\n \\end{align*}\\end{align}\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "def triangle_wave(freq0, amp0, num_pitches, sample_rate):\n mults = [2.0 * i + 1.0 for i in range(num_pitches)]\n freq = extend_pitch(freq0, mults)\n\n c = 8 / (PI**2)\n mults = [c * ((-1) ** i) / ((2.0 * i + 1.0) ** 2) for i in range(num_pitches)]\n amp = extend_pitch(amp0, mults)\n\n waveform = oscillator_bank(freq, amp, sample_rate=sample_rate)\n return freq, amp, waveform" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "freq, amp, waveform = triangle_wave(freq0, amp0, int(SAMPLE_RATE / F0 / 2), SAMPLE_RATE)\nplot(freq, amp, waveform, SAMPLE_RATE, zoom=(1 / F0, 3 / F0))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Inharmonic Paritials\n\nInharmonic partials refer to freqencies that are not integer multiple\nof fundamental frequency.\n\nThey are essential in re-creating realistic sound or\nmaking the result of synthesis more interesting.\n\n\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Bell sound\n\nhttps://computermusicresource.com/Simple.bell.tutorial.html\n\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "num_tones = 9\nduration = 2.0\nnum_frames = int(SAMPLE_RATE * duration)\n\nfreq0 = torch.full((num_frames, 1), F0)\nmults = [0.56, 0.92, 1.19, 1.71, 2, 2.74, 3.0, 3.76, 4.07]\nfreq = extend_pitch(freq0, mults)\n\namp = adsr_envelope(\n num_frames=num_frames,\n attack=0.002,\n decay=0.998,\n sustain=0.0,\n release=0.0,\n n_decay=2,\n)\namp = torch.stack([amp * (0.5**i) for i in range(num_tones)], dim=-1)\n\nwaveform = oscillator_bank(freq, amp, sample_rate=SAMPLE_RATE)\n\nplot(freq, amp, waveform, SAMPLE_RATE, vol=0.4)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As a comparison, the following is the harmonic version of the above.\nOnly frequency values are different.\nThe number of overtones and its amplitudes are same.\n\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "freq = extend_pitch(freq0, num_tones)\nwaveform = oscillator_bank(freq, amp, sample_rate=SAMPLE_RATE)\n\nplot(freq, amp, waveform, SAMPLE_RATE)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## References\n\n- https://en.wikipedia.org/wiki/Additive_synthesis\n- https://computermusicresource.com/Simple.bell.tutorial.html\n- https://computermusicresource.com/Definitions/additive.synthesis.html\n\n" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.14" } }, "nbformat": 4, "nbformat_minor": 0 }