{
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "\n# Additive Synthesis\n\n**Author**: [Moto Hira](moto@meta.com)_\n\nThis tutorial is the continuation of\n[Oscillator and ADSR Envelope](./oscillator_tutorial.html)_.\n\nThis tutorial shows how to perform additive synthesis and subtractive\nsynthesis using TorchAudio's DSP functions.\n\nAdditive synthesis creates timbre by combining multiple waveform.\nSubtractive synthesis creates timbre by applying filters.\n\n<div class=\"alert alert-danger\"><h4>Warning</h4><p>This tutorial requires prototype DSP features, which are\n   available in nightly builds.\n\n   Please refer to https://pytorch.org/get-started/locally\n   for instructions for installing a nightly build.</p></div>\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "import torch\nimport torchaudio\n\nprint(torch.__version__)\nprint(torchaudio.__version__)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Overview\n\n\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "try:\n    from torchaudio.prototype.functional import adsr_envelope, extend_pitch, oscillator_bank\nexcept ModuleNotFoundError:\n    print(\n        \"Failed to import prototype DSP features. \"\n        \"Please install torchaudio nightly builds. \"\n        \"Please refer to https://pytorch.org/get-started/locally \"\n        \"for instructions to install a nightly build.\"\n    )\n    raise\n\nimport matplotlib.pyplot as plt\nfrom IPython.display import Audio"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Creating multiple frequency pitches\n\nThe core of additive synthesis is oscillator. We create a timbre by\nsumming up the multiple waveforms generated by oscillator.\n\nIn [the oscillator tutorial](./oscillator_tutorial.html)_, we used\n:py:func:`~torchaudio.prototype.functional.oscillator_bank` and\n:py:func:`~torchaudio.prototype.functional.adsr_envelope` to generate\nvarious waveforms.\n\nIn this tutorial, we use\n:py:func:`~torchaudio.prototype.functional.extend_pitch` to create\na timbre from base frequency.\n\n\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "First, we define some constants and helper function that we use\nthroughout the tutorial.\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "PI = torch.pi\nPI2 = 2 * torch.pi\n\nF0 = 344.0  # fundamental frequency\nDURATION = 1.1  # [seconds]\nSAMPLE_RATE = 16_000  # [Hz]\n\nNUM_FRAMES = int(DURATION * SAMPLE_RATE)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "def plot(freq, amp, waveform, sample_rate, zoom=None, vol=0.1):\n    t = (torch.arange(waveform.size(0)) / sample_rate).numpy()\n\n    fig, axes = plt.subplots(4, 1, sharex=True)\n    axes[0].plot(t, freq.numpy())\n    axes[0].set(title=f\"Oscillator bank (bank size: {amp.size(-1)})\", ylabel=\"Frequency [Hz]\", ylim=[-0.03, None])\n    axes[1].plot(t, amp.numpy())\n    axes[1].set(ylabel=\"Amplitude\", ylim=[-0.03 if torch.all(amp >= 0.0) else None, None])\n    axes[2].plot(t, waveform)\n    axes[2].set(ylabel=\"Waveform\")\n    axes[3].specgram(waveform, Fs=sample_rate)\n    axes[3].set(ylabel=\"Spectrogram\", xlabel=\"Time [s]\", xlim=[-0.01, t[-1] + 0.01])\n\n    for i in range(4):\n        axes[i].grid(True)\n    pos = axes[2].get_position()\n    fig.tight_layout()\n\n    if zoom is not None:\n        ax = fig.add_axes([pos.x0 + 0.02, pos.y0 + 0.03, pos.width / 2.5, pos.height / 2.0])\n        ax.plot(t, waveform)\n        ax.set(xlim=zoom, xticks=[], yticks=[])\n\n    waveform /= waveform.abs().max()\n    return Audio(vol * waveform, rate=sample_rate, normalize=False)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Harmonic Overtones\n\nHarmonic overtones are frequency components that are an integer\nmultiple of the fundamental frequency.\n\nWe look at how to generate the common waveforms that are used in\nsynthesizers. That is,\n\n - Sawtooth wave\n - Square wave\n - Triangle wave\n\n\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Sawtooth wave\n\n[Sawtooth wave](https://en.wikipedia.org/wiki/Sawtooth_wave) can be\nexpressed as the following. It contains all the integer harmonics, so\nit is commonly used in subtractive synthesis as well.\n\n\\begin{align}\\begin{align*}\n   y_t &= \\sum_{k=1}^{K} A_k \\sin ( 2 \\pi f_k t ) \\\\\n   \\text{where} \\\\\n   f_k &= k f_0 \\\\\n   A_k &= -\\frac{ (-1) ^k }{k \\pi}\n   \\end{align*}\\end{align}\n\n\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "The following function takes fundamental frequencies and amplitudes,\nand adds extend pitch in accordance with the formula above.\n\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "def sawtooth_wave(freq0, amp0, num_pitches, sample_rate):\n    freq = extend_pitch(freq0, num_pitches)\n\n    mults = [-((-1) ** i) / (PI * i) for i in range(1, 1 + num_pitches)]\n    amp = extend_pitch(amp0, mults)\n    waveform = oscillator_bank(freq, amp, sample_rate=sample_rate)\n    return freq, amp, waveform"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Now synthesize a waveform\n\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "freq0 = torch.full((NUM_FRAMES, 1), F0)\namp0 = torch.ones((NUM_FRAMES, 1))\nfreq, amp, waveform = sawtooth_wave(freq0, amp0, int(SAMPLE_RATE / F0), SAMPLE_RATE)\nplot(freq, amp, waveform, SAMPLE_RATE, zoom=(1 / F0, 3 / F0))"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "It is possible to oscillate the base frequency to create a\ntime-varying tone based on sawtooth wave.\n\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "fm = 10  # rate at which the frequency oscillates [Hz]\nf_dev = 0.1 * F0  # the degree of frequency oscillation [Hz]\n\nphase = torch.linspace(0, fm * PI2 * DURATION, NUM_FRAMES)\nfreq0 = F0 + f_dev * torch.sin(phase).unsqueeze(-1)\n\nfreq, amp, waveform = sawtooth_wave(freq0, amp0, int(SAMPLE_RATE / F0), SAMPLE_RATE)\nplot(freq, amp, waveform, SAMPLE_RATE, zoom=(1 / F0, 3 / F0))"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Square wave\n\n[Square wave](https://en.wikipedia.org/wiki/Square_wave) contains\nonly odd-integer harmonics.\n\n\\begin{align}\\begin{align*}\n   y_t &= \\sum_{k=0}^{K-1} A_k \\sin ( 2 \\pi f_k t ) \\\\\n   \\text{where} \\\\\n   f_k &= n f_0 \\\\\n   A_k &= \\frac{ 4 }{n \\pi} \\\\\n   n   &= 2k + 1\n   \\end{align*}\\end{align}\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "def square_wave(freq0, amp0, num_pitches, sample_rate):\n    mults = [2.0 * i + 1.0 for i in range(num_pitches)]\n    freq = extend_pitch(freq0, mults)\n\n    mults = [4 / (PI * (2.0 * i + 1.0)) for i in range(num_pitches)]\n    amp = extend_pitch(amp0, mults)\n\n    waveform = oscillator_bank(freq, amp, sample_rate=sample_rate)\n    return freq, amp, waveform"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "freq0 = torch.full((NUM_FRAMES, 1), F0)\namp0 = torch.ones((NUM_FRAMES, 1))\nfreq, amp, waveform = square_wave(freq0, amp0, int(SAMPLE_RATE / F0 / 2), SAMPLE_RATE)\nplot(freq, amp, waveform, SAMPLE_RATE, zoom=(1 / F0, 3 / F0))"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Triangle wave\n\n[Triangle wave](https://en.wikipedia.org/wiki/Triangle_wave)\nalso only contains odd-integer harmonics.\n\n\\begin{align}\\begin{align*}\n   y_t &= \\sum_{k=0}^{K-1} A_k \\sin ( 2 \\pi f_k t ) \\\\\n   \\text{where} \\\\\n   f_k &= n f_0 \\\\\n   A_k &= (-1) ^ k \\frac{8}{(n\\pi) ^ 2} \\\\\n   n   &= 2k + 1\n   \\end{align*}\\end{align}\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "def triangle_wave(freq0, amp0, num_pitches, sample_rate):\n    mults = [2.0 * i + 1.0 for i in range(num_pitches)]\n    freq = extend_pitch(freq0, mults)\n\n    c = 8 / (PI**2)\n    mults = [c * ((-1) ** i) / ((2.0 * i + 1.0) ** 2) for i in range(num_pitches)]\n    amp = extend_pitch(amp0, mults)\n\n    waveform = oscillator_bank(freq, amp, sample_rate=sample_rate)\n    return freq, amp, waveform"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "freq, amp, waveform = triangle_wave(freq0, amp0, int(SAMPLE_RATE / F0 / 2), SAMPLE_RATE)\nplot(freq, amp, waveform, SAMPLE_RATE, zoom=(1 / F0, 3 / F0))"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Inharmonic Paritials\n\nInharmonic partials refer to freqencies that are not integer multiple\nof fundamental frequency.\n\nThey are essential in re-creating realistic sound or\nmaking the result of synthesis more interesting.\n\n\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Bell sound\n\nhttps://computermusicresource.com/Simple.bell.tutorial.html\n\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "num_tones = 9\nduration = 2.0\nnum_frames = int(SAMPLE_RATE * duration)\n\nfreq0 = torch.full((num_frames, 1), F0)\nmults = [0.56, 0.92, 1.19, 1.71, 2, 2.74, 3.0, 3.76, 4.07]\nfreq = extend_pitch(freq0, mults)\n\namp = adsr_envelope(\n    num_frames=num_frames,\n    attack=0.002,\n    decay=0.998,\n    sustain=0.0,\n    release=0.0,\n    n_decay=2,\n)\namp = torch.stack([amp * (0.5**i) for i in range(num_tones)], dim=-1)\n\nwaveform = oscillator_bank(freq, amp, sample_rate=SAMPLE_RATE)\n\nplot(freq, amp, waveform, SAMPLE_RATE, vol=0.4)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "As a comparison, the following is the harmonic version of the above.\nOnly frequency values are different.\nThe number of overtones and its amplitudes are same.\n\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "freq = extend_pitch(freq0, num_tones)\nwaveform = oscillator_bank(freq, amp, sample_rate=SAMPLE_RATE)\n\nplot(freq, amp, waveform, SAMPLE_RATE)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## References\n\n- https://en.wikipedia.org/wiki/Additive_synthesis\n- https://computermusicresource.com/Simple.bell.tutorial.html\n- https://computermusicresource.com/Definitions/additive.synthesis.html\n\n"
      ]
    }
  ],
  "metadata": {
    "kernelspec": {
      "display_name": "Python 3",
      "language": "python",
      "name": "python3"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.10.14"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 0
}