{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "\n# StreamWriter Advanced Usage\n\n**Author**: [Moto Hira](moto@meta.com)_\n\nThis tutorial shows how to use :py:class:`torchaudio.io.StreamWriter` to\nplay audio and video.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "

Note

This tutorial uses hardware devices, thus it is not portable across\n different operating systems.\n\n The tutorial was written and tested on MacBook Pro (M1, 2020).

\n\n\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "

Note

This tutorial requires FFmpeg libraries.\n Please refer to `FFmpeg dependency ` for\n the detail.

\n\n\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "

Warning

TorchAudio dynamically loads compatible FFmpeg libraries\n installed on the system.\n The types of supported formats (media format, encoder, encoder\n options etc) depend on the libraries.\n\n To check the available devices, muxers and encoders, you can use the\n following commands\n\n```console\nffmpeg -muxers\nffmpeg -encoders\nffmpeg -devices\nffmpeg -protocols

\n```\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Preparation\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "import torch\nimport torchaudio\n\nprint(torch.__version__)\nprint(torchaudio.__version__)\n\nfrom torchaudio.io import StreamWriter" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "from torchaudio.utils import download_asset\n\nAUDIO_PATH = download_asset(\"tutorial-assets/Lab41-SRI-VOiCES-src-sp0307-ch127535-sg0042.wav\")\nVIDEO_PATH = download_asset(\n \"tutorial-assets/stream-api/NASAs_Most_Scientifically_Complex_Space_Observatory_Requires_Precision-MP4_small.mp4\"\n)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Device Availability\n\n``StreamWriter`` takes advantage of FFmpeg's IO abstraction and\nwrites the data to media devices such as speakers and GUI.\n\nTo write to devices, provide ``format`` option to the constructor\nof ``StreamWriter``.\n\nDifferent OS will have different device options and their availabilities\ndepend on the actual installation of FFmpeg.\n\nTo check which device is available, you can use `ffmpeg -devices`\ncommand.\n\n\"audiotoolbox\" (speaker) and \"sdl\" (video GUI)\nare available.\n\n```console\n$ ffmpeg -devices\n...\nDevices:\n D. = Demuxing supported\n .E = Muxing supported\n --\n E audiotoolbox AudioToolbox output device\n D avfoundation AVFoundation input device\n D lavfi Libavfilter virtual input device\n E opengl OpenGL output\n E sdl,sdl2 SDL2 output device\n```\nFor details about what devices are available on which OS, please check\nthe official FFmpeg documentation. https://ffmpeg.org/ffmpeg-devices.html\n\n\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Playing audio\n\nBy providing ``format=\"audiotoolbox\"`` option, the StreamWriter writes\ndata to speaker device.\n\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "# Prepare sample audio\nwaveform, sample_rate = torchaudio.load(AUDIO_PATH, channels_first=False, normalize=False)\nnum_frames, num_channels = waveform.shape" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "# Configure StreamWriter to write to speaker device\ns = StreamWriter(dst=\"-\", format=\"audiotoolbox\")\ns.add_audio_stream(sample_rate, num_channels, format=\"s16\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "# Write audio to the device\nwith s.open():\n for i in range(0, num_frames, 256):\n s.write_audio_chunk(0, waveform[i : i + 256])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "

Note

Writing to \"audiotoolbox\" is blocking operation, but it will not\n wait for the aduio playback. The device must be kept open while\n audio is being played.\n\n The following code will close the device as soon as the audio is\n written and before the playback is completed.\n Adding :py:func:`time.sleep` will help keep the device open until\n the playback is completed.\n\n```\nwith s.open():\n s.write_audio_chunk(0, waveform)

\n```\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Playing Video\n\nTo play video, you can use ``format=\"sdl\"`` or ``format=\"opengl\"``.\nAgain, you need a version of FFmpeg with corresponding integration\nenabled. The available devices can be checked with ``ffmpeg -devices``.\n\nHere, we use SDL device (https://ffmpeg.org/ffmpeg-devices.html#sdl).\n\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "# note:\n# SDL device does not support specifying frame rate, and it has to\n# match the refresh rate of display.\nframe_rate = 120\nwidth, height = 640, 360" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For we define a helper function that delegates the video loading to\na background thread and give chunks\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "running = True\n\n\ndef video_streamer(path, frames_per_chunk):\n import queue\n import threading\n\n from torchaudio.io import StreamReader\n\n q = queue.Queue()\n\n # Streaming process that runs in background thread\n def _streamer():\n streamer = StreamReader(path)\n streamer.add_basic_video_stream(\n frames_per_chunk, format=\"rgb24\", frame_rate=frame_rate, width=width, height=height\n )\n for (chunk_,) in streamer.stream():\n q.put(chunk_)\n if not running:\n break\n\n # Start the background thread and fetch chunks\n t = threading.Thread(target=_streamer)\n t.start()\n while running:\n try:\n yield q.get()\n except queue.Empty:\n break\n t.join()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we start streaming. Pressing \"Q\" will stop the video.\n\n

Note

`write_video_chunk` call against SDL device blocks until SDL finishes\n playing the video.

\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "# Set output device to SDL\ns = StreamWriter(\"-\", format=\"sdl\")\n\n# Configure video stream (RGB24)\ns.add_video_stream(frame_rate, width, height, format=\"rgb24\", encoder_format=\"rgb24\")\n\n# Play the video\nwith s.open():\n for chunk in video_streamer(VIDEO_PATH, frames_per_chunk=256):\n try:\n s.write_video_chunk(0, chunk)\n except RuntimeError:\n running = False\n break" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ ".. raw:: html\n\n \n\n[[code](https://download.pytorch.org/torchaudio/tutorial-assets/sdl.py)_]\n\n\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Streaming Video\n\nSo far, we looked at how to write to hardware devices. There are some\nalternative methods for video streaming.\n\n\n\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## RTMP (Real-Time Messaging Protocol)\n\nUsing RMTP, you can stream media (video and/or audio) to a single client.\nThis does not require a hardware device, but it requires a separate player.\n\nTo use RMTP, specify the protocol and route in ``dst`` argument in\nStreamWriter constructor, then pass ``{\"listen\": \"1\"}`` option when opening\nthe destination.\n\nStreamWriter will listen to the port and wait for a client to request the video.\nThe call to ``open`` is blocked until a request is received.\n\n```\ns = StreamWriter(dst=\"rtmp://localhost:1935/live/app\", format=\"flv\")\ns.add_audio_stream(sample_rate=sample_rate, num_channels=num_channels, encoder=\"aac\")\ns.add_video_stream(frame_rate=frame_rate, width=width, height=height)\n\nwith s.open(option={\"listen\": \"1\"}):\n for video_chunk, audio_chunk in generator():\n s.write_audio_chunk(0, audio_chunk)\n s.write_video_chunk(1, video_chunk)\n```\n.. raw:: html\n\n \n\n[[code](https://download.pytorch.org/torchaudio/tutorial-assets/rtmp.py)_]\n\n\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## UDP (User Datagram Protocol)\n\nUsing UDP, you can stream media (video and/or audio) to socket.\nThis does not require a hardware device, but it requires a separate player.\n\nUnlike RTMP streaming and client processes are disconnected.\nThe streaming process are not aware of client process.\n\n```\ns = StreamWriter(dst=\"udp://localhost:48550\", format=\"mpegts\")\ns.add_audio_stream(sample_rate=sample_rate, num_channels=num_channels, encoder=\"aac\")\ns.add_video_stream(frame_rate=frame_rate, width=width, height=height)\n\nwith s.open():\n for video_chunk, audio_chunk in generator():\n s.write_audio_chunk(0, audio_chunk)\n s.write_video_chunk(1, video_chunk)\n```\n.. raw:: html\n\n \n\n[[code](https://download.pytorch.org/torchaudio/tutorial-assets/udp.py)_]\n\n\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Tag: :obj:`torchaudio.io`\n\n" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.14" } }, "nbformat": 4, "nbformat_minor": 0 }