[av v10.0] The output container does not put audio stream under audio but under other instead
Created by: YosuaMichael
Overview
In av==10.0, when we do add stream for audio steam in output container: container.add_stream('aac', rate=44100)
, we notice that container.streams.audio
is still empty tuple, and instead the audio steam actually goes to container.streams.other
instead (see reproduction code below).
Note that this bug is not happening in av==9.2.
Expected behavior
When we do container.add_stream('aac', rate=44100)
we expect container.streams.audio
to be not empty
Actual behavior
When we do container.add_stream('aac', rate=44100)
we expect container.streams.audio
is empty, and container.streams.other
is not empty instead.
Reproduction
# Simplified from torchvision code: https://github.com/pytorch/vision/blob/main/torchvision/io/video.py#L99
import av
import torch
import numpy as np
audio_fps = 44100
audio_codec = 'aac'
audio_layout = "stereo"
audio_array = torch.rand((2, 44100))
container = av.open("test_write.mp4", mode="w")
a_stream = container.add_stream(audio_codec, rate=audio_fps)
audio_format_dtypes = {
"dbl": "<f8",
"dblp": "<f8",
"flt": "<f4",
"fltp": "<f4",
"s16": "<i2",
"s16p": "<i2",
"s32": "<i4",
"s32p": "<i4",
"u8": "u1",
"u8p": "u1",
}
audio_sample_fmt = container.streams[0].format.name
format_dtype = np.dtype(audio_format_dtypes[audio_sample_fmt])
audio_array = torch.as_tensor(audio_array).numpy().astype(format_dtype)
frame = av.AudioFrame.from_ndarray(audio_array, format=audio_sample_fmt, layout=audio_layout)
frame.sample_rate = audio_fps
for packet in a_stream.encode(frame):
container.mux(packet)
for packet in a_stream.encode():
container.mux(packet)
container.close()
print(f"container.streams.audio: {container.streams.audio}")
print(f"container.streams.other: {container.streams.other}")
# Using av==10.0
# container.streams.audio: ()
# container.streams.other: (<av.Stream #0 audio/aac at 0x12551fe20>,)
# Using av==9.2
# container.streams.audio: (<av.AudioStream #0 aac at 44100Hz, stereo, fltp at 0x1275920e0>,)
# container.streams.other: ()
Versions
- OS: macOS 12.6
- PyAV runtime:
PyAV v10.0.0
library configuration: --disable-static --enable-shared --libdir=/tmp/vendor/lib --prefix=/tmp/vendor --arch=arm64 --enable-cross-compile --disable-alsa --disable-doc --disable-mediafoundation --enable-fontconfig --enable-gmp --disable-gnutls --enable-gpl --enable-libaom --enable-libass --enable-libbluray --enable-libdav1d --enable-libfreetype --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libopus --enable-libspeex --enable-libtheora --enable-libtwolame --enable-libvorbis --enable-libvpx --enable-libx264 --enable-libx265 --disable-libxcb --enable-libxml2 --enable-libxvid --enable-lzma --enable-version3 --enable-zlib
library license: GPL version 3 or later
libavcodec 59. 37.100
libavdevice 59. 7.100
libavfilter 8. 44.100
libavformat 59. 27.100
libavutil 57. 28.100
libswresample 4. 7.100
libswscale 6. 7.100
- PyAV build:
Install from `pip install av==10.0`
- FFmpeg:
ffmpeg version 4.2.2 Copyright (c) 2000-2019 the FFmpeg developers
built with clang version 12.0.0
configuration: --prefix=/Users/yosuamichael/opt/miniconda3/envs/tv --cc=arm64-apple-darwin20.0.0-clang --disable-doc --enable-avresample --enable-gmp --enable-hardcoded-tables --enable-libfreetype --enable-libvpx --enable-pthreads --enable-libopus --enable-postproc --enable-pic --enable-pthreads --enable-shared --enable-static --enable-version3 --enable-zlib --enable-libmp3lame --disable-nonfree --enable-gpl --enable-gnutls --disable-openssl --enable-libopenh264 --enable-libx264
libavutil 56. 31.100 / 56. 31.100
libavcodec 58. 54.100 / 58. 54.100
libavformat 58. 29.100 / 58. 29.100
libavdevice 58. 8.100 / 58. 8.100
libavfilter 7. 57.100 / 7. 57.100
libavresample 4. 0. 0 / 4. 0. 0
libswscale 5. 5.100 / 5. 5.100
libswresample 3. 5.100 / 3. 5.100
libpostproc 55. 5.100 / 55. 5.100
Research
I have done the following:
-
Checked the PyAV documentation -
Searched on Google -
Searched on Stack Overflow -
Looked through old GitHub issues -
Asked on PyAV Gitter -
... and waited 72 hours for a response.
Additional context
This behaviour breaks the torchvision when it try to write video with pyav==10.0, here is a related issue: https://github.com/pytorch/vision/issues/6814.