[av v10.0] The output container does not put audio stream under audio but under other instead

Created by: YosuaMichael

Overview

In av==10.0, when we do add stream for audio steam in output container: container.add_stream('aac', rate=44100), we notice that container.streams.audio is still empty tuple, and instead the audio steam actually goes to container.streams.other instead (see reproduction code below).

Note that this bug is not happening in av==9.2.

Expected behavior

When we do container.add_stream('aac', rate=44100) we expect container.streams.audio to be not empty

Actual behavior

When we do container.add_stream('aac', rate=44100) we expect container.streams.audio is empty, and container.streams.other is not empty instead.

Reproduction

# Simplified from torchvision code: https://github.com/pytorch/vision/blob/main/torchvision/io/video.py#L99
import av
import torch
import numpy as np

audio_fps = 44100
audio_codec = 'aac'
audio_layout = "stereo"
audio_array = torch.rand((2, 44100))

container = av.open("test_write.mp4", mode="w")
a_stream = container.add_stream(audio_codec, rate=audio_fps)
audio_format_dtypes = {
    "dbl": "<f8",
    "dblp": "<f8",
    "flt": "<f4",
    "fltp": "<f4",
    "s16": "<i2",
    "s16p": "<i2",
    "s32": "<i4",
    "s32p": "<i4",
    "u8": "u1",
    "u8p": "u1",
}
audio_sample_fmt = container.streams[0].format.name
format_dtype = np.dtype(audio_format_dtypes[audio_sample_fmt])
audio_array = torch.as_tensor(audio_array).numpy().astype(format_dtype)
frame = av.AudioFrame.from_ndarray(audio_array, format=audio_sample_fmt, layout=audio_layout)

frame.sample_rate = audio_fps

for packet in a_stream.encode(frame):
    container.mux(packet)

for packet in a_stream.encode():
    container.mux(packet)
    
container.close()

print(f"container.streams.audio: {container.streams.audio}")
print(f"container.streams.other: {container.streams.other}")

# Using av==10.0
# container.streams.audio: ()
# container.streams.other: (<av.Stream #0 audio/aac at 0x12551fe20>,)


# Using av==9.2
# container.streams.audio: (<av.AudioStream #0 aac at 44100Hz, stereo, fltp at 0x1275920e0>,)
# container.streams.other: ()

Versions

OS: macOS 12.6
PyAV runtime:

PyAV v10.0.0
library configuration: --disable-static --enable-shared --libdir=/tmp/vendor/lib --prefix=/tmp/vendor --arch=arm64 --enable-cross-compile --disable-alsa --disable-doc --disable-mediafoundation --enable-fontconfig --enable-gmp --disable-gnutls --enable-gpl --enable-libaom --enable-libass --enable-libbluray --enable-libdav1d --enable-libfreetype --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libopus --enable-libspeex --enable-libtheora --enable-libtwolame --enable-libvorbis --enable-libvpx --enable-libx264 --enable-libx265 --disable-libxcb --enable-libxml2 --enable-libxvid --enable-lzma --enable-version3 --enable-zlib
library license: GPL version 3 or later
libavcodec     59. 37.100
libavdevice    59.  7.100
libavfilter     8. 44.100
libavformat    59. 27.100
libavutil      57. 28.100
libswresample   4.  7.100
libswscale      6.  7.100

PyAV build:

Install from `pip install av==10.0`

FFmpeg:

ffmpeg version 4.2.2 Copyright (c) 2000-2019 the FFmpeg developers
built with clang version 12.0.0
configuration: --prefix=/Users/yosuamichael/opt/miniconda3/envs/tv --cc=arm64-apple-darwin20.0.0-clang --disable-doc --enable-avresample --enable-gmp --enable-hardcoded-tables --enable-libfreetype --enable-libvpx --enable-pthreads --enable-libopus --enable-postproc --enable-pic --enable-pthreads --enable-shared --enable-static --enable-version3 --enable-zlib --enable-libmp3lame --disable-nonfree --enable-gpl --enable-gnutls --disable-openssl --enable-libopenh264 --enable-libx264
libavutil      56. 31.100 / 56. 31.100
libavcodec     58. 54.100 / 58. 54.100
libavformat    58. 29.100 / 58. 29.100
libavdevice    58.  8.100 / 58.  8.100
libavfilter     7. 57.100 /  7. 57.100
libavresample   4.  0.  0 /  4.  0.  0
libswscale      5.  5.100 /  5.  5.100
libswresample   3.  5.100 /  3.  5.100
libpostproc    55.  5.100 / 55.  5.100

Research

I have done the following:

Checked the PyAV documentation
Searched on Google
Searched on Stack Overflow
Looked through old GitHub issues
Asked on PyAV Gitter
... and waited 72 hours for a response.

Additional context

This behaviour breaks the torchvision when it try to write video with pyav==10.0, here is a related issue: https://github.com/pytorch/vision/issues/6814.