Essentially, I’m trying to achieve this:
The only problem with that is – I’m using ffmpeg programatically, meaning – there won’t be any predefined sounds at predefined times, instead – I have code that generates these and executes ffmpeg with the following arguments:
-f concat -r 60 -safe 0 -i {VIDEO_OUTPUT_PATH}/{fileName}.txt
{audioArgs}
-c:v libx264 -preset fast -crf 23 -vf "scale=1080:1920,format=yuv420p"
-c:a aac -b:a 192k
-af "afade=t=out:st={totalSeconds - 3}:d=3"
{mapArgs}
-async 1 -movflags +faststart -t {totalSeconds} {outputPath}"
Presumably, if there are 3 sounds (like on the image), {mapArgs} will be replaced with:
-map 0 -map 1 -map 2 -map 3
and {audioArgs} will be replaced with:
-itsoffset 0 -i bg.mp3 -itsoffset 13 sound1.mp3 -itsoffset 37 sound2.mp3
However, it seems to be playing background music and the video only – no other sounds.