The idea is to create a new Media Foundation Filter to be used in audio compression/decompression in media foundation MP4 sinkwriter/source reader. I plan to convert Facebook’s Encodec into a MFT.
My problem is the sample descriptor.
CComPtr<IMFMediaType> ina = ...,myformat = ...;
ina->SetGUID(MF_MT_SUBTYPE, MFAudioFormat_PCM);
ina->SetUINT32(MF_MT_AUDIO_NUM_CHANNELS, nch);
ina->SetUINT32(MF_MT_AUDIO_SAMPLES_PER_SECOND, sr);
int BA = (int)((br / 8) * nch);
ina->SetUINT32(MF_MT_AUDIO_AVG_BYTES_PER_SECOND, (UINT32)(sr * BA));
ina->SetUINT32(MF_MT_AUDIO_BLOCK_ALIGNMENT, BA);
ina->SetUINT32(MF_MT_AUDIO_BITS_PER_SAMPLE, br);
ina->SetGUID(MF_MT_MAJOR_TYPE, MFMediaType_Audio);
myformat->SetGUID(MF_MT_MAJOR_TYPE, MFMediaType_Audio);
myformat->SetGUID(MF_MT_SUBTYPE, MFAudioFormat_myformat);
myformat->SetUINT32(MF_MT_AUDIO_NUM_CHANNELS, nch);
myformat->SetUINT32(MF_MT_AUDIO_SAMPLES_PER_SECOND, sr);
hr = myformat->SetBlob(MF_MT_MPEG4_SAMPLE_DESCRIPTION, (UINT8*)cz, sizeof(cz));
hr = myformat->SetUINT32(MF_MT_MPEG4_CURRENT_SAMPLE_ENTRY,0);
hr = wr->AddStream(myformat, &sidx);
LogMediaType(myformat);
hr = wr->SetInputMediaType(sidx, ina, 0);
This succeeds in the sink writer. After calling BeginWriting
, I reset the as I described in this related question:
CComPtr<IMFMediaSink> pSink;
CComPtr<IMFStreamSink> s2;
CComPtr<IMFMediaTypeHandler> mth;
CComPtr<IMFMediaType> mt2;
wr->GetServiceForStream(MF_SINK_WRITER_MEDIASINK, GUID_NULL, IID_PPV_ARGS(&pSink));
pSink->GetStreamSinkByIndex(0, &s2);
s2->GetMediaTypeHandler(&mth);
mth->GetCurrentMediaType(&mt2);
LogMediaType(mt2);
mt2->SetBlob(MF_MT_MPEG4_SAMPLE_DESCRIPTION, (UINT8*)cz, sizeof(cz));
mt2->SetUINT32(MF_MT_MPEG4_CURRENT_SAMPLE_ENTRY, 0);
mth->SetCurrentMediaType(mt2);
mt2 = 0;
mth->GetCurrentMediaType(&mt2);
LogMediaType(mt2);
Finalize()
succeeds and I have an MP4. Now this MP4 cannot be read back with a source reader (not supported) and ffmpeg -i shows
Stream #0:0(und): Audio: none (ecdc / 0x63646365), 48000 Hz, 0 channels, 288 kb/s (default)
.
This means that my problem is the sample descriptor, which currently is this:
UINT8 cz[100] = {
0x0,0x0,0x0,0x64, // len
0x73,0x74,0x73,0x64, // stsd
0x0,0x0,0x0,0x0, // verflag
0x0,0x0,0x0,0x1, // num
0x0,0x0,0x0,0x54, // len
'e','c','d','c', // code
0x0 (76 bytes)
};
So obviously I must put something after the ‘ecdc’ string. What would that be?
Is this a form of, say, WAVEFORMATEX?
When I copy the data of an existing descriptor like AAC, the mp4 can be read and mediainfo/ffmpeg shows AAC/related info, which of course can’t be played back because the stream is not AAC, but the file is parsed.