I have a remote camera that captures H264 encoded video data and AAC encoded audio data, places the data into a custom ring buffer, which then is sent to a Node.js socket server, where the packet of information is detected as audio or video and then handled accordingly. That data should turn into a live stream, the protocol doesn’t matter, but the delay has to be around ~4 seconds and can be played on iOS and Android devices.
After reading hundreds of pages of documentation, questions, or solutions on the internet, I can’t seem to find anything about handling two separate streams of AAC and H264 data to create a live stream.
Despite attempting many different ways of achieving this goal, even having a working implementation of HLS, I want to revisit ALL options of live streaming, and I am hoping someone out there can give me advice or guidance to specific documentation on how to achieve this goal.
To be specific, this is our goal:
- Stream AAC and H264 data from remote cellular camera to a server which will do some work on that data to live stream to one user (possibly more users in the future) on a mobile iOS or Android device
- Delay of the live stream should be a maximum of ~4 seconds, if the user has bad signal, then a longer delay is okay, as we obviously cannot do anything about that.
- We should not have to re-encode our data. We’ve explored WebRTC, but that requires OGG audio packets and thus requires us to re-encode the data, which would be expensive for our server to run.
Any and all help, ranging from re-visiting an old approach we took to exploring new ones, is appreciated.
I can provide code snippets as well for our current implementation of LLHLS if it helps, but I figured this post is already long enough.
I’ve tried FFmpeg with named pipes, I expected it to just work, but FFmpeg kept blocking on the first named pipe input. I thought of just writing the data out to two files and then using FFmpeg, but it’s continuous data and I don’t have enough knowledge on FFmpeg on how I could use that type of implementation to create one live stream.
I’ve tried implementing our own RTSP server on the camera using Gstreamer (our camera had its RTSP server stripped out, wasn’t my call) but the camera’s flash storage cannot handle having GStreamer on it, so that wasn’t an option.
My latest attempt was using a derivation of hls-parser to create an HLS manifest and mux.js to create MP4 containers for .m4s
fragmented mp4 files and do an HLS live stream. This was my most successful attempt, where we successfully had a live stream going, but the delay was up to 16 seconds, as one would expect with HLS live streaming. We could drop the target duration down to 2 seconds and get about 6-8 seconds delay, but this could be unreliable, as these cameras could have no signal making it relatively expensive to send so many IDR frames with such low bandwidth.
With the delay being the only factor left, I attempted to upgrade the implementation to support Apple’s Low Latency HLS. It seems to work, as the right partial segments are getting requested and everything that makes LLHLS is working as intended, but the delay isn’t going down when played on iOS’ native AVPlayer, as a matter of fact, it looks like it worsened.
tzuleger is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.