Discussion:
RTP DTS/PTS result in varying mp4 frame durations
Jim Morris
2017-03-16 06:10:20 UTC
Permalink
When rtspsrc streams are piped into mp4mux, the resulting mp4 files have
widely varying sample durations and many negative composition time
offsets. Stepping frame-by-frame through these mp4s is inconsistent when
using video players like quicktime -- steps often don't advance to the next
frame or skip frames. Also, in some cases the PTS values in the mp4 go
backwards (momentarily decreasing instead of monotonically increasing).

The root of the issue appears to be that DTS is set to local clock time
when rtspsrc receives the segment. mp4mux uses the intervals between DTS's
to determine sample duration. Since the time between receiving segments
can vary considerably, the mp4 sample duration varies in parallel. (This
variation is exaggerated by mp4mux because it pulls up the DTS of the last
received segment for each frame).

The PTS is generated from the RTP timestamp, which tracks the remote
camera's clock. In our pipeline, DTS < PTS regularly. Consequently, the
mp4 sample composition time offsets are regularly negative and vary widely
to counteract the variation in sample durations. Also, as clocks drift the
sample composition time offsets grow greater in magnitude.

My temporary solution is to set DTS=PTS in mp4mux (minor modifications
to gst_qt_pad_adjust_buffer_dts() in gstqtmux.c to adjust DTS for every
frame). This fix produces a much "cleaner" mp4 that steps frame-by-frame
consistently. This fix does not handle B-frames, which is ok for our
streams.

First question: has anyone had problems with this DTS/PTS behavior when
piping RTP -> mp4?

Second question: is there a more general solution that can be applied in
rtspsrc so there's more correspondence between DTS and PTS?

Third question: are there ramifications to the temporary solution of
setting DTS=PTS in mp4mux that I've missed, besides ruining B-frame
support? And related, is it possible to extend the solution to support
B-frames by recognizing gaps in PTS progressions and shifting DTS to
preserve decode order?

Simplified pipeline:
rtspsrc location=URL latency=500 do-retransmission=true ! rtph264depay !
h264parse ! mp4mux fragment-duration=2000 ! filesink name=filexxx
location=diryyy

The issue described is true for both mp4's and fragmented mp4's.
Nicolas Dufresne
2017-03-16 15:11:36 UTC
Permalink
Post by Jim Morris
My temporary solution is to set DTS=PTS in mp4mux (minor
modifications to gst_qt_pad_adjust_buffer_dts() in gstqtmux.c to
adjust DTS for every frame).  This fix produces a much "cleaner" mp4
that steps frame-by-frame consistently.  This fix does not handle B-
frames, which is ok for our streams.
This is a known issue. Someone need to look at that and find a
solution. I think the h264 depayloader could at least detect the case
without B-Frame, and set the DTS to the "jitter-free" PTS value.

The other plausible solution is to leave DTS as none and let h264parse
fix it for us. Would need work on baseparse, which annoying tries to
create DTS if none. I have personally no idea why baseparse do
timestamp, most of the time it's doing it wrong.

Nicolas

Loading...