NUT

From MultimediaWiki
Jump to navigation Jump to search

NUT is a container format under construction by MPlayer, FFmpeg and Libav developers. Its main goals are

  • Simplicity: the format is simple to parse in code and easy to implement
  • Flexibility: able to encapsulate many types of audio, video, and subtitle codecs
  • Resilience: error-resistant while maintaining the smallest possible overhead

NUT seeking algo in libnut

Step 1 - Binary search and linear interpolation

Startup

Go to beginning of file if first syncpoint has not been found, and then go to EOF and search backward to find last syncpoint. Pick the closest possible syncpoints for requested pts from syncpoint cache, one higher and one lower. If requested pts is lower than first syncpoint or higher than last syncpoint, then seek to either of those syncpoints and end seek.

Binary search is ended when the entire region between the 2 syncpoints picked has been scanned. Which could even be before it has begun thanks to syncpoint cache.

Guess position

Best results have been achieved by combining both linear interpolation and binary search, giving most of the weight to interpolation.

hi, hi_pd
file position and timestamp of the top edge of the binary search.
lo, lo_pd
file position and timestamp of the bottom edge of the binary search.
time_pos
Requested pts.
guess
beginning of linear search for syncpoint.
#define INTERPOLATE_WEIGHT (19./20)
if (hi - lo < nut->max_distance*2) guess = lo + 16;
else { // linear interpolation
    double a = (double)(hi - lo) / (hi_pd - lo_pd);
    guess = lo + a * (time_pos - lo_pd);
    guess = guess * INTERPOLATE_WEIGHT + (lo+hi)/2 * (1 - INTERPOLATE_WEIGHT);
    if (hi - guess < nut->max_distance*2) guess = hi - nut->max_distance*2; //(lo + hi)/2;
}
if (guess < lo + 16) guess = lo + 16;

The first conditional prevents too much recursing when the distance is already low. The last conditional prevents an infinite loop of syncpoint search constantly finding lo and achieving nothing.

Binary search

Now, find a syncpoint, bounded between guess and hi. If it has a timestamp higher than requested pts, then replace hi and hi_pd, otherwise replace lo and lo_pd. If a syncpoint was not found between guess and hi, then, if guess is smaller or equal to lo + 16, then you have scanned the entire area between lo and hi, and your binary search is done. Otherwise, replace hi with guess and repeat the binary search.

After finding the 2 closest syncpoints bounding your requested pts, the bounds of linear are search are:

*start = lo_s.pos - (lo_s.back_ptr * 16 + 15);
*end = hi_s.pos;
*stopper = hi_s;

stopper is used later to end linear search prematurely. Note that it is the top edge of binary search, meaning, it has a timestamp higher than requested pts.

Step 2 - Linear search

Bounds of linear search

Linear search of course ends at end, however, it can end earlier:

  • When you find a frame of any stream, with a dts higher than requested pts.
  • When you find a frame for all active streams with a pts higher than requested pts.
  • When you find a syncpoint immediately following the syncpoint pointed to by stopper's back_ptr, (from here on this syncpoint will be called stopper_syncpoint) and all streams are active.
  • When you reach stopper_syncpoint, and you have not found a keyframe for any non-active between stopper's back_ptr and stopper_syncpoint, with a pts lower than stopper.timestamp. The reason for this is that stopper did not point because to stopper_syncpoint can only be because of active stream keyframes, and not non-active ones.
  • When you find a keyframe for all non-active streams after stopper_syncpoint with a pts lower than stopper.timestamp, which had a keyframe in the region in the previous condition. The logic is:
    • stopper.timestamp is higher than requested pts.
    • stopper_syncpoint is not the syncpoint pointed to by stopper.back_ptr, but the one immediately following.
    • If there was a keyframe for every non active stream after stopper_syncpoint, with a pts lower than stopper.timestamp, then the reason stopper does not point to stopper_syncpoint must be that there are no keyframes for active streams with a pts lower than stopper.timestamp after stopper_syncpoint.
    • There might be more keyframes in the region with a pts higher than stopper.timestamp, but those are irrelevant because they are higher than requested pts.

The last 3 conditions can be phrased differently:

  • When saying keyframes, only keyframes with a pts lower than stopper.timestamp are intended. Any other keyframes are irrelevant.
  • The region stopper.back_ptr to stopper_syncpoint has a set of keyframes between it. These keyframes are the ones responsible for stopper pointing to stopper.back_ptr and not stopper_syncpoint.
  • If all streams are active, then it's impossible for there to be better keyframes for all active streams between stopper_syncpoint to the end of the linear search, because if there was, stopper.back_ptr would point there.
  • If some streams are inactive, but none of them have a keyframe in the region specified above, then the keyframes responsible for stopper pointing to stopper.back_ptr are only from active streams, so again it is impossible there is a better position later in the linear search.
  • If some streams are inactive, and some of them do have a keyframe in the region specified above, then those keyframes are possibly responsible for stopper pointing to stopper.back_ptr. However, if we find more keyframes of those same inactive streams after the region, then they had nothing to do with stopper.back_ptr, and again the condition is met.

Linear search implementation

  1. Start at start.
  2. Search for syncpoint. If it is further than start + 15, then the file is errored, and you can start playing immediately from this syncpoint.
  3. Perform full demuxing, buffering everything read. If you encounter any NUT error during demuxing, seek back to the syncpoint after start and end seek.
  4. On every syncpoint, flush the buffer and cache the syncpoint position.
    • If this syncpoint is stopper_syncpoint, then do not flush the buffer, as it would be possible the linear search would end immediatelty afterwards and a second underlying seek would not be necessary.
  5. When finding keyframe of active stream with pts lower than requested pts, store position of keyframe. Discard previous value.
  6. End linear search as necessary by bounds given above.
  7. Pick the lowest value from positions of keyframes for active streams. This is the final position.
  8. Seek to closest syncpoint before the final position, preform a minimal demuxing until final position (to get proper last_pts context across all streams).
    • With luck this seek will not need an underlying seek, if you have not flushed the buffer since the requested position.
    • In most common case, stopper.back_ptr points to same syncpoint as the start of the linear search, as both syncpoints point to the same rare video keyframe.
    • If only video stream is active and the only non active stream is audio, you will most likely end your linear search immediately after stopper_syncpoint by finding an audio keyframe. Afterwards you will rewind right back to start, and, since buffer hasn't been flushed, this will not need an underlying seek.

Index

With index, binary search step is skipped, and a much more limited linear search used, between two immediate syncpoints. The syncpoint picked is the highest possible one that has a keyframe for each active stream past the syncpoint with a pts lower than requested pts. The end of the linear search is the syncpoint immediately following.

Dynamic Index

Create the index in the same manner as creating the index in the muxer, remembering keyframes and EOR at syncpoint positions. With one big difference - dynamic index can have gaps, thanks to seeking. Every syncpiont entry in the dynamic index has a flag indicating if it has a "unresearched region" gap from last syncpoint.

In seeking, dynamic index is used in the same manner as complete index. However, when deciding a syncpoint, the entire range of the syncpoint picked until the first syncpoint with a pts higher than requested is checked for having been already researched. If a "hole" is found anywhere in this range, dynamic index is disregarded and full binary search is performed.

End of Relevance (EOR)

Behavior with linear search:

  • Treat EOR frames same as keyframes. Seeing an EOR frame for an active stream causes the stream to be ignored until the next keyframe of same stream.
  • If all active streams are EOR by end of linear search, just use start of linear search as final seeking position.

For index:

  • When picking the correct syncpoint to start linear search, ignore streams which have EOR with pts lower than requested pts and no keyframe with pts lower than requested pts.
  • If all active streams are EOR, pick the syncpoint immediately before the syncpoint with the last EOR across all streams with pts lower than requested pts.