Difference between revisions of "NUT"

From MultimediaWiki
Jump to navigation Jump to search
(Binary search: clarify)
 
(29 intermediate revisions by 9 users not shown)
Line 1: Line 1:
* Extensions: nut
+
* Extension: nut
* Website: http://www.nut.hu/
+
* Specification: http://ffmpeg.org/~michael/nut.txt
 +
* git repository: git://git.ffmpeg.org/nut
  
NUT is a container format under construction by [[MPlayer]] and [[FFmpeg]] developers.
+
NUT is a container format under construction by [[MPlayer]], [[FFmpeg]] and [[Libav]] developers. Its main goals are
Its main goals are to be simple, flexible and error-resistant while maintaining the smallest possible overhead.
+
* Simplicity: the format is simple to parse in code and easy to implement
 +
* Flexibility: able to encapsulate many types of audio, video, and subtitle codecs
 +
* Resilience: error-resistant while maintaining the smallest possible overhead
  
 
== NUT seeking algo in libnut ==
 
== NUT seeking algo in libnut ==
Line 11: Line 14:
 
==== Startup ====
 
==== Startup ====
 
Go to beginning of file if first syncpoint has not been found, and then go to EOF and search backward to find last syncpoint.
 
Go to beginning of file if first syncpoint has not been found, and then go to EOF and search backward to find last syncpoint.
Pick the closest possible syncpoints for requested pts, one higher and one lower. If requested pts is lower than first syncpoint or higher than last syncpoint, then seek to either of those syncpoints and end seek.
+
Pick the closest possible syncpoints for requested pts from syncpoint cache, one higher and one lower. If requested pts is lower than first syncpoint or higher than last syncpoint, then seek to either of those syncpoints and end seek.
  
 
Binary search is ended when the entire region between the 2 syncpoints picked has been scanned. Which could even be before it has begun thanks to syncpoint cache.
 
Binary search is ended when the entire region between the 2 syncpoints picked has been scanned. Which could even be before it has begun thanks to syncpoint cache.
Line 21: Line 24:
 
;lo, lo_pd: file position and timestamp of the bottom edge of the binary search.
 
;lo, lo_pd: file position and timestamp of the bottom edge of the binary search.
 
;time_pos: Requested pts.
 
;time_pos: Requested pts.
;guess: begginning of linear search for syncpoint.
+
;guess: beginning of linear search for syncpoint.
  
  #define INTERPOLATE_WEIGHT (7./8)
+
  #define INTERPOLATE_WEIGHT (19./20)
  if (hi - lo < nut->max_distance) guess = lo + 8;
+
  if (hi - lo &lt; nut-&gt;max_distance*2) guess = lo + 16;
 
  else { // linear interpolation
 
  else { // linear interpolation
 
     double a = (double)(hi - lo) / (hi_pd - lo_pd);
 
     double a = (double)(hi - lo) / (hi_pd - lo_pd);
 
     guess = lo + a * (time_pos - lo_pd);
 
     guess = lo + a * (time_pos - lo_pd);
 
     guess = guess * INTERPOLATE_WEIGHT + (lo+hi)/2 * (1 - INTERPOLATE_WEIGHT);
 
     guess = guess * INTERPOLATE_WEIGHT + (lo+hi)/2 * (1 - INTERPOLATE_WEIGHT);
     if (hi - guess < nut->max_distance*2) guess = hi - nut->max_distance*2; //(lo + hi)/2;
+
     if (hi - guess &lt; nut-&gt;max_distance*2) guess = hi - nut-&gt;max_distance*2; //(lo + hi)/2;
 
  }
 
  }
  if (guess < lo + 8) guess = lo + 8;
+
  if (guess &lt; lo + 16) guess = lo + 16;
  
The first conditional prevents too much recursing when the distance is already low. The last conditional prevents infinite loop of syncpoint search constantly finding ''lo'' and achieving nothing.
+
The first conditional prevents too much recursing when the distance is already low. The last conditional prevents an infinite loop of syncpoint search constantly finding ''lo'' and achieving nothing.
  
 
==== Binary search ====
 
==== Binary search ====
Now, find a syncpoint, bounded between ''guess'' and ''hi''. If it has a pts higher than requested pts, then replace ''hi'' and ''hi_pd'', otherwise replace ''lo'' and ''lo_pd''.
+
Now, find a syncpoint, bounded between ''guess'' and ''hi''. If it has a timestamp higher than requested pts, then replace ''hi'' and ''hi_pd'', otherwise replace ''lo'' and ''lo_pd''.
If a syncpoint was not found between ''guess'' and ''hi'', then, if ''guess'' is smaller or equal to ''lo + 11'', then you have scanned the entire area between ''lo'' and ''hi'', and your binary search is done. Otherwise, replace ''hi'' with ''guess'' and repeat the binary search.
+
If a syncpoint was not found between ''guess'' and ''hi'', then, if ''guess'' is smaller or equal to ''lo + 16'', then you have scanned the entire area between ''lo'' and ''hi'', and your binary search is done. Otherwise, replace ''hi'' with ''guess'' and repeat the binary search.
  
 
After finding the 2 closest syncpoints bounding your requested pts, the bounds of linear are search are:
 
After finding the 2 closest syncpoints bounding your requested pts, the bounds of linear are search are:
  
  *start = lo_s.pos - (lo_s.back_ptr * 8 + 7);
+
  *start = lo_s.pos - (lo_s.back_ptr * 16 + 15);
  *end = lo_s.pos;
+
  *end = hi_s.pos;
 
  *stopper = hi_s;
 
  *stopper = hi_s;
  
''stopper'' is used later to end linear search prematurely. Note that it is the '''top''' edge of binary search, meaning, it has a pts higher than requested pts.
+
''stopper'' is used later to end linear search prematurely. Note that it is the '''top''' edge of binary search, meaning, it has a timestamp higher than requested pts.
  
 
=== Step 2 - Linear search ===
 
=== Step 2 - Linear search ===
Line 51: Line 54:
 
==== Bounds of linear search ====
 
==== Bounds of linear search ====
 
Linear search of course ends at ''end'', however, it can end earlier:
 
Linear search of course ends at ''end'', however, it can end earlier:
* When you find any frame of an active stream, with a pts higher than requested pts.
+
* When you find a frame of any stream, with a dts higher than requested pts.
 +
* When you find a frame for '''all''' active streams with a pts higher than requested pts.
 
* When you find a syncpoint immediately following the syncpoint pointed to by ''stopper'''s back_ptr, (from here on this syncpoint will be called ''stopper_syncpoint'') '''and''' all streams are active.
 
* When you find a syncpoint immediately following the syncpoint pointed to by ''stopper'''s back_ptr, (from here on this syncpoint will be called ''stopper_syncpoint'') '''and''' all streams are active.
* When you find a keyframe for all '''non'''-active streams after after ''stopper_syncpoint'' with a pts lower than ''stopper.pts''. The logic is:
+
* When you reach ''stopper_syncpoint'', and you have not found a keyframe for any '''non'''-active between ''stopper''<nowiki>'</nowiki>s back_ptr and ''stopper_syncpoint'', with a pts lower than ''stopper.timestamp''. The reason for this is that ''stopper'' did not point because to ''stopper_syncpoint'' can only be because of active stream keyframes, and not non-active ones.
** ''stopper.pts'' is higher than requested pts.
+
* When you find a keyframe for all '''non'''-active streams after ''stopper_syncpoint'' with a pts lower than ''stopper.timestamp'', which had a keyframe in the region in the previous condition. The logic is:
 +
** ''stopper.timestamp'' is higher than requested pts.
 
** ''stopper_syncpoint'' is '''not''' the syncpoint pointed to by ''stopper.back_ptr'', but the one immediately following.
 
** ''stopper_syncpoint'' is '''not''' the syncpoint pointed to by ''stopper.back_ptr'', but the one immediately following.
** If there was a keyframe for every non active stream after ''stopper_syncpoint'', with a pts lower than ''stopper.pts'', then the reason ''stopper'' does not point to ''stopper_syncpoint'' must be that there are no keyframes for '''active''' streams with a pts lower than ''stopper.pts'' after ''stopper_syncpoint''.
+
** If there was a keyframe for every non active stream after ''stopper_syncpoint'', with a pts lower than ''stopper.timestamp'', then the reason ''stopper'' does not point to ''stopper_syncpoint'' must be that there are no keyframes for '''active''' streams with a pts lower than ''stopper.timestamp'' after ''stopper_syncpoint''.
** There might be more keyframes in the region with a pts higher than ''stopper.pts'', but those are irrelevant because they are higher than requested pts.
+
** There might be more keyframes in the region with a pts higher than ''stopper.timestamp'', but those are irrelevant because they are higher than requested pts.
 +
 
 +
The last 3 conditions can be phrased differently:
 +
* When saying keyframes, only keyframes with a pts lower than ''stopper.timestamp'' are intended. Any other keyframes are irrelevant.
 +
* The region ''stopper.back_ptr'' to ''stopper_syncpoint'' has a set of keyframes between it. These keyframes are the ones responsible for ''stopper'' pointing to ''stopper.back_ptr'' and not ''stopper_syncpoint''.
 +
* If all streams are active, then it's impossible for there to be better keyframes for '''all''' active streams between ''stopper_syncpoint'' to the end of the linear search, because if there was, ''stopper.back_ptr'' would point there.
 +
* If some streams are inactive, but none of them have a keyframe in the region specified above, then the keyframes responsible for ''stopper'' pointing to ''stopper.back_ptr'' are only from active streams, so again it is impossible there is a better position later in the linear search.
 +
* If some streams are inactive, and some of them '''do''' have a keyframe in the region specified above, then those keyframes are ''possibly'' responsible for ''stopper'' pointing to ''stopper.back_ptr''. However, if we find more keyframes of those same inactive streams '''after''' the region, then they had nothing to do with ''stopper.back_ptr'', and again the condition is met.
  
 
==== Linear search implementation ====
 
==== Linear search implementation ====
  
 
# Start at ''start''.
 
# Start at ''start''.
# Search for syncpoint. If it is further than ''start + 7'', then the file is errored, and you can start playing immediately from this syncpoint.
+
# Search for syncpoint. If it is further than ''start + 15'', then the file is errored, and you can start playing immediately from this syncpoint.
# Perform full demuxing, buffering everything read.
+
# Perform full demuxing, buffering everything read. If you encounter any NUT error during demuxing, seek back to the syncpoint after ''start'' and end seek.
 
# On every syncpoint, flush the buffer and cache the syncpoint position.
 
# On every syncpoint, flush the buffer and cache the syncpoint position.
 +
#* If this syncpoint is ''stopper_syncpoint'', then do not flush the buffer, as it would be possible the linear search would end immediatelty afterwards and a second underlying seek would not be necessary.
 
# When finding keyframe of active stream with pts lower than requested pts, store position of keyframe. Discard previous value.
 
# When finding keyframe of active stream with pts lower than requested pts, store position of keyframe. Discard previous value.
 
# End linear search as necessary by bounds given above.
 
# End linear search as necessary by bounds given above.
 
# Pick the lowest value from positions of keyframes for active streams. This is the final position.
 
# Pick the lowest value from positions of keyframes for active streams. This is the final position.
 
# Seek to closest syncpoint before the final position, preform a minimal demuxing until final position (to get proper last_pts context across all streams).
 
# Seek to closest syncpoint before the final position, preform a minimal demuxing until final position (to get proper last_pts context across all streams).
 +
#* With luck this seek will not need an underlying seek, if you have not flushed the buffer since the requested position.
 +
#* In most common case, ''stopper.back_ptr'' points to same syncpoint as the start of the linear search, as both syncpoints point to the same rare video keyframe.
 +
#* If only video stream is active and the only non active stream is audio, you will most likely end your linear search immediately after ''stopper_syncpoint'' by finding an audio keyframe. Afterwards you will rewind right back to ''start'', and, since buffer hasn't been flushed, this will not need an underlying seek.
  
 
=== Index ===
 
=== Index ===
 
With index, binary search step is skipped, and a much more limited linear search used, between two immediate syncpoints. The syncpoint picked is the highest possible one that has a keyframe for each active stream past the syncpoint with a pts lower than requested pts. The end of the linear search is the syncpoint immediately following.
 
With index, binary search step is skipped, and a much more limited linear search used, between two immediate syncpoints. The syncpoint picked is the highest possible one that has a keyframe for each active stream past the syncpoint with a pts lower than requested pts. The end of the linear search is the syncpoint immediately following.
  
Dynamic index is used in the same manner as complete index. However, when deciding a syncpoint, the entire range of the syncpoint picked until the first syncpoint with a pts higher than requested is checked for having been already researched. If a "hole" is found anywhere in this range, dynamic index is disregarded and full binary search is performed.
+
=== Dynamic Index ===
 +
Create the index in the same manner as creating the index in the muxer, remembering keyframes and EOR at syncpoint positions.
 +
With one big difference - dynamic index can have gaps, thanks to seeking. Every syncpiont entry in the dynamic index has a flag indicating if it has a &quot;unresearched region&quot; gap from last syncpoint.
 +
 
 +
In seeking, dynamic index is used in the same manner as complete index. However, when deciding a syncpoint, the entire range of the syncpoint picked until the first syncpoint with a pts higher than requested is checked for having been already researched. If a &quot;hole&quot; is found anywhere in this range, dynamic index is disregarded and full binary search is performed.
 +
 
 +
=== End of Relevance (EOR) ===
 +
Behavior with linear search:
 +
* Treat EOR frames same as keyframes. Seeing an EOR frame for an active stream causes the stream to be ignored until the next keyframe of same stream.
 +
* If all active streams are EOR by end of linear search, just use start of linear search as final seeking position.
 +
 
 +
For index:
 +
* When picking the correct syncpoint to start linear search, ignore streams which have EOR with pts lower than requested pts and no keyframe with pts lower than requested pts.
 +
* If all active streams are EOR, pick the syncpoint immediately before the syncpoint with the last EOR across all streams with pts lower than requested pts.
  
 
[[Category:Container Formats]]
 
[[Category:Container Formats]]

Latest revision as of 07:01, 27 October 2012

NUT is a container format under construction by MPlayer, FFmpeg and Libav developers. Its main goals are

  • Simplicity: the format is simple to parse in code and easy to implement
  • Flexibility: able to encapsulate many types of audio, video, and subtitle codecs
  • Resilience: error-resistant while maintaining the smallest possible overhead

NUT seeking algo in libnut

Step 1 - Binary search and linear interpolation

Startup

Go to beginning of file if first syncpoint has not been found, and then go to EOF and search backward to find last syncpoint. Pick the closest possible syncpoints for requested pts from syncpoint cache, one higher and one lower. If requested pts is lower than first syncpoint or higher than last syncpoint, then seek to either of those syncpoints and end seek.

Binary search is ended when the entire region between the 2 syncpoints picked has been scanned. Which could even be before it has begun thanks to syncpoint cache.

Guess position

Best results have been achieved by combining both linear interpolation and binary search, giving most of the weight to interpolation.

hi, hi_pd
file position and timestamp of the top edge of the binary search.
lo, lo_pd
file position and timestamp of the bottom edge of the binary search.
time_pos
Requested pts.
guess
beginning of linear search for syncpoint.
#define INTERPOLATE_WEIGHT (19./20)
if (hi - lo < nut->max_distance*2) guess = lo + 16;
else { // linear interpolation
    double a = (double)(hi - lo) / (hi_pd - lo_pd);
    guess = lo + a * (time_pos - lo_pd);
    guess = guess * INTERPOLATE_WEIGHT + (lo+hi)/2 * (1 - INTERPOLATE_WEIGHT);
    if (hi - guess < nut->max_distance*2) guess = hi - nut->max_distance*2; //(lo + hi)/2;
}
if (guess < lo + 16) guess = lo + 16;

The first conditional prevents too much recursing when the distance is already low. The last conditional prevents an infinite loop of syncpoint search constantly finding lo and achieving nothing.

Binary search

Now, find a syncpoint, bounded between guess and hi. If it has a timestamp higher than requested pts, then replace hi and hi_pd, otherwise replace lo and lo_pd. If a syncpoint was not found between guess and hi, then, if guess is smaller or equal to lo + 16, then you have scanned the entire area between lo and hi, and your binary search is done. Otherwise, replace hi with guess and repeat the binary search.

After finding the 2 closest syncpoints bounding your requested pts, the bounds of linear are search are:

*start = lo_s.pos - (lo_s.back_ptr * 16 + 15);
*end = hi_s.pos;
*stopper = hi_s;

stopper is used later to end linear search prematurely. Note that it is the top edge of binary search, meaning, it has a timestamp higher than requested pts.

Step 2 - Linear search

Bounds of linear search

Linear search of course ends at end, however, it can end earlier:

  • When you find a frame of any stream, with a dts higher than requested pts.
  • When you find a frame for all active streams with a pts higher than requested pts.
  • When you find a syncpoint immediately following the syncpoint pointed to by stopper's back_ptr, (from here on this syncpoint will be called stopper_syncpoint) and all streams are active.
  • When you reach stopper_syncpoint, and you have not found a keyframe for any non-active between stopper's back_ptr and stopper_syncpoint, with a pts lower than stopper.timestamp. The reason for this is that stopper did not point because to stopper_syncpoint can only be because of active stream keyframes, and not non-active ones.
  • When you find a keyframe for all non-active streams after stopper_syncpoint with a pts lower than stopper.timestamp, which had a keyframe in the region in the previous condition. The logic is:
    • stopper.timestamp is higher than requested pts.
    • stopper_syncpoint is not the syncpoint pointed to by stopper.back_ptr, but the one immediately following.
    • If there was a keyframe for every non active stream after stopper_syncpoint, with a pts lower than stopper.timestamp, then the reason stopper does not point to stopper_syncpoint must be that there are no keyframes for active streams with a pts lower than stopper.timestamp after stopper_syncpoint.
    • There might be more keyframes in the region with a pts higher than stopper.timestamp, but those are irrelevant because they are higher than requested pts.

The last 3 conditions can be phrased differently:

  • When saying keyframes, only keyframes with a pts lower than stopper.timestamp are intended. Any other keyframes are irrelevant.
  • The region stopper.back_ptr to stopper_syncpoint has a set of keyframes between it. These keyframes are the ones responsible for stopper pointing to stopper.back_ptr and not stopper_syncpoint.
  • If all streams are active, then it's impossible for there to be better keyframes for all active streams between stopper_syncpoint to the end of the linear search, because if there was, stopper.back_ptr would point there.
  • If some streams are inactive, but none of them have a keyframe in the region specified above, then the keyframes responsible for stopper pointing to stopper.back_ptr are only from active streams, so again it is impossible there is a better position later in the linear search.
  • If some streams are inactive, and some of them do have a keyframe in the region specified above, then those keyframes are possibly responsible for stopper pointing to stopper.back_ptr. However, if we find more keyframes of those same inactive streams after the region, then they had nothing to do with stopper.back_ptr, and again the condition is met.

Linear search implementation

  1. Start at start.
  2. Search for syncpoint. If it is further than start + 15, then the file is errored, and you can start playing immediately from this syncpoint.
  3. Perform full demuxing, buffering everything read. If you encounter any NUT error during demuxing, seek back to the syncpoint after start and end seek.
  4. On every syncpoint, flush the buffer and cache the syncpoint position.
    • If this syncpoint is stopper_syncpoint, then do not flush the buffer, as it would be possible the linear search would end immediatelty afterwards and a second underlying seek would not be necessary.
  5. When finding keyframe of active stream with pts lower than requested pts, store position of keyframe. Discard previous value.
  6. End linear search as necessary by bounds given above.
  7. Pick the lowest value from positions of keyframes for active streams. This is the final position.
  8. Seek to closest syncpoint before the final position, preform a minimal demuxing until final position (to get proper last_pts context across all streams).
    • With luck this seek will not need an underlying seek, if you have not flushed the buffer since the requested position.
    • In most common case, stopper.back_ptr points to same syncpoint as the start of the linear search, as both syncpoints point to the same rare video keyframe.
    • If only video stream is active and the only non active stream is audio, you will most likely end your linear search immediately after stopper_syncpoint by finding an audio keyframe. Afterwards you will rewind right back to start, and, since buffer hasn't been flushed, this will not need an underlying seek.

Index

With index, binary search step is skipped, and a much more limited linear search used, between two immediate syncpoints. The syncpoint picked is the highest possible one that has a keyframe for each active stream past the syncpoint with a pts lower than requested pts. The end of the linear search is the syncpoint immediately following.

Dynamic Index

Create the index in the same manner as creating the index in the muxer, remembering keyframes and EOR at syncpoint positions. With one big difference - dynamic index can have gaps, thanks to seeking. Every syncpiont entry in the dynamic index has a flag indicating if it has a "unresearched region" gap from last syncpoint.

In seeking, dynamic index is used in the same manner as complete index. However, when deciding a syncpoint, the entire range of the syncpoint picked until the first syncpoint with a pts higher than requested is checked for having been already researched. If a "hole" is found anywhere in this range, dynamic index is disregarded and full binary search is performed.

End of Relevance (EOR)

Behavior with linear search:

  • Treat EOR frames same as keyframes. Seeing an EOR frame for an active stream causes the stream to be ignored until the next keyframe of same stream.
  • If all active streams are EOR by end of linear search, just use start of linear search as final seeking position.

For index:

  • When picking the correct syncpoint to start linear search, ignore streams which have EOR with pts lower than requested pts and no keyframe with pts lower than requested pts.
  • If all active streams are EOR, pick the syncpoint immediately before the syncpoint with the last EOR across all streams with pts lower than requested pts.