Header consists of many parts and may include quantization tables and 2048 bits of user data. Also each frame has two GUIDs and timestamp. The frame header is packed into big-endian dwords.
Actual frame data consists of packed macroblocks with technique almost identical to JPEG: DC prediction and variable-length codes with run length encoding for other 63 coefficients.
DC coefficient is not quantized.
SMPTE standardized it as VC3.