Interplay MVE

From MultimediaWiki
Jump to navigation Jump to search

Interplay MVE is a full motion video format used in a number of PC games published by Interplay. It combines a custom video codec and either PCM or a custom DPCM coding scheme for audio.


The technical description on this page is originally based on an anonymous and thorough description of the format published when Interplay was proactive about pursuing individuals who tried to understand their data format. "BG" refers to the PC game Baldur's Gate, the apparent focus of the author's analysis.

Interplay MVE Format

Throughout this description, a "word" is a 16-bit value. All values are little-endian, unless otherwise specified.

The high-level format of an Interplay MVE file is a small header, followed by variable sized stream chunks; each stream chunk consists of a word giving the length of the chunk, and another giving the type, followed by a stream of 1 or more stream opcodes, which consist of a two-word count for the length of the stream opcode, a single byte for the type, a single byte (which I believe to be a "version" field, to allow backwards compatibility ), and then variable data depending on the type of opcode.

So, just to make sure that's clear, we've got the header, followed by a 2-level hierarchical structure:

        ||           CHUNK1         ||           CHUNK2         ||
header  || op1 || op2 || op3 || op4 || op1 || op2 || op3 || op4 || ...


The Header of an Interplay MVE file must start with the sequence of bytes:

"Interplay MVE File\x1A\0"

where \x1A represents ASCII 0x1a (^Z), the old DOS end-of-file character, and \0 represents ASCII 0x00 (NUL). The reason for this is then, under DOS, if you do:

C:\>TYPE foobar.MVE

you'll see

Interplay MVE File

After these 20 bytes, there are 6 more bytes, which I believe are either a file format version, a "magic" number, or were, once upon a time, parameters. In modern Interplay games, these parameters appear to need to be hard-coded. They take the form of 3 words:

001a 0100 1133

Immediately following this are the chunks.


Each chunk consists of a word giving the total length of the data contained in the chunk, and another word which represents the type of the chunk. After these four bytes (which are NOT included in the chunk length), comes the chunk data. The types of chunks I know about (i.e. which are used in BG/BG2 movies that I've examined; the chunk types are not used at all in the movie playback code in BG/BG2) are:

0000: initialize audio
0001: audio only chunk (or maybe only used for audio pre-buffering)
0002: initialize video
0003: video chunk (usually includes audio.  possibly always includes audio)
0004: shutdown chunk
0005: end chunk

I don't know why an "end chunk" is needed, since the "shutdown chunk" seems to do that job nicely. The "end chunk" appears to contain no opcodes.


The opcodes I've observed range from 0x00 to 0x15. Of these, the current code used in BG/BG2 uses only from 0x00 through 0x11, so any guesses as to the function of 0x12 through 0x15 would be merely speculation. I have no idea what any of these opcodes are used for. But, again, since they are unused in the BG/BG2 movie player code, they are unnecessary for playback.

Opcode 0x00: End Of Stream

No data associated with this. When this opcode is seen, the playback of the movie stops immediately.

Opcode 0x01: End Of Chunk

All this opcode does in theory is to terminate a chunk. In practice, it signals the code to fetch and decode the next chunk.

Opcode 0x02: Create Timer

DWORD   timer rate
WORD    timer subdivision

This sets up the timer that drives the animation. Basically, every time the timer expires, it should be starting to pump out the next frame in order to keep up with the desired frame rate.

The normal values I've seen here are 0x2095 for the timer rate (8341), and 8 for the timer subdivision. What this means in practice is that every 8*8341 (=66728) microseconds, it should be ready to send out the next frame. So... 10000000/66728 == 14.9 frames per second typically. The exact purpose for the timer subdivision is unclear to me, but it may be an artifact of earlier methods of timer handling, since some of the code here appears to possibly even date back to the DOS days.

Opcode 0x03: Initialize Audio Buffers

version 0:

   WORD    (unknown)
   WORD    flags
   WORD    sample rate
   WORD    min buffer length

version 1:

   WORD    (unknown)
   WORD    flags
   WORD    sample rate
   DWORD   min buffer length

The flags recognized, as of version 0 are:

   bit 0: 0=mono,  1=stereo
   bit 1: 0=8-bit, 1=16-bit

The flags recognized, as of version 1 are:

   bit 0: 0=mono,  1=stereo
   bit 1: 0=8-bit, 1=16-bit
   bit 2: 0=uncompressed, 1=compressed

Only uncompressed audio is supported in the version 0 opcode. I think the other 13 bits (14 bits for ver. 0) here may be garbage. Whatever they are, they are not apparently used for the playback engine inside BG/BG2.

The sample rate is the standard sampling rate in kHz; typically 22050 in the BG movies. Buffer length is the size (in bytes) of the buffer that needs to be allocated for the audio. (I don't remember if this is the total audio buffer size needed, or if this number is a "per-channel" number that needs to be doubled for "stereo" audio streams.) They use 1.5 times the original buffer size in order to have a "safety zone".

I will cover the format of the compressed audio data in the audio data opcode section.

Opcode 0x04: Start/Stop Audio

This seems to start and/or stop the audio playback. This opcode contains no data.

Opcode 0x05: Initialize Video Buffer(s)

version 0:

   WORD    width
   WORD    height

version 1:

   WORD    width
   WORD    height
   WORD    ?count?

version 2:

   WORD    width
   WORD    height
   WORD    ?count?
   WORD    true-color

(I think width and height are actually expressed as 8x8 pixel blocks. Need to verify. --Multimedia Mike 13:52, 5 February 2006 (EST))

Width is the width of the buffer to allocate, and height is the height. Both are given in terms of pixels. Now, the count appears to be used to over-allocate the video buffer. To compute the size to allocate for the video buffer, they take 2 bytes per pixel, and multiply by the height and the width, and then multiply by the count. If scan-line doubling is enabled (which it is not in BG/BG2), it then divides this value by two, on the assumption that there is only enough data for half the resolution. Anyway, the over-allocation may be used to create a larger movie area and pan smoothly or something. I haven't seen a way to use the overallocation with the format details that I've discerned, but the video coding is particularly hairy, as the decoder relies on self-modifying x86 code to function. Yick. Anyway, I'm still in the process of looking for a file that uses over-allocation so that I can figure out exactly why it is used and what it is used for. (Again, this feature doesn't appear to be widely used in the sampling of BG/BG2 movies that I've examined.) Note that an alternate possibility for the usage of the over-allocated space is as scratch space. This possibility will be addressed in the (voluminous!) documentation for opcode 0x11.

Opcode 0x06: unknown

   4 bytes apparently unused?
   WORD    unknown
   WORD    unknown
   WORD    unknown
   WORD    flip back buffer? (0=no, 1=yes)
   bytes   unknown

I haven't seen this opcode used in any BG/BG2 movies; however, this may be used for the panning or some clever usage of the over-allocation mentioned in opcode 0x05. If "flip back buffer?" has bit 0 set, it will flip the two allocated buffers before it does whatever it is that it does.

The "whatever it does" appears to be characterized by bulk memory moves, which makes it possible that it _is_ used in conjunction with the over-allocated video buffers.

No "version" check is made for this opcode, which makes me suspect that there is only 1 supported version of this opcode. (version 0, presumably)

Opcode 0x07: Send Buffer to Display

version 0:

   WORD    palette start
   WORD    palette count

version 1:

   WORD    palette start
   WORD    palette count
   WORD    ???

palette start is the index of the first palette entry to be installed before copying from the current back buffer to the display. palette count is the number of palette entries to be installed. As for the mysterious other flag... I am still unclear on its usage. Again, I've seen no example of its usage yet.

Opcode 0x08: Audio Frame (data)/Opcode 0x09: Audio Frame (silence)

   WORD    seq-index
   WORD    stream-mask
   WORD    stream-len
   data    audio data (only for Opcode 0x08)

seq-index is the sequential index of this audio chunk, numbered from 0000 (0000 being the first chunk in the audio file). stream-mask works as follows:

A given mve file can contain up to 16 parallel audio streams. Presumably this is for alternate languages. The stream-mask determines which stream(s) a given audio chunk belongs to. So, if bit 0 is set in the stream-mask, it belongs to stream 0. Typically, in the English language version of BG, I've seen the sole audio frame (opcode 8) having bit 0 set, and the next silent frame having all 15 of the other bits set.

So, just to make this clear, what we see is:

   opcode 8: idx=0 mask=0x0001 len=0x16d8 data=...
   opcode 9: idx=0 mask=0xfffe len=0x16d8

These audio chunks appear to always come in pairs.

stream-len is the total number of samples in the chunk.

For more information on the DPCM format used in Interplay MVE files, see Interplay DPCM.

Opcode 0xa: Initialize Video Mode

   WORD    X-resolution
   WORD    Y-resolution
   WORD    flags

The usage of the flags field appears to be largely historical. Perhaps with the introduction of DirectX as the underlying medium, rather than the direct graphics hardware manipulation that was, apparently, used in an earlier version, this field is unnecessary. (In fact, in BG, this entire opcode turns into a no-op.) (Note, for the curious: BG actually contains assembly code to do register level manipulation of VGA hardware. Not enough to actually really do much, but it's there, anyway.

Opcode 0xb: Create Gradient

   BYTE    baseRB
   BYTE    numR_RB
   BYTE    numB_RB
   BYTE    baseRG
   BYTE    numR_RG
   BYTE    numG_RG

I haven't seen this particular opcode used, but it is clear that it generates a gradient palette. It appears that it will generate two gradient palettes, if both count0 and count1 are non-zero. The first gradient is a pure red-blue gradient, and the second a pure red-green gradient. It appears to be designed for EGA/VGA hardware, since it uses 0-63 as the maximum range for a component within a color. The red component of each gradient moves linearly from 0 to 63 within numR_RB (resp. numR_RG) rows, and the blue or green component moves linearly from 0 to 39 within numB_RB (resp numG_RG) columns for the blue or green gradient respectively.

The colors are ordered in row-major ordering, starting at the 'base'th entry. So, if you had:


You'd get 20 colors starting at index #12, with a row-major gradient. Specifically you'd see:

   ( 0,0,0) ( 0,0,13) ( 0,0,26) ( 0,0,39)  ; 12...15
   (15,0,0) (15,0,13) (15,0,26) (15,0,39)  ; 16...19
   (31,0,0) (31,0,13) (31,0,26) (31,0,39)  ; 20...23
   (47,0,0) (47,0,13) (47,0,26) (47,0,39)  ; 24...27
   (63,0,0) (63,0,13) (63,0,26) (63,0,39)  ; 28...31

Opcode 0xc: Set Palette

   WORD    pal-start
   WORD    pal-count
   data    pal-data
  • pal-start indicates the first palette entry to fill
  • pal-count indicates the number of palette entries to fill
  • pal-data is the palette data, 3 bytes per palette entry, packed as RGBRGBRGB

Note that the palette components are 6-bit VGA palette values and only range from 0..63. The values must be shifted to display properly in higher resolution color formats where RGB components are generally 8 bits.

Opcode 0xd: Set Palette Entries Compressed

   data    compressed palette data

This doesn't appear to have been used in the BG movies. This is a series of 32 entries of the following form:

   <byte> <RGB> <RGB> ... <RGB>

Where there are between 0 and 8 <RGB> values, taking 3 bytes apiece.

Each bit in the preceding byte determines which of the 8 palette entries have an RGB value stored for them, with the least significant bit corresponding to the first entry in the group of 8. So, in order to set only the 240th entry in the palette, the data would be:

   00              ;; 00-07
   00              ;; 08-0f
   00              ;; 10-17
   00              ;; 18-1f
   00              ;; 20-27
   00              ;; 28-2f
   00              ;; 30-37
   00              ;; 38-3f
   00              ;; 40-47
   00              ;; 48-4f
   00              ;; 50-57
   00              ;; 58-5f
   00              ;; 60-67
   00              ;; 68-6f
   00              ;; 70-77
   00              ;; 78-7f
   00              ;; 80-87
   00              ;; 88-8f
   00              ;; 90-97
   00              ;; 98-9f
   00              ;; a0-a7
   00              ;; a8-af
   00              ;; b0-b7
   00              ;; b8-bf
   00              ;; c0-c7
   00              ;; c8-cf
   00              ;; d0-d7
   00              ;; d8-df
   01 rr gg bb     ;; e0-e7
   00              ;; e8-ef
   00              ;; f0-f7
   00              ;; f8-ff
       00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
       00 00 00 00 00 00 00 00 00 00 00 00 01 rr gg bb
       00 00 00
   35 bytes of data instead of 768 to store the whole palette.

Opcode 0xe: ???

   data    unknown length

I haven't encountered this value before. What it does is set a pointer to an array of words used during decoding of data using the 0x10 opcode, which I have also not encountered.

I'm still working on figuring out the use of this opcode and the 0x10 opcode, but they don't appear to be used in the BG movies, again. See my comments at opcode 0x10 for more details.

Opcode 0xf: Set Decoding Map

   data    decoding map

The decoding map is a particular data block used in the decoding of video frames, as encoded via opcode 0x11. I'll cover it in detail when I get to opcode 0x11.

Opcode 0x10: ???

This is another means of storing video data. I haven't seen it used yet, and am still sorting through the details. This seems to be tied in with the issue of multiple pages of video memory, as with opcode 6 and the "count" field of opcode 5.

Note that this opcode makes use of 3 (!) data streams, as opposed to 2 for 0x11. Even so, it appears to be a much simpler encoding. The data streams used for this are the most recent 0xe opcode data stream, the most recent 0xf opcode data stream, and this opcode's data stream.

There appears to be verbatim pixel data encoded in the 0x10 stream, but the which pixels have been stored, among other things, is determined by the other streams. It also appears that in this stream all pixel manipulation is done in 8-pixel wide and 8-pixel tall units. This is set-up to loop first over each column, then over each row, then finally over each page:

       foreach page
           foreach row
               foreach col
                   decode opcode data

If I can find an example of one of these files to mess around with, I will complete my analysis of this opcode.

Opcode 0x11: Video Data

Ok, this is the big killer opcode. See Interplay Video for a detailed description of the video coding format.

Opcode 0x12

Not observed in BG movies, and not used by the player.

Opcode 0x13

Unknown. Used in the BG movies, but not used by the player. Appears to always(?) have 0x84 bytes of data. This is a recurrent opcode, appearing in most, if not all video chunks.

Opcode 0x14

Not observed in BG movies, and not used by the player.

Opcode 0x15

Unknown. Used in the BG movies, but not used by the player. Appears to always(?) have 4 bytes of data. This one appears in the "video initialization chunk"

Typical Chunk Formation

Audio chunks:

   opcode 0x8
   opcode 0x9

Video chunks:

   opcode 0x2
   opcode 0xf
   opcode 0x8
   opcode 0x9
   opcode 0x11
   opcode 0x13
   opcode 0x4
   opcode 0x7

video init chunk (type 2):

   opcode 0xa
   opcode 0x5
   opcode 0xc
   opcode 0x15

audio init chunk (type 0):

   opcode 0x3

PC Games Using Interplay MVE Files