Tomb Raider is driven by various sets of files — level files, script files, FMVs, audio tracks and sound files. In TR4 and TR5, there is also specific file type which contains cutscene data — cutseq pack.
The script file structure differs from version to version.
In TR1, all script info was embedded into executable file (TOMB.EXE
), and thus is hardcoded. TR2 and TR3 had unified TOMBPC.DAT
file, which contains all the text strings describing the various elements in the game (e.g. the game engine knows about “Key 1”; it looks in TOMBPC.DAT
to determine the name to be displayed in Lara’s inventory, such as “Rusty Key” or “Taste rostige” or “Clé Rouillée”), the level and cut-scene filenames (e.g. WALL.TR2
, CUT3.TR2
), the order in which they are to be played, and various per-level and per-game configuration options (e.g. what weapons and objects Lara starts the level with, whether or not the “cheat” codes work, etc.).
TR4 and TR5 introduced a new script format, where the actual script defining the gameflow was separated from text strings used in game — hence, both TR4 and TR5 have two .DAT
files — SCRIPT.DAT
and LANGUAGE.DAT
, where LANGUAGE
differs depending on regional origin of the game — US.DAT
, FRENCH.DAT
, JAPANESE.DAT
, and so on.
The level files, {level-name}.PHD/TUB/TR2/TR4/TRC
, contain everything about the level, including the geographical geometry, the geometry (meshes) of all animate and inanimate objects in the level, all the textures and colour data, all animation data, index information (and, in TR1, TR4 and TR5 — also the actual sound sample data) for all sounds, accessibility maps — everything necessary to run the game. For whatever reason, Core has included everything in one file instead of breaking it up into logical groupings; this means that every level contains all the meshes, textures, sound information, and animation data for Lara and all of her weapons. There are a fair number of other redundancies, too.
Since TR4, the level file is divided into several chunks, each of them being compressed with zlib. Usually, each chunk of compressed data is preceded by two 32-bit unsigned integers defining the uncompressed size of the chunk and the compressed size of the chunk. Therefore, the engine allocates an empty buffer equal to the uncompressed size of a specific chunk, and another buffer equal to the compressed size. The compressed data is loaded directly within it based on the compressed size. The compressed data is then decompressed into the result buffer and the buffer containing the compressed data is destroyed. In TR5, those chunks aren’t compressed anymore.
.PHD
is actually the initials of the Lead Programmer for Tomb Raider 1: Paul Howard Douglas. Looks like this programmer contributed a lot of the code during early development stages of Tomb Raider. This is suggested because the phd initials also became a prefix for several helper functions in the original source code, for instance: phd_sin
, phd_cos
etc. Most likely, he was also responsible for developing the level file structure for Tomb Raider.
TR1-3 shared the same proprietary Eidos codec for videos, called Escape. The extension for such files is .RPL
, that’s why they occasionally (and mistakingly) called Replay codec. Signature feature of RPL videos is that they are always interlaced with black stripes; most likely, this was used to conserve disk space (however, PlayStation videos were in .STR
format, which is basic MPEG compression, and they had no interlacing — but suffered from blocking issues). In TR1 and TR2, framerate was limited to 15 FPS, while in TR3 it was doubled to 30 FPS.
For a long time, Escape codec was largely unexplored and barely reverse-engineered; there was only an abandoned open source Mplayer implementation for some Escape codec versions, but recent ffmpeg revisions feature fully functional decoder for Escape videos.
Since TR4, all FMVs are in Bink Video format, which is much more common and easy to rip, convert and explore.
These are long sound files which occasionally play either on some in-game events (e.g. approaching certain important checkpoint in game, like big hall with ladder and two wolves in “Caves” — it triggers danger music theme) or in looped manner as background ambience. Audio tracks are stored differently across TR game versions — CD-Audio in TR1-TR2, single merged file CDAUDIO.WAD
in TR3, and separate audio files in TR4 and TR5.
TR2 and TR3 also featured external sound sample files, which allowed to share samples between all level files. This sound file is called MAIN.SFX
, and usually placed in DATA
subfolder. Hence, engine loads sound samples not from level files (as it’s done in TR1, TR4 and TR5 — see above), but rather from this MAIN.SFX
file.
TR4 and TR5 featured special data type containing all the necessary information to play in-game cutscenes. While in earlier games such info was embedded into the level file itself, and generally, cutscenes themselves were separate level files (easily distinguished by their filenames, e.g. CUT1.TR2
etc.), TR4 changed this approach, and cutscenes could be loaded and played right inside level files at runtime.
The data for such cutscene setup was packed into single file titled CUTSEQ.PAK
in TR4 or CUTSEQ.BIN
in TR5. There will be a special section describing whole cutseq file format.
For the purposes of further discussion, the following are assumed:
int8_t | specifies an 8-bit signed integer (range -128..127) |
uint8_t | specifies an 8-bit unsigned integer (range 0..255) |
int16_t | specifies a 16-bit signed integer (range -32768..32767) |
uint16_t | specifies a 16-bit unsigned integer (range 0..65535) |
int32_t | specifies a 32-bit signed integer (range -2147483648..2147483647) |
uint32_t | specifies a 32-bit unsigned integer (range 0..4294967295) |
float | specifies a 32-bit IEEE-754 floating-point number |
fixed | specifies a 32-bit non-trivial 16.16 fixed point value — see further |
ufixed16 | specifies a 16-bit non-trivial 8.8 fixed point value — see further |
All multi-byte integers ({u}int16_t
, {u}int32_t
) are stored in little-endian (Intel-x86, etc.) format, with the least significant byte stored first and the most significant byte stored last. When using this data in platforms with big-endian (PowerPC, etc.) number format, be sure to reverse the order of bytes.
These very specific data types mimic floating-point behaviour, while remaining integer. It is done by splitting floating-point value into whole and fractional parts, and keeping each part as int16_t
and uint16_t
correspondingly for `fixed` type and as uint8_t
and uint8_t
for ufixed16
type. Whole part is kept as it is, while fractional part is multiplied by 65536 (for fixed
) or by 255 (for ufixed16
), and then kept as unsigned integer. So, the formula to calculate floating-point from fixed
is:
$F_{real} = P_{whole} + ( P_{frac} \div 65536 )$
Formula to calculate floating-point from ufixed16
is:
$F_{real} = P_{whole} + ( P_{frac} \div 255 )$
…where $P_{whole}$ is whole part of mixed float (signed for fixed
, unsigned for ufixed16
), and $P_{frac}$ is fractional part (unsigned).
However, some internal variables and constants (like drawing distance, fog distance constants and some light properties) are PC-specific and stored in floating point numbers. Also, last game in series, TR5, extensively used floating-point numbers for certain data types – like colours, vertices and coordinates.
Data alignment is something one has to be careful about. When some entity gets an address that is a multiple of $n$, it is said to be $n$-byte aligned. The reason it is important here is that some systems prefer multibyte alignment for multibyte quantities, and compilers for such systems may pad the data to get the “correct” alignments, thus making the in-memory structures out of sync with their file counterparts. However, a compiler may be commanded to use a lower level of alignment, one that will not cause padding. And for TR’s data structures, 2-byte alignment should be successful in nearly all cases, with exceptions noted below.
To set single-byte alignment in any recent compiler, use the following compiler directive:
#pragma pack(push, 1)
To return to the project’s default alignment, use the following directive:
#pragma pack(pop)
The world coordinate system is oriented with the $X-Z$ plane horizontal and $Y$ vertical, with $-Y$ being “up” (e.g. decreasing $Y$ values indicate increasing altitude). The world coordinate system is specified using int32_t
values; however, the geography is limited to the $+X$/$+Z$ quadrant for reasons that are explained below. Mesh coordinates are relative and are specified using int16_t
.
There are some additional coordinate values used, such as “the number of 1024-unit blocks between points A and B”; these are simply scaled versions of more conventional coordinates.
All colours in TR are specified either explicitly (using either the [tr_colour] structure, described below, 16-bit structures or 32-bit structures) or implicitly, by indexing one of the palettes. However, it is only applicable to TR1-3 — there is no palette in TR4 and TR5.
In TR1-3, mesh surfaces could be either coloured or textured. Coloured surfaces are “painted” with a single colour that is either specified explicitly or using an index into the palette.
Beginning from TR4, coloured faces feature was removed, so each face must have a texture attached to it.
Textured surfaces map textures from the texture atlases (textiles) to each point on the mesh surface. This is done using conventional UV mapping, which is specified in “Object Textures” below; each object texture specifies a mapping from a set of vertices to locations in an atlas, and these texture vertices are associated with position vertices specified here. Each atlas has a size of 256×256 pixels.
The 16-bit atlas array, which contains [tr_image16] structures, specifies colours using 16-bit 1-5-5-5 ARGB encoding.
If, for some reason, 16-bit textures are turned off, all colours and textures use an 8-bit palette that is stored in the level file. This palette consists of a 256-element array of [tr_colour] structures, each designating some colour; textures and other elements that need to reference a colour specify an index (0..255) into the Palette[]
array. There is also a 16-bit palette, which is used for identifying colours of solid polygons. The 16-bit palette contains up to 256 four-byte entries; the first three bytes are a [tr_colour], while the last byte is ignored (set to 0).
The 32-bit texture atlas array, which contains [tr4_image32] structures, specifies colours using 32-bit ARGB, where the alpha channel is unused. The 16-bit and 32-bit texture atlas arrays depict the same graphics data, but of course the 32-bit array has a better colour resolution. It’s the one used if you select a 32-bit A8R8G8B8 texture format in the setup menu from TR4 and TR5.
There are two basic types of “visible objects” in TR2 — meshes and sprites.
Meshes are collections of textured or coloured polygons that are assembled to form a three-dimensional object (such as a tree, a tiger, or Lara herself). The “rooms” themselves are also composed of meshes. Mesh objects may contain more than one mesh; though these meshes are moved relative to each other, each mesh is rigid.
Sprites are two-dimensional images that are inserted into three-dimensional space, such as the “secret” dragons, ammunition, medi-packs, etc. There are also animated sprite sequences, such as the fire at the end of “The Great Wall.” Core had presumably used this method to reduce CPU utilization on the PlayStation and/or the earlier PCs. Sprites become less and less abundant; TR2 has very few scenery sprites, and TR3’s pickups are models instead of sprites.
Each Tomb Raider game has an internal hardcoded set of entity types, each of them linked to specific model (hence, entity type and model can be considered equal). Entity is an individual object with its own specific function and purpose. Almost every “moving” or “acting” thing you see is an entity — like enemies, doors, pick-up items, and even Lara herself.
A level can contain numerous instances of the same entity type, e.g. ten crocodiles, five similar doors and switches, and so on.
Entities are referenced in one of two ways — as an offset into an array (e.g. Entities[i]
) or internally, using an unique index . In the latter case, the related array (Entities[]
) is searched until a matching index is found. Each entity also refers to its entity type
by TypeID
to select behaviour and model to draw. In this case, Models[]
array is searched for matching TypeID
until one found.
There are three basic types of animations in TR, two corresponding with textures — sprite animations and animated textures — and one corresponding directly with meshes.
Sprite animation (sprite sequences) consists simply of a series of sprites that are to be displayed one after another, e.g. grenade explosions. Sprite animations were quite common in earlier games (TR1 and TR2), while in TR3 onwards there are almost no sprite animations — only notable example is fire particle sprites and water splash effect.
These are either a list of textures cycled through in endless loop, or (in TR4-5) a single texture with shifting coordinates, creating an illusion of “rolling” image.
Mesh animations are much more complex than sprite and texture animations, and done by what is essentially a skeletal-modeling scheme. These involve some arrays (Frames[] and MeshTree[]) of offsets and rotations for each element of a composite mesh. Frames are then grouped into an array (Animations[]) that describes discrete “movements”, e.g. Lara taking a step or a tiger striking with its paw. The animations are “sewn together” by a state change array and an animation dispatch array, which, together with state information about the character, ensure that the animation is fluid (e.g. if Lara is running and the player releases the RUN key, she will stop; depending upon which of her feet was down at the time, either her left or right foot will strike the floor as part of the “stop” animation. The correct animation (left foot stop vs. right foot stop) is selected using these structures and the state information).
There are two main types of lighting in Tomb Raider, constant and vertex. Constant lighting means that all parts of an object have the same illumination, while in vertex lighting, each polygon vertex has its own light value, and the illumination of the polygon interiors is interpolated from the vertex values.
Furthermore, lighting can be either internal or external. Internal lighting is specified in an object’s data, external lighting is calculated using the room’s light sources (ambient light, point light sources, spotlights (TR4-5), dynamic lights).
When available, external lighting also uses the vertex normals to calculate the incoming light at each vertex. Light intensities are described either with a single value or with a 16 bits color value (you can see it more like a “color filter”), depending mainly on the TR version.
Light intensities are described with a single value in TR1 and a pair of values in TR2 and TR3; the paired values are almost always equal, and the pairing may reflect some feature that was only imperfectly implemented, such as off/on or minimum/maximum values. In TR1 and TR2, the light values go from 0 (maximum light) to 8192 (minimum light), while in TR3, the light values go from 0 (minimum light) to 32767 (maximum light).
There are two ways for sound samples to play.
First one is basically sound emitter sitting at a static global position in level, and continuously emitting specified sound (such as waterfalls — these are in SoundSources[]
). Second one is triggered sounds — these are sounds played when some event happens, such as at certain animation frames (footsteps and other Lara sounds), when doors open and close, and when weapons are fired.
Either way, each played sound is referred to using a three-layer indexing scheme, to provide a maximum amount of abstraction. An internal sound index references SoundMap[]
, which points to a SoundDetails[]
record, which in turn points to a SampleIndices[]
entry, which in turn points to a sound sample. SoundDetails[]
, contains such features as sound intensity, how many sound samples to choose from, among others. The sound samples themselves are in Microsoft WAVE format, and, as already mentioned, they are embedded either in the data files (TR1, TR4 and TR5) or in a separate file (MAIN.SFX
) in TR2 and TR3.
Much of the .TR2 file is comprised of structures based on a few fundamental data structures, described below.
This is how most colours are specified.
struct tr_colour // 3 bytes { uint8_t Red; // Red component (0 -- darkest, 255 -- brightest) uint8_t Green; // Green component (0 -- darkest, 255 -- brightest) uint8_t Blue; // Blue component (0 -- darkest, 255 -- brightest) };
(Some compilers will pad this structure to make 4 bytes; one must either read and write 3 bytes explicitly, or else use a simple array of bytes instead of this structure.)
And as mentioned earlier, the 16-bit palette uses a similar structure:
struct tr_colour4 // 4 bytes { uint8_t Red; uint8_t Green; uint8_t Blue; uint8_t Unused; };
In TR5, there is new additional colour type composed of floating-point numbers. This type is primarily used in light structures.
struct tr5_colour // 16 bytes { float Red; float Green; float Blue; float Unused; // Usually filler value = 0xCDCDCDCD };
This is how vertices are specified, using relative coordinates. They are generally formed into lists, such that other entities (such as quads or triangles) can refer to them by simply using their index in the list.
struct tr_vertex // 6 bytes { int16_t x; int16_t y; int16_t z; };
As with colours, TR5 introduced additional vertex type comprised of floating-point numbers:
struct tr5_vertex // 12 bytes { float x; float y; float z; };
Four vertices (the values are indices into the appropriate vertex list) and a texture (an index into the object-texture list) or colour (index into 8-bit palette or 16-bit palette). If the rectangle is a coloured polygon (not textured), the .Texture element contains two indices: the low byte (Texture & 0xFF
) is an index into the 256-colour palette, while the high byte (Texture >> 8
) is in index into the 16-bit palette, when present. A textured rectangle will have its vertices mapped onto all 4 vertices of an object texture, in appropriate correspondence.
struct tr_face4 // 12 bytes { uint16_t Vertices[4]; uint16_t Texture; };
Texture
field can have the bit 15 set: when it is, the face is double-sided (i.e. visible from both sides).
If the rectangle is a coloured polygon (not textured), the .Texture element contains two indices: the low byte (Texture & 0xFF
) is an index into the 256-colour palette, while the high byte (Texture >> 8
) is in index into the 16-bit palette, when present.
TR4 and later introduced an extended version only used for meshes, not for triangles and quads making rooms:
struct tr4_mesh_face4 // 12 bytes { uint16_t Vertices[4]; uint16_t Texture; uint16_t Effects; };
The only difference is the extra field Effects
. It has this layout:
Attribute
field of [tr_object_texture] is 2, but this flag overrides it).Bit 0 set, blending enabled | Bit 0 not set, blending disabled |
Shiny effect at max | No shiny effect |
These structures has the same layout than the quad face definitions, except a textured triangle will have its vertices mapped onto the first 3 vertices of an object texture, in appropriate correspondence. Moreover, a triangle has only 3 vertices, not 4.
struct tr_face3 // 8 bytes { uint16_t Vertices[3]; uint16_t Texture; };
struct tr4_mesh_face3 // 10 bytes { uint16_t Vertices[3]; uint16_t Texture; uint16_t Effects; // TR4-5 ONLY: alpha blending and environment mapping strength };
All the info about Texture
and Effects
fields is also similar to same info from [tr_face4] and [tr4_mesh_face4] respectively.
Each uint8_t
represents a pixel whose colour is in the 8-bit palette.
struct tr_image8 // 65536 bytes { uint8_t pixels[256 * 256]; };
Each uint16_t
represents a pixel, encoded as 1-5-5-5 ARGB.
struct tr_image16 // 131072 bytes { uint16_t pixels[256 * 256]; };