This is an old revision of the document!
Table of Contents
Music and Sound
As it was described in the beginning, all sound in TR engines can be separated into two distinctive sections — audio tracks and sounds.
Audio tracks are long separate files which are loaded by streaming, and usually they contain background ambience, music or voiceovers for cutscenes.
Sounds are short audio samples, which are frequently played on eventual basis. While engine can usually play one audio track at a time, it can play lots of sounds in the same time, and also play numerous copies of the same sound (for example, when two similar enemies are roaming around).
Audio Tracks
Audio tracks can be looped or one-shot. Looped tracks are usually contain background ambience (these creepy sounds heard in the beginning of “Caves” and so on), but occasionally they can use music (e. g., “Jeep theme” from TR4). One-shot tracks are used for musical pieces which are usually triggered on certain event, and also for “voice chatting” (e. g. approaching the monk from “Diving Area” in TR2).
As both looped and one-shot tracks use the same audio playing routine, there’s no chance both looped and one-shot tracks could be played simultaneously. This is the reason why background ambience stops and restarts every time another (one-shot) track is triggered. However, this limitation was lifted in TREP and TRNG.
The audio tracks are stored in different fashions in the various versions of the TR series:
TR1 and TR2
TR1 and TR2 used CD-Audio tracks for in-game music, and therefore, they needed auxiliary CD-audio fed into the soundcard. That’s the reason why most contemporary PCs have issues with audiotrack playback in these games — such setup is no longer supported in modern CD/DVD/BD drives, and digital pipeline is not always giving the same result. Currently, various modernized game repacks (such as Steam or GOG releases) officially features no-CD cracks with embedded MP3 player, which takes place of deprecated CD-Audio player.
TR3
In TR3, we have somewhat special audiotrack setup. Audio format was changed to simple WAV (MS-ADPCM codec), but all tracks were embedded into CDAUDIO.WAD
file, which also contained a header with a list of all tracks and their durations. So, when game requests an audiotrack to play, it takes info on needed track from CDAUDIO.WAD
header, and then goes straight to an offset for this track into it. The format of CDAUDIO.WAD
header entry is:
struct tr3_cdaudio_entry // 0x10C bytes { char Name[260]; // C string with track name uint32_t WavLength; // Wave file size uint32_t WavOffset; // Absolute offset in CDAUDIO.WAD };
The number of header entries is always 130 (meaning the whole size of header should be 0x10C * 130
). Header is immediately followed by embedded audio files in WAV format.
CDAudio.db
, which contains the names of all the track files as 32-byte zero-padded C strings with no extra contents.
TR4 and TR5
In TR4-5, track format remained the same (MS-ADPCM), but tracks were no longer embedded into CDAUDIO.WAD
. Instead, each track was saved as simple .WAV
file, and file names themselves were embedded into executable. Hence, when TR4-5 plays an audiotracks, it refers to internal filename table, and then loads an audiotrack with corresponding name.
Sounds
In TR engines, sounds appear in a variety of contexts.
They can be either continuous or triggered. Continuous ones are usually produced by sound source object, which make sound in a range around some specific point (range appears to be hardcoded, and is equal to around 8 sectors). Likewise, triggered ones can be triggered by a variety of events. The triggering can be hardcoded in the engine (for example, gunshots) or by reaching some animation frame (footsteps, Lara’s somewhat unladylike sounds). Flipeffects can also be used to play certain sound sample in specific circumstances, and either called by triggers or from animations.
Sounds may be looped or one shot, with looped ones playing until specifically untriggered by some in-game event. For example, Uzi and M16 sounds are looped and will be played until Lara stops firing these weapons — in such case, engine sends a command to stop this particular sound.
Sounds may be global or local. Global sounds have no 3D positioning in world space, they always have constant volume and usually not linked to any object in level. These are primarily menu sounds. Local sounds are usually produced by entities or sound source objects. They have certain coordinates in space, and could be affected by their position and environment. Local sounds, when emitted by entities, may travel along with these entities.
Sounds are referred to by an internal sound index; this is translated into which sound sample with the help of three layers of indexing, to allow for a suitable degree of abstraction. Internal sound indices for various sounds are consistent across all the level files in a game; a gunshot or a passport opening in one level file will have the same internal sound index as in all the others. The highest level of these is the SoundMap[]
array, which translates the internal sound index into an index into SoundDetails[]
. Each SoundDetails
record contains such details as the sound volume, pitch and volume randomization, looping configuration, how many samples to select from, and an index into SampleIndices[]
. This allows for selecting among multiple samples to produce variety; that index is the index to the SampleIndices[]
value of first of these, with the rest of them being having the next indices in series of that array. Thus, if the number of samples is 4, then the TR engine looks in SampleIndices[]
locations Index, Index+1, Index+2, and Index+3. Finally, the SampleIndices[]
array references some arrays of sound samples.
Additionally, TR4 and TR5 introduced usage of MS-ADPCM codec to store sample data, which was compressed and loaded into buffers on level loading. However, TR4 engine version bundled with TRLE used uncompressed wave data, as in TR1-TR3.
In TR1, TR4 and TR5 samples themselves are embedded in the level files. In TR1, SampleIndices[]
array contains the displacements of each sample in bytes from the beginning of that embedded block. In TR4 and TR5, way to access sample data was changed — each sample data is preceded by uncompressed size and compressed size uint32_t
values, which are used to extract given sample and load it into DirectSound buffer.
struct tr4_sample // (variable length) { uint32_t UncompSize; uint32_t CompSize; uint8_t SoundData[CompSize]; // zlib-compressed sound data (CompSize bytes) };
.WAV
file, uncompressed size defines the size of raw PCM data size in 16-bit, 22050 kHz, mono format. However, uncompressed size does not necessarily equal to a value derived from compressed size by wave type conversion, because MS-ADPCM codec tends to leave a bit of silence in the end of the file (which produces audible interruption in looped samples).
In TR2 and TR3, these samples are concatenated in the file MAIN.SFX
with no additional information; SampleIndices[]
contains sequence numbers (0, 1, 2, 3, …) in MAIN.SFX
. Finally, the samples themselves are all in Microsoft WAVE format.
Sound Data Structures
Sound Sources
This structure contains the details of continuous-sound sources. Although a SoundSource object has a position, it has no room membership; the sound seems to propagate omnidirectionally for about 8 horizontal-grid sizes without regard for the presence of walls.
struct tr_sound_source // 16 bytes { int32_t x; // absolute X position of sound source (world coordinates) int32_t y; // absolute Y position of sound source (world coordinates) int32_t z; // absolute Z position of sound source (world coordinates) uint16_t SoundID; // internal sound index uint16_t Flags; // 0x40, 0x80, or 0xC0 };
Flags
field defines sound source behaviour, if it was placed in room which can be flipped to alternate room:
0x40
: Play sound only when room is in alternate state.0x80
: Play sound only when room is in original state.0xC0
: Always play sound (in both original and alternate rooms).
Sound Map
SoundMap
is used for mapping from internal-sound index to SoundDetails
index; it is 256 int16_t
in TR1, 370 int16_t
in TR2, TR3 and TR4, and 450 int16_t
in TR5. A value of -1 (0xFFFF)
indicates “none”, meaning the sample is not used in current level.
Each SoundDetails
entry can be described as such:
struct tr_sound_details // 8 bytes { uint16_t Sample; // (index into SampleIndices) uint16_t Volume; uint16_t Chance; // If !=0 and ((rand()&0x7fff) > Chance), this sound is not played uint16_t Characteristics; };
Characteristics
is a packed field containing various options for this particular sound detail:
Bit | Hex | Flag | Description |
---|---|---|---|
0 | 0x0003 | Looping behaviour | either normal playback (value 00 ), one-shot rewound (01 in TR1/TR2, 10 in other games), meaning the sound will be rewound if triggered again, or looped (value 10 in TR1, 11 in other games), meaning the sound will be looped until strictly stopped by an engine event. Since TR3, one-shot wait mode is introduced (value 10 ), meaning the same sound will be ignored until current one stops |
1 | |||
2 | 0x00FC | Number of sound samples in this group | If there are more than one samples, then engine will select one to play based on randomizer (for example, listen to Lara footstep sounds) |
3 | |||
4 | |||
5 | |||
6 | |||
7 | |||
8 | |||
9 | |||
10 | |||
11 | |||
12 | 0x1000 | No panoramic | Disables panoramic calculation of a given sample (always plays at center). |
13 | 0x2000 | Randomize pitch | When this flag is set, sound pitch will be slightly varied with each playback event |
14 | 0x4000 | Randomize gain | When this flag is set, sound volume (gain) will be slightly varied with each playback event |
15 |
In TR3 onwards, [tr_sound_details] structure was rearranged:
struct tr3_sound_details // 8 bytes { uint16_t Sample; // (index into SampleIndices) uint8_t Volume; uint8_t Range; uint8_t Chance; uint8_t Pitch; int16_t Characteristics; };
Range
now defines radius (in sectors), on which this sound can be heard. Previously (in TR1 and TR2), each sound had a predefined range about 8 sectors.
Pitch
specifies absolute pitch volume for this sound (may be also varied by bit 13 of Characteristics). Mainly, this value was used to “speed-up” certain samples, thus allowing to keep high-quality samples with lower sample rate, or on contrary, “slow down” sample, making it sound longer than its native sample rate, thus conserving some memory.
Sample Indices
In TR1, this is an offset value array, each offset of which points into the embedded sound-samples block, which follows this array in the level file. In TR2 and TR3, this is a list of indices into the file ‘MAIN.SFX` file; the indices are the index numbers of that file’s embedded sound samples, rather than the samples’ starting locations. That file itself is a set of concatenated sound files with no catalogue info present.