h'okay, I think I have cracked the decompression algorithm!!
the rest of this post will only be of interest to hardcore bytehackers who want to follow in my filthy footsteps
so, this compression scheme applies to any files that have the extension
.MGL and this is some (Lua-ish) pseudocode code to decompress it
repeat
local b = read_byte()
if (b == 0) or (b == nil) then
-- zero byte marks the end of the stream
break
elseif b >= 0xE0 then
local b2 = read_byte()
local b3 = read_byte()
local offset = ((b - 0xE0) * 256) + b2 + 3
local length = b3 + 5
reuse_bytes(offset, length)
elseif b >= 0xC0 then
local b2 = read_byte()
local offset = ((b % 4) * 256) + b2 + 3
local length = 4 + math.floor((b - 0xC0) / 4)
reuse_bytes(offset, length)
elseif b >= 0x80 then
local offset = (b - 0x80) + 3
reuse_bytes(offset, 3)
elseif b >= 0x70 then
local reps = (b - 0x70) + 2
reuse_bytes(2, 2, reps)
elseif b >= 0x60 then
local reps = (b - 0x60) + 3
reuse_bytes(1, 1, reps)
elseif b >= 0x50 then
local length = (b - 0x50) + 2
uint16LE_pattern_sequence(length)
elseif b >= 0x40 then
local length = (b - 0x40) + 3
byte_pattern_sequence(length)
else
local length = b
copy_input_bytes(length)
end
until (b == 0) or (b == nil)
this is the core loop, it requires the following functions be defined first:
- read_byte(): Read 1 byte from the input stream, and return its value as a number. If the end of the stream is reached, return nil.
- copy_input_bytes(length): Take a chunk of bytes from the input stream, and write it direct to the output stream.
- reuse_bytes(offset, length[, repeats]): Take a chunk of bytes that have previously been written to the output stream, and re-append it to the end. The chunk begins offset bytes backwards from the end of the stream. If length is greater than offset, repeat the chunk like a pattern - for example if reuse_bytes(5, 9) is called and the last five bytes written spell "Hello" in ASCII, "HelloHell" should be written to the output stream. If the optional parameter repeats is given, repeat the chunk that number of times.
- byte_pattern_sequence(length): Use the last two bytes written to the output stream to continue with a pattern. For example if the last two byte values were 02 03, the sequence would continue 04 05 06 07...
- uint16LE_pattern_sequence(length): Same as byte_pattern_sequence() but with 16-bit short integers (Little-Endian encoded) instead of bytes. So note the total number of bytes written will be twice the value of length