Sunday, July 18, 2021

Smarty and the Nasty Gluttons - part 2 (disk system and file loader)

Preface

Continuing from where we left off in the post #1 on Smarty and the Nasty Gluttons.. 


Smarty and the Nasty Gluttons disk system

The Smarty and the Nasty Gluttons has a special disk system and a floppy disk layout that is neither the most efficient or entirely standard. It is such because we can ;) The file system is essentially an Amiga FFS that is placed into a floppy disk in a way one can boot the floppy disk also in Amigas with Kickstart 1.2 or earlier (those have no support booting from FFS formatted floppy disk in the Kickstart ROM).  

The floppy disk bootblock DOS type ("DOS\0") tells to the Kickstart that the floppy is formatted in OFS (old file system). However, there is a minimal FFS file loader placed in the bootblock that is able to load the second  stage FFS loader which is then used throughout the game. This simple trick allows us to use FFS formatted disk with any Kickstart.

The OFS vs. FFS trick has some side effects. The AmigaDOS gets really confused when one tries to access files on the floppy disk and you will get checksum errors for all files (well, the game disk has one hand crafted licence text file that is actually loadable - how this is done is left to the reader as an exercise ;) The OFS and FFS are similar enough that reading floppy disk directory structures etc still work out of box. 

Using Kickstart 3.1 CLI tools one cannot create a floppy disk that uses every possible sector on a disk. At least one free sector has to remain free on the disk. Why? I have no idea. At the end the game ADF was mastered using ADFTools, good old unregistered version of DiskMonTools v3.13 and some checksum calculation scripts. I am very bad at automating steps like this so the master ADF creation, which was also my task, was very tiresome. This is partly because I do development on a REAL AMIGA HARDWARE and stuff involving Python or other fancy stuff found in e.g., FS-UAE (got no Windows machine to run WinUAE) need shuffling files around between systems. See the internals of my main development machine, which travels with me wherever I go ;-) Unfortunately this A600 has no networking like my other Amigas at home..


The Amiga Fast File System (FFS) file loader

The Smarty and the Nasty Gluttons file loader is able to load both plain data and compressed S405 files from any path on a floppy disk. Compressed files can be decompressed on the fly which is very memory efficient since no need for having the compressed file as a whole in the RAM. The loader uses a hardware banging trackloader under the hood, which is capable of reading/writing standard Amiga MFM tracks including the OS header area (16 additional bytes of data per sector). The FFS file loader dates back to early 1990s.. could be around 1991 or 1993 or so.. However, I did quite a bit rewriting for Smarty and the Nasty Gluttons during 2020. Honestly, the old and new code base probably have nothing in common ;)

There are simple optimisations included in the loader such as caching the last MFM track and the root block sector (at FFS level). Unlike the original Smarty and the Nasty Gluttons trackloader from 1990s the new FFS file loader does not support interleaving on fly decompression of previous track while disk DMA is reading new data.. This turned out to be slightly cumbersome to implement as part of the file loader thus I got lazy and gave up ;)

The loader has the following API for loading files:

; Inputs:
;  D0 = drive (0 = DF0, 1 = DF1 etc)
;       (the boxed version of the game reads the active drive
;       number from a specific memory location)
;  D1 = filetype (0=data, 1=S405)
;  A0 = ptr to NUL terminated filename (+ path if other than root)
;  A1 = destination memory address (16 bits aligned)
;
; Returns:
;  D0 = < 0 if error
;       > 0 loaded bytes
;  A0 = first unused destination address after loading

The normal data file loading is uninteresting. However, the on fly decompression has some interesting pieces in it. The on fly decompression is done per sector basis i.e., a sector of compressed data is loaded and then decompressed. Read the post #1 for the used compressor background. 

The kernel of the combined data and compressed file loader is the sector loading loop. In a case of data file loading the "sector loader function" address points at a plain sector loader function. In a case of S405 compressed file loading the "sector loader function" address points at the decompression routine - initially to stc5loader() function and later to outofdataloader() function.

loaddata:
        ; Get the next sector number to read
        move.w (a2)+,d0
        ; If zero load more sectors from extension block if any..
beq.b extloadloop
movem.l (fileaddr).w,a1/a3
        ; A1=destination memory
        ; A3=sector loader function
jsr     (a3)
tst.l d0
bmi.w diskerror
move.l a1,(fileaddr).w
        ; D5=number of bytes still to load.. 
tst.l d5
beq.b extloadloop
        bra.b loaddata

Enabling per sector decompression of the S405 compressed file some tweaks had to be made into the decompressor. There's only one function getB() used to read more bits from the compressed data stream. We place the check for "new sector needed" in that function.

; Parameters:
;  D1 = num_bits_to_extract
;  D6 = bits left in the bit buffer
;  D7 = bit buffer
;  A0 = ptr to compressed data buffer
;  A6 = end of compressed data buffer
getB: cmp.w d1,d6
bge.b .getb
lsl.l d6,d7
; Check if we reached the "end of compressed data" buffer.
        ; A6 is the "end of compressed data" pointer.
cmp.l a0,a6
bhi.b .ok
; Trigger "more data" exception..
bsr.w outofdata
.ok: move.w (a0)+,d7
sub.w d6,d1
moveq #16,d6
.getb: sub.w d1,d6
lsl.l d1,d7
rts

The outofdata() function does the magic so that the decompressor can exit back to the file loader to load a new sector of compressed data and eventually resume the decompressor after the "bsr.w". In the released game the call stack could be 5 subroutines deep.

outofdata:
        ; Store decompressor context
        movem.l d0-d7/a1-a6,(stcctxsave).w ; 14 registers
        ; Calculate used call stack size..
move.l (stcstackptr).w,d0
lea     (stccallstack).w,a2
sub.l a7,d0
add.l d0,a2
lsr.w #2,d0
subq.w #1,d0
move.l d0,(stcstacksize).w ; size in longs - 1
; Save the call stack and at the same time unwind
; it to a point that RTS returns to the DOS file loader..
.movestack:
        move.l  (a7)+,-(a2)
dbf     d0,.movestack
; Restore the file loader context and return A2 as the
; current ptr to dest memory (the loader expects
; to receive this).
movem.l (loaderregs).w,d5-d7/a2/a4-a6
moveq #0,d0
rts

Later in the sector loading loop the outofdataloader() decompressor gets called to load the actual sector and resume the decompression from the spot it left off last time (see the "sector loader function" address placed in A3 register).

outofdataloader:
move.l a7,(stcstackptr).w ; save return stack
movem.l d5-d7/a2/a4-a6,(loaderregs).w
        ; Load 1 sector of compressed data into the sector buffer
move.l (secbuffer).w,a1
bsr.w load512
bmi.b .diskerror
; Restore call stack.. and also 
; point A0 to loaded sector buffer
move.l a1,a0
move.l (stcstacksize).w,d0
lea     (stccallstack).w,a1
.movestack:
        move.l (a1)+,-(a7)
dbf     d0,.movestack
; Return to the decruncher
movem.l (stcctxsave).w,d0-d7/a1-a6 ; 14 registers
.diskerror:
        rts

Initially the stc5loader() function is called to load the first sector of the compressed file (see the "sector loader function" address placed in A3 register). 512 bytes is always enough to contain the file ID information and the PRE tree. 

stc5loader:
movem.l d5-d7/a2/a4-a5,(loaderregs).w
move.l a1,-(sp)
move.l (secbuffer).w,a1
bsr.w load512
bmi.b .diskerror
cmp.l #"S405",(a1)+
beq.b .filetypeok
moveq #-1,d0
move.l d0,(error).w ; error - not an STC5 file
.diskerror:
        move.l (sp)+,a1
        rts
.filetypeok:
lea outofdataloader(pc),a2
move.l a2,(funcaddr).w
move.l a1,a0
move.l (sp)+,a1
move.l a7,(stcstackptr).w
move.l (secbuffer).w,a6
adda.w #SECBUFFER_SIZE,a6
move.l (workstack).w,a2

Most likely all this stack mangling and call flow context switching could be done in much more elegant way. However, since we were low on available memory the same decompressor could also be used for RAM to RAM decompression! The trick in that case is to load $ffffffff into A6 ("end of compressed data") and the outofdata() part of the getB() function never triggers.


Hiscores 

The Smarty and the Nasty Gluttons saves hiscores to the disk. However, at the end we had no single sector free disk space. The solution was to save the hiscore into the sector OS header area (16 bytes per sector) on track 25. This is also an intentional annoyance to emulation folks as the basic ADF does not support sector OS headers and back then no emulator could save the hiscores ;-) 

Btw, the track 25 is a hidden geek joke referring to Twin Peaks. Actually, there are quite a few Twin Peaks referrals/messages hidden in the ADF release of the game disk (hint: bootblock, hiscore, each file on the disk). I wonder if anyone ever found or figured those out ;) I could even give a free boxed copy of the game to a person first reporting me all Twin Peaks referrals/messages.

No comments:

Post a Comment

Blitter c2p for a 16 colour rotozoomer

Preface I released a simple rotozoomer in a "Lure of the Temptress" (see the  Pouet link ) crack intro for Flashtro . The original...