• Welcome to Smashboards, the world's largest Super Smash Brothers community! Over 250,000 Smash Bros. fans from around the world have come to discuss these great games in over 19 million posts!

    You are currently viewing our boards as a visitor. Click here to sign up right now and start on your path in the Smash community!

Completed DAT Offset Finder 1.00

Punkline

Dr. Frankenstack
Joined
May 15, 2015
Messages
423
This DOL module is a research utility designed for MCM.

It can derive a file data offset from a given RAM addresses query; provided that the game has allocated said address for the purpose of storing a loaded DAT file.

The module may be used directly with calls to its included functions, or through an included default global interface:

https://gfycat.com/RareFormalAfricanjacana

The global interface polls the input once per frame using an MCM function that may be assigned to additional execution points via injection mods. I’ll link to any extension codes below.

Note: offsets are displaced by -0x20 in order to conform with the game’s normal relocation syntax. To find true file data offsets, add +0x20 to the relocation offset.

---

Update - 1/9/2018
- path string buffer now properly cleans garbage characters on new inputs
- added check for null relocation base address, prevents false positives
- added check for null input, prevents false positives
- <datInquire> function can now be assigned an I/O interface besides the default global one



Code - DAT Offset Finder 1.00
uses mytoc block 339

Code:
-==-

DAT File Offset Finder 1.0
When <datInquireGlobal> is executed, any address written to -0x1F84(rtoc) (804DDA5C) will be tested against each active dat file allocation.
Default update hook is once per frame; may be inserted in other functions.

The following information is output in a table accessible from -0x1F88(rtoc) (804DDA58)
0x00 - (previous input)
0x04 - DAT relocation info table pointer (0x44)
0x08 - DAT allocation index element pointer (0x1C)
0x0C - Relocation base address (+0x20 from file start)
0x10 - Relocation offset (of equivalent input address)
0x14 - Size of path string
0x18 - 0x40 byte path string buffer (ascii)
[Punkline]
1.02 ----- 0x802F56C0 --- C842E078 -> C8428000  # lfd -0x1F88(rtoc) becomes lfd -0x8000(rtoc) -- both use 4330000000000000.
1.02 ----- 0x802F5590 --- C842E078 -> C8428000  # lfd -0x1F88(rtoc) becomes lfd -0x8000(rtoc) -- both use 4330000000000000.
1.02 ----- 0x802F5478 --- C822E078 -> C8228000  # lfd -0x1F88(rtoc) becomes lfd -0x8000(rtoc) -- both use 4330000000000000.
1.02 ----- 0x802F5348 --- C822E078 -> C8228000  # lfd -0x1F88(rtoc) becomes lfd -0x8000(rtoc) -- both use 4330000000000000.
1.02 ----- 0x802F52A4 --- C822E078 -> C8228000  # lfd -0x1F88(rtoc) becomes lfd -0x8000(rtoc) -- both use 4330000000000000.
1.02 ----- 0x802F5200 --- C822E078 -> C8228000  # lfd -0x1F88(rtoc) becomes lfd -0x8000(rtoc) -- both use 4330000000000000.
1.02 ----- 0x802F515C --- C822E078 -> C8228000  # lfd -0x1F88(rtoc) becomes lfd -0x8000(rtoc) -- both use 4330000000000000.
1.02 ----- 0x802F50B8 --- C822E078 -> C8228000  # lfd -0x1F88(rtoc) becomes lfd -0x8000(rtoc) -- both use 4330000000000000.
1.02 ----- 0x802F4CF0 --- C822E078 -> C8228000  # lfd -0x1F88(rtoc) becomes lfd -0x8000(rtoc) -- both use 4330000000000000.
1.02 ----- 0x804DDA58 --- 4330000000000000 -> 0000000000000000  # <- replace 0's to create default value for -0x1F88(rtoc)

1.02 ----- 0x8037e29c --- d0037218 -> Branch
D0037218
bl <datInquireGlobal>
00000000

<get_DAT_Inquiry_data> All
4E800021 00000000
00000000 00000000
00000000 00000000
00000000 00000000
00000000 00000000
00000000 00000000
00000000 00000000
00000000 00000000
00000000 00000000
00000000 00000000
00000000 00000000
00000000

<datInquireGlobal> 1.02
9001FF60 7C0802A6
90010004 9421FF50
BC61001C 7C0902A6
90010014 7C000026
90010018
bl <get_DAT_Inquiry_data>
8062E07C 7C8802A6
9082E078
bl <datInquire>
80010018 7C0FF120
80010014 7C0903A6
B861001C 382100B0
80010004 7C0803A6
8001FF60 4E800020

<datInquire> All
7C0802A6 90010004
9421FFD0 BF810010
7C7F1B78 7C9E2378
3FA08043 63BD2124
839E0000 7C1CF800
41820068 93FE0000
7FE3FB78
bl <datIdentify>
2C040000 41820048
387E0008 7C8365AA
80640014 80630004
907E0004 A8640006
808DBBE0
bl <findNthString>
387E0018 90BE0014
bl 0x800031f4
8083FFFC 7C632214
38A00040 7C842850
bl 0x8000c160
48000010 387E0004
38800054
bl 0x8000c160
BB810010 38210030
80010004 7C0803A6
4E800020

<datIdentify> All
7C0802A6 90010004
9421FFE0 BFC10010
7C7E1B79 41820078
3FE08043 63FF2124
38000050 7C0903A6
A87F0008 2C03270F
40A20054 A87F0006
2C03FFFF 40810048
809F0014 2C040000
4182003C 80840004
80640040 2C030000
4182002C 7C1E1800
41A00024 80A40000
7C651A14 7C1E1800
41A10014 80A40020
7FE4FB78 7CC5F050
48000010 3BFF001C
4200FFA0 38800000
7FC3F378 BBC10010
38210020 80010004
7C0803A6 4E800020


<findNthString> All
7C862378 38E6FFFF
39000000 38870001
38A00000 8C070001
38A50001 2C000000
4082FFF4 39080001
7C081800 4180FFE0
4E800020

ASM + Notes:
edit: 1/2/2018
- reformatted bad paste of offset notes

ASM
Code:
-==-

ASM - DAT File Offset Finder 1.0
When <datInquire> is executed, any address written to -0x1F84(rtoc) (804DDA5C) will be tested against each active dat file allocation.
Default update hook is once per frame; may be inserted in other functions.

The following information is output in a table accessible from -0x1F88(rtoc) (804DDA58)
0x00 - (previous input)
0x04 - DAT relocation info table pointer (0x44)
0x08 - DAT allocation index element pointer (0x1C)
0x0C - Relocation base address (+0x20 from file start)
0x10 - Relocation offset (of equivalent input address)
0x14 - Size of path string
0x18 - 0x40 byte path string buffer (ascii)
[Punkline]
1.02 ----- 0x802F56C0 --- C842E078 -> C8428000  # lfd -0x1F88(rtoc) becomes lfd -0x8000(rtoc) -- both use 4330000000000000.
1.02 ----- 0x802F5590 --- C842E078 -> C8428000  # lfd -0x1F88(rtoc) becomes lfd -0x8000(rtoc) -- both use 4330000000000000.
1.02 ----- 0x802F5478 --- C822E078 -> C8228000  # lfd -0x1F88(rtoc) becomes lfd -0x8000(rtoc) -- both use 4330000000000000.
1.02 ----- 0x802F5348 --- C822E078 -> C8228000  # lfd -0x1F88(rtoc) becomes lfd -0x8000(rtoc) -- both use 4330000000000000.
1.02 ----- 0x802F52A4 --- C822E078 -> C8228000  # lfd -0x1F88(rtoc) becomes lfd -0x8000(rtoc) -- both use 4330000000000000.
1.02 ----- 0x802F5200 --- C822E078 -> C8228000  # lfd -0x1F88(rtoc) becomes lfd -0x8000(rtoc) -- both use 4330000000000000.
1.02 ----- 0x802F515C --- C822E078 -> C8228000  # lfd -0x1F88(rtoc) becomes lfd -0x8000(rtoc) -- both use 4330000000000000.
1.02 ----- 0x802F50B8 --- C822E078 -> C8228000  # lfd -0x1F88(rtoc) becomes lfd -0x8000(rtoc) -- both use 4330000000000000.
1.02 ----- 0x802F4CF0 --- C822E078 -> C8228000  # lfd -0x1F88(rtoc) becomes lfd -0x8000(rtoc) -- both use 4330000000000000.
1.02 ----- 0x804DDA58 --- 4330000000000000 -> 0000000000000000  # <- replace 0's to create default value for -0x1F88(rtoc)

1.02 ----- 0x8037e29c --- d0037218 -> Branch
# stfs    f0, 0x7218 (r3)
# returning from frame timer update (1 per frame)
stfs    f0, 0x7218 (r3) # original instruction

bl <datInquireGlobal>
.long 0



<get_DAT_Inquiry_data> All
blrl
.zero 0x58  # 0x58 byte buffer for output data

<datInquireGlobal> 1.02
# this version of DAT inquire uses global params instead
# of direct arguments.

# the stack frame is used to safely store context so that
# it may be used from virtually any bl-safe location.

stw   r0, -0xA0(sp) # store lr r0, ctr, cr, GPRs:
mflr  r0
stw   r0, 0x4(sp)
stwu  sp, -0xB0(sp) # activation record
stmw  r3, 0x1C(sp)
mfctr r0            # 0x10(sp) = r0
stw   r0, 0x14(sp)  # 0x14(sp) = ctr
mfcr  r0            # 0x18(sp) = cr
stw   r0, 0x18(sp)  # 0x1C(sp) = start of stmw r3-r31

bl <get_DAT_Inquiry_data>
lwz   r3, -0x1F84(rtoc) # r3 = input to check
mflr  r4                # r4 = output buffer start
stw   r4, -0x1F88(rtoc) # store global pointer
bl <datInquire>

lwz   r0, 0x18(sp)  # super safe
mtcr  r0
lwz   r0, 0x14(sp)
mtctr r0
lmw   r3, 0x1C(sp)
addi  sp, sp, 0xB0
lwz   r0, 0x4(sp)
mtlr  r0
lwz   r0, -0xA0(sp)
blr


<datInquire> All
# r3 = input to read
# r4 = output buffer start address
# routine checks input for a matching dat alloc element
# if found, pathname string buffer is refreshed using entrynum
.set allocIndexAddr, 0x80432124
.set xPrev, 0       # records previous input address
.set xReloc, 4      # points to relocation info table
.set xAlloc, 8      # points to allocation index element
.set xBase, 0xC     # records reloc base
.set xOffset, 0x10  # records reloc offset
.set xStrSize, 0x14 # records string size
.set xStr, 0x18     # copies string
.set xStrMax, 0x40  # maximum string buffer size
.set rIn, 31
.set rOut, 30

.set xEntrynum, 6
.set xStatus, 8
.set xRelocLink, 0x14
.set xLinkToThisAlloc, 4
.set xFileSize, 0
.set xRelocBase, 0x20
.set xFileStart, 0x40

mflr r0
stw  r0, 0x4(sp)
stwu sp, -0x30(sp)
stmw r28, 0x10(sp)

mr rIn, r3
mr rOut, r4

lis r29, allocIndexAddr @h
ori r29, r29, allocIndexAddr @l  # r29 = Dat Alloc index start

lwz r28, xPrev(rOut)  # check prev input address
cmpw r28, rIn        # if it hasn't changed,
beq _return          # then don't run code

stw rIn, xPrev(rOut)

mr r3, rIn
bl <datIdentify>     # else, identify input
cmpwi r4, 0          # does it match a loaded dat file?
beq- _bad_match
_good_match:         # if match
# r4 = alloc element
# r5 = reloc base
# r6 = reloc offset
addi r3, rOut, xAlloc
stswi r4, r3, 0xC       # store alloc table, reloc base, reloc offset
lwz   r3, xRelocLink(r4)
lwz   r3, xLinkToThisAlloc(r3)
stw   r3, xReloc(rOut)  # store reloc table

lha   r3, xEntrynum(r4) # get entrynum ID
lwz   r4, -0x4420(r13)  # get entrynum index
bl <findNthString>

addi r3, rOut, xStr
stw  r5, xStrSize(rOut)
# r3 = string buffer
# r4 = string address
# r5 = string length in bytes
bl 0x800031f4  # memcpy
lwz r4, -0x4(r3)  # load string size again
add r3, r3, r4    # r3 = point to end of string
li  r5, xStrMax
sub r4, r5, r4    # r4 = remaining buffer length
bl 0x8000c160  # zero out output region
b _return

_bad_match:    # if no match
addi r3, rOut, 4
li   r4, 0x54
bl 0x8000c160  # zero out output region

_return:
lmw  r28, 0x10(sp)
addi sp, sp, 0x30
lwz  r0, 0x4(sp)
mtlr r0
blr



<datIdentify> All
# r3 = any address
# checks if input address is in range of any active dat file allocations
# returns:
# r3 = unchanged
# r4 = DAT alloc element (or null if no match)
# r5 = DAT file reloc base
# r6 = DAT file reloc offset
.set allocIndexAddr, 0x80432124
.set xEntrynum, 6
.set xStatus, 8
.set xRelocLink, 0x14
.set xLinkToThisAlloc, 4
.set xFileSize, 0
.set xRelocBase, 0x20
.set xFileStart, 0x40
mflr r0
stw  r0, 0x4(sp)
stwu sp, -0x20(sp)
stmw r30, 0x10(sp)
mr. r30, r3                   # r30 = input query
beq- _no_match
lis r31, allocIndexAddr @h
ori r31, r31, allocIndexAddr @l  # r31 = loop index address
li r0, 80      # for each of 80 allocation slots...
mtctr r0
_loop:
lha r3, xStatus(r31)
cmpwi r3, 9999          # if alloc status == 9999
bne+ _incr_loop
lha r3, xEntrynum(r31)
cmpwi r3, -1
ble- _incr_loop         # and entrynum is valid
_check_in_range:
lwz r4, xRelocLink(r31)
cmpwi r4, 0
beq- _incr_loop
lwz r4, xLinkToThisAlloc(r4)
lwz r3, xFileStart(r4)
cmpwi r3, 0
beq- _incr_loop         # and reloc table isn't null
cmpw r30, r3            # and query > low limit
blt+ _incr_loop
lwz r5, xFileSize(r4)
add r3, r5, r3
cmpw r30, r3            # and query < high limit
bgt+ _incr_loop
_match:
lwz r5, xRelocBase(r4)  # r4 = alloc index element
mr  r4, r31             # r5 = relocation base address
sub r6, r30, r5         # r6 = relocation offset
b _return
_incr_loop:
addi r31, r31, 0x1C
bdnz+ _loop
_no_match:
li r4,0
_return:
mr r3, r30              # r3 = (unchanged query)
lmw  r30, 0x10(sp)
addi sp, sp, 0x20
lwz r0, 0x4(sp)
mtlr r0
blr


<findNthString> All
# r3 = string ID
# r4 = start of string array
# strings must be delimited by single null bytes 00

# returns
# r3 = String ID
# r4 = Address of string
# r5 = String Length
# r6 = start of string array
mr    r6, r4
addi  r7, r6, -1  # r7 = update loop index addr
li    r8, 0       # r8 = null counter

_reset:
addi  r4, r7, 1   # set start of string to next in array
li    r5, 0       # reset str length counter

_loop:
lbzu  r0, 1(r7)   # load next byte
addi  r5, r5, 1   # incr length by 1
cmpwi r0, 0       # is byte null?
bne+ _loop        # if not, iterate loop

_null:
addi  r8, r8, 1   # if byte is null, increment null counter
cmpw  r8, r3      # is null counter == ID?
blt+ _reset       # if not, reset length and iterate loop

_return:
blr
---

Notes

The game stores information about where a DAT file is located in an index located at 804318B8.

These elements are small links that create allocation chains representing groups of files loaded together:

DAT allocation chains
Code:
DAT allocation chains

P    0x0    points to another allocation link in this chain
              uses 0 nulls at end of group alloc
P    0x4    pointer to start of this allocation
              consists largely of loaded DAT files, and their corresponding
              relocation info tables.
W    0x8    length of allocation for loaded file
              this is padded to account for a 0x20 byte alignment,
              and possibly some header info.
---

Another structure starting at 80432078 creates an index of 80 allocation slots that seem to be related to DVD hardware interrupts:

DAT Allocation Index
Code:
DAT Allocation Index

80432078    0x000    unk, start of global index container
8043207C    0x004    unk index of "Old" interrupt elements this index of data
                       is overwritten with the "New" version of the index.
  ...
804320C0    0x048    unk index of "New" interrupt elements this data is copied
                       over to the "Old" index.
  ...
80432104    0x08C    unk data
  ...
80432124    0x0AC    HSD Allocation Index Element slot 1 game functions use
                       (base of DAT Alloc index) + 0xAC to access each slot,
                       incrementing the index by 0x1C.
80432140    0x0C8    HSD Allocation Index Element slot 2
8043215C    0x0E4    HSD Allocation Index Element slot 3
  ...
804329C8    0x950    HSD Allocation Index Element slot 80 there is a strict
                       limit of 80 slots.
---

The 80 index elements, or "allocation slots” are structured like so:

DAT Allocation Index Element
Code:
DAT Allocation Index Element

b    0x00    unk -- written to conditionally by using an immediate value
b    0x01    unk -- passed as argument r3 in calls to 80017740
b    0x02    unk -- passed as argument r5 (?)
b    0x03    unk
b    0x04    unk -- passed as argument r10 (?)
b    0x05    unk
s    0x06    DVD Entrynum of pathname string for file
s    0x08    status-- 9999 for ready(0x270F)
               used to represent some kind of a status for loading things
               from DVD.
s    0x0A    unk
W    0x0C    file size
P    0x10    to HSD Allocation Link for File follow link pointer to go
               directly to file allocation.
P    0x14    to HSD Allocation Link for Relocation Info or use the relocation
               info table to use pointers to various sections of the file.
?    0x18    unk
---

Each DAT file is given an additional 0x44 byte allocation. These allocations each create a table that unpacks the DAT file header in a way that helps resolve symbol names when loading data from a file. These may be accessed from an allocation slot in the allocation index, or from the allocation chain link belonging to the associated file:

DAT Relocation Info Table
Code:
DAT Relocation Info Table
these are allocated alongside files dynamically,
but are accessed from the above static structures.

# first 0x20 bytes = string copied directly from the header of a DAT file
W    0x00    File Size
W    0x04    data block size
W    0x08    relocation table count
W    0x0C    root count
W    0x10    root count 2
W    0x14    unk
W    0x18    unk
W    0x1C    unk

P    0x20    to Reloc Base Address
               file start + 0x20
P    0x24    to Reloc
               pointer table in file relocation table
P    0x28    to root node index 1
P    0x2C    to root node index 2
P    0x30    to Symbol Strings
               used by functions to find data sections within data body.
?    0x34    to unk1
?    0x38    to unk2
?    0x3C    to unk3

P    0x40    to File Data Start file header start
               (as opposed to relocation base address)

---

Extensions

Projectile Hitbox JObj Finder - Use projectile hitbox assignments to track down bone descriptions in file data

---

Suggested usage

- Install code with @DRGN ‘s latest version of Melee Code Manager

- Download @aldelaro5 ‘s latest version of Memory Engine
- Download this watchlist to read the output buffer from Memory Engine

- Alternatively, use the following navigation information in an injection code:
Code:
global 32-bit Input Field = -0x1F84(rtoc) (804DDA5C)

global Output Table pointer = -0x1F88(rtoc) (804DDA58)

Output Table Structure:
0x00 - (previous input)
0x04 - DAT relocation info table pointer (0x44)
0x08 - DAT allocation index element pointer (0x1C)
0x0C - Relocation base address (+0x20 from file start)
0x10 - Relocation offset (of equivalent input address)
0x14 - Size of path string
0x18 - 0x40 byte path string buffer (ascii)
---

The following functions are included with the module:

<datInquireGlobal>
# Execute this to read the global input field, and update the output buffer accordingly
# no arguments or return values


<datInquire>
# r3 = input to read
# r4 = output buffer start address
# no return values


<datIdentify>
# r3 = any address
# checks if input address is in range of any active dat file allocations
# returns:
# r3 = unchanged
# r4 = DAT alloc element (or null if no match)
# r5 = DAT file reloc base
# r6 = DAT file reloc offset
 
Last edited:

Punkline

Dr. Frankenstack
Joined
May 15, 2015
Messages
423
Here's an example extension code that targets projectile JObj locations in a file.

Install this code alongside the main code in the first post:

Code - Projectile JObj Finder
Code:
-==-
Projectile Hitbox JObj Finder
Whenever a hitbox is assigned to a projectile or stage-owned JObj,
its file data offset will be recorded in the DAT Inquiry output buffer.
[Punkline]
1.02 ----- 802791A8 ---    901D0048 -> Branch
901D0048 807D0048
80630084 9062E07C
bl <datInquireGlobal>
00000000
ASM



With this extension, any projectile that specifies a valid “bone ID” in its hitbox will have its JObjDesc address automatically written to the output buffer. Doing this allows us to derive the location of associated joints in the skeleton through the action of spawning hitboxes in-game:

https://gfycat.com/ReliableKeyGrub

From the above gif, we can monitor the following joints being output as relocation offsets:

Code:
Offset       File        Description
0x00059828 - GrGr.dat  : Apple hitbox joint
0x0008A8C8 - ItCo.dat  : Ray Gun laser hitbox joint
0x00046048 - ItCo.dat  : Ray Gun thrown item hitbox joint
0x00017948 - PlCl.dat  : Young Link’s Boomerang hitbox joint
0x0001D108 - PlCl.dat  : Young Link’s Fire Arrow hitbox joint
0x000038E8 - PlKbCpCl.dat : Kirby’s Fire Arrow hitbox joint
For more information about the JObjDesc format, see the Melee Dat Format thread
 
Last edited:
Top Bottom