Also found my old note that describes how 'engine performance patch' acts:
>The memory handling does sound very stupid, though - when Drakan came out, 128MB was a pretty normal spec for most machines.
as you know Drakan's minimum hardware is 16 Mb RAM ... ask Surreal or Psygnosis why it is so...
look at this subroutine - it is a part of render code that manage memory :
Code: Select all
...
.text:0043C7D0 ; --------------- S U B R O U T I N E ---------------------------------------
.text:0043C7D0
.text:0043C7D0
.text:0043C7D0 sub_43C7D0 proc near ; CODE XREF: sub_446A30+258p
.text:0043C7D0 ; sub_446E70+6B0p
.text:0043C7D0
.text:0043C7D0 arg_0 = dword ptr 10h
.text:0043C7D0 arg_4 = dword ptr 14h
.text:0043C7D0 arg_8 = dword ptr 18h
.text:0043C7D0
.text:0043C7D0 push ebx
.text:0043C7D1 push ebp
.text:0043C7D2 push esi
.text:0043C7D3 mov esi, ecx
.text:0043C7D5 push edi
.text:0043C7D6 mov eax, [esi+1B8h]
.text:0043C7DC mov ecx, [esi+1BCh]
.text:0043C7E2 cmp ecx, eax ; test if allocated memory still enough
.text:0043C7E4 jl short loc_43C848 ; jump if everything still OK for render
.text:0043C7E6 mov ecx, [esi+1B4h]
.text:0043C7EC add eax, 100h ; prepare to allocate more memory
.text:0043C7F1 test ecx, ecx ; test if ReAllocating required
.text:0043C7F3 mov [esi+1B8h], eax
.text:0043C7F9 jz short loc_43C82C ; jump to Allocating
.text:0043C7FB mov edi, ds:GlobalHandle
.text:0043C801 push ecx ; pMem
.text:0043C802 call edi ; GlobalHandle
.text:0043C804 push eax ; hMem
.text:0043C805 call ds:GlobalUnlock
.text:0043C80B mov eax, [esi+1B8h]
.text:0043C811 mov ecx, [esi+1B4h]
.text:0043C817 push 2 ; uFlags
.text:0043C819 lea eax, [eax+eax*4]
.text:0043C81C shl eax, 2
.text:0043C81F push eax ; dwBytes
.text:0043C820 push ecx ; pMem
.text:0043C821 call edi ; GlobalHandle
.text:0043C823 push eax ; hMem
.text:0043C824 call ds:GlobalReAlloc ; DAMN SLOW function :(
.text:0043C82A jmp short loc_43C83B
.text:0043C82C ; ---------------------------------------------------------------------------
.text:0043C82C
.text:0043C82C loc_43C82C: ; CODE XREF: sub_43C7D0+29j
.text:0043C82C lea edx, [eax+eax*4]
.text:0043C82F shl edx, 2
.text:0043C832 push edx ; dwBytes
.text:0043C833 push 0 ; uFlags
.text:0043C835 call ds:GlobalAlloc
.text:0043C83B
.text:0043C83B loc_43C83B: ; CODE XREF: sub_43C7D0+5Aj
.text:0043C83B push eax ; hMem
.text:0043C83C call ds:GlobalLock
.text:0043C842 mov [esi+1B4h], eax
.text:0043C848
.text:0043C848 loc_43C848: ; CODE XREF: sub_43C7D0+14j
.text:0043C848 mov eax, [esi+1BCh]
.text:0043C84E xor ecx, ecx
.text:0043C850 mov edi, [esp+4+arg_4]
.text:0043C854 lea edx, [eax+eax*4]
.text:0043C857 mov eax, [esi+1B4h]
.text:0043C85D lea edx, [eax+edx*4]
.text:0043C860 mov [edx], ecx
.text:0043C862 mov [edx+4], ecx
.text:0043C865 mov [edx+8], ecx
.text:0043C868 mov [edx+0Ch], ecx
.text:0043C86B mov [edx+10h], ecx
.text:0043C86E mov eax, [esi+1BCh]
.text:0043C874 mov edx, [esp+4+arg_8]
.text:0043C878 inc eax
.text:0043C879 mov [esi+1BCh], eax
.text:0043C87F mov ecx, [esi+1B4h]
.text:0043C885 lea eax, [eax+eax*4]
.text:0043C888 lea ebp, [ecx+eax*4-14h]
.text:0043C88C mov [ebp+0], edx
.text:0043C88F mov [ebp+8], edi
.text:0043C892 mov eax, [esi+1C8h]
.text:0043C898 mov [ebp+0Ch], eax
.text:0043C89B mov ecx, [esi+164h]
.text:0043C8A1 mov [ebp+4], ecx
.text:0043C8A4 mov ebx, [esi+1C8h]
.text:0043C8AA mov ecx, [esi+1C4h]
.text:0043C8B0 mov eax, ebx
.text:0043C8B2 lea edx, [eax+edi]
.text:0043C8B5 cmp edx, ecx ; <== test if enough memory
.text:0043C8B7 jle short loc_43C92A ; <== jump if everything still OK for render
.text:0043C8B9 sub eax, ecx
.text:0043C8BB lea eax, [eax+edi-1]
.text:0043C8BF cdq
.text:0043C8C0 and edx, 1FFh ; >
.text:0043C8C6 add eax, edx ; >
.text:0043C8C8 sar eax, 9 ; > < == prepare to allocate more memory
.text:0043C8CB inc eax ; >
.text:0043C8CC shl eax, 9 ; >
.text:0043C8CF add eax, ecx
.text:0043C8D1 mov ecx, [esi+1C0h]
.text:0043C8D7 test ecx, ecx ; < == test if ReAllocating required
.text:0043C8D9 mov [esi+1C4h], eax
.text:0043C8DF jz short loc_43C911 ; < == jump for Allocating
.text:0043C8E1 push ecx ; pMem
.text:0043C8E2 call ds:GlobalHandle
.text:0043C8E8 push eax ; hMem
.text:0043C8E9 call ds:GlobalUnlock
.text:0043C8EF mov eax, [esi+1C4h]
.text:0043C8F5 mov ecx, [esi+1C0h]
.text:0043C8FB shl eax, 5
.text:0043C8FE push 2 ; uFlags
.text:0043C900 push eax ; dwBytes
.text:0043C901 push ecx ; pMem
.text:0043C902 call ds:GlobalHandle
.text:0043C908 push eax ; hMem
.text:0043C909 call ds:GlobalReAlloc ; DAMN SLOW function :(
.text:0043C90F jmp short loc_43C91D
...
i.e. you can patch byte 09H at offset 3C8CEH in Drakan.exe to 0FH to get about +40% performance at Paradise level and about +190% performance at Atlantis level ... but it would make almost zero difference at retail low-enough-poly levels ... the more POP/POR you have in a frame the more advantage you will get ...
also recommended to patch memory allocation routine at 3C7ECH - we have not place there but we can replace it with a jump to free place at the end of .text section - i make it to 78500H
so changes in the tweaked version:
Code: Select all
.text:0043C7EC jmp 478500
.text:0043C8CC shl eax, 0FH
.text:00478500 or eax,100H
.text:00478505 shl eax,05H
.text:00478508 jmp 43C7F1H
You can add it in your patch too.
I think it is part of 'dynamic arrays' which Riot Engine uses to fits in very small RAM machines, but for some design reason it can not reallocates the exact new amount of memory in 1 OS call and makes a number of sequential calls that takes much more time. The point of that patch was to increase the memory allocation step - so reduce a number of calls to reallocating OS subroutine. Yes - it will ends in not optimal memory usage but now we usually have significally more memory in user's PCs.
Also I found some backup of my IDA database with my comments from that years when I searching via Drakan:OOTF binaries:
ftp://fauser:365445@files.drakan.ru/pub ... n_idbs.zip
May it would be useful too. Do not remember what IDA version I use though.