That was not so easy to find.
Here below are the instructions. For those who can wait, look at
http://m0001.gamecopyworld.com/games/pc_sw_rogue_squadron.shtml)
in a few days ...
Problem:
--------
This is a problem a lot of people seem to have on Windows XP with modern video cards.
After every mission successfully completed or cancelled, and after the medal panel,
the game crashes to the desktop. Thus, after every mission, to progress, one has to start
RS back up again. Everything else runs fine, which is what makes it puzzling.
Compatibility mode, the 1.2 patch.... none of it helps.
It is reported to be ok with some old display drivers, however most recent cards
(GeForce 5, 7, 8, ...) do not function with these old drivers, which wouldn't even install.
The facts:
----------
After the medal panel, the 1.2 patched game crashes at instruction 0x004EA2B0 (game
patched to 1.2 with ROGUE12.EXE of 1 316 556 bytes).
It is trying to access a memory region which doesn't exist, i.e. which is not physically
allocated and mapped in the process space.
Short analysis:
---------------
Coming back a little in the code and in the stack, it can be seen that the code is at
that moment trying to copy video data from one place to another, by 480 lines of 640 pixels
of 4 bytes (= 32 bits color ?).
The destination data seems to be twice the size of the source data, i.e. converting from
2 bytes pixels (16 bits color ?) to 4 bytes (32 bits color ?).
This can be seen by the increments applied after each line copy, which is 0x500 at source
(= 640 * 2 = 0x280 * 2), and 0xA00 at destination (= 640 * 4 = 0x280 * 4).
At the moment where the process breaks, 0x46 or 0x45 lines are yet to be copied, to a
buffer which appears to have a size of 0x100400 bytes = a little more than 1 MB.
Ways to cure the problem:
-------------------------
Given that there is not anymore support for this product, there is no other choice than
to dig into it and try to patch it by hand, directly in the ASM.
There are at first 2 possible ways to go on that:
1) Based on the fact that this could be due to the color expanding, avoid the problem
by having increments of 0x500 inside the destination buffer instead of 0xA00.
2) Find the place where the buffer is allocated, and patch the allocation to include
more memory in order to avoid the buffer overrun.
3) Other ... determined after the dig into the code and not obvious now.
Method 1:
---------
This starts by trying to find where the 0xA00 value is coming from.
After some trace back and analysis, using execution and data write breakpoints as needed,
the location which hold this value is: .data:0x708594
More digging reveals that the value is written to this variable from information traced
back to a call to ddraw.dll at: .text:0x57C3DD
.text:0057C3C7 push 0
.text:0057C3C9 lea edx, [esp+80h+var_74]
.text:0057C3CD push 1
.text:0057C3CF mov [esp+84h+var_74], 7Ch
.text:0057C3D7 mov ecx, [eax]
.text:0057C3D9 push edx
.text:0057C3DA push 0
.text:0057C3DC push eax
-> .text:0057C3DD call dword ptr [ecx+64h]
.text:0057C3E0 test eax, eax
.text:0057C3E2 jnz loc_57C4A8
This call fills back some data in the stack, which is then interpreted by the code
following it to determine if it should write what is returned by this call to ddraw.dll (1),
or what is returned by a default calculation which gives back ... 0x500 (2)
The test is done at: .text:0x57C4C4
.text:0057C45C loc_57C45C: ; CODE XREF: sub_56EF80+D4D2j
-> .text:0057C45C test [esp+90h+var_84], 1FF9EEh
0) .text:0057C464 jz short loc_57C46F
1) .text:0057C466 mov ecx, [esp+90h+var_78]
.text:0057C46A mov [esi+8], ecx
.text:0057C46D jmp short loc_57C489
.text:0057C46F ; ---------------------------------------------------------------------------
.text:0057C46F
.text:0057C46F loc_57C46F: ; CODE XREF: sub_56EF80+D4E4j
2) .text:0057C46F xor eax, eax
.text:0057C471 xor edx, edx
.text:0057C473 mov ax, [esi+4]
.text:0057C477 mov dl, [esi+10h]
.text:0057C47A imul eax, edx
.text:0057C47D cdq
.text:0057C47E and edx, 7
.text:0057C481 add eax, edx
.text:0057C483 sar eax, 3
.text:0057C486 mov [esi+8], eax
.text:0057C489
At this stage, the idea is then to force the path through 2), always,
by changing the line at (0) to an unconditional jump:
0) .text:0057C464 jmp short loc_57C46F
So let's do that and see what happens ... that works !! No more trap !
However, there is an inconvenient ! All the interface screens to manipulate
the options, the list of pilots ... etc ... are halved in height :-(
Not so surprising by the way, this 0xA00 and 0x500 thing, with expansion,
had to be there for a purpose.
All the rest is ok, i.e the mission screens, the in-action sequences when
flying the engines, the score and about screens ... etc ... are untouched.
These halved screens are playable, the mouse acts correctly over them since
the .data:0x708594 variable is used in a lot of places by the code, but
this is not perfect, some of the text on screen is partial in height like the
pilot names or labels ... This requires to use the original version for example
to go through the parameter screens for modifying the joystick buttons and other
parameter screens.
Method 1bis:
------------
Ok, so what if we leave .data:0x708594 intact, but just try to patch the line increments
then, when copying from one buffer to another, to use the same increment size at source
and destination ?
This is done there:
.text:004E816E loc_4E816E: ; CODE XREF: sub_4E8043+127j
.text:004E816E mov eax, [ebp+arg_0_source_desc]
.text:004E8171 mov ecx, [ebp+var_408]
.text:004E8177 add ecx, [eax+14h]
.text:004E817A mov [ebp+var_408], ecx
.text:004E8180 mov edx, [ebp+var_404]
-> .text:004E8186 add edx, dword_708594
.text:004E818C mov [ebp+var_404], edx
.text:004E8192 jmp short loc_4E8126
Let's change the line to:
-> .text:004E8186 add edx, [eax+14h]
with the necessary 3 nop's behind (the new instruction is 3 bytes shorter)
Result is ... not very good, the interface screens are ok in height, but the contents seem
somewhat scrambled, is changing as the mouse pointer is moving, doesn't correspond to the
mouse pointer position ... in short, not useable.
Method 2:
---------
This starts by trying to find where the video buffer address is coming from.
Simple, it is stored in: .data:0x657B60
The allocation of the buffer itself is much harder to find. This comes back to
the same procedure and call to ddraw.ddl than in Method 1, however it is highly
probable that this is just a video context coming from somewhere else in the code
which will be hard to find unless to be a DirectX expert, which I am not, or worse,
it could be allocated by ddraw.dll itself ... and there, no chance to patch anything.
Also, it is not obvious from the parameters given to the call what is driving that,
unless again to know the DirectX API.
Dead end then on this method :-(
But wait ....! Something is strange at the end of this procedure ...
Method 3:
---------
The end of this procedure takes the buffer address returned by the ddraw.dll call,
and adds to it an offset in lines multiplied by the line length obtained a little
higher in the code, cf. Method 1).
.text:0057C489 loc_57C489: ; CODE XREF: sub_56EF80+D4EDj
-> .text:0057C489 movsx eax, word_73A50C
-> .text:0057C490 imul eax, [esi+8]
-> .text:0057C494 add eax, [esp+90h+var_64]
.text:0057C498 mov [esi], eax
.text:0057C49A mov byte ptr dword_6D26E8+1, bl
.text:0057C4A0 pop esi
.text:0057C4A1 mov al, 1
.text:0057C4A3 pop ebx
.text:0057C4A4 add esp, 7Ch
.text:0057C4A7 retn
This offset is in another word variable: .data:0x73A50C (2 bytes)
This word variable is written at: .text:0x57BC1B
.text:0057BBF0 sub_57BBF0 proc near
.text:0057BBF0
.text:0057BBF0 var_E0= dword ptr -0E0h
.text:0057BBF0 var_DC= dword ptr -0DCh
.text:0057BBF0 var_D8= dword ptr -0D8h
.text:0057BBF0 var_D4= dword ptr -0D4h
.text:0057BBF0 var_C0= dword ptr -0C0h
.text:0057BBF0 var_BC= dword ptr -0BCh
.text:0057BBF0 var_B8= dword ptr -0B8h
.text:0057BBF0 var_B4= dword ptr -0B4h
.text:0057BBF0 var_94= dword ptr -94h
.text:0057BBF0 var_90= dword ptr -90h
.text:0057BBF0 var_88= dword ptr -88h
.text:0057BBF0 var_80= dword ptr -80h
.text:0057BBF0 var_78= dword ptr -78h
.text:0057BBF0 var_2C= dword ptr -2Ch
.text:0057BBF0 arg_0= dword ptr 4
.text:0057BBF0 arg_4= dword ptr 8
.text:0057BBF0 arg_8= dword ptr 0Ch
.text:0057BBF0 arg_C= dword ptr 10h
.text:0057BBF0
.text:0057BBF0 mov eax, [esp+arg_C]
.text:0057BBF4 sub esp, 8Ch
.text:0057BBFA and eax, 1
.text:0057BBFD push ebx
.text:0057BBFE mov ebx, [esp+90h+arg_0]
.text:0057BC05 push ebp
.text:0057BC06 mov ebp, [esp+94h+arg_4]
.text:0057BC0D push esi
.text:0057BC0E mov si, cx
.text:0057BC11 mov ecx, [esp+98h+arg_8]
.text:0057BC18 push edi
.text:0057BC19 mov edi, edx
-> .text:0057BC1B mov word_73A50C, bp
.text:0057BC22 shl eax, 1
.text:0057BC24 cmp si, word_73A500
.text:0057BC2B mov word_73A50E, cx
.text:0057BC32 mov word_73A502, di
.text:0057BC39 mov dword_73A724, eax
.text:0057BC3E jnz short loc_
The purpose of this is not obvious. Tracing on the values taken during the process
shows that this is 0 most of the time, and that at the moment where the process
breaks, it is 0x46 ... Aha ! So let's patch the code to always read 0:
-> .text:0057C489 mov eax, 0
with the necessary 2 nop's behind (the new instruction is 2 bytes shorter)
And let'ts try ... the result is .. perfect so far ! No glitch, no image distortion
anywhere.
Ok, didn't play the whole game so far, so there might be an effect somewhere which
cannot be seen yet, but at least, it seems that most of the time if not all, the
game will be fine. So that will be the one.
Enjoy.
!---------------------------!
! Summary of modifications: !
!---------------------------!
Method 1 (workable, some screens height divided by 2):
------------------------------------------------------
.0057C464: 7409 je .00057C46F
becomes
.0057C464: EB09 jmps .00057C46F
With an hex editor, change in "Rogue Squadron.EXE":
offset value changes
____________________________________
17B864h 116 (74h) to 235 (EBh)
Method 1bis (not really useable interface screens):
---------------------------------------------------
.004E8186: 031594857000 add edx,[00708594]
becomes
.004E8186: 035014 add edx,[eax][14]
.004E8189: 90 nop
.004E818A: 90 nop
.004E818B: 90 nop
With an hex editor, change in "Rogue Squadron.EXE":
offset value changes
____________________________________
E7587h 21 (15h) to 80 (50h)
E7588h 148 (94h) to 20 (14h)
E7589h 133 (85h) to 144 (90h)
E758Ah 112 (70h) to 144 (90h)
E758Bh 0 (00h) to 144 (90h)
Method 3 (perfect so far):
--------------------------
.0057C489: 0FBF050CA57300 movsx eax,w,[0073A50C]
becomes
.0057C489: B800000000 mov eax,000000000
.0057C48E: 90 nop
.0057C48F: 90 nop
With an hex editor, change in "Rogue Squadron.EXE":
offset value changes
____________________________________
17B889h 15 (0Fh) to 184 (B8h)
17B88Ah 191 (BFh) to 0 (00h)
17B88Bh 5 (05h) to 0 (00h)
17B88Ch 12 (0Ch) to 0 (00h)
17B88Dh 165 (A5h) to 0 (00h)
17B88Eh 115 (73h) to 144 (90h)
17B88Fh 0 (00h) to 144 (90h)