Unpackme I am Famous


Because it is simply obvious that I have not been posting on this blog for a while, here is a post about Safedisc v3.
Last week I was studying this protection in deep, each component under IDA, but I accidentally broke my external hard drive by giving a shot in. I lost a lot of .idb from different games, softwares or malware, my personal toolz, unpackers, ...
So to smile again I decided to write about how to unpack this protection.
For those familiar with safedisc, the only interesting part will be Nanomites, restoring Imports or emulated opcodes is a joke when you know how older versions work.

Extra data

During introduction I talked about different components, they are placed at the end of the file.
The size of the target game is 1 830 912 bytes, but if we look IMAGE_SECTION_HEADER closely :

 Name      VirtSize   VirtAddr   SizeRaw    PtrRaw     Flags      Pointing Directories
 .text     00131148h  00401000h  00132000h  00001000h  60000020h  
 .rdata    0002F497h  00533000h  00030000h  00133000h  40000040h  Debug Data
 .data     012CDCE8h  00563000h  0000D000h  00163000h  C0000040h  
 .rsrc     0003289Eh  01831000h  00033000h  00170000h  40000040h  Resource Table
 stxt774   00002059h  01864000h  00003000h  001A4000h  E0000020h  
 stxt371   00003358h  01867000h  00004000h  001A7000h  E0000020h  Import Table
								  Import Address Table

If we sum the last Real Offset and Real Size of stxt371 section :

>>> 0x1A7000 + 0x4000
>>> hex(0x1A7000 + 0x4000)

1 748 992 bytes != 1 830 912 bytes.
Clearly there is some extra data at the end of the file.
By looking the main executable under IDA, I was able to find an interesting sub that retrieves and extracts those datas.
First, here is the structure used for extra data :

struct extra_data
	DWORD sig_1;
	DWORD sig_2;
	DWORD num_file;
	DWORD offset_1;
	DWORD offset_2;
	DWORD unknow_1;
	DWORD unknow_2;
	BYTE  name[0xD];

sig_1 must always be set to 0xA8726B03 and sig_2 to 0xEF01996C
And after deleting all there (weak?) obfuscation, we can retrieve the following "pseudo code" to extract additionnal data.

	SetFilePointer(hFile, actual_pos, NULL, FILE_BEGIN);
	ReadFile(hFile, buff, 0x121, &bread, 0);
	key = actual_pos;

	for (i = 0; i < bread; i++)
		key = key * 0x13C6A5;
		key += 0x0D8430DED;
		buff[i] ^= (((((key >> 0x10) ^ (key >> 0x8)) ^ (key >> 0x18)) ^ (key & 0xFF)) & 0xFF);
	memcpy(&data, buff, sizeof(struct extra_data));

	actual_pos += data.offset_1 + data.offset_2;

} while (data.sig_1 == 0xA8726B03 && data.sig_2 == 0xEF01996C);

Result :

Name : ~def549.tmp
Num : 1
Name : clcd32.dll
Num : 1100
Name : clcd16.dll
Num : 1100
Name : mcp.dll
Num : 1101
Num : 2
Name : DrvMgt.dll
Num : 2
Name : SecDrv04.VxD
Num : 11
Name : ~e5.0001
Num : 0
Name : PfdRun.pfd
Num : 0
Name : ~df394b.tmp
Num : 0

As you can see we can extract a lot of files, and here is the algorithm to decypher it :

ptr = (BYTE*)GlobalAlloc(GPTR, data.offset_1);
SetFilePointer(hFile, actual_pos - data.offset_1, NULL, FILE_BEGIN);
ReadFile(hFile, ptr, data.offset_1, &bread, NULL);
if (bread != data.offset_1)
	printf("[-] ReadFile() failed\n");
key = 0x8142FEA1;
int init_key;
init_key = 0x8142FEA1;
for (i = 0; i < bread; i++)
	key = init_key ^ 0x7F6D09ED;
	ptr[i] = (((((key >> 0x18) ^ (key >> 0x10)) ^ (key >> 0x8))) & 0xFF) ^ ptr[i];
	ptr[i] ^= key & 0xFF;
	init_key = init_key << 0x8;
	init_key += ptr[i];

Each component will be extracted into %temp% path, they got their own goal, we will not study all of them there is no interest.

I will not discuss more about all this stuff, by loosing all my idb I am bored to reverse (rename sub) again and again with all this shitty C++ stuff, you can find some fun crypto when they decypher pfd file or code section, rijndael modified, different xor operation, anyway let's continue !

Find OEP

This is the easiest part :

stxt371:018670A2                 mov     ebx, offset start
stxt371:018670A7                 xor     ecx, ecx
stxt371:018670A9                 mov     cl, ds:byte_186703D
stxt371:018670AF                 test    ecx, ecx
stxt371:018670B1                 jz      short loc_18670BF
stxt371:018670B3                 mov     eax, offset loc_1867113
stxt371:018670B8                 sub     eax, ebx
stxt371:018670BA                 sub     eax, 5
stxt371:018670BD                 jmp     short loc_18670CD
stxt371:018670BF ; ---------------------------------------------------------------------------
stxt371:018670BF loc_18670BF:                            ; CODE XREF: start+13j
stxt371:018670BF                 push    ecx
stxt371:018670C0                 mov     ecx, offset loc_1867159
stxt371:018670C5                 mov     eax, ecx
stxt371:018670C7                 sub     eax, ebx
stxt371:018670C9                 add     eax, [ecx+1]
stxt371:018670CC                 pop     ecx
stxt371:018670CD loc_18670CD:                            ; CODE XREF: start+1Fj
stxt371:018670CD                 mov     byte ptr [ebx], 0E9h
stxt371:018670D0                 mov     [ebx+1], eax

This code will replace Module Entrypoint by a jump to Real OEP, so if you like using OllyDbg execute first instructions and put a breakpoint on that jump.
But you will encounter a "dead lock" problem, before jumping to real OEP, it decyphers sections, loads dll AND CreateProcess "~e5.0001" giving the pid of the game process as argument.
This process will load ~df394b.tmp aka SecServ.dll, all strings inside this dll are encrypted, we can decrypt all of them :

int decrypt_func_01(char *mem_alloc, char *addr_to_decrypt)
    DWORD count;
    DWORD key;
    char  actual;

    if (mem_alloc && addr_to_decrypt)
        count = 0;
        key = 0x522CFDD0;
        while (1)
            actual = *addr_to_decrypt++;
            actual = actual ^ (char)key;
            *mem_alloc++ = actual;
            key = 0xA065432A - 0x22BC897F * key;
            if (!actual)
            if (count != 127)
            return 0;
        return 1;
        return 0;

Here is the result of all strings decyphered :

Addr = 667A9240 : drvmgt.dll
Addr = 667A9264 : secdrv.sys
Addr = 667A9298 : SecDrv04.VxD
Addr = 667A92BC : ALT_
Addr = 667A9C78 : Kernel32
Addr = 667AA71C : \\.\NTICE
Addr = 667AA73C : \\.\SICE
Addr = 667AA75C : \\.\SIWVID
Addr = 667AAB80 : .text
Addr = 667A9928 : Ntdll.dll
Addr = 667A9948 : Kernel32
Addr = 667AA3F0 : GetVersionExA
Addr = 667AA6BC : ZwQuerySystemInformation
Addr = 667AA6EC : NtQueryInformationProcess
Addr = 667AA780 : IsDebuggerPresent
Addr = 667AAB50 : ZwQuerySystemInformation
Addr = 667AADBC : ExitProcess
Addr = 667A99F8 : DeviceIoControl
Addr = 667A9A40 : CreateFileA
Addr = 667A9A64 : ReadProcessMemory
Addr = 667A9A8C : WriteProcessMemory
Addr = 667A9AB8 : VirtualProtect
Addr = 667A9AE0 : CreateProcessA
Addr = 667A9B08 : CreateProcessW
Addr = 667A9B30 : GetStartupInfoA
Addr = 667A9B58 : GetStartupInfoW
Addr = 667A9B80 : GetSystemTime
Addr = 667A9BA4 : GetSystemTimeAsFileTime
Addr = 667A9BD4 : TerminateProcess
Addr = 667A9BFC : Sleep
Addr = 667AB8C0 : WriteProcessMemory
Addr = 667AB8EC : FlushInstructionCache
Addr = 667AB918 : VirtualProtect
Addr = 667ABB90 : SetThreadContext
Addr = 667ABBB8 : GetThreadContext
Addr = 667ABBE0 : SuspendThread
Addr = 667ABB64 : FlushInstructionCache
Addr = 667ABB38 : WriteProcessMemory
Addr = 667ABC84 : ContinueDebugEvent
Addr = 667ABB0C : DebugActiveProcess
Addr = 667ABAE4 : WaitForDebugEvent
Addr = 667A99F8 : DeviceIoControl
Addr = 667ACF00 : System\CurrentControlSet\Services\VxD
Addr = 667ACF5C : cmapieng.vxd
Addr = 667ACF3C : StaticVxD

The most interesting things are DebugActiveProcess, ContinueDebugEvent, WriteProcessMemory, FlushInstructionCache, SetThreadContext.
As I said earlier this dll will be in charge of debugging the game process, it prevents debugging it with Olly or any Ring3 debugger.
The game process after calling CreateProcess will wait (WaitForSingleObject) signal that temp executable will attach to it and give it signal and continue to debug it, but if you are already debugging game process, WaitForSingleObject will never catch this signal.
All the code below can be found inside ~df394.tmp aka SecServ.dll :

.text:667250C1 loc_667250C1:                           ; CODE XREF: sub_66724FDE+D5j
.text:667250C1                 push    0FFFFFFFFh      ; dwMilliseconds
.text:667250C3                 push    edi             ; hHandle
.text:667250C4                 call    ds:WaitForSingleObject
.text:667250CA                 push    [ebp+hObject]   ; hObject
.text:667250CD                 mov     [ebp+return_value], eax
.text:667250D0                 call    esi ; CloseHandle
.text:667250D2                 push    edi             ; hObject
.text:667250D3                 call    esi ; CloseHandle
.text:667250D5                 cmp     [ebp+return_value], 0 ; WAIT_OBJECT_0
.text:667250D9                 pop     edi
.text:667250DA                 pop     esi
.text:667250DB                 jz      short exit_func
.text:667250DD                 call    ebx ; GetLastError
.text:667250DF                 call    Exit_Process
.text:667250E4 exit_func:                              ; CODE XREF: sub_66724FDE+11j
.text:667250E4                                         ; sub_66724FDE+20j ...
.text:667250E4                 pop     ebx
.text:667250E5                 leave
.text:667250E6                 retn
.text:667250E6 sub_66724FDE    endp

If you want to use OllyDBG, put a breakpoint on WaitForSingleObject call, and modify argument TIMEOUT to something different than INFINITE, and change ZF flag during the test of the return value.


Now the fun stuff can start, if you followed what I said, you can continue to debug game process, but at a moment you will encounter problem like follow :

.text:004519FC                 call    ds:dword_5331B0 ; kernel32.IsBadReadPtr
.text:004519FC ; ---------------------------------------------------------------------------
.text:00451A02                 db 0CCh
.text:00451A03                 db 0CCh ;
.text:00451A04                 db 0CCh ;
.text:00451A05                 db 0CCh ;

What are doing these 0xCC (int 3) aka Trap to Debugger or software breakpoint after a call to a kernel32 API ?
It's a well known technique, instructions are replaced by this opcode and informations about the removed opcode is stored in a table. (Remember pfd file ?)
Then, by using self-debugging, when one of these breakpoints is hit, the debugging process will handle the debug exception, and will look up certain information about the debugging break.

But the problem is, if Nanomites are called several times, it can impact a little the performance, right ? (Not anymore today), but Safedisc decided to count how much time a Nanomite is executed, and if this Nanomite is executed too much time, it will restore the replaced opcodes by writting it inside the debugged process.
So if we want to fix theses Nanomites, we just have to patch a branch instruction that say : "This nanomites has been executed too much time, restore opcode !", and scan txt section of game process to find all the nanomites, call them, and the debugger process will restore all the removed opcode :).

How To

When unpacking (real?) protection you need to write cool toolz, here are all the steps that I did :

Need a diagram ?

I encountered a little problem during those operation, if we create a thread at an addr containing 0xCC followed by nop operation (0x90), Safedisc debugger crashes or emulates shit...
Visual Studio uses 0xCC, 0x90 and 0x00 opcode for padding, don't ask me why they don't just use only 0x00, I don't know.
Just so you know, if you don't provide the full path of these dll while you are injecting it, the first dll must be placed in the folder of the game process, and the second one in %temp% path, because debugger process is extracted and executed here.

You can find the branch instruction inside ~def394.tmp (SecServ.dll) at addr 0x6678F562 :

.txt5:6678F562                 cmp     ax, 1
.txt5:6678F566                 jnz     not_write_process_memory


Just some debug information :

Process id : 894
EventCode : 1
Exception Code : 80000003
Exception Addr : 40170F
[+] GetThreadContext(0xB8, 0x635080); return_addr = 66733C55
lpContext->EIP = 7C91120F
[+] WriteProcessMemory(0x5C, 0x40170F, 0x61F58C, 0x2, 0x61F0F0); return_addr = 6672BA45
85 C0 
[+] SetThreadContext(0xB8, 0x635080); return_addr = 66733C23
lpContext->EIP = 40170F

As you can see at address 0x40170F, an event occured 0x1 -> EXCEPTION_DEBUG_EVENT and his code 0x80000003 (EXCEPTION_BREAKPOINT), so the debugger process replaces the 0xCC 0xCC by 0x85 0xC0 -> "test eax, eax", and try to SetThreadContext but we hooked it to terminate the thread.

Restoring Imports

Like the previous version import points to some virtual address where the code calls routine to find the correct import.
By using algo against itself we can resolve all correct address of imports.
Inside txt section we can find different type of call to imports :

The idea is simple, scan .txt section look for call dword ptr or jmp dword ptr or jmp section Stxt774, hook the function that resolve the api and get the result and save into into a linked list.
This function in question is in ~df394b.tmp :

.txt:6678D644                 call    resolve_api
.txt:6678D649                 pop     ecx
.txt:6678D64A                 pop     ecx

Just replace the pop ecx, by "add esp, X; ret" and get the result into register eax.

BUT ! Sometimes by calling the same virtual_addr but from other location it don't resolve the same API address.

API (0x7E3AC17E) has rdata.0x53327C (txt.0x51A656)  rdata.0x53327C (txt.0x454509)  rdata.0x53327C (txt.0x454149)  rdata.0x533260 (txt.0x453773)  rdata.0x53327C (txt.0x4535BD)
API (0x7E39869D) has rdata.0x53329C (txt.0x51A686)  rdata.0x53329C (txt.0x50B64E)  rdata.0x53329C (txt.0x50B1E4)  rdata.0x53327C (txt.0x4FDC5E)  rdata.0x53329C (txt.0x4FD7CA)  rdata.0x533284 (txt.0x4FD718)  

As you can see the address in rdata 0x53327C, can resolve different API when it is called from different locations (txt address).
To fix it, it's very simple we reorder the linked list according to the api address, and choose one rdata for each call, and we will change value of the call or jmp dword ptr at txt address for each entry of an api.

After reorder

Output after reordering :

API (0x7E3AC17E) has rdata.0x53327C (txt.0x51A656)  rdata.0x53327C (txt.0x454509)  rdata.0x53327C (txt.0x454149)  rdata.0x53327C (txt.0x453773)  rdata.0x53327C (txt.0x4535BD)  
API (0x7E39869D) has rdata.0x53329C (txt.0x51A686)  rdata.0x53329C (txt.0x50B64E)  rdata.0x53329C (txt.0x50B1E4)  rdata.0x53327C (txt.0x4FDC5E)  rdata.0x53329C (txt.0x4FD7CA)  rdata.0x533284 (txt.0x4FD718)

We can now write back into rdata addr the real adress of the api and fix the call or jmp at adress in txt section, to point to the good rdata address.
Now you can look with ImportRec and see that all imports are restored correctly :)

To fix jmp section Stxt774, we just have to replace the jmp by a call dword ptr[rdata], but wait jmp stxt774 is 5 bytes and we need 6 bytes to change it to call dword ptr, don't worry, after resolving the api and ret to it, the api will return at jmp stxt774 + 6, so there is enough place.

And Import Reconstructor is happy (Invalid imports 0) :

Emulated opcodes

After fixing Nanomites and restoring imports, I encounter a last problem.

.text:00404909                 push    ecx
.text:0040490A                 push    eax
.text:0040490B                 call    sub_4013F3
.text:0040490B sub_404909      endp ; sp-analysis failed
.text:004013F3                 mov     eax, 1E1Bh
.text:004013F8                 pop     ecx
.text:004013F9                 lea     eax, [eax+ecx]
.text:004013FC                 mov     eax, [eax]
.text:004013FE                 jmp     eax

This code will just compute an address in txt section, get the value pointed by this address and jump to it. The jump destination is an address from ~df394b.tmp.

The goal of sub 0x6673E090 is simply to check from where it has been called, lookup in a table of emulated opcodes and restore it.
Here only one emulation is performed then it will write original opcode back.
Like for restoring imports, we find each reference to the sub 0x00404909, setup an hook at the end of the sub 0x6673E09, call each reference, and emulated opcodes will be restored automatically :)


Safedisc v3 is really not difficult, you can find the source of all my codes at the end of this post.
I will go back to school project, hopefully graduating this year :)



First dll

Second dll