Unpackme I am Famous
Introduction
Because it is simply obvious that I have not been posting on this blog for a while, here is a post about Safedisc v3.
Last week I was studying this protection in deep, each component under IDA, but I accidentally broke my external hard drive by giving a shot in. I lost a lot of .idb from different games, softwares or malware, my personal toolz, unpackers, ...
So to smile again I decided to write about how to unpack this protection.
For those familiar with safedisc, the only interesting part will be Nanomites, restoring Imports or emulated opcodes is a joke when you know how older versions work.
Extra data
During introduction I talked about different components, they are placed at the end of the file.
The size of the target game is 1 830 912 bytes, but if we look IMAGE_SECTION_HEADER closely :
Name VirtSize VirtAddr SizeRaw PtrRaw Flags Pointing Directories
-------------------------------------------------------------------------------------------
.text 00131148h 00401000h 00132000h 00001000h 60000020h
.rdata 0002F497h 00533000h 00030000h 00133000h 40000040h Debug Data
.data 012CDCE8h 00563000h 0000D000h 00163000h C0000040h
.rsrc 0003289Eh 01831000h 00033000h 00170000h 40000040h Resource Table
stxt774 00002059h 01864000h 00003000h 001A4000h E0000020h
stxt371 00003358h 01867000h 00004000h 001A7000h E0000020h Import Table
Import Address Table
If we sum the last Real Offset and Real Size of stxt371 section :
>>> 0x1A7000 + 0x4000
1748992
>>> hex(0x1A7000 + 0x4000)
'0x1ab000'
1 748 992 bytes != 1 830 912 bytes.
Clearly there is some extra data at the end of the file.
By looking the main executable under IDA, I was able to find an interesting sub that retrieves and extracts those datas.
First, here is the structure used for extra data :
struct extra_data
{
DWORD sig_1;
DWORD sig_2;
DWORD num_file;
DWORD offset_1;
DWORD offset_2;
DWORD unknow_1;
DWORD unknow_2;
BYTE name[0xD];
};
sig_1 must always be set to 0xA8726B03 and sig_2 to 0xEF01996C
And after deleting all there (weak?) obfuscation, we can retrieve the following "pseudo code" to extract additionnal data.
do
{
SetFilePointer(hFile, actual_pos, NULL, FILE_BEGIN);
ReadFile(hFile, buff, 0x121, &bread, 0);
key = actual_pos;
for (i = 0; i < bread; i++)
{
key = key * 0x13C6A5;
key += 0x0D8430DED;
buff[i] ^= (((((key >> 0x10) ^ (key >> 0x8)) ^ (key >> 0x18)) ^ (key & 0xFF)) & 0xFF);
}
memcpy(&data, buff, sizeof(struct extra_data));
print_data_info(&data);
actual_pos += data.offset_1 + data.offset_2;
} while (data.sig_1 == 0xA8726B03 && data.sig_2 == 0xEF01996C);
Result :
Name : ~def549.tmp
Num : 1
Name : clcd32.dll
Num : 1100
Name : clcd16.dll
Num : 1100
Name : mcp.dll
Num : 1101
Name : SECDRV.SYS
Num : 2
Name : DrvMgt.dll
Num : 2
Name : SecDrv04.VxD
Num : 11
Name : ~e5.0001
Num : 0
Name : PfdRun.pfd
Num : 0
Name : ~df394b.tmp
Num : 0
As you can see we can extract a lot of files, and here is the algorithm to decypher it :
ptr = (BYTE*)GlobalAlloc(GPTR, data.offset_1);
SetFilePointer(hFile, actual_pos - data.offset_1, NULL, FILE_BEGIN);
ReadFile(hFile, ptr, data.offset_1, &bread, NULL);
if (bread != data.offset_1)
{
printf("[-] ReadFile() failed\n");
exit(0);
}
key = 0x8142FEA1;
int init_key;
init_key = 0x8142FEA1;
for (i = 0; i < bread; i++)
{
key = init_key ^ 0x7F6D09ED;
ptr[i] = (((((key >> 0x18) ^ (key >> 0x10)) ^ (key >> 0x8))) & 0xFF) ^ ptr[i];
ptr[i] ^= key & 0xFF;
init_key = init_key << 0x8;
init_key += ptr[i];
}
Each component will be extracted into %temp% path, they got their own goal, we will not study all of them there is no interest.
- ~def549.tmp, a DLL, whose goal is to call different anti-debug technics (not interesting), check files on CD-ROM, ...
- ~e5.0001, an executable, this process will debug the main executable, for managing Nanomites.
- PfdRun.pfd, No type, This file will de decyphered for computing instruction table used for emulated opcodes.
- ~df394b.tmp, another DLL, Load and decyph section from other DLL, and manage debug event for ~e5.0001 process.
I will not discuss more about all this stuff, by loosing all my idb I am bored to reverse (rename sub) again and again with all this shitty C++ stuff, you can find some fun crypto when they decypher pfd file or code section, rijndael modified, different xor operation, anyway let's continue !
Find OEP
This is the easiest part :
stxt371:018670A2 mov ebx, offset start
stxt371:018670A7 xor ecx, ecx
stxt371:018670A9 mov cl, ds:byte_186703D
stxt371:018670AF test ecx, ecx
stxt371:018670B1 jz short loc_18670BF
stxt371:018670B3 mov eax, offset loc_1867113
stxt371:018670B8 sub eax, ebx
stxt371:018670BA sub eax, 5
stxt371:018670BD jmp short loc_18670CD
stxt371:018670BF ; ---------------------------------------------------------------------------
stxt371:018670BF
stxt371:018670BF loc_18670BF: ; CODE XREF: start+13j
stxt371:018670BF push ecx
stxt371:018670C0 mov ecx, offset loc_1867159
stxt371:018670C5 mov eax, ecx
stxt371:018670C7 sub eax, ebx
stxt371:018670C9 add eax, [ecx+1]
stxt371:018670CC pop ecx
stxt371:018670CD
stxt371:018670CD loc_18670CD: ; CODE XREF: start+1Fj
stxt371:018670CD mov byte ptr [ebx], 0E9h
stxt371:018670D0 mov [ebx+1], eax
This code will replace Module Entrypoint by a jump to Real OEP, so if you like using OllyDbg execute first instructions and put a breakpoint on that jump.
But you will encounter a "dead lock" problem, before jumping to real OEP, it decyphers sections, loads dll AND CreateProcess "~e5.0001" giving the pid of the game process as argument.
This process will load ~df394b.tmp aka SecServ.dll, all strings inside this dll are encrypted, we can decrypt all of them :
int decrypt_func_01(char *mem_alloc, char *addr_to_decrypt)
{
DWORD count;
DWORD key;
char actual;
if (mem_alloc && addr_to_decrypt)
{
count = 0;
key = 0x522CFDD0;
while (1)
{
actual = *addr_to_decrypt++;
actual = actual ^ (char)key;
*mem_alloc++ = actual;
key = 0xA065432A - 0x22BC897F * key;
if (!actual)
break;
if (count != 127)
{
count++;
continue;
}
return 0;
}
return 1;
}
else
return 0;
}
Here is the result of all strings decyphered :
Addr = 667A9240 : drvmgt.dll
Addr = 667A9264 : secdrv.sys
Addr = 667A9298 : SecDrv04.VxD
Addr = 667A92BC : ALT_
Addr = 667A9C78 : Kernel32
Addr = 667AA71C : \\.\NTICE
Addr = 667AA73C : \\.\SICE
Addr = 667AA75C : \\.\SIWVID
Addr = 667AAB80 : .text
Addr = 667A9928 : Ntdll.dll
Addr = 667A9948 : Kernel32
Addr = 667AA3F0 : GetVersionExA
Addr = 667AA6BC : ZwQuerySystemInformation
Addr = 667AA6EC : NtQueryInformationProcess
Addr = 667AA780 : IsDebuggerPresent
Addr = 667AAB50 : ZwQuerySystemInformation
Addr = 667AADBC : ExitProcess
Addr = 667A99F8 : DeviceIoControl
Addr = 667A9A40 : CreateFileA
Addr = 667A9A64 : ReadProcessMemory
Addr = 667A9A8C : WriteProcessMemory
Addr = 667A9AB8 : VirtualProtect
Addr = 667A9AE0 : CreateProcessA
Addr = 667A9B08 : CreateProcessW
Addr = 667A9B30 : GetStartupInfoA
Addr = 667A9B58 : GetStartupInfoW
Addr = 667A9B80 : GetSystemTime
Addr = 667A9BA4 : GetSystemTimeAsFileTime
Addr = 667A9BD4 : TerminateProcess
Addr = 667A9BFC : Sleep
Addr = 667AB8C0 : WriteProcessMemory
Addr = 667AB8EC : FlushInstructionCache
Addr = 667AB918 : VirtualProtect
Addr = 667ABB90 : SetThreadContext
Addr = 667ABBB8 : GetThreadContext
Addr = 667ABBE0 : SuspendThread
Addr = 667ABB64 : FlushInstructionCache
Addr = 667ABB38 : WriteProcessMemory
Addr = 667ABC84 : ContinueDebugEvent
Addr = 667ABB0C : DebugActiveProcess
Addr = 667ABAE4 : WaitForDebugEvent
Addr = 667A99F8 : DeviceIoControl
Addr = 667ACF00 : System\CurrentControlSet\Services\VxD
Addr = 667ACF5C : cmapieng.vxd
Addr = 667ACF3C : StaticVxD
The most interesting things are DebugActiveProcess, ContinueDebugEvent, WriteProcessMemory, FlushInstructionCache, SetThreadContext.
As I said earlier this dll will be in charge of debugging the game process, it prevents debugging it with Olly or any Ring3 debugger.
The game process after calling CreateProcess will wait (WaitForSingleObject) signal that temp executable will attach to it and give it signal and continue to debug it, but if you are already debugging game process, WaitForSingleObject will never catch this signal.
All the code below can be found inside ~df394.tmp aka SecServ.dll :
.text:667250C1
.text:667250C1 loc_667250C1: ; CODE XREF: sub_66724FDE+D5j
.text:667250C1 push 0FFFFFFFFh ; dwMilliseconds
.text:667250C3 push edi ; hHandle
.text:667250C4 call ds:WaitForSingleObject
.text:667250CA push [ebp+hObject] ; hObject
.text:667250CD mov [ebp+return_value], eax
.text:667250D0 call esi ; CloseHandle
.text:667250D2 push edi ; hObject
.text:667250D3 call esi ; CloseHandle
.text:667250D5 cmp [ebp+return_value], 0 ; WAIT_OBJECT_0
.text:667250D9 pop edi
.text:667250DA pop esi
.text:667250DB jz short exit_func
.text:667250DD call ebx ; GetLastError
.text:667250DF call Exit_Process
.text:667250E4
.text:667250E4 exit_func: ; CODE XREF: sub_66724FDE+11j
.text:667250E4 ; sub_66724FDE+20j ...
.text:667250E4 pop ebx
.text:667250E5 leave
.text:667250E6 retn
.text:667250E6 sub_66724FDE endp
.text:667250E6
If you want to use OllyDBG, put a breakpoint on WaitForSingleObject call, and modify argument TIMEOUT to something different than INFINITE, and change ZF flag during the test of the return value.
Nanomites
Now the fun stuff can start, if you followed what I said, you can continue to debug game process, but at a moment you will encounter problem like follow :
.text:004519FC call ds:dword_5331B0 ; kernel32.IsBadReadPtr
.text:004519FC ; ---------------------------------------------------------------------------
.text:00451A02 db 0CCh
.text:00451A03 db 0CCh ;
.text:00451A04 db 0CCh ;
.text:00451A05 db 0CCh ;
What are doing these 0xCC (int 3) aka Trap to Debugger or software breakpoint after a call to a kernel32 API ?
It's a well known technique, instructions are replaced by this opcode and informations about the removed opcode is stored in a table. (Remember pfd file ?)
Then, by using self-debugging, when one of these breakpoints is hit, the debugging process will handle the debug exception, and will look up certain information about the debugging break.
- Is it a Nanomite ?
- Yes ! So I have to emulate the removed opcode
- And restore the context of the thread correctly
But the problem is, if Nanomites are called several times, it can impact a little the performance, right ? (Not anymore today), but Safedisc decided to count how much time a Nanomite is executed, and if this Nanomite is executed too much time, it will restore the replaced opcodes by writting it inside the debugged process.
So if we want to fix theses Nanomites, we just have to patch a branch instruction that say : "This nanomites has been executed too much time, restore opcode !", and scan txt section of game process to find all the nanomites, call them, and the debugger process will restore all the removed opcode :).
How To
When unpacking (real?) protection you need to write cool toolz, here are all the steps that I did :
- Create Game process in suspended state
- Inject a first (malicious?) dll into it and continue execution
- This first dll will setup an Hook on CreateProcessA, the goal of this task is when the debugger process ( ~e5.0001 ) will be created, it will change the dwCreationFlags to CREATE_SUSPENDED and inject a second dll in it.
- A second hook from the first dll will be setup on GetVersionExA to gain execution just after the jump to Real OEP.
- Once GetVersionExA is called, we scan txt section and look for 0xCC and for each one it create a thread at the address of the nanomites.
- The second dll will patch the branch condition for WriteProcessMemory the emulated opcode and hook SetThreadContext for terminating the thread in question and not continue his execution.
Need a diagram ?
I encountered a little problem during those operation, if we create a thread at an addr containing 0xCC followed by nop operation (0x90), Safedisc debugger crashes or emulates shit...
Visual Studio uses 0xCC, 0x90 and 0x00 opcode for padding, don't ask me why they don't just use only 0x00, I don't know.
Just so you know, if you don't provide the full path of these dll while you are injecting it, the first dll must be placed in the folder of the game process, and the second one in %temp% path, because debugger process is extracted and executed here.
You can find the branch instruction inside ~def394.tmp (SecServ.dll) at addr 0x6678F562 :
.txt5:6678F562 cmp ax, 1
.txt5:6678F566 jnz not_write_process_memory
Result
Just some debug information :
---
Process id : 894
EventCode : 1
Exception Code : 80000003
Exception Addr : 40170F
---
[+] GetThreadContext(0xB8, 0x635080); return_addr = 66733C55
lpContext->EIP = 7C91120F
[+] WriteProcessMemory(0x5C, 0x40170F, 0x61F58C, 0x2, 0x61F0F0); return_addr = 6672BA45
85 C0
[+] SetThreadContext(0xB8, 0x635080); return_addr = 66733C23
lpContext->EIP = 40170F
---
As you can see at address 0x40170F, an event occured 0x1 -> EXCEPTION_DEBUG_EVENT and his code 0x80000003 (EXCEPTION_BREAKPOINT), so the debugger process replaces the 0xCC 0xCC by 0x85 0xC0 -> "test eax, eax", and try to SetThreadContext but we hooked it to terminate the thread.
Restoring Imports
Like the previous version import points to some virtual address where the code calls routine to find the correct import.
By using algo against itself we can resolve all correct address of imports.
Inside txt section we can find different type of call to imports :
- call dword ptr[virtual_addr]
- jmp dword ptr[virtual_addr]
- jmp section Stxt774
The idea is simple, scan .txt section look for call dword ptr or jmp dword ptr or jmp section Stxt774, hook the function that resolve the api and get the result and save into into a linked list.
This function in question is in ~df394b.tmp :
.txt:6678D644 call resolve_api
.txt:6678D649 pop ecx
.txt:6678D64A pop ecx
Just replace the pop ecx, by "add esp, X; ret" and get the result into register eax.
BUT ! Sometimes by calling the same virtual_addr but from other location it don't resolve the same API address.
API (0x7E3AC17E) has rdata.0x53327C (txt.0x51A656) rdata.0x53327C (txt.0x454509) rdata.0x53327C (txt.0x454149) rdata.0x533260 (txt.0x453773) rdata.0x53327C (txt.0x4535BD)
API (0x7E39869D) has rdata.0x53329C (txt.0x51A686) rdata.0x53329C (txt.0x50B64E) rdata.0x53329C (txt.0x50B1E4) rdata.0x53327C (txt.0x4FDC5E) rdata.0x53329C (txt.0x4FD7CA) rdata.0x533284 (txt.0x4FD718)
As you can see the address in rdata 0x53327C, can resolve different API when it is called from different locations (txt address).
To fix it, it's very simple we reorder the linked list according to the api address, and choose one rdata for each call, and we will change value of the call or jmp dword ptr at txt address for each entry of an api.
After reorder
Output after reordering :
API (0x7E3AC17E) has rdata.0x53327C (txt.0x51A656) rdata.0x53327C (txt.0x454509) rdata.0x53327C (txt.0x454149) rdata.0x53327C (txt.0x453773) rdata.0x53327C (txt.0x4535BD)
API (0x7E39869D) has rdata.0x53329C (txt.0x51A686) rdata.0x53329C (txt.0x50B64E) rdata.0x53329C (txt.0x50B1E4) rdata.0x53327C (txt.0x4FDC5E) rdata.0x53329C (txt.0x4FD7CA) rdata.0x533284 (txt.0x4FD718)
We can now write back into rdata addr the real adress of the api and fix the call or jmp at adress in txt section, to point to the good rdata address.
Now you can look with ImportRec and see that all imports are restored correctly :)
To fix jmp section Stxt774, we just have to replace the jmp by a call dword ptr[rdata], but wait jmp stxt774 is 5 bytes and we need 6 bytes to change it to call dword ptr, don't worry, after resolving the api and ret to it, the api will return at jmp stxt774 + 6, so there is enough place.
And Import Reconstructor is happy (Invalid imports 0) :
Emulated opcodes
After fixing Nanomites and restoring imports, I encounter a last problem.
.text:00404909 push ecx
.text:0040490A push eax
.text:0040490B call sub_4013F3
.text:0040490B sub_404909 endp ; sp-analysis failed
.text:0040490B
.text:004013F3 mov eax, 1E1Bh
.text:004013F8 pop ecx
.text:004013F9 lea eax, [eax+ecx]
.text:004013FC mov eax, [eax]
.text:004013FE jmp eax
This code will just compute an address in txt section, get the value pointed by this address and jump to it. The jump destination is an address from ~df394b.tmp.
The goal of sub 0x6673E090 is simply to check from where it has been called, lookup in a table of emulated opcodes and restore it.
Here only one emulation is performed then it will write original opcode back.
Like for restoring imports, we find each reference to the sub 0x00404909, setup an hook at the end of the sub 0x6673E09, call each reference, and emulated opcodes will be restored automatically :)
Conclusion
Safedisc v3 is really not difficult, you can find the source of all my codes at the end of this post.
I will go back to school project, hopefully graduating this year :)