Unpackme I am Famous
Introduction
Because it is simply obvious that I have not been posting on this blog for a while, here is a post about Safedisc v3.
Last week I was studying this protection in deep, each component under IDA, but I accidentally broke my external hard drive by giving a shot in. I lost a lot of .idb from different games, softwares or malware, my personal toolz, unpackers, ...
So to smile again I decided to write about how to unpack this protection.
For those familiar with safedisc, the only interesting part will be Nanomites, restoring Imports or emulated opcodes is a joke when you know how older versions work.
Extra data
During introduction I talked about different components, they are placed at the end of the file.
The size of the target game is 1 830 912 bytes, but if we look IMAGE_SECTION_HEADER closely :
Name VirtSize VirtAddr SizeRaw PtrRaw Flags Pointing Directories
-------------------------------------------------------------------------------------------
.text 00131148h 00401000h 00132000h 00001000h 60000020h
.rdata 0002F497h 00533000h 00030000h 00133000h 40000040h Debug Data
.data 012CDCE8h 00563000h 0000D000h 00163000h C0000040h
.rsrc 0003289Eh 01831000h 00033000h 00170000h 40000040h Resource Table
stxt774 00002059h 01864000h 00003000h 001A4000h E0000020h
stxt371 00003358h 01867000h 00004000h 001A7000h E0000020h Import Table
Import Address Table
If we sum the last Real Offset and Real Size of stxt371 section :
>>> 0x1A7000 + 0x4000
1748992
>>> hex(0x1A7000 + 0x4000)
'0x1ab000'
1 748 992 bytes != 1 830 912 bytes.
Clearly there is some extra data at the end of the file.
By looking the main executable under IDA, I was able to find an interesting sub that retrieves and extracts those datas.
First, here is the structure used for extra data :
struct extra_data
{
DWORD sig_1;
DWORD sig_2;
DWORD num_file;
DWORD offset_1;
DWORD offset_2;
DWORD unknow_1;
DWORD unknow_2;
BYTE name[0xD];
};
sig_1 must always be set to 0xA8726B03 and sig_2 to 0xEF01996C
And after deleting all there (weak?) obfuscation, we can retrieve the following "pseudo code" to extract additionnal data.
do
{
SetFilePointer(hFile, actual_pos, NULL, FILE_BEGIN);
ReadFile(hFile, buff, 0x121, &bread, 0);
key = actual_pos;
for (i = 0; i < bread; i++)
{
key = key * 0x13C6A5;
key += 0x0D8430DED;
buff[i] ^= (((((key >> 0x10) ^ (key >> 0x8)) ^ (key >> 0x18)) ^ (key & 0xFF)) & 0xFF);
}
memcpy(&data, buff, sizeof(struct extra_data));
print_data_info(&data);
actual_pos += data.offset_1 + data.offset_2;
} while (data.sig_1 == 0xA8726B03 && data.sig_2 == 0xEF01996C);
Result :
Name : ~def549.tmp
Num : 1
Name : clcd32.dll
Num : 1100
Name : clcd16.dll
Num : 1100
Name : mcp.dll
Num : 1101
Name : SECDRV.SYS
Num : 2
Name : DrvMgt.dll
Num : 2
Name : SecDrv04.VxD
Num : 11
Name : ~e5.0001
Num : 0
Name : PfdRun.pfd
Num : 0
Name : ~df394b.tmp
Num : 0
As you can see we can extract a lot of files, and here is the algorithm to decypher it :
ptr = (BYTE*)GlobalAlloc(GPTR, data.offset_1);
SetFilePointer(hFile, actual_pos - data.offset_1, NULL, FILE_BEGIN);
ReadFile(hFile, ptr, data.offset_1, &bread, NULL);
if (bread != data.offset_1)
{
printf("[-] ReadFile() failed\n");
exit(0);
}
key = 0x8142FEA1;
int init_key;
init_key = 0x8142FEA1;
for (i = 0; i < bread; i++)
{
key = init_key ^ 0x7F6D09ED;
ptr[i] = (((((key >> 0x18) ^ (key >> 0x10)) ^ (key >> 0x8))) & 0xFF) ^ ptr[i];
ptr[i] ^= key & 0xFF;
init_key = init_key << 0x8;
init_key += ptr[i];
}
Each component will be extracted into %temp% path, they got their own goal, we will not study all of them there is no interest.
- ~def549.tmp, a DLL, whose goal is to call different anti-debug technics (not interesting), check files on CD-ROM, ...
- ~e5.0001, an executable, this process will debug the main executable, for managing Nanomites.
- PfdRun.pfd, No type, This file will de decyphered for computing instruction table used for emulated opcodes.
- ~df394b.tmp, another DLL, Load and decyph section from other DLL, and manage debug event for ~e5.0001 process.
I will not discuss more about all this stuff, by loosing all my idb I am bored to reverse (rename sub) again and again with all this shitty C++ stuff, you can find some fun crypto when they decypher pfd file or code section, rijndael modified, different xor operation, anyway let's continue !
Find OEP
This is the easiest part :
stxt371:018670A2 mov ebx, offset start
stxt371:018670A7 xor ecx, ecx
stxt371:018670A9 mov cl, ds:byte_186703D
stxt371:018670AF test ecx, ecx
stxt371:018670B1 jz short loc_18670BF
stxt371:018670B3 mov eax, offset loc_1867113
stxt371:018670B8 sub eax, ebx
stxt371:018670BA sub eax, 5
stxt371:018670BD jmp short loc_18670CD
stxt371:018670BF ; ---------------------------------------------------------------------------
stxt371:018670BF
stxt371:018670BF loc_18670BF: ; CODE XREF: start+13j
stxt371:018670BF push ecx
stxt371:018670C0 mov ecx, offset loc_1867159
stxt371:018670C5 mov eax, ecx
stxt371:018670C7 sub eax, ebx
stxt371:018670C9 add eax, [ecx+1]
stxt371:018670CC pop ecx
stxt371:018670CD
stxt371:018670CD loc_18670CD: ; CODE XREF: start+1Fj
stxt371:018670CD mov byte ptr [ebx], 0E9h
stxt371:018670D0 mov [ebx+1], eax
This code will replace Module Entrypoint by a jump to Real OEP, so if you like using OllyDbg execute first instructions and put a breakpoint on that jump.
But you will encounter a "dead lock" problem, before jumping to real OEP, it decyphers sections, loads dll AND CreateProcess "~e5.0001" giving the pid of the game process as argument.
This process will load ~df394b.tmp aka SecServ.dll, all strings inside this dll are encrypted, we can decrypt all of them :
int decrypt_func_01(char *mem_alloc, char *addr_to_decrypt)
{
DWORD count;
DWORD key;
char actual;
if (mem_alloc && addr_to_decrypt)
{
count = 0;
key = 0x522CFDD0;
while (1)
{
actual = *addr_to_decrypt++;
actual = actual ^ (char)key;
*mem_alloc++ = actual;
key = 0xA065432A - 0x22BC897F * key;
if (!actual)
break;
if (count != 127)
{
count++;
continue;
}
return 0;
}
return 1;
}
else
return 0;
}
Here is the result of all strings decyphered :
Addr = 667A9240 : drvmgt.dll
Addr = 667A9264 : secdrv.sys
Addr = 667A9298 : SecDrv04.VxD
Addr = 667A92BC : ALT_
Addr = 667A9C78 : Kernel32
Addr = 667AA71C : \\.\NTICE
Addr = 667AA73C : \\.\SICE
Addr = 667AA75C : \\.\SIWVID
Addr = 667AAB80 : .text
Addr = 667A9928 : Ntdll.dll
Addr = 667A9948 : Kernel32
Addr = 667AA3F0 : GetVersionExA
Addr = 667AA6BC : ZwQuerySystemInformation
Addr = 667AA6EC : NtQueryInformationProcess
Addr = 667AA780 : IsDebuggerPresent
Addr = 667AAB50 : ZwQuerySystemInformation
Addr = 667AADBC : ExitProcess
Addr = 667A99F8 : DeviceIoControl
Addr = 667A9A40 : CreateFileA
Addr = 667A9A64 : ReadProcessMemory
Addr = 667A9A8C : WriteProcessMemory
Addr = 667A9AB8 : VirtualProtect
Addr = 667A9AE0 : CreateProcessA
Addr = 667A9B08 : CreateProcessW
Addr = 667A9B30 : GetStartupInfoA
Addr = 667A9B58 : GetStartupInfoW
Addr = 667A9B80 : GetSystemTime
Addr = 667A9BA4 : GetSystemTimeAsFileTime
Addr = 667A9BD4 : TerminateProcess
Addr = 667A9BFC : Sleep
Addr = 667AB8C0 : WriteProcessMemory
Addr = 667AB8EC : FlushInstructionCache
Addr = 667AB918 : VirtualProtect
Addr = 667ABB90 : SetThreadContext
Addr = 667ABBB8 : GetThreadContext
Addr = 667ABBE0 : SuspendThread
Addr = 667ABB64 : FlushInstructionCache
Addr = 667ABB38 : WriteProcessMemory
Addr = 667ABC84 : ContinueDebugEvent
Addr = 667ABB0C : DebugActiveProcess
Addr = 667ABAE4 : WaitForDebugEvent
Addr = 667A99F8 : DeviceIoControl
Addr = 667ACF00 : System\CurrentControlSet\Services\VxD
Addr = 667ACF5C : cmapieng.vxd
Addr = 667ACF3C : StaticVxD
The most interesting things are DebugActiveProcess, ContinueDebugEvent, WriteProcessMemory, FlushInstructionCache, SetThreadContext.
As I said earlier this dll will be in charge of debugging the game process, it prevents debugging it with Olly or any Ring3 debugger.
The game process after calling CreateProcess will wait (WaitForSingleObject) signal that temp executable will attach to it and give it signal and continue to debug it, but if you are already debugging game process, WaitForSingleObject will never catch this signal.
All the code below can be found inside ~df394.tmp aka SecServ.dll :
.text:667250C1
.text:667250C1 loc_667250C1: ; CODE XREF: sub_66724FDE+D5j
.text:667250C1 push 0FFFFFFFFh ; dwMilliseconds
.text:667250C3 push edi ; hHandle
.text:667250C4 call ds:WaitForSingleObject
.text:667250CA push [ebp+hObject] ; hObject
.text:667250CD mov [ebp+return_value], eax
.text:667250D0 call esi ; CloseHandle
.text:667250D2 push edi ; hObject
.text:667250D3 call esi ; CloseHandle
.text:667250D5 cmp [ebp+return_value], 0 ; WAIT_OBJECT_0
.text:667250D9 pop edi
.text:667250DA pop esi
.text:667250DB jz short exit_func
.text:667250DD call ebx ; GetLastError
.text:667250DF call Exit_Process
.text:667250E4
.text:667250E4 exit_func: ; CODE XREF: sub_66724FDE+11j
.text:667250E4 ; sub_66724FDE+20j ...
.text:667250E4 pop ebx
.text:667250E5 leave
.text:667250E6 retn
.text:667250E6 sub_66724FDE endp
.text:667250E6
If you want to use OllyDBG, put a breakpoint on WaitForSingleObject call, and modify argument TIMEOUT to something different than INFINITE, and change ZF flag during the test of the return value.
Nanomites
Now the fun stuff can start, if you followed what I said, you can continue to debug game process, but at a moment you will encounter problem like follow :
.text:004519FC call ds:dword_5331B0 ; kernel32.IsBadReadPtr
.text:004519FC ; ---------------------------------------------------------------------------
.text:00451A02 db 0CCh
.text:00451A03 db 0CCh ;
.text:00451A04 db 0CCh ;
.text:00451A05 db 0CCh ;
What are doing these 0xCC (int 3) aka Trap to Debugger or software breakpoint after a call to a kernel32 API ?
It's a well known technique, instructions are replaced by this opcode and informations about the removed opcode is stored in a table. (Remember pfd file ?)
Then, by using self-debugging, when one of these breakpoints is hit, the debugging process will handle the debug exception, and will look up certain information about the debugging break.
- Is it a Nanomite ?
- Yes ! So I have to emulate the removed opcode
- And restore the context of the thread correctly
But the problem is, if Nanomites are called several times, it can impact a little the performance, right ? (Not anymore today), but Safedisc decided to count how much time a Nanomite is executed, and if this Nanomite is executed too much time, it will restore the replaced opcodes by writting it inside the debugged process.
So if we want to fix theses Nanomites, we just have to patch a branch instruction that say : "This nanomites has been executed too much time, restore opcode !", and scan txt section of game process to find all the nanomites, call them, and the debugger process will restore all the removed opcode :).
How To
When unpacking (real?) protection you need to write cool toolz, here are all the steps that I did :
- Create Game process in suspended state
- Inject a first (malicious?) dll into it and continue execution
- This first dll will setup an Hook on CreateProcessA, the goal of this task is when the debugger process ( ~e5.0001 ) will be created, it will change the dwCreationFlags to CREATE_SUSPENDED and inject a second dll in it.
- A second hook from the first dll will be setup on GetVersionExA to gain execution just after the jump to Real OEP.
- Once GetVersionExA is called, we scan txt section and look for 0xCC and for each one it create a thread at the address of the nanomites.
- The second dll will patch the branch condition for WriteProcessMemory the emulated opcode and hook SetThreadContext for terminating the thread in question and not continue his execution.
Need a diagram ?
I encountered a little problem during those operation, if we create a thread at an addr containing 0xCC followed by nop operation (0x90), Safedisc debugger crashes or emulates shit...
Visual Studio uses 0xCC, 0x90 and 0x00 opcode for padding, don't ask me why they don't just use only 0x00, I don't know.
Just so you know, if you don't provide the full path of these dll while you are injecting it, the first dll must be placed in the folder of the game process, and the second one in %temp% path, because debugger process is extracted and executed here.
You can find the branch instruction inside ~def394.tmp (SecServ.dll) at addr 0x6678F562 :
.txt5:6678F562 cmp ax, 1
.txt5:6678F566 jnz not_write_process_memory
Result
Just some debug information :
---
Process id : 894
EventCode : 1
Exception Code : 80000003
Exception Addr : 40170F
---
[+] GetThreadContext(0xB8, 0x635080); return_addr = 66733C55
lpContext->EIP = 7C91120F
[+] WriteProcessMemory(0x5C, 0x40170F, 0x61F58C, 0x2, 0x61F0F0); return_addr = 6672BA45
85 C0
[+] SetThreadContext(0xB8, 0x635080); return_addr = 66733C23
lpContext->EIP = 40170F
---
As you can see at address 0x40170F, an event occured 0x1 -> EXCEPTION_DEBUG_EVENT and his code 0x80000003 (EXCEPTION_BREAKPOINT), so the debugger process replaces the 0xCC 0xCC by 0x85 0xC0 -> "test eax, eax", and try to SetThreadContext but we hooked it to terminate the thread.
Restoring Imports
Like the previous version import points to some virtual address where the code calls routine to find the correct import.
By using algo against itself we can resolve all correct address of imports.
Inside txt section we can find different type of call to imports :
- call dword ptr[virtual_addr]
- jmp dword ptr[virtual_addr]
- jmp section Stxt774
The idea is simple, scan .txt section look for call dword ptr or jmp dword ptr or jmp section Stxt774, hook the function that resolve the api and get the result and save into into a linked list.
This function in question is in ~df394b.tmp :
.txt:6678D644 call resolve_api
.txt:6678D649 pop ecx
.txt:6678D64A pop ecx
Just replace the pop ecx, by "add esp, X; ret" and get the result into register eax.
BUT ! Sometimes by calling the same virtual_addr but from other location it don't resolve the same API address.
API (0x7E3AC17E) has rdata.0x53327C (txt.0x51A656) rdata.0x53327C (txt.0x454509) rdata.0x53327C (txt.0x454149) rdata.0x533260 (txt.0x453773) rdata.0x53327C (txt.0x4535BD)
API (0x7E39869D) has rdata.0x53329C (txt.0x51A686) rdata.0x53329C (txt.0x50B64E) rdata.0x53329C (txt.0x50B1E4) rdata.0x53327C (txt.0x4FDC5E) rdata.0x53329C (txt.0x4FD7CA) rdata.0x533284 (txt.0x4FD718)
As you can see the address in rdata 0x53327C, can resolve different API when it is called from different locations (txt address).
To fix it, it's very simple we reorder the linked list according to the api address, and choose one rdata for each call, and we will change value of the call or jmp dword ptr at txt address for each entry of an api.
After reorder
Output after reordering :
API (0x7E3AC17E) has rdata.0x53327C (txt.0x51A656) rdata.0x53327C (txt.0x454509) rdata.0x53327C (txt.0x454149) rdata.0x53327C (txt.0x453773) rdata.0x53327C (txt.0x4535BD)
API (0x7E39869D) has rdata.0x53329C (txt.0x51A686) rdata.0x53329C (txt.0x50B64E) rdata.0x53329C (txt.0x50B1E4) rdata.0x53327C (txt.0x4FDC5E) rdata.0x53329C (txt.0x4FD7CA) rdata.0x533284 (txt.0x4FD718)
We can now write back into rdata addr the real adress of the api and fix the call or jmp at adress in txt section, to point to the good rdata address.
Now you can look with ImportRec and see that all imports are restored correctly :)
To fix jmp section Stxt774, we just have to replace the jmp by a call dword ptr[rdata], but wait jmp stxt774 is 5 bytes and we need 6 bytes to change it to call dword ptr, don't worry, after resolving the api and ret to it, the api will return at jmp stxt774 + 6, so there is enough place.
And Import Reconstructor is happy (Invalid imports 0) :
Emulated opcodes
After fixing Nanomites and restoring imports, I encounter a last problem.
.text:00404909 push ecx
.text:0040490A push eax
.text:0040490B call sub_4013F3
.text:0040490B sub_404909 endp ; sp-analysis failed
.text:0040490B
.text:004013F3 mov eax, 1E1Bh
.text:004013F8 pop ecx
.text:004013F9 lea eax, [eax+ecx]
.text:004013FC mov eax, [eax]
.text:004013FE jmp eax
This code will just compute an address in txt section, get the value pointed by this address and jump to it. The jump destination is an address from ~df394b.tmp.
The goal of sub 0x6673E090 is simply to check from where it has been called, lookup in a table of emulated opcodes and restore it.
Here only one emulation is performed then it will write original opcode back.
Like for restoring imports, we find each reference to the sub 0x00404909, setup an hook at the end of the sub 0x6673E09, call each reference, and emulated opcodes will be restored automatically :)
Conclusion
Safedisc v3 is really not difficult, you can find the source of all my codes at the end of this post.
I will go back to school project, hopefully graduating this year :)
Sources
Injector
First dll
Second dll
Binary Auditing Training Package unpackme_03
Introduction
Today, I was bored so I decided to have fun with http://www.binary-auditing.com/.
And I found a very fun challenge inside unpacking exercices :
Unpackme
Packer starts here :
seg009:00411800 start proc near
seg009:00411800 call $+5
seg009:00411805 xor ebp, ebp
seg009:00411807 pop ebp
seg009:00411808 sub ebp, offset word_401A1E
seg009:0041180E xor ebx, ebx
seg009:00411810 lea eax, byte_401A4D[ebp]
seg009:00411816
seg009:00411816 loc_411816: ; CODE XREF: start+1Fj
seg009:00411816 inc bl
seg009:00411818 mov cl, [eax]
seg009:0041181A xor cl, bl
seg009:0041181C cmp cl, 55h
seg009:0041181F jnz short loc_411816
seg009:00411821 mov ecx, 55Ah
seg009:00411826 lea esi, byte_401A4D[ebp]
seg009:0041182C mov edi, esi
seg009:0041182E
seg009:0041182E loc_41182E: ; CODE XREF: start+32j
seg009:0041182E lodsb
seg009:0041182F xor al, bl
seg009:00411831 stosb
seg009:00411832 loop loc_41182E
The first loop is for computing key for XOR operation. ebx will be equal to 0x77.
The second loop will decrypt first stage of the packer with the key stored into ebx.
Next the packer will resolve base address of kernel32.dll by getting the current structured exception handling (SEH) frame into fs:[0] and get an address inside kernel32 after the seh handler, and back
from this address into memory for finding 'PE' and 'MZ' signature.
At this point it will have the base address of kernel32.dll
Then it will parse PE header of this dll, get export function name table and search for GlobalAlloc().
It will Alloc some space, and copy different portion of code into it. We will return to the analysis of this code later (some stuff are here for api resolution during main execution).
For not loosing time by analysing all the copy of portion of code, we will setup memory breakpoint on acces on code section and run our debugger.
We land here :
00157BFC AD LODS DWORD PTR DS:[ESI]
00157BFD 35 DEC0ADDE XOR EAX,DEADC0DE
00157C02 AB STOS DWORD PTR ES:[EDI]
00157C03 ^ E2 F7 LOOPD SHORT 00157BFC
00157C05 C3 RET
At this point ecx equal to 0x1E00, and raw size of code section equal to 0x7800, so it's actually deciphering all code section with 0xDEADCODE as XOR key.
Disable the memory breakpoint on access, and go to ret, then do the operation again (setup memory breakpoint acces), and we land here :
00157BEB 8DBD BF6F4000 LEA EDI,DWORD PTR SS:[EBP+406FBF]
00157BF1 8B85 A71F4000 MOV EAX,DWORD PTR SS:[EBP+401FA7]
00157BF7 8B0F MOV ECX,DWORD PTR DS:[EDI]
00157BF9 81C1 00004000 ADD ECX,400000 ; ASCII "MZP"
00157BFF C601 E8 MOV BYTE PTR DS:[ECX],0E8
00157C02 83C1 05 ADD ECX,5
00157C05 50 PUSH EAX
00157C06 2BC1 SUB EAX,ECX
00157C08 8941 FC MOV DWORD PTR DS:[ECX-4],EAX
00157C0B 58 POP EAX
00157C0C 81C7 88000000 ADD EDI,88
00157C12 837F 04 00 CMP DWORD PTR DS:[EDI+4],0
00157C16 ^ 75 DF JNZ SHORT 00157BF7
00157C18 C3 RET
Do you recognize this operation ?
Opcode 0xE8, add 5 ?, it is making a call.
The destination of the call (eax) go to the first virtual part I talked, we will call this "api address solving".
The packer is making call redirection for each API.
The next memory breakpoint on access will land us here :
004085D4 E8 9706D5FF CALL 00158C70
A call crafted juste before, let's follow it.
00158C70 9C PUSHFD
00158C71 60 PUSHAD
00158C72 E8 00000000 CALL 00158C77
00158C77 5D POP EBP
00158C78 81ED 1C1C4000 SUB EBP,401C1C
00158C7E 8BBD B01C4000 MOV EDI,DWORD PTR SS:[EBP+401CB0]
00158C84 8B7424 24 MOV ESI,DWORD PTR SS:[ESP+24]
00158C88 83EE 05 SUB ESI,5
00158C8B 81EE 00004000 SUB ESI,400000 ; ASCII "MZP"
00158C91 81EF 88000000 SUB EDI,88
00158C97 81C7 88000000 ADD EDI,88
00158C9D 3B37 CMP ESI,DWORD PTR DS:[EDI]
00158C9F ^ 75 F6 JNZ SHORT 00158C97
00158CA1 68 2680ACC8 PUSH C8AC8026
00158CA6 FFB5 B41C4000 PUSH DWORD PTR SS:[EBP+401CB4]
00158CAC E8 A4000000 CALL 00158D55
00158CB1 8D4F 48 LEA ECX,DWORD PTR DS:[EDI+48]
00158CB4 8BD1 MOV EDX,ECX
00158CB6 E8 3B000000 CALL 00158CF6
00158CBB 51 PUSH ECX
00158CBC FFD0 CALL EAX
00158CBE 93 XCHG EAX,EBX
00158CBF 68 EEEAC01F PUSH 1FC0EAEE
00158CC4 FFB5 B41C4000 PUSH DWORD PTR SS:[EBP+401CB4]
00158CCA E8 86000000 CALL 00158D55
00158CCF 8D4F 08 LEA ECX,DWORD PTR DS:[EDI+8]
00158CD2 E8 1F000000 CALL 00158CF6
00158CD7 51 PUSH ECX
00158CD8 53 PUSH EBX
00158CD9 FFD0 CALL EAX
00158CDB 8D4F 08 LEA ECX,DWORD PTR DS:[EDI+8]
00158CDE E8 13000000 CALL 00158CF6
00158CE3 8D4F 48 LEA ECX,DWORD PTR DS:[EDI+48]
00158CE6 E8 0B000000 CALL 00158CF6
00158CEB 894424 1C MOV DWORD PTR SS:[ESP+1C],EAX
00158CEF 61 POPAD
00158CF0 9D POPFD
00158CF1 83C4 04 ADD ESP,4
00158CF4 FFE0 JMP EAX
- At 0x00158C84, it will get from where the call occured.
- It will substract 5 and ImageBase to it.
- And look into the same table (for making call) if the offset is known.
- Once he found this offset into its table, it will resolve Address of LoadLibrary (0xC8AC8026 rol 7).
- Before calling 0x00158CF6, it will set into edx an address at offset 0x48 of the index into the table.
- 0x0x00158CF6 is just here for not (~) operation on each byte until it get a null byte.
- And then LoadLibrary with the string deciphered by not operation.
- 0x00158CCA it will call back again 0x00158D55 but with hash 1FC0EAEE for resolving address of "GetProcAddress".
- 0x00158CCF, decipher some stuff starting at offset 0x8.
- And finally GetProcAddress of it.
- All the string will be cypher again juste after as you can see.
Maybe with a schem it will be more clear, example of a typical call to an api :
So if you follow me, here is the structur for each api :
struct api
{
DWORD offset; /* +0x00 */
DWORD unknow; /* +0x04 */
char api_name[0x40]; /* +0x08 */
char dll_name[0x40]; /* +0x48 */
};
Here is one exemple :
Writing test program :
> cat test.c
int main(void)
{
char api_name[] = "\xB8\x9A\x8B\xB2\x90\x9B\x8A\x939A\xB7\x9E\x91\x9B\x93\x9A\xBE\xFF\xFF";
char dll_name[] = "\xB4\xBA\xAD\xB1\xBA\xB3\xCC\xCD\xD1\xBB\xB3\xB3\xFF\xFF";
int i;
for (i = 0; i < strlen(api_name); i++)
api_name[i] = ~api_name[i];
for (i = 0; i < strlen(api_name); i++)
dll_name[i] = ~dll_name[i];
printf("%s\n", api_name);
printf("%s\n", dll_name);
}
> ./test
GetModueHandleA
KERNEL32.DLL
This entry in the table was for solving call to GetModuleHandleA() from kernel32.dll
So for dumping our program, we will have to reconstruct all those redirections, we will write a dll and inject it into the process.
What the injected code will do ?
- Search call addr into code section, where the content of addr equal to a call to virtual memory (GlobalAlloc()) "api address solving".
- Hook the jmp eax, for gaining control, we will replace it by a jmp ebx.
- Store result (api address) into idata section.
- Replace each call virtual memory by jmp dword ptr[idata_section].
But have we got enough for replacing call
Opcode for jmp dword ptr [0x42424242] = FF 25 42 42 42 42, size : 6, we got enought place.
Code for the dll :
#include <stdio.h> #include <Windows.h> #define LDE_X86 0 #ifdef __cplusplus extern "C" #endif int __stdcall LDE(void* address , DWORD type); void fix_call(void) { IMAGE_DOS_HEADER *idh = NULL; IMAGE_NT_HEADERS *inh = NULL; IMAGE_SECTION_HEADER *ish = NULL; IMAGE_SECTION_HEADER *ish_import = NULL; DWORD imagebase; DWORD imageend; FILE *fp = NULL; DWORD i; BYTE *ptr; int val; DWORD addr_call; DWORD nb_api = 0; fp = fopen("debug_msg.txt", "w"); if (!fp) MessageBoxA(NULL, "fopen failed :(", "failed", 0); imagebase = (DWORD)GetModuleHandle(NULL); idh = (IMAGE_DOS_HEADER *)imagebase; inh = (IMAGE_NT_HEADERS *)((BYTE*)imagebase + idh->e_lfanew); ish = (IMAGE_SECTION_HEADER*)((BYTE*)inh + sizeof (IMAGE_NT_HEADERS)); imageend = imagebase + inh->OptionalHeader.SizeOfImage; fprintf(fp, "Image Base : %08X\nImage End : %08X\n", imagebase, imageend); ish = (IMAGE_SECTION_HEADER*)((BYTE*)inh + sizeof (IMAGE_NT_HEADERS)); ish_import = (IMAGE_SECTION_HEADER*)((BYTE*)inh + sizeof(IMAGE_NT_HEADERS) + sizeof(IMAGE_SECTION_HEADER) * 4); fprintf(fp, "Ish Import : %08X\n", imagebase + ish_import->VirtualAddress); for (i = 0; i < ish->Misc.VirtualSize; i++) { ptr = (BYTE*)(imagebase + ish->VirtualAddress + i); /* Look for a call */ if (*(ptr) == 0xE8) { val = *(ptr + 4) << 0x18 | *(ptr + 3) << 0x10 | *(ptr + 2) << 0x8 | *(ptr + 1); val += imagebase + ish->VirtualAddress + i + 5; /* Is destination inside code section ? */ if (val > imagebase + ish->VirtualAddress && val < imagebase + ish->VirtualAddress + ish->Misc.VirtualSize) { ptr = (BYTE*)val; addr_call = val; /* Look for a call */ if (*(ptr) == 0xE8) { val = *(ptr + 4) << 0x18 | *(ptr + 3) << 0x10 | *(ptr + 2) << 0x8 | *(ptr + 1); val += (int)ptr + 5; /* Is destination is not into code section ? */ if (val < imagebase + ish->VirtualAddress || val > imagebase + ish->VirtualAddress + ish->Misc.VirtualSize) { fprintf(fp, "Call Redirect found at %08X to %08X, ", imagebase + ish->VirtualAddress + i, val); ptr = (BYTE*)val; /* Change JMP EAX to JMP EBX */ val = imagebase + ish->VirtualAddress + i; *(ptr + 0x85) = 0xE3; __asm { pushad mov ebx, end_api mov eax, val call eax end_api: add esp, 8 mov val, eax popad } fprintf(fp, "Addr Api : %08X\n", val); /* Put Api Addr into idata section */ ptr = (BYTE*)(imagebase + ish_import->VirtualAddress + 0x58 + nb_api * 8); memcpy(ptr, &val, 4); val = (imagebase + ish_import->VirtualAddress + 0x58 + nb_api * 8); /* Replace call by jmp dword ptr [idata_section] */ ptr = (BYTE*)addr_call; *ptr = 0xFF; *(ptr + 1) = 0x25; memcpy(ptr + 2, &val, 4); nb_api++; } } } } } fclose(fp); } BOOL (__stdcall *Resume_GetVersionExA)(LPOSVERSIONINFO lpVersionInfo) = NULL; BOOL __stdcall Hook_GetVersionExA(LPOSVERSIONINFO lpVersionInfo) { DWORD return_addr; __asm { mov eax, [ebp + 4] mov return_address, eax } if (return_addr == 0x00405ABF) { MessageBoxA(NULL, "ready ?", "Ready ?", 0); fix_call(); __asm jmp $ } return Resume_GetVersionExA(lpVersionInfo); } void setup_hook(char *module, char *name_export, void *Hook_func, void *trampo) { DWORD OldProtect; DWORD len; FARPROC Proc; Proc = GetProcAddress(GetModuleHandleA(module), name_export); if (!Proc) { MessageBoxA(NULL, name_export, module, 0); } len = 0; while (len < 5) len += LDE((BYTE*)Proc + len , LDE_X86); memcpy(trampo, Proc, len); *(BYTE *)((BYTE*)trampo + len) = 0xE9; *(DWORD *)((BYTE*)trampo + len + 1) = (BYTE*)Proc - (BYTE*)trampo - 5; VirtualProtect(Proc, len, PAGE_EXECUTE_READWRITE, &OldProtect); *(BYTE*)Proc = 0xE9; *(DWORD*)((char*)Proc + 1) = (BYTE*)Hook_func - (BYTE*)Proc - 5; VirtualProtect(Proc, len, OldProtect, &OldProtect); } BOOL WINAPI DllMain(HINSTANCE hinstDLL, DWORD fdwReason, LPVOID lpReserved) { DisableThreadLibraryCalls(GetModuleHandleA("inject_unpackme03.dll")); Resume_GetVersionExA = (BOOL(__stdcall *)(LPOSVERSIONINFO))VirtualAlloc(0, 0x1000, MEM_COMMIT, PAGE_EXECUTE_READWRITE); memset(Resume_GetVersionExA, 0x90, 0x1000); setup_hook("kernel32.dll", "GetVersionExA", &Hook_GetVersionExA, Resume_GetVersionExA); return (1); }My code use the Length Disassembly Engine from BeatriX.
Now we code a little injector in masm for changing :
.386 .model flat,stdcall option casemap:none include \masm32\include\windows.inc include \masm32\include\kernel32.inc includelib \masm32\lib\kernel32.lib .data PInfo PROCESS_INFORMATION <> SInfo STARTUPINFOA <> Kernel db "kernel32.dll", 0 LLib db "LoadLibraryA", 0 exe_name db "03_unpackme.exe", 0 dll_name db "inject_unpackme03.dll", 0 Addr_name dd 0 .code start: invoke GetStartupInfo, addr SInfo invoke CreateProcess, addr exe_name, NULL, NULL, NULL, FALSE, CREATE_SUSPENDED, NULL, NULL, addr SInfo, addr PInfo invoke VirtualAllocEx, PInfo.hProcess, 0, 100h, MEM_COMMIT, PAGE_READWRITE mov [Addr_name], eax invoke WriteProcessMemory, PInfo.hProcess, Addr_name, addr dll_name, LENGTHOF dll_name, NULL invoke GetModuleHandleA, addr Kernel invoke GetProcAddress, eax, addr LLib invoke CreateRemoteThread, PInfo.hProcess, NULL, 0, eax, [Addr_name], 0, NULL invoke WaitForSingleObject, eax, INFINITE invoke ResumeThread, PInfo.hThread exit: invoke ExitProcess, 0 end start
So as you can see, I used a little trick for waiting unpacking of all executables :
I setup an hook on GetVersionExA() and if the call occurs from one interesting address (near OEP), I call fix "fix_call" function and enter in infinite loop.
With this infinite loop we can attach Olly to our process and watch the result :
It's cool, but wait i forgot to talk about one thing, finding real OEP !
Restart OllyDBG, let the loop xor all the first stage, and setup breakpoint on :
0041196F E8 C2010000 CALL 03_unpac.00411B36
Step into and add breakpoint on :
00411B61 FF95 4B1D4000 CALL DWORD PTR SS:[EBP+401D4B]
You should land here (Addr can change due to allocated memory) :
00157C44 ^\FFA5 BB1F4000 JMP DWORD PTR SS:[EBP+401FBB]
Trace the code until you got something like that :
00153BC4 9D POPFD
00153BC5 61 POPAD
00153BC6 5A POP EDX
00153BC7 58 POP EAX
00153BC8 E8 D35F0000 CALL 00159BA0
If we step into, we will find something very interesting :
00159BA0 60 PUSHAD
00159BA1 E8 00000000 CALL 00159BA6
00159BA6 5D POP EBP
00159BA7 81ED E21B4000 SUB EBP,401BE2
00159BAD 8B7424 20 MOV ESI,DWORD PTR SS:[ESP+20]
00159BB1 83EE 05 SUB ESI,5
00159BB4 8B9D 111C4000 MOV EBX,DWORD PTR SS:[EBP+401C11]
00159BBA 83EB 28 SUB EBX,28
00159BBD 83C3 28 ADD EBX,28
00159BC0 3973 10 CMP DWORD PTR DS:[EBX+10],ESI
00159BC3 ^ 75 F8 JNZ SHORT 00159BBD
00159BC5 8B73 08 MOV ESI,DWORD PTR DS:[EBX+8]
00159BC8 89B5 0C1C4000 MOV DWORD PTR SS:[EBP+401C0C],ESI
00159BCE 61 POPAD
00159BCF 68 DEC0ADDE PUSH DEADC0DE
00159BD4 C3 RET
This not api resolution, but call resolution !This sub is quite simple, like api resolution it will check into a table the offset of the call and replace 0xDEADCODE by the addr of the (stolen ?) call.
I think (it's not sure) the packer has stolen some call from the virgin file and reconstruct them with a push addr ret.
Let's put a conditional log on ret address ( Expression = "[esp]" ).
We run the program and exit him and watch the log.
00159BD4 COND: 004085D4
00159BD4 COND: 00402128
00159BD4 COND: 004014C4
00159BD4 COND: 0040212C
00159BD4 COND: 00402D44
00159BD4 COND: 004085D4
00159BD4 COND: 00403D20
00159BD4 COND: 00403FA4
00159BD4 COND: 00403394
00159BD4 COND: 004033A4
00159BD4 COND: 00402F2C
00159BD4 COND: 004085B6
00159BD4 COND: 004085AA
00159BD4 COND: 00405BD4
00159BD4 COND: 00405C18
00159BD4 COND: 004066AC
00159BD4 COND: 004066B4
00159BD4 COND: 00406978
00159BD4 COND: 004085D4
7E390000 Module C:\WINDOWS\system32\USER32.DLL
00159BD4 COND: 00405B84
77EF0000 Module C:\WINDOWS\system32\GDI32.dll
Process terminated, exit code 0
If you remember at the begining of the article, the second breakpoint on access on code section land us to the first entry of your log, is it OEP ?I don't think so, it's a call to GetModuleHandleA(), ... strange, ... strange.
If you look closely, there is another thing strange, before the log of the loading module "USER32.dll", we can see a call to 0x004085D4, but this call is just a redirection to GetModuleHandleA, so what's happen between ?
We will restart our debugger and put a breakpoint on the ret of the call redirection function and wait until it go to the last 0x004085D4.
We trace the code, call "api address solving", we put a breakpoint on the JMP EAX, trace into GetModuleHandleA(), and execute till return.
We are back into virtual memory code, and trace it until get :
00154119 9D POPFD
0015411A 61 POPAD
0015411B 5A POP EDX
0015411C 58 POP EAX
0015411D FF56 18 CALL DWORD PTR DS:[ESI+18]
[ESI + 18] will be equal to 0x00401000, is it the real OEP ?
00401000 55 PUSH EBP
00401001 8BEC MOV EBP,ESP
00401003 6A 00 PUSH 0
00401005 68 20104000 PUSH 03_unpac.00401020
0040100A 6A 00 PUSH 0
0040100C 68 E7030000 PUSH 3E7
00401011 8B45 08 MOV EAX,DWORD PTR SS:[EBP+8]
00401014 50 PUSH EAX
00401015 E8 56760000 CALL 03_unpac.00408670
0040101A 33C0 XOR EAX,EAX
0040101C 5D POP EBP
0040101D C2 1000 RET 10
CALL 03_unpac.00408670 will go to resolve api, and call DialogBoxParamA().But wait we store first argument into eax, so this function need an argument.
If we look msdn documentation first parameter of DialogBoxParamA() is a handle to the module whose executable file contains the dialog box template.
So the parameter of this function should be the result of GetModuleHandleA(NULL) (this will be first stolen fix).
A second problem is when we will return from DialogBoxParamA, and return from sub_00401000 we should ret to a fonction wich call ExitProcess().
Launch the injector, and attach olly to the process, and search reference to kernel32.ExitProcess, and we found this sub :
0040669C 55 PUSH EBP
0040669D 8BEC MOV EBP,ESP
0040669F 8B45 08 MOV EAX,DWORD PTR SS:[EBP+8]
004066A2 50 PUSH EAX
004066A3 E8 F01E0000 CALL 03_unpac.00408598 ; JMP to kernel32.ExitProcess
004066A8 5D POP EBP
004066A9 C3 RET
Now we just have to find some place for putting the stolen bytes, just after all the jmp dword ptr [idata_section] it's cool.
- Add a call to GetmoduleHandleA(NULL) for getting base adresse required by DialogBoxParamA
- Push Addr of ExitProcess()'s sub
- Push Real OEP
- Ret to it
Now we can dump our process with our favorite toolz dumper, and fix iat with ImportRec and putting the new OEP.
We test the dump file and it works :].
Conclusion
As deroko said, this unpackme is really not difficult, but I enjoyed solving it.
Pages : 1