Lets analyze the PE file in detail and see what it’s up to. Like most malware, this sample was packed and in order to properly analyze it, we must begin by unpacking the binary. Keeping this in mind, I began by debugging the file, hoping to find the reference to the data section in order to determine precisely where the encrypted portion of data was to be found.
Fortunately, I was not disappointed and was soon able to find the reference point.
Figure 1: Obfuscated data |
After further debugging, we are able to see the code decrypted in memory. The decryption occurs in multiple iterations, until the data is completely decrypted.
Figure 2: Decryption of obfuscated data |
Now we have a full view of the decrypted code in memory. The portion that was decrypted contains position independent code (i.e shellcode).
Figure 3: Decrypted data in memory |
Since the code is decrypted in memory, we can assume that at some point, control will be transferred to that region, which in this case happens immediately. We also can see that the VirtualProtectEx API is used to change the protection of the memory region and by doing so the malware will be able to execute and manipulate the memory.
Figure 4: Change memory protection |
After this occurs, control is transferred to the region by an instruction of JMP EDI. Here, EDI will hold the address to which EIP (instruction pointer) lands and we can see that it is the same portion of the
code that was decrypted earlier.
Figure 5: Control transferred to new code |
There's an interesting bit of code here if we look at first couple of instructions on the landed region. We can see a NOP instruction, followed by SUB EAX,EAX and a CALL and POP EBX. If we carefully observe the address that is called, it is that of the POP EBX. This is a common technique found in shellcode and file infectors where one needs to get the address of the region that is currently being executed.
When this CALL is executed, it pushes the return address onto the stack (in this case it is the address of POP EBX). Now POP EBX is executed, as that instruction pops the value from the top of the stack to EBX. The address is then added to a constant of 0x33, to point to the region that is then decrypted by the decryption loop. This reveals more code, after which a JMP instruction transfers the control to the newly revealed code.
Further, I was able to identify another interesting piece of code here. The code below retrieves the address of the PEB (process environment block) and navigates to PEB_LDR_DATA->InLoadOrderModuleList, where it retrieves the names of the loaded modules (DLL’s) .
Figure 6: Fetch base address of kernel32.dll |
There's another catch here. The malware looks for specific DLL’s (in this case kernel32.dll), but instead of using the string kernel32.dll to compare with retrieved module names from the PEB, it carries the hash of the DLL names and then calculates the hash value for the retrieved module names and compares them. This allows the malware to make minimum noise and avoid some antivirus rules.
Figure 7: Dll name hash |
Once the malware gets kernel32.dll, it then retrieves the base address of the kernel32.dll, which in this case is 0x7c800000. Now, using the PE file format, the malware moves to the export table of kernel32.dll, as illustrated in the code below,
Figure 8: Finding exportaddresstable of kernel32.dll |
Looking at code above the instruction MOV EBX, DWORD PTR DS:[EAX+78], lands us at the datadirectory-->exportaddresstable of the kernel32.dll. The malware then retrieves the value and adds it to the imagebase (ie 0x7c800000) in order to reach the export table, where it retrieves the address of the exported function. Here too, the malware never uses the names of the function, but instead it uses a stored hash.
After further analysis, we stumble onto another piece of code, which copies data again from the data section to a newly allocated memory region.
Figure 9: Copy more data |
Investigating further, we see that this data is decrypted to reveal what looks like some sort of an address table.
Figure 10: Address table |
The table has significance as it is used as an address calculator, To calculate the address of
the region from where it copies bulk data, Which is further decrypted to form what looks like
a compressed file.
Figure 11: Compressed data |
And there it is. Moving ahead, we land in the decompression routine, which quickly reveals that
the data is compressed using “aplib”.
Figure 12: Aplib decompression routine |
Once the decompression is completed it does some familiar actions by flushing out the bytes of the original EXE file starting from the imagebase 0x400000 and copy the decompressed data to its new imagebase (i.e 0x400000)
Figure 13: Copy decompressed PE -file |
Finally using “LoadlibraryEx” and Getprocaddress the IAT is rebuild in the memory after which the control is transferred to the new code at the address 0x401021
Figure14: Rebuild IAT in memory |
The job of this code is limited. It writes a PE-file which is embedded within itself into the temporary folder as “Adobe.exe” using the api “GetTempPathA”.
Figure 15: Transfer control to OEP |
In the end, the file (Adobe.exe) is dropped in the temp folder and executed using the API “ShellExecuteA”.
Figure 16: Execute dropped "Adobe.exe" |
A dummy PDF file is also written to the current directory named “Bestellung.pdf”. In a subsequent blog post, we will see why the malware dropped this PDF file.
That’s all for now. In the next post, we’ll continue the analysis of the dropped file “Adobe.exe”