In the last few months, we have seen many blogs on PDF exploits related to filters like “ASCIIHexDecode”, “FlateDecode”, etc., being used to avoid Antivirus detection. The idea employed by attacker’s leverages different filtering techniques to hide malicious data so that it will be difficult to understand and decode. We have encountered many PDF exploits where either “[/FlateDecode /ASCIIHexDecode]” or “[/FlateDecode /ASCII85Decode]” filters are used. As defined by gnupdf, “the ASCIIHexDecode filter decodes data that has been encoded in ASCII hexadecimal form” and “the ASCII85Decode filter decodes data that has been encoded in ASCII base-85 encoding and produces binary data”. Interestingly, we have found another case, which Zscaler blocks, whereby both of the filters are used in the same PDF on different objects. This technique can be used to hide malicious code inside the PDF.
The following sample is still live on the web. Let’s open it in notepad and search for the “ASCIIHexDecode” or “ASCII85Decode” filters to see if they are used. Here is the screenshot where the “ASCIIHexDecode” filter is used:
If you look at the above image, there is something suspicious, a non-readable block of code in the 18 object with a length of 19343. The PDF is not blank and contains 4 pages text. The malicious code is injected toward the bottom of the PDF to avoid detection. Let’s decode it to see if it contains malicious JavaScript. The tool “pdf-parser.py” from PDF Tools supports both of the filters and easily decodes the code inside.
The decoded script is now shown above. The above malicious script is using special characters like @, _, ?, !, $, etc for substitution in one of the variables. If we remove these characters from the variable, you will see clear text malicious JavaScript code inside. Looking at the above JavaScript code, it does not contain any functionality to replace or remove these characters. Without such functionality, the code would be incomplete. We therefore need to look for additional functionality elsewhere in the PDF file. We later found another filter called “ASCII85Decode”, which included some additional suspicious code. Here it is:
Let’s decode this further using “pdf-parser.py” tool. The below command is used to decode this particular object.
D:\pdf-parser.py --object=20 --raw --filter withSearch.pdf > out2.log
Here is the decoded script for this filter,
That’s it. It does indeed contain additional malicious JavaScript. This is an interesting case where the script is divided into two parts, encoded using different filters and used in two different objects. This is done intentionally by the attacker to fool Antivirus engines and avoid detection. Let’s decode this script to see which PDF vulnerabilities it targets.
The above malicious JavaScript targets 3 old vulnerabilities,
That’s it for now. Be Safe.
Umesh
The following sample is still live on the web. Let’s open it in notepad and search for the “ASCIIHexDecode” or “ASCII85Decode” filters to see if they are used. Here is the screenshot where the “ASCIIHexDecode” filter is used:
If you look at the above image, there is something suspicious, a non-readable block of code in the 18 object with a length of 19343. The PDF is not blank and contains 4 pages text. The malicious code is injected toward the bottom of the PDF to avoid detection. Let’s decode it to see if it contains malicious JavaScript. The tool “pdf-parser.py” from PDF Tools supports both of the filters and easily decodes the code inside.
The decoded script is now shown above. The above malicious script is using special characters like @, _, ?, !, $, etc for substitution in one of the variables. If we remove these characters from the variable, you will see clear text malicious JavaScript code inside. Looking at the above JavaScript code, it does not contain any functionality to replace or remove these characters. Without such functionality, the code would be incomplete. We therefore need to look for additional functionality elsewhere in the PDF file. We later found another filter called “ASCII85Decode”, which included some additional suspicious code. Here it is:
Let’s decode this further using “pdf-parser.py” tool. The below command is used to decode this particular object.
D:\pdf-parser.py --object=20 --raw --filter withSearch.pdf > out2.log
Here is the decoded script for this filter,
That’s it. It does indeed contain additional malicious JavaScript. This is an interesting case where the script is divided into two parts, encoded using different filters and used in two different objects. This is done intentionally by the attacker to fool Antivirus engines and avoid detection. Let’s decode this script to see which PDF vulnerabilities it targets.
The above malicious JavaScript targets 3 old vulnerabilities,
- collectEmailInfo() – CVE-2007-5659
- Collab.getIcon() – CVE-2009-0927
- .printf() – CVE-2008-2992
That’s it for now. Be Safe.
Umesh