In January 2022, Adobe released a security update for vulnerabilities in Adobe Acrobat and Reader. The update fixed five vulnerabilities (CVE-2021-44703, CVE-2021-44708, CVE-2021-44709, CVE-2021-44740, and CVE-2021-44741) discovered by Zscaler’s ThreatLabz. These five vulnerabilities existed in the Adobe Acrobat Pro DC Solid Framework. Adobe uses the Solid Framework for the conversion of PDF files to Microsoft Office files in Adobe Acrobat. In this blog, we present our analysis of CVE-2021-44708, a heap-based buffer overflow vulnerability in Adobe Acrobat Pro DC. Foxit’s PDF Editor uses the Solid Framework for the conversion of PDF files to other file formats, and is therefore, also impacted by this vulnerability. Foxit has also released a security update to fix CVE-2021-44708.
Vulnerability Description
CVE-2021-44708 is a heap overflow vulnerability due to the insecure handling of a maliciously crafted file, potentially resulting in arbitrary code execution in the context of the current user. The exploitation of this issue requires user interaction in that a victim must open a malicious file.
Known Affected Software Configurations
- Acrobat DC Continuous 21.007.20099 and earlier versions in Windows
- Acrobat DC Continuous 21.007.20099 and earlier versions in macOS
- Acrobat 2020 Classic 2020 20.004.30017 and earlier versions in Windows & macOS
- Acrobat 2017 Classic 2017 17.011.30204 and earlier versions in Windows & macOS
- Foxit PDF Editor 11.2.0.53415 and all previous 11.x versions, 10.1.6.37749 and earlier
Proof of Concept
The vulnerability can be triggered by opening a malicious PDF file and exporting it to a Microsoft Word document. Zscaler ThreatLabz created a PoC file that will cause the following crash. To reproduce this issue, the following steps can be performed:
- Enable Page Heap on Acrobat.exe
- Open the PoC file with Adobe Acrobat Pro DC.
- Click the menu File → Export to → Microsoft Word → Word Document. It will produce the following crash.
Test Environment
Adobe Acrobat Pro DC, Product version: 21.7.20095.60881
PDFLibTool.dll, Product version: 10.0.12082.1
Solid Framework (x86), Product version: 10.0.12082.1
Technical Analysis
The heap-based buffer overflow vulnerability CVE-2021-44708 exists in Adobe Acrobat Pro DC’s third-party library Solid Framework, which is located in the directory C:\Program Files (x86)\Adobe\Acrobat DC\Acrobat\plug_ins\SaveAsNonPDF\Solid. Figure 1 shows a comparison between a properly structured PDF file with a minimized PoC file that triggers this vulnerability.
Figure 1. Comparison between a normal PDF file and the minimized PoC file that triggers CVE-2021-44708
As shown in Figure 1, the single modified byte is located in obj 261. The structure of the minimized PoC PDF file is shown below in Figure 2.
Figure 2. The structure of the minimized PoC PDF file for CVE-2021-44708
In Figure 2, the obj 65 uses obj 261 as its content and references obj 72 as its colorspace. The obj 72 is an ICCBased family color space and references obj 73 as its International Color Consortium (ICC) profile. ICCBased color spaces can have 1, 3, or 4 components (defined by the N dictionary value). They often use Gray (1 component), RGB (3 components), or CMYK (4 components). The parameter “/N 4” indicates the number of color components in the color space described by the ICC profile data. The stream data in both obj 261 and obj 73 has been compressed using the deflate algorithm. (For more information on deflate, please refer to https://tools.ietf.org/html/rfc1951). Zlib is a C library that implements the deflate algorithm. Adobe Acrobat uses zlib to decompress the deflate-compressed data. An example Python script is shown below to decompress the data in obj 261 and obj 73, respectively.
Figure 3. Python script to decompress Zlib stream data
The uncompressed stream content in obj 261 contains a stream of graphics operators in Figure 4. Many of these streams contain invalid operators and abnormal operands.
Figure 4. Uncompressed stream content in obj 261 in the minimized PoC file for CVE-2021-44708
The uncompressed stream content in obj 73 is shown in Figure 5. It’s an ICC profile with 0x39D0 bytes. The ICC Profile Format Specification refers to https://www.color.org/icc32.pdf.
Figure 5. Uncompressed stream content (containing an ICC profile) in obj 73
The methods CPdfPageContentProcessor::ProcessCommandStreamMultiThread and CPdfPageContentProcessor::ProcessGeneralCommand stand out in the stack backtrace described in the Proof of Concept section. The method CPdfPageContentProcessor::ProcessGeneralCommand(GrStateCommands const &,char const *,std::vector<CPdfLexem> &,int) is responsible for processing the command stream of graphic operators and operands. The following breakpoint can be set to trace the order of the graphic operators processed before the crash occurs.
bu PDFLibTool!CPdfPageContentProcessor::ProcessGeneralCommand ".printf \"CPdfPageContentProcessor::ProcessGeneralCommand hit\\n\"; dps esp L7; db poi(esp+8); dd poi(esp+4) L1; gc; "
This breakpoint is hit thousands of times until the crash occurs. Example output when this breakpoint is hit is shown in Figure 6.
Figure 6. Output of the breakpoint at CPdfPageContentProcessor::ProcessGeneralCommand
The crash occurs when the graphic operator f* is processed. The operator f* is a path-painting operator without operands. The description of path-painting operators is listed in Figure 7.
Figure 7. Path-Painting operators
In Figure 6, there’s one operator c21244.7568469.3 that is close to the operator f* before the crash occurs. The string c21244.7568469.3 occurs once in the uncompressed stream content in obj 261 as shown in Figure 8.
Figure 8. The string c21244.7568469.3 in the uncompressed stream content in obj 261
The this pointer when the crash occurs points to a heap buffer with the size of 0x15c bytes as shown in Figure 9. Based on the output of the command !heap, the method CPdfPageContentProcessor::ProcessCommandStreamMultiThread contains a call stack with the method CICMConverter::Initialize that calls the function malloc to allocate a heap buffer with the size of 0x15c bytes.
Figure 9. The call stack and this pointer when the crash occurs
The following breakpoint is set to trace how the heap buffer is allocated and initialized.
0:002> bm PDFLibTool!CICMConverter::Initialize
1: 67db43e0 @!"PDFLibTool!CICMConverter::Initialize"
2: 67db4460 @!"PDFLibTool!CICMConverter::Initialize"
0:002> bu PDFLibTool!CxManagedImage::Set+0x5e871
When the breakpoint at CICMConverter::Initialize(uchar *,uint,solid::pal::COLORTYPE) is hit, the three function parameters can be inspected as shown in Figure 10.
Figure 10. The parameters of CICMConverter::Initialize(uchar *,uint,solid::pal::COLORTYPE) after the breakpoint is hit
The first parameter is a pointer to a heap buffer that stores the ICC profile data (the uncompressed stream content in obj 73 is shown in Figure 5). The second parameter is the size of the ICC profile data. The third parameter is 0x07 that represents the color type CMYK.
When the breakpoint at PDFLibTool!CxManagedImage::Set+0x5e871 is hit, it pushes 0x15c on the stack as the parameter passed to the function malloc to allocate a heap buffer with a size of 0x15c bytes as seen in Figure 11.
Figure 11. Allocated heap buffer with the size 0x15c bytes
The corresponding snippet of pseudocode in IDA Pro is shown in Figure 12. After the heap buffer is allocated, the code performs some initialization for the heap buffer. First, it calls the function sub_102B7BC0 to initialize the vtable pointer and other fields. The name of the vftable indicates that the heap buffer is a CIccCLUT object (which represents the ICC Color Lookup Tables).
Figure 12. The relevant snippet of pseudocode in IDA Pro
The code then calls the function sub_102BAA30 to continue the initialization of the heap buffer. As seen in Figure 13, the function sub_102BAA30 calls the function sub_102BAA70 to perform the initialization. At the end of sub_102BAA70, the code allocates a new heap buffer with a size of 0x708c bytes, and then it stores the pointer to the newly allocated heap buffer at the offset 0x64 of the CIccCLUT object.
Figure 13. The pseudocode of the function sub_102BAA70
A memory write breakpoint can be set using the command “ba w4 addr” on the heap buffer. In Figure 12, after the function sub_102BAA30 is called, there is a function call to obtain the starting offset for the tables of the ICC profile file. The return value is 0x17c. Next, the code calls the function sub_102BE280, which is used to convert the tables in the mft2 tag in the ICC profile to a color lookup table. The color lookup table is stored in the newly allocated heap buffer (with a size 0x708c bytes) as shown in Figure 14.
Figure 14. The ICC profile and color lookup table
Finally, after the initialization of the heap buffer (CIccCLUT object) is completed, the memory layout is similar to the following in Figure 15.
Figure 15. The memory layout of CIccCLUT object
In Figure 6, the crash occurs during the process of handling the graphic operator f*. Since there are over 200 f* operators in the graphic stream, the following conditional breakpoint can be set to trace the malformed operator c21244.7568469.3.
bu PDFLibTool!CPdfPageContentProcessor::ProcessGeneralCommand ".if(poi(poi(esp+8))=0x32313263) {.printf \"CPdfPageContentProcessor::ProcessGeneralCommand hit\\n\"; dps esp L7; db poi(esp+8); dd poi(esp+4) L1;} .else {gc; } "
The operator f* of interest is close to the operator c21244.7568469.3 in the graphic stream that is shown in Figure 8. When the breakpoint is hit, the following breakpoint at PDFLibTool!CPdfPageContentProcessor::ProcessGeneralCommand can be set again.
bu PDFLibTool!CPdfPageContentProcessor::ProcessGeneralCommand "printf \"CPdfPageContentProcessor::ProcessGeneralCommand hit\\n\"; dps esp L7; db poi(esp+8); dd poi(esp+4) L1; "
The breakpoint is hit several times until the following output in Figure 16 is reached.
Figure 16. Output of breakpoint reached at PDFLibTool!CPdfPageContentProcessor::ProcessGeneralCommand
At this stage, the code that handles how the f* operator is processed can be analyzed. In the Proof of Concept section, the stack backtrace contains the method CCSICCBased::ConvertToRGB that was called before the crash occurs. This method is used to convert the CCSICCBased color into an RGB color. In obj 73 (in Figure 2) the parameter “/N 4” indicates the number of color components in the color space described by the ICC profile. This means that it will convert CMYK (4 components) into RGB by means of the ICC color lookup tables (CLUT).
Therefore a breakpoint can be set at the method PDFLibTool!CCSICCBased::ConvertToRGB.
bm PDFLibTool!CCSICCBased::ConvertToRGB
When this breakpoint is hit, the following output is expected in Figure 17.
Figure 17. Breakpoint triggered at the method PDFLibTool!CCSICCBased::ConvertToRGB
The method PDFLibTool!CCSICCBased::ConvertToRGB takes 3 parameters. The second parameter is a pointer to a double type that stores four double type numbers which are 98929, 5511.01, 962.467, and 5. These four numbers are the operands for the operator scn. The description of the scn operator is listed in Figure 18.
Figure 18. The scn (set color) operator definition
The method CCSICCBased::ConvertToRGB calls the method CICMConverter::Convert(struct tagRGBTRIPLE *retstr, const double *a2) whose second parameter is a pointer pointing to a stack buffer that stores four numbers (98929, 5511.01, 962.467, 5) of double type. The method CICMConverter::Convert converts the double array to a float array as shown in Figure 19.
Figure 19. The pseudocode of the method CICMConverter::Convert
The memory layout of the float array is shown in Figure 20.
Figure 20. The memory layout of the float array [98929, 5511.01, 962.467, 5]
Tracing the function call flow, the crash occurs in the function sub_102BB250(int this, int a2, float *a3). The pseudocode of sub_102BB250 is shown in Figure 21.
Figure 21. The pseudocode of sub_102BB250
When the program reaches the function sub_102BB250, the memory layout of the function parameters and this pointer are shown in Figure 22.
Figure 22. The memory layout of the parameters and this pointer of sub_102BB250
As seen in Figure 22, the this pointer points to a CIccCLUT object with the size of 0x15c bytes. At the offset 0x64, a pointer is stored that points to the heap buffer of the color lookup table whose size is 0x708c bytes. In Figure 21, we can see that the program does some arithmetic before dereferencing the variable v22 in the following code.
The four numbers (98929, 5511.01, 962.467, 5) passed to the scn operator are malformed, causing the code to calculate an offset in the color lookup tables that far exceeds the length of the heap buffer that stores the color lookup tables. As a result, when the code dereferences the variable v22, a heap buffer overflow occurs.
A valid scn operator is shown below the malformed operand values in Figure 23. The value of each number should fall into the range from 0 to 1.
Figure 23. Comparison between normal operands and malformed operands for the scn operator
To summarize, the PoC PDF file contains malformed operands passed to the scn operator in the graphics stream. As a result, a malformed color is set using the scn operator that uses the f* operator to fill the path using the even-odd rule. When the f* operator is processed, the code converts CMYK (4 components) into RGB by means of Color Lookup Tables (CLUTs), which causes a heap buffer overflow.
Mitigation
All users of Adobe Acrobat and Reader are encouraged to upgrade to the latest version of this software. Zscaler’s Advanced Threat Protection and Advanced Cloud Sandbox can protect customers against this vulnerability.
References
https://helpx.adobe.com/security/products/acrobat/apsb22-01.html
https://www.foxit.com/support/security-bulletins.html
https://blog.idrsolutions.com/2010/05/understanding-the-pdf-file-format-color/
https://gregstoll.com/~gregstoll/floattohex/
https://www.color.org/icc32.pdf
About ThreatLabz
ThreatLabz is the security research arm of Zscaler. This world-class team is responsible for hunting new threats and ensuring that the thousands of organizations using the global Zscaler platform are always protected. In addition to malware research and behavioral analysis, team members are involved in the research and development of new prototype modules for advanced threat protection on the Zscaler platform, and regularly conduct internal security audits to ensure that Zscaler products and infrastructure meet security compliance standards. ThreatLabz regularly publishes in-depth analyses of new and emerging threats on its portal, research.zscaler.com.