CVE-2025-66516: XXE vulnerability exposes Apache Tika

A critical vulnerability, CVE-2025-66516 (CVSS 10.0), has been identified in Apache Tika, affecting how the framework processes PDF files containing XFA (XML Forms Architecture) data. The vulnerability resides in tika-core, which means any system using Tika’s default parsing behavior remains vulnerable even if the PDF parser module was previously patched.

No special configuration or insecure application code is required; simply ingesting a malicious PDF is enough to trigger the exploit. In vulnerable versions, Tika processes attacker-controlled XFA content in a way that allows unauthorized access to sensitive files or internal resources during parsing, making this a high-impact issue for any workflow that handles user-supplied PDFs.

What Is CVE-2025-66516?

Risk Analysis

Severity: CRITICAL
CVSSv3.1: Base Score: 10.0 CRITICAL
Vector: CVSS:4.0/AV:N/AC:L/AT:N/PR:N/UI:N/VC:H/VI:H/VA:H/SC:H/SI:H/SA:H

Exploit available in public: No
Exploit complexity: Low

CVE-2025-66516 is a high-severity XXE (XML External Entity) vulnerability caused by improper handling of XML embedded within XFA layers of PDF files. When Apache Tika processes a malicious PDF containing attacker-controlled XFA data, the core parser does not properly restrict external entity resolution. This allows the attacker to embed XML entities that reference sensitive filesystem paths, internal URLs, cloud metadata services, or other restricted resources.

The XXE vulnerability also introduces a path traversal vector, allowing attackers to access arbitrary files on the server during PDF processing.

All of the following versions are impacted:

tika-core:13 through 3.2.1
tika-parser-pdf-module: 0.0 through 3.2.1
tika-parsers (1.x series): 13 through 1.28.5, where the PDF parser was bundled

Any application running these versions is exposed, regardless of how securely the surrounding code is written.

How the XXE Exploit Unfolds in Apache Tika

When Tika encounters a PDF with embedded XFA, it processes the XML to extract text. In vulnerable versions, this XML is parsed without properly restricting external entity resolution. A malicious PDF can therefore:

Force Tika to read sensitive files from the server
Trigger outbound network calls to internal systems
Leak cloud metadata or credentials
Potentially disrupt document processing services

What makes this vulnerability especially dangerous is that it requires no unsafe code or special configuration. Simply using Tika’s standard parsing APIs such as AutoDetectParser or Tika().parseToString() automatically triggers the vulnerable pathway.

Any workflow that processes PDFs by default, including search indexing, ETL pipelines, content classification, or document preview generation, will parse the malicious XFA content without visibility. Because these operations typically run in the background, the attack executes silently, giving attackers the opportunity to extract sensitive files or probe internal systems long before anyone notices something is wrong.

Even routine background operations, such as automated text extraction or metadata scanning, can unknowingly trigger the exploit.

Preventing XXE Exploitation in Apache Tika

The most reliable fix is upgrading to tika-core 3.2.2 or later, where external entity resolution is properly restricted. This closes the XXE attack path and should be applied as soon as possible. To minimize risk until a full patch is applied, the following measures help protect applications that rely on Apache Tika.

Disable PDF Parsing if Upgrade Is Delayed – If patching cannot happen immediately, you can temporarily disable PDF parsing through a custom tika-config.xml. This prevents Tika from processing potentially malicious PDFs and avoids triggering the vulnerable code path.
Preprocess PDFs Before Sending Them to Tika – Using tools like qpdf or pdfid.py to scan incoming PDFs helps identify XFA structures or /AcroForm markers. Rejecting such files early greatly reduces the chance of XXE exploitation.
Enforce Strong Network Egress Controls – Strict outbound network restrictions limit damage even if XXE is triggered. Blocking access to metadata services, internal APIs, and sensitive endpoints prevents attackers from retrieving data through external entity calls.
Isolate Document-Processing Workloads – Long term, Tika should run in isolated, sandboxed environments with limited file system and network access. Treating document parsing as an untrusted workload helps contain any future parsing vulnerabilities.

AppTrana WAAP Coverage for CVE-2025-66516

AppTrana WAAP has had protection for this exploitation from day 0, using advanced inspection rules to detect and block malicious XFA-based XML payloads inside PDFs before they reach Apache Tika. The platform identifies harmful structures such as embedded external entities, suspicious XML signatures, and abnormal PDF patterns, ensuring XXE attacks are stopped at the edge even when Tika is running a vulnerable version.

Similar to pre-processing tools like qpdf or pdfid.py that flag XFA or /AcroForm markers, AppTrana performs deep file inspection automatically during upload to prevent malicious PDFs from entering the parsing workflow. In addition to inbound filtering, AppTrana restricts unauthorized outbound calls that XXE exploits typically attempt, blocking access to internal URLs or metadata services.

AppTrana’s managed security team continues to track this vulnerability and emerging exploit techniques, with additional protections deployed as new intelligence or PoCs become available.

Stay tuned for more relevant and interesting security articles. Follow Indusface on Facebook, Twitter, and LinkedIn.

CVE-2025-66516: Critical XXE Vulnerability Exposes Apache Tika Deployments

What Is CVE-2025-66516?

How the XXE Exploit Unfolds in Apache Tika

Preventing XXE Exploitation in Apache Tika

AppTrana WAAP Coverage for CVE-2025-66516

Share Article:

CVE-2025-66516: Critical XXE Vulnerability Exposes Apache Tika Deployments

What Is CVE-2025-66516?

How the XXE Exploit Unfolds in Apache Tika

Preventing XXE Exploitation in Apache Tika

AppTrana WAAP Coverage for CVE-2025-66516

Share Article:

Join 51000+ Security Leaders

Fully Managed SaaS-Based Web Application Security Solution

Get free access to Integrated Application Scanner, Web Application Firewall, DDoS & Bot Mitigation, and CDN for 14 days