A4: XML External Entities (XXE)

OWASP, API Security, WAF

A4: XML External Entities (XXE)

Introduction

XML presents a useful resource for sending data from service to service and for data processing internally but with anything, as soon as user input gets involved, things get dangerous. The processing of these files comes with an inherent risk due to XML processors having external entities enabled by default. Not everyone knows about these settings which makes this a potentially dangerous thing to have. External entities can be used to grab files or even execute code. Needless to say we do not want this to happen.

Stepan Ilyin

Author

A4:XML External Entities (XXE)

Threat agents/attack vectors	Security weakness	Impact
When malicious attackers want to exploit this vulnerability they are looking for ways to insert their own XML files or for ways to insert some content into files that the developers might not have thought of securing.	When XML libraries and processors were first being introduced, they would enable external entities by default and a lot of web applications inherent this without using it and without knowing that it was even active. This is part of the specifications so these tools are not to blame but it does allow for the evaluation of expressions that are potentially harmful to the system. What makes this problem worse is that XXE is not something that is regularly checked by testers.	The technical impact can range from the disclosure of files to the execution of code, contacting of internal functionality or servers and even DoS attacks. It depends on what functionality is impacted and in what manner as regards to the business impact.

What is XXE attack?

An XXE attack occurs when malicious actors send off data in one of the XML formats they have control over (for example an XML upload, a SOAP request or even a DOCX file they can upload as they consist of XML documents after we extract them). The attacker can insert what’s called an external entity into an XML and call that entity in one of the nodes. This might cause the system to execute the external entity and for example execute code. We have shown an example below:

&lt;?xml version="1.0" encoding="UTF-8"?&gt;
    &lt;!DOCTYPE foo [ &lt;!ENTITY smbConf SYSTEM "file:///smb.conf"&gt; ]&gt;
&lt;TestDocument&gt;
&lt;name&gt;test&lt;/name&gt;
&lt;lastName&gt;&smbConf&lt;lastName&gt;

Now I specifically added the &smbConf external entity in the second node of my document as I wanted to make clear XXE attacks can occur in any node of the document.

XXE Attack types

Retrieving files with the help of XXE

Like you may have noticed from the example shown above, there are two parts two an XXE attack to retrieve files as this was an example of that. First of all we have to note the inclusion of the external entity

&amp;lt;!DOCTYPE fakeDocType [ &lt;!ENTITY smbConf SYSTEM &quot;file:///smb.conf&quot;&gt; ]&amp;gt;

And second of all we need to include this entity in one of the nodes of the document.

&lt;lastName&gt;&smbConf&lt;/lastName&gt;

The attacking XML contains an external entity called smbConf which will attempt to gain a smb configuration file. As stated before, we then test every possible node for this external entity to see if we can grab the file and display it as an attacker.

Performing SSRF attacks using XXE

Another potentially harmful way XXE could impact an organization is by performing an SSRF attack with the XXE vulnerability. When this happens the server is made to execute an HTTP request on behalf of the attacker with all kinds of serious side effects.

If we want to execute an SSRF attack through XXE, we need to define what URL we want the server to execute a request to like so:

&lt;!DOCTYPE fakeDocType [ <!ENTITY adminPanel SYSTEM "http://192.168.1.12/admin/"> ]&gt;

As we can see in the example above, the XXE processor will execute a request to a server running on the internal network that contains an admin panel that can only be accessed by the internal network. This admin panel can now be browsed by an attacker by means of SSRF. If no data is returned however, a blind SSRF might still be possible.

Blind XXE attacks

Just like blind SSRF vulnerabilities, blind XXE vulnerabilities also exist. The external entity can still be processed but that does not mean it has to return data. These types of vulnerabilities are harder to find and abuse but with some more creative techniques, attackers can still find and exploit these issues.

Locating hidden attack surface for XXE attacks

It may seem like only directly controlled XML files by the user are vulnerable but nothing could be further from the truth! There are possibilities for attackers to execute what is known as xinlcude attackers. This is where the attackers will insert an XXE attack vector into a non-XML parameter and the server will later on merge the input into an XML file. In this situation however you can only insert your attack string into the XML file and not control its entirety so you will have to be creative as you can not redefine or change the doctype. Luckily, the XML specification comes to the rescue as we can use part of it called the xinlcude section. An example of an xcinlcude attack would be:

&lt;fakeElement xmlns:xi="http://www.w3.org/2001/XInclude"&gt;
&lt;xi:include parse="text" href="file:///smb"/&gt;&lt;/foo&gt;

This is one of the nastiest of the OWASP top 10 vulnerabilities as it’s often missed, slipping through the net while still having a devastating impact.

How Can I Detect XML External Entities?

We should take to adept source code analysis tools that will scan our code and report issues to us. We should also note down any entry point for XML files such as XML file imports, DOCX file uploads, SVG image uploads and SOAP endpoints. We should make sure to test all these XXE entry points and not only limit ourselves to the regular XXE issues we know but look for blind XXE issues as they are harder to test for and require a different strategy. We need to investigate all the possibilities so this includes anything that might contain the vulnerability, SAML, DTD, SOAP, … and also test these endpoints thoroughly while making sure to test for every node of the XML.

If the attacker can only control part of the XML document, they should aim to test for xinclude attacks.

XXE attack scenarios

&lt;?xml version="1.0" encoding="ISO-8859-1"?&gt;
&lt;!DOCTYPE sshKey [&lt;!ELEMENT sshKey ANY &gt;
&lt;!ENTITY key SYSTEM "file:///.ssh/id_rsa" &gt;]&gt;
&lt;foo&gt;&xxe;&lt;/foo&gt;

The first attack scenario we want to start out with an attacker who wants to steal the private SSH keys of their victim so they launch an XXE attack with an external entity which will try to grab the id_rsa file from the victim which is their private key, allowing them to possibly create connections to other servers if they can also grab the known_hosts file from the .ssh folder using the same technique.

In our second attack scenario we want to visit a URL on a web server that can only be connected to from a host that is within the same network to prevent hackers from stealing the sensitive unprotected data they are after which would be a list of credit card details in this example.

&lt;!DOCTYPE fakeDocType [ <!ENTITY xxe SYSTEM "http://127.0.0.1/creditCards/"> ]&gt;

The web server contains the list itself so the SSRF attack does not even need to try and reach a different server but can instead connect on the loopback ip address.

These are all nice theoretical examples but i find practical examples and real life examples work best which is why we will be going over a CVE report from CVE-2018-12463 which is an older CVE that describes how a single bad implementation of an XML interpreter can cause serious problems. In this case, an XXE vulnerability led to the ability to read files and perform SSRF attacks from an unauthenticated user which makes this vulnerability even worse.

https://www.cvedetails.com/cve/CVE-2018-12463/

Even big players with great budgets are not immune to this vulnerability, which is exactly what IBM discovered with their websphere application. An XXE vulnerability has been found which allowed attackers to consume resources and get their hands on sensitive user data. CVE-2021-20454 is a great example of why we should be very diligent in our XXE testing and might even need to consider other data formats as JSON.

https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2021-20454

How to prevent XXE vulnerabilities

Prevention of XXE attacks will rely heavily on indexing and protecting all possible XML entry points and making sure they do not have external entities enabled where not needed. We need to be aware that XML is something more complex than it seems at first glance and it reaches far and wide. If possible we should opt to use a different data format such as JSON to prevent the possibility of XXEs completely.

If we do use SOAP, we need to make sure we use a version higher than 1.2 as it will be patched properly. Other XML libraries used should also be patched promptly. To aid this process there are checkers that go over the dependencies and report any outdated versions.

It goes without saying that wherever possible external entities should be disabled in the configurations where the application allows this.

In all instances where user data ends up in an XML file (this can also be done by the application merging user input with an XML file) we should implement proper data hygiene and sanitise all the incoming data. The best way to do this is by implementing a whitelisting strategy but we realise this is not always feasible as it can cause business problems to only allow certain input.

An XSD is a great technology to help us validate any incoming XML file and we should make sure every incoming file meets the requirements set forth in the XSD.

Code review can also help us detect these issues before they hit production. This can either be done manually or with the help of source code review tools though these should always be used in conjunction with manual testing and code reviews. These should pay special attention to any endpoint accepting XML input.

A last option is to install a WAF or API security firewall to increase the security of the application but these should never be used in isolation instead we should opt to use them in conjunction with the above preventive measures.

Conclusion

XXE is an often overlooked issue type due to the way developers learn about XML and how they often neglect to learn about it’s more intricate features such as external entities or Xincludes. Since these issues are easy to miss and they have such a large impact in general, it is important to pay close attention to any XML input point and to test it thoroughly.

FAQ

What is XML External Entities (XXE) in OWASP?

What causes XML External Entities (XXE) vulnerabilities?

How can I detect and prevent XXE vulnerabilities?

Can XXE vulnerabilities be exploited remotely?

What are the most recent XXE vulnerabilities reported by OWASP?

References

A4:2017-XML External Entities (XXE) - OWASP

A4:2017-XML External Entities (XXE) - GitHub

‍

Updated:

April 8, 2025

Learning Objectives

webinar

July 23, 2025

Mastering API Security Testing: Stop BOLA and the OWASP Top 10 Before Deployment

Don’t miss the opportunity to enhance your API security testing expertise with Wallarm.

Stepan Ilyin

Author |

Verified Expert

Stepan is a cybersecurity expert proficient in Python, Java, and C++. With a deep understanding of security frameworks, technologies, and product management, they ensure robust information security programs. Their expertise extends to CI/CD, API, and application security, leveraging Machine Learning and Data Science for innovative solutions. Strategic acumen in sales and business development, coupled with compliance knowledge, shapes Wallarm's success in the dynamic cybersecurity landscape.

Ivan Novikov

Reviewer |

Verified Expert

With over a decade of experience in cybersecurity, well-versed in system engineering, security analysis, and solutions architecture. Ivan possesses a comprehensive understanding of various operating systems, programming languages, and database management. His expertise extends to scripting, DevOps, and web development, making them a versatile and highly skilled individual in the field. Bughunter, working with top tech companies such as Google, Facebook, and Twitter. Blackhat speaker.