Skip to content

XML 外部實體 (XXE) 攻擊

In this article, I will write about the XML External Entity attack. For this attack to occur, the application must have logic for parsing XML input.

This injection will happen if there is a weakly configured XML parser. A successful attack would be if the attacker would be able to view files on the application server and interact with the backend. This XXE vulnerability could be used to perform server-side request forgery (SSRF) attacks, denial of service (DoS) Billion Laughs Attack, and many more.

What are XXE types?

There is no strict classification of XXE attacks, but we can divide them into two types: in-band and out-of-band(blind).

· In-band are more common than out-of-band ones. In this case, the attacker will receive an immediate response to the XXE payload.

· Out-of-band or so-called Blind XXE, there is no immediate response. This type involves the creation of an external Document Type Definition. For this type, the XML parser also needs to make an additional request to an attacker-controlled server.

What are the cases when attacker can execute this injection?

· In old applications where the version of SOAP is less than 1.2

· Applications where users are logged in based on their sessions – SAML(single sign-on (SSO) login standard). Chances for this attack to happen in this case can be very high because SAML uses XML for identity assertions

· If there are XML inputs or XML uploads into XML documents that can be added from untrusted data and parsed by an XML processor after that.

· There is a high risk when Document Type Definitions (DTD) is enabled

When would application parse XML?

XML is often used in both: frontend and backend web development.


The Frontend side of the application can request, for example, an XML file from API and create and present a UI form based on the data in XML. Then we can have an option to add a new field into the form and if we would like to save the changes. Afterward, the XML input would be added into the XML document.

From the backend parsing, XML would be used to transfer the data in some standard format. Also, in mobile development, Android applications use it to create layouts and store configurations.

On the OWASP site, you can find more examples of XXE attacks. Portswigger has a nicely explained example of this attack:

For example, suppose a shopping application checks for the stock level of a product by submitting the following XML to the server:

The application performs no particular defenses against XXE attacks, so you can exploit the XXE vulnerability to retrieve the /etc/passwd file by submitting the following XXE payload:

<?xml version=”1.0″ encoding=”UTF-8″?>
<!DOCTYPE foo [ <!ENTITY xxe SYSTEM “file:///etc/passwd”> ]>

This XXE payload defines an external entity &xxe; whose value is the contents of the /etc/passwd file and uses the entity within the productId value. This causes the application’s response to include the contents of the file:

Invalid product ID:


List of preventions for XXE

  • Using JSON instead of XML and avoiding serialization of sensitive data
  • As I mentioned before, this attack can happen easily when the application is using SOAP < 1.2, so try to update to the higher version
  • Implement XSD validation in your application (“XML Schemas”) for all XML file inputs
  • Patch or upgrade all XML libraries
  • Use SAST tools for checking out if there are XXE vulnerabilities.

How to prevent if you are using SAML?

SAML language is used to construct authorization statements, whose authenticity is protected by the XML digital signature applied over the statements.

Many attacks happen because of wrong assumptions made by developers; for example, the token is always properly formed XML compliant with SAML schema.

The developers can assume that SAML would have just one Assertion tag in the document (the properly formed SAML would have). With that fact, developers can validate just the first element they get when searching for elements by the tag name in the XML document.

To get list of nodes JS “getElementsByTagName” method can be used:

NodeList xmlNodes = doc.getElementsByTagName(“saml:Assertion”);

To xmlNodes will be assigned the list of matching elements from document with tag Name “saml:Assertion”.

As developers can assume that this is the properly formed SAML with one Assertion tag, they will get the first element and validate it after:

let firstElement = (Element)xmlNodes.item(0);

*As you can guess, this is not the proper way to validate the tag because the attacker can also assume that developers used this approach for the validation. In this case, the attacker can catch the first element (tag) and replace it with a malicious assertion before the original one, and it will never be detected.

With the same logic, some developers use “getElementsByTagNameNS” but the result would be the same: easily inserted malicious script in the first element.

Proper prevention would be:

· Parsing the XML document. Using structure validation based on the supplied schema. Never allow automatic download of schemas from the third party but prefer to use local trusted copies. It would also be good if it is possible to inspect schemas and perform schema hardening. This could be used to disable possible wildcard types or relaxed processing statements.

· Digital signature validation, which verifies the authenticity and integrity of the assertion embedded in the SAML document. This prevents forgery.

**Most important when writing schema is to describe the intended document’s structure precisely.

How to prevent using XSD validation?

I will explain how to create a C# solution to validate XML data.

The most important reason we want to use XSD (XML Schema Definition) validation is that we want the sender and receiver to have the same “expectations” about the content. Using schemas, we need to describe exactly the data so both parties would be clear about them.


· Add XML file into the code

When adding XML file, you will just see xml tag:

<?xml version=”1.0″ encoding=”utf-8″ ?>

I will add object User with properties FirstName, LastName, Address, so xml file would look like this:

· Create XML Schema for this file

You will get XML schema structure like this:

· Modify XSD

Now you can modify the file- add validations for FirstName and Address. In this case, I just show how to add validations for these fields, but they will, of course, not prevent the attack; they will just validate the length and the type of mentioned fields.

· Validate XML using XSD

What am I doing in the code?

  • Getting the local path of Assembly so I can after add XML file name and XSD file name to get their full paths
  • Creating schema using XmlSchemaSet and XmlSeverityType which are from System.Xml.Schema
  • Using XMLReader from System.XML so I can create XDocument imported from System.Xml.Linq
  • When I create document, I want to use validate method that class has and pass schema by which I will validate and the method ValidationEventHandler (I named it like that) which is throwing exception if type is error. In this method you should add all validation logic.

This is just an example on how to create XSD for XML file and which libraries you can use for the validation.

How to prevent with implementation of DTD?

We can also validate XML file using DTD. Here are some differences between XSD and DTD on site.

In this example, I am validating an XML file using a DTD file with DtdProcessing.


  • Setting the validation settings using XmlReaderSettings
  • Creating the XmlReader object so I can parse the file using the method read()
  • Creating ValidationEventHandler method which is throwing an exception if the type is an error. In this method, you should add all validation logic.

List of SAST testing tools

SAST testing tools will help you with static application security testing.

SAST tools can be free, commercial, and open-source tools.

A list of the most popular SAST Tools currently are:

  • Veracode
  • LGTM
  • Checkmarx
  • Klocwork
  • Reshift
  • SpectralOps
  • HCL AppScan
  • Codacy
  • Insider CLI
  • Argon


Why is SOAP version < 1.2 vulnerable to XXE attack and why you should use later versions?


Before version 1.2 external entities were allowed within SOAP messages.

Since version 1.2 some changes were introduced to the envelope and encoding schemas. Both schemas have been updated to be compliant with the XML Schema Recommendation.

You can see the list of recommendations which were used:






Also, additional changes occurred in this version, within the names of datatypes in the XML Schema specification, and some datatypes were removed. If you want check out all changes which were made you can go to this site.



This article presented some prevention steps that could help you defend your application from XXE attack.

The OWASP team, which is constantly working to discover new ways the attackers can exploit your application and perform their malicious actions, are always updating their Prevention Cheat Sheet.

The best way to secure your application would be to always be up to date with the new prevention ways: best libraries to use, best detection tools, etc.

In the end, secure code is the cheapest code!    

Cover photo by Joshua Woroniecki

#XXE_attack #XSD #DTD #SAML #vicarius_blog

As part of our mission to secure the world’s OT, IoT and Cyber Physical infrastructures, we invest resources into offensive research of vulnerabilities and attack techniques.

Ripple20 are 19 vulnerabilities revealed by Israeli firm JSOF that affect millions of OT and IOT devices. The vulnerabilities reside in a TCP/IP stack developed by Treck, Inc. The TCP/IP stack is widely used by manufacturers in the OT and IoT industries and thus affects a tremendous amount of devices.

Among the affected devices are Cisco Routers, HP Printers, Digi IoT devices, PLCs by Rockwell Automation and many more. Official advisories by companies who confirmed having affected devices can be found here, in the “More Information” section.

The most critical vulnerabilities are three that can cause a stable Remote Code Execution (CVE-2020-11896, CVE-2020-11897, CVE-2020-11901) and another that can cause the target device’s memory heap to be leaked (CVE-2020-11898).

On behalf of our customers, we set out to explore the real impact of these vulnerabilities, which we’re now sharing with the public.

The research has been conducted by researchers Maayan Fishelov and Dan Haim, and has been managed by SCADAfence’s Co-Founder and CTO, Ofer Shaked.

Exploitability Research
We set out to check the exploitability of these vulnerabilities, starting with CVE-2020-11898 (the heap memory leak vulnerability), one of the 19 published vulnerabilities.

We created a Python POC script that is based on JSOF official whitepaper for this vulnerability. According to JSOF, the implementation is very similar to CVE-2020-11896, which is an RCE vulnerability that is described in the whitepaper. Also mentioned about the RCE vulnerability: “Variants of this Issue can be triggered to cause a Denial of Service or a persistent Denial of Service, requiring a hard reset.”

Trial Results:
Test 1 target: Samsung ProXpress printer model SL-M4070FR firmware version V4.00.02.18 MAY-08-2017. This device is vulnerable according to the HP Advisory.

Test 1 result: The printer’s network crashed and required a hard reset to recover. We were unable to reproduce the heap memory leak as described, and this vulnerability would have been tagged as unauthenticated remote DoS instead, on this specific printer.

Test 2 target: HP printer model M130fw. This device is vulnerable according to the HP Advisory.

Test 2 result: Although reported as vulnerable by the manufacturer, we were unable to reproduce the vulnerability, and we believe that this device isn’t affected by this vulnerability. We believe that’s because the IPinIP feature isn’t enabled on this printer, which we’ve verified with a specially crafted packet.

Test 3 target: Undisclosed at this stage due to disclosure guidelines. We will reveal this finding in the near future.

Test 3 result: We found an unreported vendor and device, on which we can use CVE-2020-11898 to remotely leak 368 bytes from the device’s heap, disclosing sensitive information. No patch is available for this device. Due to our strict policy of using Google’s Responsible Disclosure, we’ve reported this to the manufacturer, to allow them to make a patch available prior to the publication date.

Key Takeaways
We’ve confirmed the exploitability vulnerabilities on our IoT lab devices.

On the negative side: The vulnerabilities exist on additional products that are unknown to the public. Attackers are likely to use this information gap to attack networks.
On the positive side: Some devices that are reported as affected by the manufacturers are actually not affected, or are affected by other vulnerabilities. It might require attackers to tailor their exploits to specific products, increasing the cost of exploitation, and prevent them from using the vulnerability on products that are reported as vulnerable.

SCADAfence Research Recommendations
Check your asset inventory and vulnerability assessment solutions for unpatched products affected by Ripple20.
The SCADAfence Platform creates an asset inventory with product and software versions passively and actively, and allows you to manage your CVEs across all embedded and Windows devices.
Prioritize patching or other mitigation measures based on: Exposure to the internet, exposure to insecure networks (business LAN and others), criticality of the asset.
This prioritization can automatically be obtained from tools such as the SCADAfence Platform.
Detect exploitation based on network traffic analysis.
The SCADAfence Platform detects usage of these exploits in network activity by searching for patterns that indicate usage of this vulnerability in the TCP/IP communications.
If you have any questions or concerns about Ripple20, please contact us and we’ll be happy to assist you and share our knowledge with you or with your security experts.

About Version 2 Limited
Version 2 Limited is one of the most dynamic IT companies in Asia. The company develops and distributes IT products for Internet and IP-based networks, including communication systems, Internet software, security, network, and media products. Through an extensive network of channels, point of sales, resellers, and partnership companies, Version 2 Limited offers quality products and services which are highly acclaimed in the market. Its customers cover a wide spectrum which include Global 1000 enterprises, regional listed companies, public utilities, Government, a vast number of successful SMEs, and consumers in various Asian cities.

About SCADAfence
SCADAfence helps companies with large-scale operational technology (OT) networks embrace the benefits of industrial IoT by reducing cyber risks and mitigating operational threats. Our non-intrusive platform provides full coverage of large-scale networks, offering best-in-class detection accuracy, asset discovery and user experience. The platform seamlessly integrates OT security within existing security operations, bridging the IT/OT convergence gap. SCADAfence secures OT networks in manufacturing, building management and critical infrastructure industries. We deliver security and visibility for some of world’s most complex OT networks, including Europe’s largest manufacturing facility. With SCADAfence, companies can operate securely, reliably and efficiently as they go through the digital transformation journey.