What is Cross-Site Scripting (XSS) and how to prevent it
Research / Posted March 25, 2020
An Introduction to Cross-Site Scripting (XSS)
Cross-Site Scripting or “XSS” is one of the most common vulnerabilities found in web applications. XSS made up nearly 40 per cent of all attacks logged by security researchers in recent years, who also noted that almost 75 per cent of large companies across Europe had been targeted over the last year.
However, ask many developers and technical team members to explain exactly what it is or how to prevent it and the answers that you get back may surprise you: answers will often consist of partially-remembered examples, many typically involving some form of classic pop-up dialogue box in a web browser. But true XSS attacks are nearly always completely invisible to end users. Clearly there is still a lot of uncertainty among developers and others around what exactly XSS is and how to prevent it. Very often the topic is made overly complex, when in reality it is relatively simple to both understand and defend against. In this article we’re going to outline how XSS occurs, and how it can be prevented, with a minimum of jargon and a focus on comprehension of key issues.
A history lesson
Websites rely on the underlying HTTP protocol in a simple request-response cycle: your browser requests a resource, and the server that hosts the webpage returns it. Websites historically were completely static, returning requested assets such as documents, files, HTML pages and images – you requested a URL and the same content was returned regardless of who, what or where you were.
Over time websites became ‘dynamic’, and returned different content based on the server-side execution of a script or program that dynamically altered content before returning it. That could be as simple as returning “Hello, Daniel” when you were logged in as the user “Daniel”. What’s going on here is that the server was provided with the username “Daniel”, and embedded it in the returned response.
Where’s the problem?
Cross-site scripting is the slightly clunky and confusing initialism (“CSS” was already taken by “Cascading Style Sheets”, hence the use of the seemingly random “X”) for the malicious insertion by attackers of unauthorised script into a website or web application. We’ll look at how this is performed below.
What causes XSS?
At its simplest, XSS relies on a website returning client-side code to a client for it to execute, but in which an attacker has inserted or ‘injected’ code of their own, rather than the code that the web server is intending for your browser to run.
This can take a variety of forms but usually relies on the attacker tricking the website into returning code generated by data submitted to it. In a dynamic website, a server might build a response to a customer based on data that the customer has previously entered in an earlier web request.
A simple example
A trivial example of a website that could be vulnerable may be as simple as a site containing a single form that echoes your name back to you. You enter your name (“Daniel”) in a login form field, and the website returns “Hello, Daniel” in its HTTP response:
What’s meant to happen:
User: [requests webpage]
Server: Returns “What is your name?”
User: Enters “Daniel”
Server: Returns “Hello, Daniel”
User Browser: Renders the HTML returned and displays “Hello, Daniel”
What can go wrong
Before looking at how an attacker can exploit this, we’ll examine a simpler case where a user simply XSS-attacks themselves via a vulnerable website:
User: [requests webpage]
Server: Shows “What is your name?”
User: Enters “<script>…</script>” in the name field
Server: Returns “Hello, <script>…</script>”
User Browser: Sees the executable script content “<script>…</script>” in the response and executes it, performing the potentially malicious action it contains instructions to perform
How does the attacker inject the script?
When a request is sent to a web browser, dynamic content (such as the “name” in the above example) are frequently sent as parameters to URLs, eg:
If an attacker can get you to instead click on a link such as:
then you have sent their script to the server, and the server will then “reflect” it back to you, where it will be executed by your browser.
There are three forms of XSS, usually targeting users’ browsers:
- Reflected XSS: This is the simple example given above. In this type, the website includes user input as part of its response to a request, ‘reflecting’ it back. Typically the user will need to interact with some malicious link that points to an attacker-controlled page, such as malicious ‘watering hole’ websites, advertisements, or similar.
- Stored XSS: In this type, the website stores user input that is then viewed at a later time by another user or an administrator. This could be for example a comment on a form or news story. Any person visiting the page with the XSS stored will be vulnerable.
How dangerous is XSS?
There are many ‘rewards’ for an attacker targeting a user with XSS attacks. Malicious payloads can include:
- Capturing user input such as passwords via a key-logger
- Sending cookies, tokens and other cached data to a third party
- Performing network requests and system operations that the user hasn’t requested
- Forcing downloads of files to the end user PC
Why is Cross-site Scripting so commonly seen?
The reason that XSS vulnerabilities are so prevalent is that whilst XSS is a trivial vulnerability to defend against, doing so requires robust coding practices that are consistently implemented over time. Consider how data is handled at input and output in almost every piece of code you write, and at every level of your stack.
How to prevent Cross-Site Scripting (XSS)
The best approach to protect your website against XSS and other linked vulnerabilities is to ensure that all input data is treated as raw data/text and doesn’t allow the data to be interpreted as code and context-jump into an executable command. In practice this means performing some combination of:
- Sanitisation of any data received from an external context or user; and
- Encoding of any data output to another component
Sanitisation of input data
You should sanitise input, ideally against a type, or if not then against a whitelist regex of allowed values. In the example above, if you’re asking for someone’s name, then you could for example allow only upper case and lower case alphabet plus a few other characters – there’s no names with “<” in, for example.
However, this is simpler for some parameters and form fields than others. If you are processing a parameter representing a numeric item ID, then simply checking the type is an integer may be simple and sufficient. For other data that is richer, this is more difficult, and sanitisation is of more limited value for such parameters. When you sanitise input, you risk altering the data in ways that might make it unusable. Input sanitisation is therefore generally avoided in cases where the nature of the data is unknown, such as free-form text entry fields, especially if these may legitimately contain complex data sets such as code samples.
Encoding of output data (server-side)
The more effective measure to prevent XSS is to ensure that every function in your code that passes data to another context encodes the data for that system, ensuring that it continues to be interpreted as data, and not permitted to jump contexts into being interpreted as executable code. There is no universal encoding standard that can be used, since the encoding mechanism will vary depending upon the context.
If exporting to the browser in HTML, HTML-encoding should be used, for example:
<script> → <script>
Server-side frameworks commonly provide helper functions to provide this functionality to you. Standard encoding libraries are available for common languages and frameworks including Ruby, PHP, Java and Python.
The same encoding should be done at each language boundary, such as when passing off data to another system. If exporting to a SQL database, then SQL escape strings can be used, although fully parameterised queries are a much more robust defence. In larger teams and projects, standard classes can be used that automatically encode data at relevant system and language boundaries.
Encoding of output data (client-side)
document.querySelector(‘#myElement’).textContent = theUserGeneratedInput
If you need to create an element then similarly you can use document.createTextNode():
const el = document.createTextNode(theUserGeneratedInput)
If you need to add content to an HTML attribute, you can use the setAttribute() method of the element:
And finally if you need to add content to the URL, then you can use the window.encodeURIComponent() function:
window.location.href = window.location.href + ‘?test=’ + window.encodeURIComponent(theUserGeneratedInput)
Web Application Firewalls
A web application firewall is a reverse proxy that sits between a user and a screened application. It can parse application-layer (HTTP) requests and determine if they contain payloads in parameters and headers that may contain attempted XSS. They are not foolproof, and can perform both false positives (false alerts blocking genuine functionality) and false negatives (missed detection opportunities) but can be an important tool in a defense in depth strategy when well configured.
Client Side (Browser) Controls
Some browsers have historically applied protection client-side via built-in browser filtering. However this is becoming less common. The Chrome XSS auditor was rolled out with ‘filter’ mode as default, which meant that web pages continued to be rendered, but filtered out any code that was suspected of presenting a potential XSS issue, however following several XSS bypass exploits being published, the feature was removed.
If you’d like any further information or would like to see what XSS vulnerabilities we can pick up in your web applications then feel free to email us at email@example.com where an AppCheck representative can set up your free vulnerability assessment.
Get started with Appcheck
No software to download or install.
Contact us or call us 0113 887 8380