What is Cross-Site Scripting (XSS) and how to prevent it

Cross-Site Scripting or “XSS” is one of the most common vulnerabilities found in web applications. XSS made up nearly 40 per cent of all attacks logged by security researchers in recent years, who also noted that almost 75 per cent of large companies across Europe had been targeted over the last year.

An Introduction to Cross-Site Scripting (XSS)

 

Cross-Site Scripting or “XSS” is one of the most common vulnerabilities found in web applications. XSS made up nearly 40 per cent of all attacks logged by security researchers in recent years, who also noted that almost 75 per cent of large companies across Europe had been targeted over the last year.

However, ask many developers and technical team members to explain exactly what it is or how to prevent it and the answers that you get back may surprise you: answers will often consist of partially-remembered examples, many typically involving some form of classic pop-up dialogue box in a web browser. But true XSS attacks are nearly always completely invisible to end users. Clearly there is still a lot of uncertainty among developers and others around what exactly XSS is and how to prevent it. Very often the topic is made overly complex, when in reality it is relatively simple to both understand and defend against. In this article we’re going to outline how XSS occurs, and how it can be prevented, with a minimum of jargon and a focus on comprehension of key issues.

 

A history lesson

 

Websites rely on the underlying HTTP protocol in a simple request-response cycle: your browser requests a resource, and the server that hosts the webpage returns it. Websites historically were completely static, returning requested assets such as documents, files, HTML pages and images – you requested a URL and the same content was returned regardless of who, what or where you were.

Over time websites became ‘dynamic’, and returned different content based on the server-side execution of a script or program that dynamically altered content before returning it. That could be as simple as returning “Hello, Daniel” when you were logged in as the user “Daniel”. What’s going on here is that the server was provided with the username “Daniel”, and embedded it in the returned response.

On a modern website, web servers typically add ‘client-side’ scripts to the mix, to provide additional functionality and inactivity. These client-side scripts, typically written in a language such as JavaScript, are sent to your browser from the webserver, and then execute on your own computer, rather than the server.

 

Where’s the problem?

 

The problem is that while you might trust JavaScript returned by your online banking website to perform automatic actions on your computer relating to the online banking website, you almost certainly don’t trust JavaScript written by anyone to execute on your computer. JavaScript can be dangerous because it executes client-side within your browser and has access to all data within scope of the website that you’re accessing. If an attacker can control the script that executes in your browser, they can trick your computer into sending them your online banking password, invisibly and unbeknownst to you, simply by you visiting your online banking site.

Cross-site scripting is the slightly clunky and confusing initialism (“CSS” was already taken by “Cascading Style Sheets”, hence the use of the seemingly random “X”) for the malicious insertion by attackers of unauthorised script into a website or web application. We’ll look at how this is performed below.

 

What causes XSS?

 

At its simplest, XSS relies on a website returning client-side code to a client for it to execute, but in which an attacker has inserted or ‘injected’ code of their own, rather than the code that the web server is intending for your browser to run.

This can take a variety of forms but usually relies on the attacker tricking the website into returning code generated by data submitted to it. In a dynamic website, a server might build a response to a customer based on data that the customer has previously entered in an earlier web request.

 

A simple example

 

A trivial example of a website that could be vulnerable may be as simple as a site containing a single form that echoes your name back to you. You enter your name (“Daniel”) in a login form field, and the website returns “Hello, Daniel” in its HTTP response:

 

What’s meant to happen:

User: [requests webpage]

Server: Returns “What is your name?”

User: Enters “Daniel”

Server: Returns “Hello, Daniel”

User Browser: Renders the HTML returned and displays “Hello, Daniel”

 

What can go wrong

If the website can instead be forced to receive the name of “<script>…</script>” in a request, then this script will be returned to the user as JavaScript and execute as code in the unsuspecting visitors’ browsers.

Before looking at how an attacker can exploit this, we’ll examine a simpler case where a user simply XSS-attacks themselves via a vulnerable website:

 

User: [requests webpage]

Server: Shows “What is your name?”

User: Enters “<script>…</script>” in the name field

Server: Returns “Hello, <script>…</script>”

User Browser: Sees the executable script content “<script>…</script>” in the response and executes it, performing the potentially malicious action it contains instructions to perform

 

How does the attacker inject the script?

 

When a request is sent to a web browser, dynamic content (such as the “name” in the above example) are frequently sent as parameters to URLs, eg:

 

http://www.test.com?name=Daniel

 

If an attacker can get you to instead click on a link such as:

 

http://www.test.com?<script>…<script>

 

then you have sent their script to the server, and the server will then “reflect” it back to you, where it will be executed by your browser.

 

There are three forms of XSS, usually targeting users’ browsers:

 

  1. Reflected XSS: This is the simple example given above. In this type, the website includes user input as part of its response to a request, ‘reflecting’ it back. Typically the user will need to interact with some malicious link that points to an attacker-controlled page, such as malicious ‘watering hole’ websites, advertisements, or similar.
  2. Stored XSS: In this type, the website stores user input that is then viewed at a later time by another user or an administrator. This could be for example a comment on a form or news story. Any person visiting the page with the XSS stored will be vulnerable.
  3. DOM XSS: JavaScript frameworks, single-page applications, and APIs that dynamically include attacker-controllable data to a page are vulnerable to DOM XSS.

 

How dangerous is XSS?

 

There are many ‘rewards’ for an attacker targeting a user with XSS attacks. Malicious payloads can include:

 

  • Capturing user input such as passwords via a key-logger
  • Sending cookies, tokens and other cached data to a third party
  • Performing network requests and system operations that the user hasn’t requested
  • Forcing downloads of files to the end user PC

 

Why is Cross-site Scripting so commonly seen?

 

The reason that XSS vulnerabilities are so prevalent is that whilst XSS is a trivial vulnerability to defend against, doing so requires robust coding practices that are consistently implemented over time. Consider how data is handled at input and output in almost every piece of code you write, and at every level of your stack.

 

How to prevent Cross-Site Scripting (XSS)

 

The best approach to protect your website against XSS and other linked vulnerabilities is to ensure that all input data is treated as raw data/text and doesn’t allow the data to be interpreted as code and context-jump into an executable command. In practice this means performing some combination of:

 

  • Sanitisation of any data received from an external context or user; and
  • Encoding of any data output to another component

 

Sanitisation of input data

You should sanitise input, ideally against a type, or if not then against a whitelist regex of allowed values. In the example above, if you’re asking for someone’s name, then you could for example allow only upper case and lower case alphabet plus a few other characters – there’s no names with “<” in, for example.

However, this is simpler for some parameters and form fields than others. If you are processing a parameter representing a numeric item ID, then simply checking the type is an integer may be simple and sufficient. For other data that is richer, this is more difficult, and sanitisation is of more limited value for such parameters. When you sanitise input, you risk altering the data in ways that might make it unusable. Input sanitisation is therefore generally avoided in cases where the nature of the data is unknown, such as free-form text entry fields, especially if these may legitimately contain complex data sets such as code samples.

 

Encoding of output data (server-side)

The more effective measure to prevent XSS is to ensure that every function in your code that passes data to another context encodes the data for that system, ensuring that it continues to be interpreted as data, and not permitted to jump contexts into being interpreted as executable code. There is no universal encoding standard that can be used, since the encoding mechanism will vary depending upon the context.

If exporting to the browser in HTML, HTML-encoding should be used, for example:

 

<script> → &lt;script&gt;

 

Server-side frameworks commonly provide helper functions to provide this functionality to you. Standard encoding libraries are available for common languages and frameworks including Ruby, PHP, Java and Python.

The same encoding should be done at each language boundary, such as when passing off data to another system. If exporting to a SQL database, then SQL escape strings can be used, although fully parameterised queries are a much more robust defence. In larger teams and projects, standard classes can be used that automatically encode data at relevant system and language boundaries.

 

Encoding of output data (client-side)

The same techniques can be used in client-side code to prevent DOM-based XSS. Specifically, in client-side JavaScript we use a different encoding mechanism depending on the use case. If you need to add content to an HTML element, the best way is to assign the user-generated input to that element using the textContent property. The browser will do all the escaping for you:

 

document.querySelector(‘#myElement’).textContent = theUserGeneratedInput

 

If you need to create an element then similarly you can use document.createTextNode():

 

const el = document.createTextNode(theUserGeneratedInput)

 

If you need to add content to an HTML attribute, you can use the setAttribute() method of the element:

 

document.querySelector(‘#myElement’).setAttribute(‘attributeName’, theUserGeneratedInput)

 

And finally if you need to add content to the URL, then you can use the window.encodeURIComponent() function:

 

window.location.href = window.location.href + ‘?test=’ + window.encodeURIComponent(theUserGeneratedInput)

 

Security Headers

Various HTTP Response headers exist now that aim to mitigate XSS attacks. The foremost one is the Content Security Policy (CSP) response header, which sets bounds on the permitted sources for JavaScript code. CSP is enabled in a web server either globally or on a page by page basis, by adding the Content‑Security‑Policy HTTP Response Header when serving the page.

 

Web Application Firewalls

A web application firewall is a reverse proxy that sits between a user and a screened application. It can parse application-layer (HTTP) requests and determine if they contain payloads in parameters and headers that may contain attempted XSS. They are not foolproof, and can perform both false positives (false alerts blocking genuine functionality) and false negatives (missed detection opportunities) but can be an important tool in a defense in depth strategy when well configured.

 

Client Side (Browser) Controls

Some browsers have historically applied protection client-side via built-in browser filtering. However this is becoming less common. The Chrome XSS auditor was rolled out with ‘filter’ mode as default, which meant that web pages continued to be rendered, but filtered out any code that was suspected of presenting a potential XSS issue, however following several XSS bypass exploits being published, the feature was removed.

 

Further Information

If you’d like any further information or would like to see what XSS vulnerabilities we can pick up in your web applications then feel free to email us at info@localhost where an AppCheck representative can set up your free vulnerability assessment.

Get started with Appcheck

No software to download or install.

Contact us or call us 0113 887 8380

About Appcheck

AppCheck is a software security vendor based in the UK, offering a leading security scanning platform that automates the discovery of security flaws within organisations websites, applications, network and cloud infrastructure. AppCheck are authorized by te Common Vulnerabilities and Exposures (CVE) Program aas a CVE Numbering Authority (CNA)

No software to download or install.
Contact us or call us 0113 887 8380

Start your free trial

Your details
IP Addresses
URLs

Get in touch