Injection Attacks: An Introduction
Research / Posted April 06, 2020
Injection attacks are unique in that they have remained at the top of the OWASP Top 10 list since 2004. So for over a decade this type of vulnerability has been considered the most critical vulnerability for organisations to be aware of when developing code for the web.
The OWASP Top 10 is intended to focus attention on the most effective areas that companies should address in their efforts to protect themselves and produce secure code – it makes intuitive sense to focus the most effort on the highest risks.
See how AppCheck can help with OWASP Top 10 Vulnerabilities here: AppCheck vs OWASP Top 10
The first public discussions of a specific form of injection attack known as “SQL injection” appeared in 1998 and targeted the underlying language (SQL) that has become almost ubiquitous for managing data held in relational databases – including those that act as the backend storage for websites and web applications – since its introduction in the late 1980s. However, SQL is not the only technology vulnerable to injection attacks. So why do injection attacks remain so prevalent after such a long period of awareness by organisations?
Reasons for Longevity
Injection attacks appeal to hackers because they are relatively simple, can be automatically tested for (meaning that attackers can test many websites quickly), and have a high “reward” for attackers, since it can lead to them being able to access and steal data, or take over entire systems. Compared to more conceptually nebulous attacks such as XSS, injection attacks are simple and can be devastating. If you’ve ever watched a film in which a remote computer system is hacked and completely taken over… an injection attack can deliver exactly that.
Paired with this high impact (and therefore ‘reward’ to malicious hackers or attackers) is the fact that injection vulnerabilities are easy to introduce unless guarded against explicitly and deliberately by developers. It is not necessary to make a ‘mistake’ as such in order for code to be vulnerable to injection – rather, unless injection attacks are specifically considered and coded against, then code will likely be vulnerable. This means that much naive code or code written by junior developers will likely be vulnerable in some form or other.
Injection attacks are also common due to the ubiquity of the underlying stack that they can compromise – the majority of web servers will run (a) a Linux operating system and (b) an SQL data store, both of which can be targeted by variants of injection attacks – so an attacker knows that there is a high likelihood that certain attacks will work on a given system. Compared to language-specific or plugin-specific vulnerabilities in application code, injection vulnerabilities can be targeted at the majority of web services due to commonalities in their underlying technology.
But what exactly is an “Injection” attack anyway?
What is an injection attack?
Variants of injection attack can be exploited against technologies such as operating system shells (eg Linux’s “bash”), data manipulation languages (eg “SQL”) and even directory services such as LDAP. But although the exact implementation details of the syntax used in the attack payload varies somewhat in each variants, the underlying principles are always the same and rely on the fact that:
- Web applications are dynamic, and are designed to take input from the user and process it in some way;
- This user input is passed to the back end code that executes on the server, where the code processes the data; and
- In certain circumstances, it’s possible for the code to be tricked into treating the user input as command instructions rather than as data to be processed.
In a formal definition, we would say that injection attacks occur when untrusted data is sent to an interpreter as part of a command or query. The attacker’s hostile data can trick the interpreter into executing unintended commands or accessing data without proper authorisation. What we mean by this is best explained by looking at how interpreters work, and an example of how this can be used against them.
How parameterized code works and why it is vulnerable
Computer code nowadays is written in high-level languages – that is, when you write computer code now, you don’t have to code in binary or machine code (eg “01001000 01100101 01101100”), you can write code using normal words that have special meaning.
This means that the string “delete” in a high-level language can be both:
- a sequence of characters that the computer processes and has no specific meaning to it (data), or
- actual code to cause a program to delete a file from disk (code).
It is therefore pretty important that computers remain clear on whether a given string is code, or data. However, computer code also has to be dynamic: that is, it must be able to respond to variable input. If code ran in a closed state and received no input, it would simply run from a starting state and end with the same result each time. This is fine for a simple piece of code, such as if we want to print the word “hello”:
However, this doesn’t let us build a dynamic website. We need to be able to take user input and process it. In a dynamic website, we might pass in a parameter in our URL, such as a username:
Output: “hello, dan”
Underneath the hood, our code to enable this functionality may look like this:
system(“echo hello, $username”);
The $username variable is being used to represent whichever username is passed in. Our developer expects this to be a name. However a malicious attacker knows that there is a difference between what is expected and what is permitted. He sends in a special character (which one depends on the underlying system he’s targetting) that says to the computer effectively “here’s some data first, then here’s the terminating character to signal the end of that data, and finally here’s something else to do”. The code diligently performs the request and executes the command. For example:
Request: http://www.test.com/login?username=dan; rm -f /etc/passwd
Output: “hello, dan”
The output is the same but underneath the hood here’s what was executed. A reminder of our code:
system(“echo hello, $username”);
And here’s how that was evaluated and executed:
system(“echo hello, dan; rm -f /etc/passwd “);
The system just printed the username out, and then deleted a file that the attacker told it to, because it failed to preserve the separation of code from data and executed user-provided data as code. This is the crux of injection attacks.
How can I prevent injection attacks?
Preventing injection requires taking deliberate measures to ensure that the code and underlying system are clear at all times that given variables are data only, and to segregate them from being interpreted as commands or queries.
Since it is difficult to remember to do this every single time and must be performed everywhere in order to be effective, the preferred option is to ensure that all handling of received data is channelled using a “throttle point” that that is known to safely preserve the separation of data and code. This may be a standardised API or library.
Language-specific equivalents of the above for SQL injection include the use of prepared statements or stored procedures that act as structured mechanisms that automatically enforce the separation between data and command. The difference between prepared statements and stored procedures is that the SQL code for a stored procedure is defined and stored in the database itself, and then called from the application.
Avoiding system calls
The use of system calls within code – functions such as PHPs system(); exec(); passthru(); and shell_exec(); for example – are inherently dangerous since they involve passing through input to the OS/shell directly, by design. Alternatives such as specific libraries or functions that can be included to perform specific tasks rather than using OS_native binaries via these exec calls typically provide far greater protection against command injection attacks.
Input Validation (Whitelisting)
Where user-provided values have to be used within calls, commands or queries, then the parameter values should ideally be checked against a whitelist of permitted values – explicitly rather than using regexes, since these can be bypassed often. That may or may be feasible depending on the input – an input asking for a number between one and ten can be easily checked against a whitelist. An input asking for a name cannot.
Escaping User-Supplied Input
If a parameterized API is not available, you should carefully escape special characters using the specific escape syntax for that interpreter. This technique should only be used as a last resort, when none of the above are feasible since it can be bypassed often using techniques such as double-encoding or other tricks.
OWASP suggest that source code review is the best method of detecting if applications are vulnerable to injections, closely followed by thorough automated testing of all parameters, headers, URL, cookies, JSON, SOAP, and XML data inputs. Organizations can include static source (SAST) and dynamic application test (DAST) tools into the CI/CD pipeline to identify newly introduced injection flaws prior to production deployment.
How can AppCheck Help?
AppCheck performs comprehensive checks for a wide range of injection vulnerabilities including SQL Injection. The AppCheck Vulnerability Analysis Engine provides detailed rationale behind each finding including a custom narrative to explain the detection methodology, verbose technical detail and proof of concept evidence through safe exploitation.
As always, if you require any more information on this topic or want to see what Injection Vulnerabilities AppCheck can pick up in your website and applications then please get in contact with us: firstname.lastname@example.org
Get started with Appcheck
No software to download or install.
Contact us or call us 0113 887 8380