AppCheck is pleased to announce enhanced support for scanning GraphQL based APIs. In this post we take a brief look at GraphQL and some of the security implications surrounding the technology.
The GraphQL foundation describes it as;
“GraphQL is a query language for APIs and a runtime for fulfilling those queries with your existing data. GraphQL provides a complete and understandable description of the data in your API, gives clients the power to ask for exactly what they need and nothing more, makes it easier to evolve APIs over time, and enables powerful developer tools.” — https://graphql.org/
Originally developed by Facebook, GraphQL allows developers to design their apps in a more flexible way than with traditional RESTful APIS.
For example, consider a bookstore application that provides information on each of its books such as basic detail, author information and a list of books by the same author. In a application powered by a traditional REST API it may need to make several requests to retrieve this information such as:
GET https://appcheckbooks.com/books/id/1 Response: {"book name":"The Stand", "AuthorID":"stephenking".. GET https://appcheckbooks.com/authors/id/stephenking POST https://appcheckbooks.com/books/search byauthor=stephenking
In GraphQL, a single API endpoint is used, and the developer defines a query to retrieve the data required by the application. For example, a query to return the name and genre of a book may be structured as follows:
{ book(id: 1){ name genre } }
If as in our previous example the developer also wanted to retrieve other books by the same author along with their name and genre the query could be modified without making multiple requests to retrieve the same data:
{ book(id: 1) { name genre author { books { name genre } } } }
To learn more about GraphQL we recommend checking out the excellent tutorials available on the GraphQL Foundation website.
A frequently asked question at our training events is “how does GraphQL impact application security?” Strictly speaking, the use of GraphQL should neither introduce nor resolve any particular security flaw and you should ensure the usual secure coding practices are still applied. In the real world however, there are some trends that are worth exploring.
Let’s start with how security could be improved in GraphQL over a standard REST API or scripting language such as PHP. In this example we’ll use a classic SQL injection vulnerability in PHP.
Example: news.php
$query = "SELECT story FROM news where news.id = ". $_GET["id"]; $result = mysql_query($query)
In the code above, the id parameter is read from the URL query string and included within a SQL database query without filtering. The developer expected only integers to be passed via the id parameter such as /news.php?id=100 which in turn would build the following query to return a specific news article.
SELECT story FROM news where news.id=100
However, due to a lack of input validation, it is possible for the attacker to pass a string value via the id parameter to modify the query. For example; “/news.php?id=100 UNION SELECT username, password from users” injects an additional SELECT statement to return the username and password from the users table.
Now let’s consider the same component as part of a GraphQL application. In order to use GraphQL, the developer must first define a Schema for each query and include each field name along with its type. GraphQL is strongly typed, meaning that if the schema defines a field as being a given type such as Int, supplying any other type will raise an exception. Its therefore likely that when defining the Schema, if the developer is expecting an Int, this is what will be defined. To continue the example above, each news article returned by the query could be defined using the following Schema:
type NewsArticle { id: Int title: String story: String }
Even if the same vulnerable code is used to build the SQL statement, GraphQL would reject the malicious payload on the basis that it doesn’t match the defined type.
The bad news is, that as great as the type system is, it will only prevent “type juggling” attacks in certain specific scenarios and when the defined type prevents the attacker’s payload from being accepted. For example, consider that instead of “Int” the developer used the type “ID”, which would seem perfectly reasonable in this scenario. GraphQL defines the “ID” type as follows:
The ID scalar type represents a unique identifier, often used to refetch an object or as the key for a cache. The ID type is serialized in the same way as a String; however, defining it as an ID signifies that it is not intended to be human‐readable. — https://graphql.org/learn/schema/
Since strings are now accepted the vulnerability becomes exploitable again.
Whilst the strongly typed nature of GraphQL could make some vulnerabilities less likely, another aspect of GraphQL, namely its flexibility, could make certain types of access control vulnerabilities more common.
For example, one of key benefits of GraphQL is the ability for the client to submit a single request and return exactly the data they requested, including nested data (a benefit of Graph structure). Consider a social media application that defines its users with the following schema. Notice that each user has a list of friends which are also Users.
type User { id: ID first_name: String last_name: String email: String friends: [User] }
The client may submit a query such as the following to return their profile information and a list of their friends:
This is expected behavior as far as the developer is concerned, the application enters the Graph via the “user” query and passes in their own user id. For the purpose of this example we assume the value “100” here is validated against the users’ authenticated session.
However, each friend in this example is also a User object with its own Friends. Thanks to the fact the query operates in a graph, we can expand the path to return the email address belonging to friends of our friends, something which is likely to be considered an access control vulnerability.
This flaw can of course be avoided if access control is properly enforced at the business logic layer, however thanks to the flexibility of GraphQL it could make access control mistakes like this one more likely than with the same feature using a traditional API.
In summary, the design paradigm employed by GraphQL offers a flexible and predicable method of interacting with your APIs, strong typing may prevent exploitation of some injection vulnerabilities, the power and flexibility of the query language can also bring about its own challenges.
Feature Summary:
By default, when AppCheck encounters a GraphQL endpoint it will attempt to enumerate all Queries, Mutations and Subscriptions by submitting an Introspection query (see: https://graphql.org/learn/introspection/). Each enumerated query is then submitted as-is to enumerate default values for supported arguments. The enumerated queries are then passed to the AppCheck scanning engine to be tested for security flaws.
In some cases, Introspection may be disabled on the target server. To accommodate testing in this scenario AppCheck includes a downloadable query you can run within a development environment and then configure within the scan definition.
Scans can also be triggered via a REST API to integrate with your Continuous Integration and build processes
No software to download or install.
Contact us or call us 0113 887 8380
AppCheck is a software security vendor based in the UK, offering a leading security scanning platform that automates the discovery of security flaws within organisations websites, applications, network and cloud infrastructure. AppCheck are authorized by te Common Vulnerabilities and Exposures (CVE) Program aas a CVE Numbering Authority (CNA)