VOOZH about

URL: https://learn.snyk.io/lesson/sql-injection/

⇱ What is SQL injection (SQLi)? | Tutorial & examples | Snyk Learn


SQL injection (SQLi)

Improper handling of input during SQL query generation

~15mins estimated

Select your ecosystem

SQL injection: the basics

Copy to Clipboard

What is SQL injection?

SQL injection (or SQLi) is one of the most widespread code vulnerabilities. To perform a SQL injection attack, an attacker inserts or "injects" malicious SQL code via the input data of the application. SQL injection allows the attacker to read, change, or delete sensitive data as well as execute administrative operations on the database.

About this lesson

In this lesson, you will learn how SQL injection works and how to protect your code against it. We will begin by using SQL injection to bypass the login screen of a vulnerable web application. We will then dive deeper into the code of that vulnerable application and explain why the SQL injection attack was effective. Finally, we will teach you how to fix and prevent SQL injection in your applications.

FUN FACT

Even Star Trek suffers

Did you know that, according to Star Trek, SQL injection will still plague humanity 300 years from now? In "If Memory Serves", a 2019 Star Trek Discovery episode, a probe used SQL injection when attacking a datastore on one of the ship's shuttlecraft. The attack was discovered by Commander Airiam. Sadly, in the 24th century, people are still writing insecure SQL queries.

👁 Image

SQL injection in action

Copy to Clipboard

To see how easy it is to perform a SQL injection attack, let's play with a vulnerable web application–the login screen of a fictional company called startup.io. We will attempt to bypass their login screen by supplying malicious input to the password field.

Let's start with a valid email and password. The form has already been prepopulated with their correct email and password.

Try it out. After you press the login button, you should see a successful login message. If you type in anything else, you'll get a failed login attempt. This all makes sense so far.

Startup.io login

Strange input, strange errors

It’s time to start hacking! Change the password field to evil' password. You are probably expecting to see a standard login failure message again. But, surprisingly, the app displays:

Incorrect syntax near il’

In fact, you may recognise this error message. Remember the last time you tried to run a malformed SQL query? You likely saw a similar error message returned by the database client you were using at the time. But why are we seeing it as the error of a web application?

The error implies that the input password (evil' password) was executed as part of a backend SQL query. The apostrophe character which we placed in evil’ made the syntax of that query invalid.

This is excellent news for us: an input string that we control is directly injected into some backend SQL code! Equipped with that knowledge, we can achieve something much more malicious than causing a harmless error.

Bypass login

Now that we know the app’s backend will execute anything we enter into the password field let’s fully exploit it. We’ll use a very particular password: idontknow' or 1=1; --.

Try to log in with the password above and our standard email (user1@startup.io).

Startup.io login

Bingo! Our new password is obviously incorrect, but we see a successful login message. Congratulations, you’ve just hacked the application! You supplied a peculiar input string that you knew would be injected and run in a SQL query. By doing so, you managed to bypass the login validation of the app. You’ve just executed a SQL injection attack.

But why was it successful? Unlike the previous example, our payload contains a valid SQL (or 1=1; --) after the apostrophe character. In a nutshell, this changed the meaning of the query which is used to decide if users can log in or not. Read on to discover how it actually worked under the hood! We will do a deep dive on both the backend code and the query in the next section. But first, let’s try one more hack.

Delete the database

Change the password to idontknow'; DROP TABLE credentials; -- and press the login button. You should see an error message:

Email user1@startup.io does not exist.

Startup.io login

What is going on here? Our email field has not changed but the application now claims that the email does not exist.

Did you notice the DROP TABLE credentials part of the password we just supplied? It gives a big hint to what just happened. You've managed to delete all the credential data from the application—the emails and passwords of all users!

SQL injection under the hood

Copy to Clipboard

We used the quote character (') in the password field to inject an unintended string into the query in order to change the query's logic. Using other metacharacters, such as - or ;, is also possible.

Let's look at this example in more detail. First, consider the backend logic of startup.io’s login functionality. The following code sample shows a function which aims to verify the user's credentials.

We used the quote character (') in the password field to inject an unintended string into the query in order to change the query's logic. Using other metacharacters, such as - or ;, is also possible.

Let's look at this example in more detail. First, consider the backend logic of our login functionality. The following Rust code sample from our handlers.rs file shows the vulnerable function that aims to verify a user's credentials.

So, what happened when we successfully bypassed the login page?

The key vulnerability lies in how the SQL query is constructed. The function takes the user's email and password and directly inserts them into a query string using format!.

What happened when we hacked the login page?

The email and password parameters come directly from an input form

What happened when we hacked the login page?

The req parameter comes directly from the input form.

What happened when we hacked the login page?

The $_GET parameter array comes directly from the input form via the URL.

What happened when we hacked the login page?

The req parameter comes directly from the input form.

What happened when we hacked the login page?

What happened when we hacked the login page?

The email and password parameters come directly from the input form.

The req parameter comes directly from the input form.

What happened when we hacked the login page?

When you provided the malicious password anything' OR 1=1; --, the format! macro produced the following final SQL query string:

SELECT *
FROM credentials
WHERE email = 'user1@startup.io'
AND password = 'anything' OR 1=1; --'

Let's break down why this query bypasses the security check:

  1. ... AND password = 'anything' completes the original AND clause with some dummy text.
  2. OR 1=1 is the core of the attack. Since 1=1 is always true, this condition makes the entire WHERE clause true for every single user in the database.
  3. ; ends the SELECT statement.
  4. -- is a SQL comment. It tells the database to ignore everything that follows, including the final, trailing single quote (') that the format! macro added, preventing a syntax error.

The database executed this query, found a valid row (because 1=1 is always true), and the application incorrectly granted access.

What happened when we hacked the login page?

The username and password are passed from the input form to this function and the SQL query is constructed by concatenating the input into the query. This meant we were able to manipulate the query and bypass authentication. The key here is the use of the % character placed before our parameters. This will pass the input from the client straight to the database, without performing any sort of checks or validation. You should never use % or + to merge any user-supplied values into your queries.

The connection object is used to create a “cursor” which allows you to send commands to the database and cursor.fetchone() will fetch the result from the query.

How does string manipulation work?

The key here is the construction of the query through string concatenation. Let's try to inject the malicious password which we use to bypass the login into the query. After the string concatenation is done, the WHERE clause of the query becomes:

WHERE email='user1@startup.io'
AND password='idontknow' OR 1=1;

If we consider the following:

  • AND has precedence over OR (true for most SQL implementations)
  • password='idontknow' is FALSE for the given email
  • 1=1 is always TRUE

We can reduce the WHERE clause as follows:

-> email='user1@startup.io' AND password='idontknow' OR 1=1
-> (TRUE AND FALSE) OR TRUE
-> FALSE OR TRUE
-> TRUE

A SQL injection attack illustration where a malicious input is injected into a login window of a web application

By injecting malicious input into the query, we managed to change the query's logic! The query will evaluate as TRUE even when the supplied password is invalid. Since the query result is used to decide if a given user should be allowed to log in, we’ve bypassed the application's authentication mechanism.

What is the impact of SQL injection?

Now that you know how SQL injection works, imagine the implications. By using SQL injection, an attacker can change the logic of the vulnerable query. They can read or modify any data stored in the database. If the vulnerable query is executed as a privileged user, the attacker can run any administrative operations, including deleting the entire database schema.

Why is SQL injection common?

SQL injection is widespread because it is easy for attackers to detect where it can be exectued. Once detected, it is easy to exploit. Any website or application which uses a database is subject to a SQL injection attack. In the past, almost all popular SQL clients and libraries were vulnerable to SQL injection attacks.

FUN FACT

Naming is hard

SQL injection has been used by creative people in many unexpected ways. For example, in 2014, an individual in Poland legally renamed his business to "Dariusz Jakubowski x'; DROP TABLE users; SELECT '1" to disrupt the operation of spammers' harvesting bots. Wonder what this peculiar name means? Keep reading to find out!

👁 Image

Scan your code & stay secure with Snyk - for FREE!

Did you know you can use Snyk for free to verify that your code
doesn't include this or other vulnerabilities?

Scan your code

SQL injection mitigation

Copy to Clipboard

How do you mitigate SQL injection?

To protect your app against SQL injection, you need to prevent user-supplied input from affecting the logic of the executed query. One way to achieve that is through parameterized queries which almost all SQL libraries support. In a parameterized query, the input is "escaped" and handled literally.

For instance, if our example used a parameterized query, and you were to insert ' OR 1=1' as a password input to that query, a database client would not interpret that input as SQL. Instead, it would look for a password string ' OR 1=1'.

A SQL injection mitigation illustration where the malicious input injection is prevented by parametrized query

Parameterized queries

We can fix our vulnerable checkLogin function by using a parameterized query. Take a look at the diff below, which compares the vulnerable code with the fixed code. Notice the use of ? to indicate the location of the query parameters. The parameters are passed as an input to the query method. Internally, the query method ensures that the input parameters are interpreted literally and not as separate SQL statements.

Parameterized queries bind user-supplied input, preventing SQL injection. Under the hood, they typically use the database's prepared statement functionality. Go's database/sql package automatically creates prepared statements for you with bound parameters whenever additional arguments are passed.

Consider a possible fix to our SQL injection problem, which uses bound parameters. You can see the difference in the diff below.

LINQ lambda expressions

Consider a possible fix to our SQL injection problem, which uses a built-in LINQ lambda predicate. Entity Framework correctly generates a parameterized query by evaluating the provided expression. You can see the difference in the diff below.

Parameterized queries

We can fix our vulnerable check_login function by using a parameterized query. Take a look at the diff below, which compares the vulnerable code with the fixed code. Notice that we removed the % and the ‘ ‘ around the placeholders. The parameters are now passed as an input to the execute method. Internally, this ensures that the input parameters are escaped by the library and are not treated as literal SQL statements.

Prepared statements explained

Consider a possible fix to our SQL injection problem, which uses a PreparedStatement. You can see the comparison in the diff below.

MySQL parameterized query

We can fix our vulnerable checkLogin function by using a MySQL parameterized query. Take a look at the diff below, which compares the vulnerable code with the fixed code. Notice the use of ? to indicate the location of the query parameters. The parameters are passed as an input to the query method. Internally, the query method ensures that the input parameters are interpreted literally and not as separate SQL statements.

We can fix our vulnerable check_login function by using an SQLite parameterized query. Take a look at the Patched snippet below, which compares the vulnerable code with the fixed code. Notice the use of ? to indicate the location of the query parameters. The parameters are passed as input using sqlite3_bind_text() functions after preparing the statement. Internally, the SQLite engine ensures that the input parameters are interpreted literally as data values and not as separate SQL statements.

Parameterized queries

We can fix our vulnerable vulnerable_login function by using a parameterized query, which is exactly what we've done in the secure_login function. Take a look at the comparison below, which highlights the vulnerable code against the fixed code. Notice the use of ? as a placeholder for our parameters. The .bind() method then safely attaches our variables to the query. Internally, sqlx ensures that these input parameters are sent to the database separately from the query template, making it impossible for them to be interpreted as SQL statements.

Vunerable code (`vulnerable_login'):

Parameterized queries

We can fix our vulnerable check_login by switching to a parameterized query. In the diff below, compare the unsafe code with the corrected version. Notice that we removed Ruby’s % string formatting and the manual single quotes around values. Instead, we use positional placeholders ($1, $2, …) and pass the parameters separately via exec_params. Internally, the driver binds and escapes these inputs so they’re handled as data, not executable SQL.

Most SQL Frameworks and libraries - whether for MySQL, Postgres, Oracle or your SQL flavor of choice - support parameterized queries. You should aim to use them by default, and avoid any methods in your framework that use string concatenation.

Bound parameters explained

Most SQL client libraries allow for queries with parameterized values. In our case, this is conn.QueryRowContext. Notice the usage of placeholders ? as query parameters, and email and password as additional arguments. In the broken code example, we inject parameters directly into the query. In the fixed code example, we put placeholders in all places where we previously injected input parameters. This way, we tell the SQL database to expect input parameters in all locations where it sees ?. Our call to conn.QueryRowContext notices the additional email and password arguments, and sends a preparted statement to the database, then runs it with the additional arguments.

When the prepared statement is sent to the database, the statement does not contain any user-supplied data. When the database executes the statement, it uses our additional arguments as bound parameters, replacing the ? placeholders in the query with the values of the additional arguments in a safe manner.

This way we keep the parameters of the query and the query itself cleanly separated. The job of combining the two falls into the database engine, which eliminates the SQL injection vulnerability.

Notice the usage of .AnyAsync() from the DatabaseContext object in the fixed code. This call allows Entity Framework to prepare the statement underneath the hood, by using built-in LINQ methods.

A lambda expression is a convenient way of defining an anonymous (unnamed) function. Many LINQ methods take lambda expression as a parameter.

Entity Framework leverages this powerful capability to generate SQL queries, that are safe and sanitized. Take our example, we've defined a strongly typed model that maps to our table. Once the statement is executed the lambda predicate is evaluated, and converted to a SQL statement that will be sent to the database.

Useful resources to learn more:

Notice the creation of a PreparedStatement from the Connection object in the fixed code. This call sends our SQL statement to the database, that is, it "prepares" the statement for future execution. Also, notice the usage of placeholders ? as query parameters. In the broken code example (left), we inject parameters directly into the query. In the fixed code example (right), we put placeholders in all places where we previously injected input parameters. This way, we tell the SQL database to expect input parameters in all locations where it sees ?.

When we send the prepared statement to the database, the statement does not contain any user-supplied data. We supply the input parameters separately by first calling the statement.setString method with the parameter value and later calling the executeQuery method. When the executeQuery method runs, a second request is sent to the database. The request instructs the database to execute the prepared statement with the input parameters we set using the setString method.

Most importantly, the second request does not contain the SQL statement itself. The SQL statement was already sent to the database by our first request (preparedStatement). The second request (executeQuery) sends the parameters and asks the database to execute the query. This way we keep the parameters of the query and the query itself cleanly separated. The job of combining the two falls into the database engine, which eliminates the SQL injection vulnerability.

Secure code (secure_login):

Modern Rust database libraries like sqlx are designed with security as a top priority. Their APIs strongly encourage the use of parameterized queries. You should always use these features and avoid any method that relies on building queries with string formatting like format!.

Quiz

Copy to Clipboard

Test your knowledge!

Quiz

What is the issue with the following code: const sqlQuery = "SELECT email FROM credentials WHERE " + "email='" + req.body.email + "' AND " + "password='" + req.body.password + "'";

Test your knowledge!

Quiz

What is wrong with the following code: cursor.execute("SELECT * FROM users WHERE username = '%s' AND password = '%s'" % (username, password))

Test your knowledge!

Quiz

What is wrong with the following code? conn.exec("SELECT * FROM users WHERE username = '%s' AND password = '%s'" % [username, password])

Keep learning

To learn more about SQL injection, check out some other great content produced by Snyk:

Congratulations

You’ve learned what SQL injection is and how to protect your systems from it. We hope you will apply your new knowledge wisely and make your code much safer.

FAQs

What is an SQL injection?

SQL injection (or SQLi) is one of the most widespread code vulnerabilities. To perform a SQL injection attack, an attacker inserts or "injects" malicious SQL code via the input data of the application.

How does an SQL injection work?

The quote character (') in the password field can be used to inject an unintended string into the query in order to change the query's logic. Using other metacharacters, such as - or ;, is also possible.

What to learn next?

Expression Language injection (ELI)

Learn how Expression Language injection (ELI) works and how to protect your applications against it with real-world examples from security experts.
Java
0% Completed

Spring4Shell

Learn what Spring4Shell is, why you should be aware of it, and how you can prevent and remediate the vulnerability in your organization.
Java
0% Completed

Insecure deserialization

Learn how an insecure deserialization attack works, and how to mitigate and remediate the vulnerability with real-world examples from security experts.
Java
Python
JavaScript
0% Completed