If you are attending this workshop at GitHub Universe, please follow the instructions below to prepare for the workshop in advance.
Closer to the workshop date, the detailed workshop steps will be available below, which the facilitators will guide you through.
- Prerequisites and setup instructions
- On your local machine
- On Codespaces
- Useful commands
- Workshop
Please complete this section before the workshop, if possible.
- Install Visual Studio Code.
- Install the CodeQL extension for Visual Studio Code.
- You do not need to install the CodeQL CLI: the extension will handle this for you.
- Clone this repository:
git clone https://github.com/githubuniverseworkshops/codeql
- Use
git pull origin main
to regularly keep this clone up to date with the latest state of the repository.
- Use
- Open the repository in Visual Studio Code: File > Open (or Open Folder) > Browse to the checkout of
githubuniverseworkshops/codeql
. - Follow Common setup steps (local and Codespaces) below.
- Import the CodeQL database to be used in the workshop:
- Click the CodeQL rectangular icon in the left sidebar.
- The first time you do this, the CodeQL extension will download the CodeQL CLI.
- In the Databases panel, place your mouse over the title, and click the cloud-shaped icon labelled
Download Database
OR click the button From a URL (as a zip file). - Copy and paste this URL into the box, then press OK/Enter: https://github.com/githubuniverseworkshops/codeql/releases/download/universe-2022/codeql-ruby-workshop-opf-openproject.zip
- The CodeQL extension will download the chosen database.
- After the database is downloaded, it will appear in the left sidebar under Databases. Look for a checkmark to the left of the database name, indicating it is selected.
- If the database is not selected, hover your cursor on the database name, and click Set Current Database.
- Install the CodeQL library package for analyzing Ruby code.
- From the Command Palette (
Cmd/Ctrl+Shift+P
), search for and run the commandCodeQL: Install Pack Dependencies
. - At the top of your VS Code window, type
github
in the box to filter the list. - Check the box next to
githubuniverseworkshops/codeql-workshop-2022-ruby
. - Click OK/Enter.
- From the Command Palette (
- Run a test CodeQL query:
- Open the file
workshop-2022/example.ql
. - From the Command Palette (
Cmd/Ctrl+Shift+P
) or the right-click context menu, click the commandCodeQL: Run Query
. - After the query compiles and runs, you should see the results in a new
CodeQL Query Results
tab.
- Open the file
- Create a new file in the
workshop-2022
directory calledUrlRedirect.ql
. You'll develop this query during the workshop.
- Go to https://github.com/githubuniverseworkshops/codeql/codespaces.
- Click Create codespace on main.
- A Codespace will open in a new browser tab.
- When the Codespace is ready, it will open a VS Code workspace file
workshop-2022.code-workspace
. - Click Open Workspace in the bottom right. The Codespace will reload.
- After the Codespace reloads, follow Common setup steps (local and Codespaces) under On your local machine.
- Run a query using the following commands from the Command Palette (
Cmd/Ctrl + Shift + P
) or right-click menu:CodeQL: Run Query
(run the entire query)CodeQL: Quick Evaluation
(run only the selected predicate or snippet)
- Click the links in the query results to navigate to the source code.
- Explore the CodeQL libraries in your IDE using:
- autocomplete suggestions (
Cmd/Ctrl + Space
) - jump-to-definition (
F12
, orCmd/Ctrl + F12
in a Codespace in the browser) - documentation hovers (place your cursor over an element)
- the AST viewer on an open source file (
View AST
from the CodeQL sidebar or Command Palette)
- autocomplete suggestions (
In this workshop we will look for URL redirection vulnerabilities in Ruby code that uses the Ruby on Rails framework. Such vulnerabilities can occur in web applications when a URL string that is controlled by an external user makes its way to application code that redirects the current user's browser to the supplied URL.
The example that we will find was a potential vulnerability in the open-source project management software OpenProject, which was introduced in a pull request, identified by CodeQL static analysis on the PR, diagnosed during PR review, and fixed before the PR was merged. Note that it remained a potential problem, not a real vulnerability, thanks to the efforts of the project maintainers and a safe default setting built into Rails 7. However, it is a good example to help us understand and detect serious URL redirection vulnerabilities that may occur elsewhere.
(OpenProject is licensed under the GNU GPL v3.0.)
The workshop is split into several steps. You can write one query per step, or work with a single query that you refine at each step. Each step has a hint that describes useful classes and predicates in the CodeQL standard libraries for Ruby.
In this section, we will reason about the abstract syntax tree (AST) of a Ruby program.
We will use this reasoning to identify specific Ruby on Rails method calls, which:
- redirect the application to another URL
- are invoked as part of HTTP
GET
request handler methods configured in the application.
The arguments of these method calls are the URLs being redirected to, and hence are potential sinks for URL redirection vulnerabilities.
-
Find all method calls in the program. To reason about the abstract syntax tree (AST) of a Ruby program, start by adding
import ruby
to your CodeQL query, and use the types defined in theAst
module.Hint
- Start typing
from Ast::
to see the types available in the AST library. - A method call is represented by the
Ast::MethodCall
type in the CodeQL Ruby library.
Solution
import ruby from Ast::MethodCall call select call
- Start typing
-
Find all calls in the program to methods named
redirect_to
.Hint
- Add a
where
clause. MethodCall
has a predicate calledgetMethodName()
that returns the method name as astring
.- CodeQL string literals are written in "double quotes".
- Use the equality operator
=
to assert that two CodeQL expressions are the same.
Solution
import ruby from Ast::MethodCall redirectCall where redirectCall.getMethodName() = "redirect_to" select redirectCall
- Add a
-
Calls to the
redirect_to
method use its first argument as the target URL. Update your query to report the redirection argument.Hint
MethodCall.getAnArgument()
returns all possible arguments of the method call.MethodCall.getArgument(int i)
returns the argument at (0-based) indexi
of the method call.- The argument is an expression in the program, represented by the CodeQL class
Ast::Expr
. - Introduce a new variable in the
from
clause to hold this expression, and output the variable in theselect
clause.
Solution
import ruby from Ast::MethodCall redirectCall, Ast::Expr arg where redirectCall.getMethodName() = "redirect_to" and arg = redirectCall.getArgument(0) select redirectCall, arg
-
Recall that predicates allow you to encapsulate logical conditions in a reusable format. Convert your previous query to a predicate which identifies the set of expressions in the program which are arguments of
redirect_to
method calls. You can use the following template:predicate isRedirect(Ast::Expr redirectLocation) { exists(Ast::MethodCall redirectCall | // TODO fill me in ) }
exists
is a mechanism for introducing temporary variables with a restricted scope. You can think of them as their ownfrom
-where
-select
. In this case, we useexists
to introduce the variablecall
with typeMethodCall
.Hint
- You can translate from the previous query clause to a predicate by:
- Converting some variable declarations in the
from
part to the variable declarations of anexists
- Placing the
where
clause conditions (if any) in the body of the exists - Adding a condition which equates the
select
to one of the parameters of the predicate.
- Converting some variable declarations in the
Solution
import ruby predicate isRedirect(Ast::Expr redirectLocation) { exists(Ast::MethodCall redirectCall | redirectCall.getMethodName() = "redirect_to" and redirectLocation = redirectCall.getArgument(0) ) }
- You can translate from the previous query clause to a predicate by:
-
When you've written your predicate, you can evaluate it directly using Quick Evaluation (click on the prompt above the predicate name, or right-click on the predicate), or using a query that calls the predicate.
Solution
import ruby predicate isRedirect(Ast::Expr redirectLocation) { exists(Ast::MethodCall redirectCall | redirectCall.getMethodName() = "redirect_to" and redirectLocation = redirectCall.getArgument(0) ) } from Ast::Expr e where isRedirect(e) select e
-
Like predicates, classes in CodeQL can be used to encapsulate reusable portions of logic. Classes represent sets of values, and they can also include operations (known as member predicates) specific to that set of values. You have already seen some CodeQL classes (
MethodCall
,Expr
etc.) and associated member predicates (MethodCall.getMethodName()
,MethodCall.getArgument(int i)
, etc.).Ast::MethodBase
is the class of all Ruby methods. Create a subclass namedGetHandlerMethod
. To begin with, your subclass will contain all the values from the superclass.Hint
- Use the
class
keyword to declare a class, and theextends
keyword to declare the supertypes of your class.
Solution
import ruby class GetHandlerMethod extends Ast::MethodBase {}
- Use the
-
Ruby on Rails allows developers to define routing logic, describing the various kinds of URL route that are accepted by the Rails application, and which Ruby methods handle HTTP requests to each of those URL routes. The request handlers are Ruby methods in a controller class, usually a subclass of
ActionController
. We want to find all methods that are request handlers for HTTPGET
requests, because these are potentially susceptible to URL redirection vulnerabilities.Refine your class so that it describes only the set of methods that are guaranteed to be request handlers for HTTP
GET
requests in Rails, according to the routing logic written in Ruby code. You do not need to identify request handlers yourself: the CodeQL standard library has a module calledActionController
and a class calledActionControllerActionMethod
that already do this for you.Hint
- Add
import codeql.ruby.frameworks.ActionController
. This library helps reason about the RailsActionController
class and its subclasses and methods, which define routing and handling for server-side Rails applications. - Create a characteristic predicate for your class. This looks like a constructor:
GetHandlerMethod() { ... }
- Within the characteristic predicate, use the special
this
variable to refer to the methods whose properties we are describing in the class. - Use an inline cast of the form
this.(ActionControllerActionMethod)
to assert thatthis
is a public Rails controller method. Then you can call further Rails-specific predicates on this value. - Use
ActionControllerActionMethod.getARoute()
to find all URL routes that are directed to this handler, according to the code. - Use
Route.getHttpMethod()
to find the HTTP method name (e.g. "get") for a given route. - Use Quick Evaluation on the characteristic predicate to see all values of your new class.
Solution
import ruby import codeql.ruby.frameworks.ActionController class GetHandlerMethod extends Ast::MethodBase { GetHandlerMethod() { this.(ActionControllerActionMethod).getARoute().getHttpMethod() = "get" } }
- Add
-
The previous step only finds handler methods that we know for sure are handling
GET
requests, according to the routes identified in the code by CodeQL. What about handler methods where CodeQL isn't sure about the request type? Let's try and handle those too. Expand your class's characteristic predicate to include handler methods where we cannot statically find a route definition.Hint
- Use the
or
keyword to expand the set of values that satisfy a logical formula. - Use the
not exists
keywords to assert that a logical formula does not hold, or that a particular value does not exist. - Use Quick Evaluation on the characteristic predicate to see all values of your new class.
To view the differences in results, you can select the results of both query runs in the Query History view, right-click and use the
Compare Results
command to view the differences.Solution
import ruby import codeql.ruby.frameworks.ActionController class GetHandlerMethod extends Ast::MethodBase { GetHandlerMethod() { this.(ActionControllerActionMethod).getARoute().getHttpMethod() = "get" or not exists(this.(ActionControllerActionMethod).getARoute()) } }
- Use the
-
The previous step may now find too many possible methods! Limit your class to methods declared within
ActionController
classes.Hint
- Use the
exists
orany
quantifiers to declare a variable of typeActionControllerControllerClass
, and assert thatthis
is one of the methods of such a class. - Use Quick Evaluation on the characteristic predicate to see all values of your new class.
- Use the Compare Results command in the Query History view to view the differences in results.
Solution
import ruby import codeql.ruby.frameworks.ActionController class GetHandlerMethod extends Ast::MethodBase { GetHandlerMethod() { this.(ActionControllerActionMethod).getARoute().getHttpMethod() = "get" or not exists(this.(ActionControllerActionMethod).getARoute()) and this = any(ActionControllerControllerClass c).getAMethod() } }
- Use the
-
The previous step may still find too many possible methods! Methods named
create/update/destroy/delete
are probably not HTTPGET
handlers if we can't find aGET
route in the code. Exclude them from your class.Hint
- Use the
and not
keywords to exclude values from a logical formula. - Use the
regexpMatch
built-in predicate to match astring
value using a (Java-style) regular expression..*
matches any string pattern.|
is the alternation/or operator. - Use Quick Evaluation on the characteristic predicate to see all values of your new class.
- Use the Compare Results command in the Query History view to view the differences in results.
Solution
import ruby import codeql.ruby.frameworks.ActionController class GetHandlerMethod extends Ast::MethodBase { GetHandlerMethod() { this.(ActionControllerActionMethod).getARoute().getHttpMethod() = "get" or not exists(this.(ActionControllerActionMethod).getARoute()) and this = any(ActionControllerControllerClass c).getAMethod() and not this.getName().regexpMatch(".*(create|update|destroy).*") } }
- Use the
-
Change your
isRedirect
predicate to find redirect calls only within methods we think are HTTPGET
handlers.Hint
- Add a parameter to the predicate, and declare its type to be the class you just defined. This states that all values of the parameter must belong to the class.
- We need to find the enclosing method of either
redirectCall
orredirectLocation
from the previous predicate. (They should both have the same enclosing method; choose one.) - Look for a predicate on
Expr
that finds the enclosing method.
Solution
import ruby import codeql.ruby.frameworks.ActionController class GetHandlerMethod extends Ast::MethodBase { GetHandlerMethod() { this.(ActionControllerActionMethod).getARoute().getHttpMethod() = "get" or not exists(this.(ActionControllerActionMethod).getARoute()) and this = any(ActionControllerControllerClass c).getAMethod() and not this.getName().regexpMatch(".*(create|update|destroy).*") } } predicate isRedirect(Ast::Expr redirectLocation, GetHandlerMethod method) { exists(Ast::MethodCall redirectCall | redirectCall.getMethodName() = "redirect_to" and redirectLocation = redirectCall.getArgument(0) and redirectCall.getEnclosingMethod() = method ) }
In this section, we will move from reasoning about the AST to reasoning about data flow. The data flow graph is built on top of the AST, but contains more detailed semantic information about the flow of information through the program. We will also use more concepts that are already modelled in the CodeQL standard libraries for Ruby, instead of having to manually model each pattern.
-
The
DataFlow
library models the flow of data through the program. This is already imported byimport ruby
, but you can also explicitly import it usingimport codeql.ruby.DataFlow
. The classDataFlow::Node
from this library represents semantic elements in the program that may have a value. Data flow nodes typically have corresponding AST nodes, but we can perform more sophisticated reasoning on the data flow graph. Modify your predicate from the previous section to reason about data flow nodes instead of AST nodes.Hint
- Change the type of
redirectLocation
fromAst::Expr
toDataFlow::Node
. This is the generic type of all data flow nodes. Most nodes correspond to expressions or parameters in the AST. - Change the type of
redirectCall
fromAst::MethodCall
toDataFlow::CallNode
. This is a more specialised type of data flow node, corresponding to a particular type of expression in the AST -- aCall
. - There are still compilation errors! Methods are a concept in the AST, not the data flow graph. We cannot call
getEnclosingMethod
on aDataFlow::Node
, so we have to convert it first into an AST node. - Use
asExpr()
to convert from aDataFlow::Node
into aCfg::ExprCfgNode
-- this is a type of node in the "control flow" graph. - Use
getExpr()
to convert from aExprCfgNode
into anAst::Expr
-- this is a type of AST node. - The rest of the predicate continues to compile without errors. This is because structural predicates like
getArgument
are defined in parallel on both the AST library and the data flow library.
Solution
import ruby import codeql.ruby.frameworks.ActionController predicate isRedirect(DataFlow::Node redirectLocation, GetHandlerMethod method) { exists(DataFlow::CallNode redirectCall | redirectCall.getMethodName() = "redirect_to" and redirectLocation = redirectCall.getArgument(0) and redirectCall.asExpr().getExpr().getEnclosingMethod() = method ) }
- Change the type of
-
We have manually modelled one method that performs redirects:
redirect_to
. There may be others! Instead of manually modelling each possible case ourselves, let's use the modelling already provided in the CodeQL standard library. TheConcepts
library models common semantic concepts in Ruby programs, such as HTTP requests and responses. Import this library usingimport codeql.ruby.Concepts
, and modify your predicate to use its modelling of HTTP redirect responses.Hint
- Add
import codeql.ruby.Concepts
. - Change the type of
redirectCall
toHttp::Server::HttpRedirectResponse
. - Remove the logical condition that states the method name of
redirectCall
must be"redirect_to"
. - Use
HttpRedirectResponse.getRedirectLocation()
to identify the redirect URL (previously this was the call argument).
Solution
import ruby import codeql.ruby.frameworks.ActionController import codeql.ruby.Concepts predicate isRedirect(DataFlow::Node redirectLocation, GetHandlerMethod method) { exists(Http::Server::HttpRedirectResponse redirectCall | redirectCall.getRedirectLocation() = redirectLocation and redirectCall.asExpr().getExpr().getEnclosingMethod() = method ) }
- Add
-
params
is a method available on Rails controller classes. It returns a hash (specifically of typeActionController::Parameters
) that has been instantiated (by Rails) with the parameters of the incoming HTTP request.These parameters are a source of remote user input. In the CodeQL standard library for Ruby, they are modelled by the
ParamsSource
class, which is a subclass of the more generalRemoteFlowSource
class.Define a new predicate
isSource(DataFlow::Node source)
that describes all sources of remote user input in the program.Hint
- Add
import codeql.ruby.dataflow.RemoteFlowSources
. - Use the
RemoteFlowSource
class, which is a type ofDataFlow::Node
. - Use the
instanceof
operator to assert that a particular value belongs to a particular class.
Solution
import codeql.ruby.dataflow.RemoteFlowSources predicate isSource(DataFlow::Node source) { source instanceof RemoteFlowSource }
- Add
We have now identified (a) places in the program which can perform URL redirection within HTTP GET handlers and (b) places in the program which receive untrusted data. We now want to tie these two together to ask: does the untrusted data ever flow to the potentially unsafe URL redirection call?
In program analysis we call this a data flow or taint tracking problem. Data flow helps us answer questions like: does this expression ever hold a value that originates from a particular other place in the program?
We can visualize the data flow problem as one of finding paths through a directed graph, where the nodes of the graph are elements in the program that have a value, and the edges represent the flow of data between those elements. If a path exists, then the data flows between those two nodes.
CodeQL for Ruby provides data flow analysis as part of the standard library. You can import the data flow library using import ruby
(which in turn imports codeql.ruby.DataFlow
), and you can import the taint tracking library using import codeql.ruby.TaintTracking
. Data flow tracks the flow of the same precise values through the program. Taint tracking is less precise, and tracks the flow of values that may change slightly through the program. Both libraries model program elements using the DataFlow::Node
CodeQL class. These nodes are separate and distinct from the AST (Abstract Syntax Tree) nodes, which represent the basic structure of the program. This allows greater flexibility in how data flow is modeled.
There are a small number of data flow node types – expression nodes and parameter nodes are most common. We have seen the asExpr()
method to convert a DataFlow::Node
into the corresponding control flow node and the getExpr()
method to convert a control flow node into the corresponding AST node; there is also asParameter()
.
In this section we will create a taint tracking query by populating this template:
/**
* @name URL redirection
* @kind problem
* @id rb/url-redirection
*/
import ruby
import codeql.ruby.frameworks.ActionController
import codeql.ruby.Concepts
import codeql.ruby.dataflow.RemoteFlowSources
import codeql.ruby.TaintTracking
// TODO add previous class and predicate definitions here
class UrlRedirectionConfig extends TaintTracking::Configuration {
UrlRedirectionConfig() { this = "UrlRedirectionConfig" }
override predicate isSource(DataFlow::Node source) {
/** TODO fill me in **/
}
override predicate isSink(DataFlow::Node sink) {
/** TODO fill me in **/
}
}
from UrlRedirectionConfig config, DataFlow::Node source, DataFlow::Node sink
where config.hasFlow(source, sink)
select sink, "Potential URL redirection"
-
Complete the
isSource
predicate, using the logic you wrote for Section 2.Solution
override predicate isSource(DataFlow::Node source) { source instanceof RemoteFlowSource }
-
Complete the
isSink
predicate, using the logic you started in Section 1 and completed in Section 2. Here, redirect URLs are sinks.Hint
- Call the
isRedirect
predicate you defined earlier. - Use
_
when you don't care about the value of a particular parameter in a predicate call.
Solution
override predicate isSink(DataFlow::Node sink) { isRedirect(sink, _) }
- Call the
-
You can now run the completed query. You should find exactly one result.
Solution
/** * @name URL redirection * @kind problem * @id rb/url-redirection */ import ruby import codeql.ruby.frameworks.ActionController import codeql.ruby.Concepts import codeql.ruby.dataflow.RemoteFlowSources import codeql.ruby.TaintTracking class UrlRedirectionConfig extends TaintTracking::Configuration { UrlRedirectionConfig() { this = "UrlRedirectionConfig" } override predicate isSource(DataFlow::Node source) { source instanceof RemoteFlowSource } override predicate isSink(DataFlow::Node sink) { isRedirect(sink, _) } } from UrlRedirectionConfig config, DataFlow::Node source, DataFlow::Node sink where config.hasFlow(source, sink) select sink, "Potential URL redirection"
-
For some results, it is easy to verify whether the results are valid, because both the source and sink may be in the same method in the code. However, for many data flow problems this is not the case, and the path from source to sink is not always obvious.
We can update the query so that it not only reports the sink, but it also reports the source and the path to that source. This is done by converting the query to a path problem query. There are five parts we will need to change:
- Convert the
@kind
fromproblem
topath-problem
. This tells the CodeQL toolchain to interpret the results of this query as path results. - Add a new import
import DataFlow::PathGraph
, which will report the path data alongside the query results. - Change
source
andsink
variables fromDataFlow::Node
toDataFlow::PathNode
, to ensure that the nodes retain path information. - Use
hasFlowPath
instead ofhasFlow
. - Change the
select
clause to report thesource
andsink
as the second and third columns. The toolchain combines this data with the path information fromPathGraph
to build the paths.
Convert your previous query to a path-problem query. Run the query to see the paths in the results view. You should find exactly two results.
Solution
/** * @name URL redirection * @kind path-problem * @id rb/url-redirection */ import ruby import codeql.ruby.frameworks.ActionController import codeql.ruby.Concepts import codeql.ruby.dataflow.RemoteFlowSources import codeql.ruby.TaintTracking import DataFlow::PathGraph class GetHandlerMethod extends MethodBase { GetHandlerMethod() { this.(ActionControllerActionMethod).getARoute().getHttpMethod() = "get" or not exists(this.(ActionControllerActionMethod).getARoute()) and this = any(ActionControllerControllerClass c).getAMethod() and not this.getName().regexpMatch(".*(create|update|destroy).*") } } predicate isRedirect(DataFlow::Node redirectLocation, GetHandlerMethod method) { exists(Http::Server::HttpRedirectResponse redirectCall | redirectCall.getRedirectLocation() = redirectLocation and redirectCall.asExpr().getExpr().getEnclosingMethod() = method ) } class UrlRedirectionConfig extends TaintTracking::Configuration { UrlRedirectionConfig() { this = "UrlRedirectionConfig" } override predicate isSource(DataFlow::Node source) { source instanceof RemoteFlowSource } override predicate isSink(DataFlow::Node sink) { isRedirect(sink, _) } } from UrlRedirectionConfig config, DataFlow::PathNode source, DataFlow::PathNode sink where config.hasFlowPath(source, sink) select sink, source, sink, "Potential URL redirection"
- Convert the
For more information on how this potential vulnerability was identified early and fixed, please read the discussion in this pull request. This potential problem never made it into the development branch or production code, thanks to the efforts of the project maintainers, and the codebase was also safe due to the use of Rails 7, which blocks open redirects by default. However, it is a good example to help us understand and detect more serious URL redirection vulnerabilities that may occur elsewhere.
- CodeQL overview
- CodeQL for Ruby
- Analyzing data flow in Ruby
- Using the CodeQL extension for VS Code
- Try out the Capture-the-Flag challenges on the GitHub Security Lab website!
- Read about more vulnerabilities found using CodeQL on the GitHub Security Lab research blog.
- Explore the open-source CodeQL queries and libraries, and learn how to contribute a new query.
- Configure CodeQL code scanning in your open-source repository.