Giter Club home page Giter Club logo

codeql-javascript-multiflow's Introduction

./images/under-construction.png

SQL injection example

The primary purpose of this material is to serve as an introduction to CodeQL for Python. The section Reading Order describes the basic CodeQL queries to go through.

The Python code is intentionally trivial and mirrors the structure of production code. Further, the steps needed for CodeQL setup in production CI/CD pipelines have identical structure to the one shown here.

Thus, it is expedient to illustrate intermediate and advanced topics here as well. The section Additional Topics does this by illustrating descriptions from the CodeQL documentation using the Python code in this repository. Thus, the Additional Topics serve as additional examples to parts of the documentation. These additional topics should be read in the order of appearance.

For system administration and devops, the section CodeQL setup provides a rudimentary guide.

Node setup and Running Sequence

npm i sqlite3

node add-user.js

./admin  -c
./admin  -s

echo frank | node add-user.js

./admin  -s

echo 'Johnny"); DROP TABLE users; --' | node add-user.js

./admin  -s

CodeQL setup

It’s best to have a full development setup for CodeQL on your laptop/desktop. This requires you to

  1. download VS Code
  2. install the CodeQL extension. Instructions on how to do that found here: https://codeql.github.com/docs/codeql-for-visual-studio-code/setting-up-codeql-in-visual-studio-code/
  3. install a CodeQL binary (containing CodeQL CLI) for whichever platform you are on and unpack that

    The binary for 2.13.5 is found here: https://github.com/github/codeql-cli-binaries/releases/tag/v2.13.5

    See script below.

  4. use gh
    gh codeql set-version 2.15.2
    
        
  5. (recommended for browsing) Install the codeql standard library matching the binary version. This is not needed to write or run queries anymore, but the library has many examples and searching it is much easier after extracting this archive: https://github.com/github/codeql/releases/tag/codeql-cli%2Fv2.13.5

    See script below.

  6. clone this repository.
    mkdir ~/local && cd ~/local && \
        git clone https://github.com/hohn/codeql-dataflow-sql-injection-python.git
        
  7. open the workspace directory in VS Code. This should just be
    cd ~/local/codeql-dataflow-sql-injection-python
    code python-sqli.code-workspace
        
  8. add the downloaded CodeQL CLI to the VS Code’s search path. Find the CodeQL extension settings, then paste the full path to the CodeQL CLI into the
    Code QL > Cli: Executable Path 
        

    field.

  9. install the pack dependencies for the CLI. In a shell, use
    cd ~/local/codeql-javascript-multiflow/
    codeql pack install session
    codeql pack install solutions
    codeql pack install tests
    
    XX:  Does pack install use too-new libraries?
    0:$ rm session/codeql-pack.lock.yml
    (base) 
    hohn@gh-hohn ~/local/codeql-javascript-multiflow
    0:$ codeql pack install session
    Dependencies resolved. Installing packages...
    Install location: /Users/hohn/.codeql/packages
    Package install location: /Users/hohn/.codeql/packages
    
    
    0:$ rm -fR /Users/hohn/.codeql/packages 
    (base) 
    hohn@gh-hohn ~/local/codeql-javascript-multiflow
    0:$ rm session/codeql-pack.lock.yml
    (base) 
    hohn@gh-hohn ~/local/codeql-javascript-multiflow
    
    XX: no, same versions.
        
  10. Run the tests.
    cd ~/local/codeql-javascript-multiflow/
    codeql test run tests/UltimateSource/UltimateSource.qlref 
    
    gh codeql set-version 2.15.2
    codeql test run tests/UltimateSource/UltimateSource.qlref 
    
    Executing 1 tests in 1 directories.
    Extracting test database in /Users/hohn/local/codeql-javascript-multiflow/tests/UltimateSource.
    Compiling queries in /Users/hohn/local/codeql-javascript-multiflow/tests/UltimateSource.
    Executing tests in /Users/hohn/local/codeql-javascript-multiflow/tests/UltimateSource.
    [1/1 comp 557ms eval 255ms] PASSED /Users/hohn/local/codeql-javascript-multiflow/tests/UltimateSource/UltimateSource.qlref
    Completed in 3s (extract 1.2s comp 557ms eval 255ms).
    All 1 tests passed.
    
    XX: with 2.13.5
    one troubleshooting step could be to run
    codeql resolve library-path --query=solutions/UltimateSource.ql
    to see which --dbscheme location it prints. Is
    the file at that location the same as the
    javascript/semmlecode.javascript.dbscheme in the unpacked CLI? (edited)
    
    0:$ gh codeql debug on
    
    0:$ codeql resolve library-path --query=solutions/UltimateSource.ql
    ++ dirname /Users/hohn/.local/share/gh/extensions/gh-codeql/gh-codeql
    + rootdir=/Users/hohn/.local/share/gh/extensions/gh-codeql
    ++ gh config get extensions.codeql.channel
    + channel=
    + :
    ++ gh config get extensions.codeql.version
    + version=v2.13.5
    + '[' resolve = local-version ']'
    ++ gh config get extensions.codeql.local-version
    + local_version=
    + :
    + '[' -e .codeql-version ']'
    + version=v2.13.5
    + '[' -z resolve ']'
    + '[' -z '' ']'
    + channel=release
    + repo=github/codeql-cli-binaries
    ++ gh config get extensions.codeql.platform
    + platform=
    + :
    + [[ -z '' ]]
    + [[ darwin23 == \d\a\r\w\i\n* ]]
    + platform=osx64
    + '[' resolve = debug ']'
    + '[' resolve = list-versions ']'
    + '[' resolve = set-channel ']'
    + '[' resolve = download ']'
    + '[' resolve = set-version ']'
    + '[' resolve = set-local-version ']'
    + '[' resolve = unset-local-version ']'
    + '[' resolve = list-installed ']'
    + '[' resolve = cleanup ']'
    + '[' resolve = cleanup-all ']'
    + '[' resolve = install-stub ']'
    + '[' -z v2.13.5 ']'
    + download v2.13.5
    + local version=v2.13.5
    + '[' -z v2.13.5 ']'
    + '[' v2.13.5 = latest ']'
    + '[' -x /Users/hohn/.local/share/gh/extensions/gh-codeql/dist/release/v2.13.5/codeql ']'
    + return 0
    + export CODEQL_DIST=/Users/hohn/.local/share/gh/extensions/gh-codeql/dist/release/v2.13.5
    + CODEQL_DIST=/Users/hohn/.local/share/gh/extensions/gh-codeql/dist/release/v2.13.5
    + exec /Users/hohn/.local/share/gh/extensions/gh-codeql/dist/release/v2.13.5/codeql resolve library-path --query=solutions/UltimateSource.ql
    --dbscheme=/Users/hohn/.codeql/packages/codeql/javascript-all/0.8.3/semmlecode.javascript.dbscheme
    --full-library-path=/Users/hohn/local/codeql-javascript-multiflow/solutions:/Users/hohn/.codeql/packages/codeql/javascript-all/0.8.3:/Users/hohn/.codeql/packages/codeql/javascript-queries/0.8.3:/Users/hohn/.codeql/packages/codeql/mad/0.2.3:/Users/hohn/.codeql/packages/codeql/regex/0.2.3:/Users/hohn/.codeql/packages/codeql/suite-helpers/0.7.3:/Users/hohn/.codeql/packages/codeql/tutorial/0.2.3:/Users/hohn/.codeql/packages/codeql/typos/0.2.3:/Users/hohn/.codeql/packages/codeql/util/0.2.3:/Users/hohn/.codeql/packages/codeql/yaml/0.2.3
    --no-default-compilation-cache
    --compilation-cache=/Users/hohn/.codeql/compile-cache
    
    # and manually
    export CODEQL_DIST=/Users/hohn/.local/share/gh/extensions/gh-codeql/dist/release/v2.13.5
    $CODEQL_DIST/codeql resolve library-path --query=solutions/UltimateSource.ql
    
    --dbscheme=/Users/hohn/.codeql/packages/codeql/javascript-all/0.8.3/semmlecode.javascript.dbscheme
    --full-library-path=/Users/hohn/local/codeql-javascript-multiflow/solutions:/Users/hohn/.codeql/packages/codeql/javascript-all/0.8.3:/Users/hohn/.codeql/packages/codeql/javascript-queries/0.8.3:/Users/hohn/.codeql/packages/codeql/mad/0.2.3:/Users/hohn/.codeql/packages/codeql/regex/0.2.3:/Users/hohn/.codeql/packages/codeql/suite-helpers/0.7.3:/Users/hohn/.codeql/packages/codeql/tutorial/0.2.3:/Users/hohn/.codeql/packages/codeql/typos/0.2.3:/Users/hohn/.codeql/packages/codeql/util/0.2.3:/Users/hohn/.codeql/packages/codeql/yaml/0.2.3
    --no-default-compilation-cache
    --compilation-cache=/Users/hohn/.codeql/compile-cache
    
    0:$ find $CODEQL_DIST | grep 'javascript/semmlecode.javascript.dbscheme'
    /Users/hohn/.local/share/gh/extensions/gh-codeql/dist/release/v2.13.5/javascript/semmlecode.javascript.dbscheme
    /Users/hohn/.local/share/gh/extensions/gh-codeql/dist/release/v2.13.5/javascript
    /semmlecode.javascript.dbscheme.stats
    
    0:$ cmp /Users/hohn/.codeql/packages/codeql/javascript-all/0.8.3/semmlecode.javascript.dbscheme \
        /Users/hohn/.local/share/gh/extensions/gh-codeql/dist/release/v2.13.5/javascript/semmlecode.javascript.dbscheme
    /Users/hohn/.codeql/packages/codeql/javascript-all/0.8.3/semmlecode.javascript.dbscheme /Users/hohn/.local/share/gh/extensions/gh-codeql/dist/release/v2.13.5/javascript/semmlecode.javascript.dbscheme differ: char 3917, line 165
    
    1:$ diff /Users/hohn/.codeql/packages/codeql/javascript-all/0.8.3/semmlecode.javascript.dbscheme     /Users/hohn/.local/share/gh/extensions/gh-codeql/dist/release/v2.13.5/javascript/semmlecode.javascript.dbscheme
    165d164
    < | 40 = @using_decl_stmt
    168c167
    < @decl_stmt = @var_decl_stmt | @const_decl_stmt | @let_stmt | @legacy_let_stmt | @using_decl_stmt;
    ---
    > @decl_stmt = @var_decl_stmt | @const_decl_stmt | @let_stmt | @legacy_let_stmt;
        
  1. install the pack dependencies VS Code. Do this via
    command palette
        

    and then select all listed by

    CodeQL: Install Pack Dependencies 
        

    It will generate a codeql-pack.lock.yml file.

  2. use the following to build a CodeQL database.
    #* Build the db with source commit id.
    codeql --version
    : CodeQL command-line toolchain release 2.13.5.
    
    cd ~/local/codeql-javascript-multiflow/
    
    DB=./js-sqli-db-$(git rev-parse --short HEAD)
    echo $DB
    
    test -d "$DB" && rm -fR "$DB"
    mkdir -p "$DB"
    
    codeql database create --language=javascript -s . -j 8 -v $DB
    
    # Check it
    unzip -v $DB/src.zip |egrep '(add|sample)'
        
  3. add the database to the editor. To do this there is a widget on the left side of editor that looks like QL and after selecting that, there is a databases panel. There are options to select from archive or folder. Select the “from folder” option and add the “database” folders you created above.
  4. open the query trivial.ql and run it via
    right click > run query on selected database
        

There are several ways to install the CodeQL binaries and libraries. Here is a shell script to do it one way

# grab -- retrieve and extract codeql cli and library
# Usage: grab version platform prefix
grab() {
    version=$1; shift
    platform=$1; shift
    prefix=$1; shift
    mkdir -p $prefix/codeql-$version &&
        cd $prefix/codeql-$version || return

    # Get cli
    wget "https://github.com/github/codeql-cli-binaries/releases/download/$version/codeql-$platform.zip"
    # Get lib
    wget "https://github.com/github/codeql/archive/refs/tags/codeql-cli/$version.zip"
    # Fix attributes
    if [ `uname` = Darwin ] ; then
        xattr -c *.zip
    fi
    # Extract
    unzip -q codeql-$platform.zip
    unzip -q $version.zip
    # Rename library directory for VS Code
    mv codeql-codeql-cli-$version/ ql
    # remove archives?
    # rm codeql-$platform.zip
    # rm $version.zip
}    

# Try:
grab v2.13.5 osx64 $HOME/local/xefm

grab v2.13.5 linux64 $HOME/local/xefm

ls $HOME/local/xefm/codeql-v2.13.5/
: codeql/  codeql-osx64.zip  ql/  v2.13.5.zip

Sample Application Setup and Run

Execute the following in a bourne-style shell, one block at a time to see results. This requires a working Python installation and a POSIX shell.

# Prepare db
./admin -r
./admin -c
./admin -s 

# Add regular user
./add-user.py 2>> log
First User

# Check
./admin -s

# Add Johnny Droptable 
./add-user.py 2>> log
Johnny'); DROP TABLE users; --

# See the problem:
./admin -s

# Check the log
tail log

Reading Order

The queries introduce CodeQL concepts and should be read bottom-up in this order:

  1. source.ql: introduces Value, ControlFlowNode and DataFlow::Node.
  2. sink.ql: introduces AstNode.
  3. TaintFlowTemplate.ql: introduce the taint flow template.
  4. TaintFlow.ql: taint flow with endpoints only, using a class. This is the old way, but it still works and is a good introduction to using classes – not writing them.
  5. TaintFlowPath.ql: taint flow with full path. Again, the old way.
  6. TaintFlowWithModule.ql: taint flow with endpoints only, using modules. The way forward.
  7. TaintFlowPathQueryWithModule.ql: taint flow with full path, using modules.

Note on the Python code

The Python call

conn.execute(query)

to sqlite3 only allows one statement and produces an exception:

sqlite3.Warning: You can only execute one statement at a time.

This makes it safer than the raw

sqlite3_exec() 

or Python’s

conn.executescript

For this tutorial, we use the multi-statement executescript() call.

Additional Topics

This repository and its source code are used to illustrate some additional topics from the CodeQL Python documentation.

Dataflow in Python

https://codeql.github.com/docs/codeql-language-guides/analyzing-data-flow-in-python/

Using and extending the CodeQL standard library:

  • StdLibPlain.ql Illustrates using the CodeQL standard library’s
    RemoteFlowSource 
        
  • StdLibExt.ql Illustrates extension of the CodeQL standard library via
    class SqlAccess extends FileSystemAccess::Range ...
        

    and

    class TerminalInput extends RemoteFlowSource::Range ...
        

Various data flow / taint flow examples from the documentation, modified as needed:

  • using-local-data-flow.ql
  • using-local-sources.ql
  • using-local-taint-tracking.ql

API graphs

https://codeql.github.com/docs/codeql-language-guides/using-api-graphs-in-python/

API graphs are a uniform interface for referring to functions, classes, and methods defined in external libraries.

  • ApiGraphs.ql: various sample queries

Type Tracking

Documentation for JavaScript, also applicable here: https://codeql.github.com/docs/codeql-language-guides/using-type-tracking-for-api-modeling/#using-type-tracking-for-api-modeling

The files

  • sqlite-info.py
  • TypeTracking.ql

use type tracking. From the docs: You can track data through an API by creating a model using the CodeQL type-tracking library. The type-tracking library makes it possible to track values through properties and function calls.

The file

  • TypeTrackingWithData.ql

goes further. From the docs: The type-tracking library makes it possible to track values through properties and function calls. Here, we also track some associated data. See https://codeql.github.com/docs/codeql-language-guides/using-type-tracking-for-api-modeling/#tracking-associated-data

Flow State

The query TaintFlowPathQueryWithSanitizer.ql illustrates using a flow-state representing whether user input has been sanitized.

It introduces ADTs via the newtype declaration of TInputSanitizationState.

codeql-javascript-multiflow's People

Contributors

hohn avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.