Giter Club home page Giter Club logo

graphql-gate's Introduction

GraphQLGate

GitHub stars GitHub issues GitHub last commit

A GraphQL rate-limiting library with query complextiy analysis for Node.js and Express

ย 

Summary

Developed under tech-accelerator OSLabs, GraphQLGate strives for a principled approach to complexity analysis and rate-limiting for GraphQL queries by accurately estimating an upper-bound of the response size of the query. Within a loosely opinionated framework with lots of configuration options, you can reliably throttle GraphQL queries by complexity and depth to protect your GraphQL API. Our solution is inspired by this paper from IBM research teams.

Table of Contents

Getting Started

Install the package

npm i graphql-limiter

Import the package and add the rate-limiting middleware to the Express middleware chain before the GraphQL server.

NOTE: a Redis server instance will need to be started in order for the limiter to cache data.

// import package
import { expressGraphQLRateLimiter } from 'graphql-limiter';

/**
 * Import other dependencies
 * */

//Add the middleware into your GraphQL middleware chain
app.use(
    'gql',
    expressGraphQLRateLimiter(schemaObject, {
        rateLimiter: {
            type: 'TOKEN_BUCKET',
            refillRate: 10,
            capacity: 100,
        },
    }) /** add GraphQL server here */
);

Configuration

  1. schema: GraphQLSchema | required

  2. config: ExpressMiddlewareConfig | required

    • rateLimiter: RateLimiterOptions | required

      • type: 'TOKEN_BUCKET' | 'FIXED_WINDOW' | 'SLIDING_WINDOW_LOG' | 'SLIDING_WINDOW_COUTER'
      • capacity: number
      • refillRate: number | bucket algorithms only
      • windowSize: number | (in ms) window algorithms only
    • redis: RedisConfig

      • options: RedisOptions | ioredis configuration options | defaults to standard ioredis connection options (localhost:6379)
      • keyExpiry: number (ms) | custom expiry of keys in redis cache | defaults to 24 hours
    • typeWeights: TypeWeightObject

      • mutation: number | assigned weight to mutations | defaults to 10
      • query: number | assigned weight of a query | defaults to 1
      • object: number | assigned weight of GraphQL object, interface and union types | defaults to 1
      • scalar: number | assigned weight of GraphQL scalar and enum types | defaults to 0
    • depthLimit: number | throttle queies by the depth of the nested stucture | defaults to Infinity (ie. no limit)

    • enforceBoundedLists: boolean | if true, an error will be thrown if any lists types are not bound by slicing arguments [first, last, limit] or directives | defaults to false

    • dark: boolean | if true, the package will calculate complexity, depth and tokens but not throttle any queries. Use this to dark launch the package and monitor the rate limiter's impact without limiting user requests.

    All configuration options

    expressGraphQLRateLimiter(schemaObject, {
        rateLimiter: {
            type: 'SLIDING_WINDOW_LOG', // rate-limiter selection
            windowSize: 6000, // 6 seconds
            capacity: 100,
        },
        redis: {
            keyExpiry: 14400000 // 4 hours, defaults to 86400000 (24 hours)
            options: {
                host: 'localhost' // ioredis connection options
                port: 6379,
            }
        },
        typeWeights: { // weights of GraphQL types
            mutation: 10,
            query: 1,
            object: 1,
            scalar: 0,
        },
        enforceBoundedLists: false, // defaults to false
        dark: false, // defaults to false
        depthLimit: 7 // defaults to Infinity (ie. no depth limiting)
    });

Notes on Lists

For queries that return a list, the complexity can be determined by providing a slicing argument to the query (first, last, limit), or using a schema directive.

  1. Slicing arguments: lists must be bounded by one integer slicing argument in order to calculate the complexity for the field. This package supports the slicing arguments first, last and limit. The complexity of the list will be the value passed as the argument to the field.

  2. Directives: To use directives, @listCost must be defined in your schema with directive @listCost(cost: Int!) on FIELD_DEFINITION. Then, on any field which resolves to an unbounded list, add @listCost(cost: [Int]) where [Int] is the complexity for this field.

(Note: Slicing arguments are preferred and will override the the @listCost directive! @listCost is in place as a fall back.)

directive @listCost(cost: Int!) on FIELD_DEFINITION
type Human {
    id: ID!
}
type Query {
    humans: [Human] @listCost(cost: 10)
}

How It Works

Requests are rate-limited based on the IP address associated with the request.

On startup, the GraphQL (GQL) schema is parsed to build an object that maps GQL types/fields to their corresponding weights. Type weights can be provided during initial configuration. When a request is received, this object is used to cross reference the fields queried by the user and compute the complexity of each field. The total complexity of the request is the sum of these values.

Complexity is determined, statically (before any resolvers are called) to estimate the upper bound of the response size - a proxy for the work done by the server to build the response. The total complexity is then used to allow/block the request based on popular rate-limiting algorithms.

Requests for each user are processed sequentially by the rate limiter.

Example (with default weights):

query {
    # 1 query
    hero(episode: EMPIRE) {
        # 1 object
        name # 0 scalar
        id # 0 scalar
        friends(first: 3) {
            # 3 objects
            name # 0 scalar
            id # 0 scalar
        }
    }
    reviews(episode: EMPIRE, limit: 5) {
        #   5 objects
        stars # 0 scalar
        commentary # 0 scalar
    }
} # total complexity of 10

Response

  1. Blocked Requests: blocked requests recieve a response with,

    • status of 429 for Too Many Requests
    • Retry-After header indicating the time to wait in seconds before the request could be approved (Infinity if the complexity is greater than rate-limiting capacity).
    • A JSON response with the remaining tokens available, complexity of the query, depth of the query, success of the query set to false, and the UNIX timestamp of the request
  2. Successful Requests: successful requests are passed on to the next function in the middleware chain with the following properties saved to res.locals

{
   graphqlGate: {
      success: boolean, // true when successful
      tokens: number, // tokens available after request
      compexity: number, // complexity of the query
      depth: number, // depth of the query
      timestamp: number, // UNIX timestamp
   }
}

Error Handling

  • Incoming queries are validated against the GraphQL schema. If the query is invalid, a response with status code 400 is returned along with an array of GraphQL Errors that were found.
  • To avoid disrupting server activity, errors thrown during the analysis and rate-limiting of the query are logged and the request is passed onto the next piece of middleware in the chain.

Internals

This package exposes 3 additional functionalities which comprise the internals of the package. This is a breif documentaion on them.

Complexity Analysis

  1. typeWeightsFromSchema | function to create the type weight object from the schema for complexity analysis

    • schema: GraphQLSchema | GraphQL schema object

    • typeWeightsConfig: TypeWeightConfig = defaultTypeWeightsConfig | type weight configuration

    • enforceBoundedLists = false

    • returns: TypeWeightObject

    • usage:

      import { typeWeightsFromSchema } from 'graphql-limiter';
      import { GraphQLSchema } from 'graphql/type/schema';
      import { buildSchema } from 'graphql';
      
      let schema: GraphQLSchema = buildSchema(`...`);
      
      const typeWeights: TypeWeightObject = typeWeightsFromSchema(schema);
  2. QueryParser | class to calculate the complexity of the query based on the type weights and variables

    • typeWeights: TypeWeightObject

    • variables: Variables | variables on request

    • returns a class with method:

      • processQuery(queryAST: DocumentNode): number

      • returns: complexity of the query and exposes maxDepth property for depth limiting

        import { typeWeightsFromSchema } from 'graphql-limiter';
        import { parse, validate } from 'graphql';
        
        let queryAST: DocumentNode = parse(`...`);
        
        const queryParser: QueryParser = new QueryParser(typeWeights, variables);
        
        // query must be validatied against the schema before processing the query
        const validationErrors = validate(schema, queryAST);
        
        const complexity: number = queryParser.processQuery(queryAST);

Rate-limiting

  1. rateLimiter | returns a rate limiting class instance based on selections

    • rateLimiter: RateLimiterConfig | see "configuration" -> rateLimiter

    • client: Redis | an ioredis client

    • keyExpiry: number | time (ms) for key to persist in cache

    • returns a rate limiter class with method:

      • processRequest(uuid: string, timestamp: number, tokens = 1): Promise<RateLimiterResponse>
      • returns: { success: boolean, tokens: number, retryAfter?: number } | where tokens is tokens available, retryAfter is time to wait in seconds before the request would be successful and success is false if the request is blocked
      import { rateLimiter } from 'graphql-limiter';
      
      const limiter: RateLimiter = rateLimiter(
          {
              type: 'TOKEN_BUCKET',
              refillRate: 1,
              capacity: 10,
          },
          redisClient,
          86400000 // 24 hours
      );
      
      const response: RateLimiterResponse = limiter.processRequest(
          'user-1',
          new Date().valueOf(),
          5
      );

Future Development

  • Ability to use this package with other caching technologies or libraries
  • Implement "resolve complexity analysis" for queries
  • Implement leaky bucket algorithm for rate-limiting
  • Experiment with performance improvements
    • caching optimization
  • Ensure connection pagination conventions can be accuratly acconuted for in complexity analysis
  • Ability to use middleware with other server frameworks

Contributions

Contributions to the code, examples, documentation, etc. are very much appreciated.

Developers

License

This product is licensed under the MIT License - see the LICENSE.md file for details.

This is an open source product.

This product is accelerated by OS Labs.

graphql-gate's People

Contributors

evanmcneely avatar feiw101 avatar jondeweydev avatar shalarewicz avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

graphql-gate's Issues

Complexity of nested lists adds complexities instead of multiplying

Nested results should multiply the complexity of inner results. Currently these are added.

For example the below query should have a complexity of 17 (1 query + 1 human + 5 * 3 (friends * children) );

 query { 
      human(id: 1) { 
          name, 
          friends(first: 5) { 
              name, 
              children(first: 3){ 
                  name 
              } 
          } 
      }
  }

Nested results with a complexity of 0 should not zero out outer results. In the above example if children resolved to a list of scalars, which have weight 0 then the overall complexity should be 7 (1 query + 1 human + 5 (friends ) and not 2 (1 query + 1 human + 5 * 0 (friends * children) );
;

TODO

  • Enable the "with nested lists" tests in typeComplexityAnalysis.test.ts to verify resolution of this issue. (tests updated by #61).

Send request info to client on rejected request

At the time of writing, the express rate limiter middleware function is sending a status code 429 when request is throttled. Refactor this to also send data like remaining tokens in bucket, requested tokens, etc. on the response.

Add tests for Leaky Bucket Algorithm

A full testing suite using jest should be written for the LeakyBucket class based on the spec decided on in issue #50. Tests can be modelled on those written for TokenBucket

Write spec for LeakyBucket class

A spec for the Leaky Bucket algorithm should be decided on prior to writing tests. The constructor should accept parameters for the following:

  • capacity - Number of tokens each bucket should hold
  • outlfowAmount - Number of complexity tokens added that can be processed at a fixed rate. Combined with outflowPeriod to create an outflow rate
  • outflowPeriod - Time (seconds) that must elapse before tokens flow out of the bucket
  • client - redis client instance

The LeakyBucket class must implement RateLimiter

This can be written in similar fashion to rateLimiters/tokenBucket.ts.

Implement Fixed Window Algorithm

Implement FixedWindow class per the spec in #51. The implementation should pass the tests written in #53.

In essence the Fixed Window algorithm rate limits requests by tracking the complexity of requests that are received in the current window and limits any requests that exceed the capacity.

Requests should be limited using IP address.

Complexity data should be stored for each user by connecting to a developer provided Redis client.

Handle GraphQL Non-null wrappers in tests and implementation

GraphQLNonNull types can wrap any GraphQLInputType. We should add tests and update buildTypeWeights and complexity analysis to ensure both incoming queries and schemas including non null types are accounted for and successfully parsed.

Per the GraphQL type def files GraphQLInputType can consist of the following types. Note the level of nesting that can occur.

GraphQLInputType = 
  | GraphQLScalarType
  | GraphQLEnumType
  | GraphQLInputObjectType
  | GraphQLList<GraphQLInputType>
  | GraphQLNonNull<
      | GraphQLScalarType
      | GraphQLEnumType
      | GraphQLInputObjectType
      | GraphQLList<GraphQLInputType>
 >;

Implement a function to calculate the complexity of a field returning a list of determinate size. Update tests

See the notes below for implementation. The type weight object used in testing should be updated to reflect the implementation.

  • type weight object used for all complexity analysis tests
  • buildTypeWeights test 'fields returning lists of objects of determinate size'
  • buildTypeWeights test 'fields returning lists of objects of indeterminate size'

Notes

The end goal is to have a function that can be used to calculate the total type complexity for a field by multiplying the arg value times the weight of the Resolved Type.

For example, if we receive a query reviews(episode: 'NEWHOPE', first: 5), the type complexity algorithm would call the function with the value of first the weight of a Review to get 5 * 1 = 5;

This function could be global that receives both arguments or a unique function could be assigned to each applicable field as below.

When looking up a field weight, we would check the type of the value. If the weight is a number, simply return the number otherwise call the associated function with relevant args.

We will also need to determine if the relevant information (variable names and resolve types) is available in the GraphQLSchemaObject or the AST created by parsing the schema object

first, last, limit are conventional keywords for limiting results. Later on we could make these keywords configurable to provide a more unopinionated solution.

Query {
  weight: 1,
  fields: {
    reviews: (type) => args[multiplierName] * typeWeightObject[type].weight
  },
}, 

Review {
  weight: 1,
  fields: {
    stars: 0,
    commentary: 0,  
  }
}

Write a function to iterate through the GraphQL schema object

Iterate through the graphQL schema object to determine field weights for query cost calculation.

Input: graphQL schema object
Output: An object with properties as types and values as weights. These could be nested in some way...
example outpu:, schema has type user with feilds username and password

{
  name: {
    weight: 1,
    feilds: {
      username: 0,
      password: 0,
    }
  }
}

Write tests for Express rate limter middleware

Unit Tests should be added for the Express Rate limiting middleware with a initial focus on using the Token Bucket algorithm.

Test should focus on testing the initial configuration steps as well as the returned middleware functioning as expected without testing the specific implementation details of each individual algorithm as these are covered by their own Unit Tests.

Allow rate limiting by user id, not just ip

At the time of writing, the express rate limiter function is rate limiting requests by IP address by default. Add a configuration option to allow the developer to throttle requests by user ID or some other 'uuid'.

Implement testing for Token Bucket algorithm

Use a TDD approach to verify correctness of the Token Bucket algorithm used for rate limiting incoming GraphQL requests.

Testing should not be concerned with computing an incoming requests complexity. Each request should be considered to consume 1 token.

Testing should verify the following;

  1. Requests are limited as expected based on user IP address.
  2. CAPACITY and REFILL_RATE are accepted input parameters.
  3. Size of the token bucket never exceeds the set CAPACITY
  4. Token bucket is refilled at the provided REFILL_RATE

Implement Leaky Bucket Algorithm

Implement LeakyBucket class per the spec in #50. The implementation should pass the tests written in #52

The LeakyBucket algorithm works similarly to TokenBucket except that requests are processed at a fixed rate. The algorithm is usually implemented with a FIFO queue and works as follow:

  • When a request arrives, the limiter checks if the queue is full. If it is not full, the request is added to the queue.
  • Otherwise, the request is dropped (a 429 status is sent)
  • Requests are pulled from the queue and processed at regular intervals

Undefined fieldWeight throws an error during complexity anlaysis

If the fieldWeight is undefined during complexity analysis then an unhandled error is thrown. These errors should be handled in development environments to provide clarity on why the fieldWeight is undefined. Production environments should attempt to default to a logical complexity value.

Update buildTypeWeights to handle lists that resolve to scalars or enums

Currently thebuildTypeWeights tests for queries that resolve to lists of scalars or enums fail. The type weight for these lists should be set to the configured or default type weight for enums or scalars.

  • TODO Enable tests under `fields returning lists of objects of determinate size and...the list resolves to an enum or scalar
  • TODO Enable tests under `fields returning lists of objects of determinate size and...the list resolves to an enum or scalar and a custom scalar weight was configured

Related to #61

Implement Token Bucket Algorithm

Implement a Token Bucket algorithm using TDD which limits requests based on a user's IP address and allows developers to provide the desired CAPACITY and REFILL_RATE (in milliseconds).

The algorithm should consider each request to consume exactly 1 token. It should not consider GraphQL query complexity when limiting requests.

The TokenBucket instance of a RateLimiter limits requests based on a unique user ID.
Whenever a user makes a request the following steps are performed:

  1. Refill the bucket based on time elapsed since the previous request
  2. Update the timestamp of the last request.
  3. Allow the request and remove the requested amount of tokens from the bucket if the user has enough.
  4. Otherwise, disallow the request and do not update the token total.

For more information on implementing a Token Bucket Algorithm see the following resources:

Update buildTypeWeights to handle lists on types other than the root Query type

Currently buildTypeWeights only generates a WeightFunction for queries that live on the root Query type. This functionality should be expanded to handled queries resolving to bounded lists that live on any type.

  • TODO Enable tests under fields returning lists of objects of determinate size and...are not on the Query type to show resolution of this issue.

Update complexity analysis to handle variables

When analyzing the complexity of a query that returns a bounded lists, the length of the list is currently determined by searching the provided parameters for one of three limiting keywords first, last and limit. Currently, the algo only parses numbers when determining the value of these arguments. Update the algorithm to also parse use values pass as Variables.

This implementation should pass the tests added in #61.

  • TODO Enable test a default value is provided in the schema...and a value is not provided with the query
  • TODO Enable test `a default value is provided in the schema...and the argument is passed in as a variable
  • TODO Enable test `a default value is not provided in the schema...and the argument is passed in as a variable

Extend the 'buildTypeWeights' function to account for LISTS, ENUMS, INTERFACES, UNIONS, INPUT TYPES in a schema object

Once this issue gets picked up, our buildTypeWeightsFromSchemaObject should be creating the type weight object out of a schema with query and basic types AND PASSING THE TESTS. Extend this functionality to work for all additional GraphQL schema types, including lists, enums, interfaces, unions, and input types (is this just an object type). Figure out if there is anything else we need to account for.

  1. Add functionality to the algorithm to pass the tests.

Framework for Rate limiting middleware

The primary entry point for our rate limiting middleware will be a function that accepts:

  • rate limiting method
  • GraphQL Schema
  • optional TypeWeightConfig parameters and outputs Express middleware.
  • redis client

Rate Limiter options should be presented as an enum consisting of Token Bucket, Leaky Bucket, Fixed Window, Sliding Window Log and Sliding Window Counter choices.

During configuration this function should:

  • Setup a Redis Client
  • Parse the schema using the provided or default TypeWeightConfig to obtain a TypeWeightObject.
  • Configure the selected RateLimiter
  • Configure complexity analysis algorithm using the TypeWeightObject
  • Return express middleware based on the above configuration.

The middleware will need to be able to have access to the GraphQL query , a unique user identifier (such as IP) and request timestamp and will call the next middleware if the request is allowed or reject the request using a 429 status code if the rate limiter denies the request.

Look into the req.ip field in more depth and determine if it is an adequate way to uniquely identify users.

There are numerous ways to get the IP address off of the request object.

The header x-forward-for will hold the originating IP address if a proxy is placed in front of the server. This would be common for a production build.

  • req.ips wwill hold an array of IP addresses in x-forward-for header. Client is likely at index zero
  • req.ip will have the IP address
  • req.socket.remoteAddress is an instance of net.socket which is used as another method of getting the IP address
  • req.ip and req.ips will work in express but not with other frameworks

Add parameter to set token bucket refill time, not just refill rate.

The token bucket parameter refillRate is being used to set the bucket refill amount for every 1 second. Add a parameter refillFrequency to adjust to time between refills other than 1 second. Maybe it defaults to 1 second and maybe the parameter refillRate needs to be changed to refillAmount. The end goal is to be able to configure the algorithm to refill at a rate of 10 tokens a minute, for example.

Implement "resolve" complexity analysis

At the time of writing, the express middleware function is hard coded to use the type complexity algorithm for complexity analysis. Refactor this function to make the choice of resolve or type complexity analysis a configuration option for the developer.

Add 'retry after' header on 429 response to client when request is throttled

At the time of writing, the express rate limiter function is responding to the client with a status code of 429 if the query has been throttled. Refactor this function to return the request with a "retry after" header set to the proper time duration in ms that the client should wait before retrying the request to be successful.

Write RateLimiter class spec

Create a class specification for RateLimiters in order to allow a user to interchange different ratelimting algorithms. Should exposing a constructor, method to check if a request is allowed and a method to connect to redis store

Resolve complexity analysis with test

Implement the resolve complexity analysis algorithm. Include thorough testing of you implementation. Model this algorithm after the type complexity algorithm.

Write spec for FixedWindow class

A spec for the FixedWindow algorithm should be decided on prior to writing tests. The constructor should accept parameters for the following:

windowSize - Size of the window in milliseconds
capacity - Number of requests allowed during the window
client - redis client instance

The FixedWindow class must implement RateLimiter

Write the basic framework for the complexity analysis algorithm.

The algorithm should do roughly these things.

  1. Accepts a GraphQL query string and a 'field weight object' from issue #3
  2. Parses the string and iterates through the AST
  3. Counts and totals complexity by checking the query fields against field weight object
  4. Returns the total complexity

Use the GraphQL.js library.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.