Giter Club home page Giter Club logo

graphql.0's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

graphql.0's Issues

Simple queries(without name) causes an error

Query:

{ 
  services(uid: 123)
  {
    uid
    p1
    p2
  }
}

Error:

2018-02-22 14:56:04.333 [60448] main/453/main utils.lua:188 E> [request_id: front-01-00000] Error catched: attempt to index field 'name' (a nil value)
2018-02-22 14:56:04.333 [60448] main/453/main utils.lua:190 E> [request_id: front-01-00000] Error occured at '...ida/.rocks/share/tarantool/graphql/tarantool_graphql.lua:422'
2018-02-22 14:56:04.333 [60448] main/453/main utils.lua:192 E> [request_id: front-01-00000]
2018-02-22 14:56:04.333 [60448] main/453/main utils.lua:171 E> [request_id: front-01-00000] [Lua ] function 'assert_gql_query_ast' at <...ida/.rocks/share/tarantool/graphql/tarantool_graphql.lua:422>
2018-02-22 14:56:04.333 [60448] main/453/main utils.lua:171 E> [request_id: front-01-00000] [Lua ] function 'compile' at <...ida/.rocks/share/tarantool/graphql/tarantool_graphql.lua:459>

Canonical data set

Add a canonical data set to the test suite to evaluate the effectiveness of the optimizer based on the cost model tarantool/graphql#22.

Having a canonical data set, and a test set of graphql queries, we can cover the optimizer with functional tests.

Cost model

Develop a cost model with which each query, can be assigned a cost: estimated, before execution, and factual, after the fact. The cost model should be used by the planner to assess alternative query plans, and by tests to assess the overall quality of the planner.

Support destination_collection deducible from a parent object

http://graphql.org/learn/schema/#union-types

Current connection format:

{
    name = 'connection_name_bar',
    destination_collection = 'collection_baz',
    type = '1:1',
    parts = {
        {
            source_field = 'field_name_source_1',
            destination_field = 'field_name_destination_1'
        },
        ...
    },
    index_name = 'index_name'
}   

Proposed the second connection format:

{
    name = 'connection_name_bar',
    type='1:1',
    variants = {
        {
           filter = {foo = 1, bar = 'id_1'},
           destination_collection = 'collection_baz',
           parts = {
               {   
                   source_field = 'field_name_source_1',
                   destination_field = 'field_name_destination_1'
               },  
               ... 
            }
            index_name = 'index_name'
         },  
         ... 
     }
}

We can move source_fields upward from variants, but I like the idea of maximum reusability of the current code (for now, at least). The format of 'filter' choosen from the same idea.

  • tarantool_graphql (validation): expand connection validation with support of the second connection format;
  • tarantool_graphql (graphql schema generation): for such connections: generate all possible types and construct the union as the graphql type of the corresponding connection field;
  • tarantool_graphql (resolve functions): expand from parameter of accessor:select() to be list of filter, from pairs, expand collection_name to be list of such collection names (in the corresponding order);
  • accessor_general: save select_internal as is, but before invoke it do the following: match parent with filters from the from argument one by one and choose the Nth collection_name from the list collection_name, pass the found collection name and certain from variant to the unchanged select_internal.

Debatable: the avro_schema_changes.org document restricts tag value type to number / string and utilizes type conversion, that seems not good for me. Maybe we must specify tag value as a value of some field, not as a key. I proposed more powerfull way that allows to reuse the existing code of our library as much as possible.

Create list of planned optimizations

  • optimize offset
  • join reordering (default: top to bottom, but when there is an index in a nested level: reorder execution)
    • further task: evaluate selectivity
  • we can start with simple join loop
  • block nested loop
  • map-reduce + pushdown conditions
  • determine index name when compile query for top-level objects when it has certain list of non-null arguments

Limit result size

Limit result list(s) length (overall items count or items count for each list) or limit result size in bytes.

Support directives: @skip, @include

It seems that grapql-lua supports directives already, so we need to write a test to check all works as expected. The example can be found here: http://graphql.org/learn/queries/#directives . I think the test can reuse the common_testdata dataset and conditionally include/skip a connection field. I think we need to check only a boolean variable as the expression of a directive and postpone more complex cases when it will be requested explicitly.

Export execution plan

Allow to export execution plan in some plan text format, to be able to cover optimizer decisions with functional tests.

Support compound primary indexes

  • tests:
    • lookup by a compound index on the top level
    • offset by a compound primary index at top level
    • connection on top of compound primary index (full/partial); related to 1st item in #30
  • accessor_general.get_index_name()
    • construct pivot_value_list to pass to the index:pairs()
    • construct value_list to pass to the index:pairs()
  • accessor_general.build_lookup_index_name(): remove assert
  • accessor_general.new()::list_args(): construct compound type for the offset field
  • tarantool_graphql.parse_cfg():
    • support non-scalar types in arguments (InnerObject) parsing + register in graphql-lua

Avro schema with a record as a record field causes an error

Schema:

"user": {
    "type": "record",
    "name": "user",
    "fields": [
        {"name": "uid", "type": "long"},
        {"name": "p1", "type": "string"},
        {"name": "p2", "type": "string"},
        {
            "name": "nested",
            "type": {
                "type": "record",
                "name": "nested",
                "fields": [
                    {"name": "x", "type": "long"},
                    {"name": "y", "type": "long"}
                ]
            }
          }

    ]
}

Error:
Encountered multiple types named "nested"

JOIN semantic for nested objects filtering

There are two approach to handle nested objects that are failed to fetch by a 1:1 connection:

  • Give an error. This is the current behaviour. Seems to be GraphQL-way.
  • Remove all parent objects while parent-child connections has 1:1 type (a first 1:N connected parent or top-level object will give the empty list). This is sort of backtraking and feels like SQL JOIN.

I don’t sure which option should be implemented (or both?).

Maintain cluster-wide statistics about data and its distribution

To be able to perform cost-based query analysis, we need to maintain cluster-wide statistics about data distribution: number of records, index fan-out (unique records vs. all records), data set size for column/space.

Where to store this statistics is yet to be investigated (let's look at newsql vendors and see what they do).

Test record within a record

  • generate appropriate GraphQL types;
  • test for selecting such objects;
  • test for filtering by an internal fields using foo.bar: … or foo: {bar: …} syntax.

Accessor should consider avro schema evolution

We have two versions of Avro-schema and four service fields:
0001:

"service": {
    "type": "record",
    "name": "service",
    "fields": [
        {"name": "uid", "type": "string"},
        {"name": "p1", "type": "long"},
        {"name": "p2", "type": "long"}
    ]
}

0002:

"service": {
    "type": "record",
    "name": "service",
    "fields": [
        {"name": "uid", "type": "string"},
        {"name": "p1", "type": "long"},
        {"name": "p2", "type": "long"},
        {"name": "p3", "type": "string", "default": "test avro default"}
    ]
}

If data has been pushed in tarantool, then we the tuple ['79031234566', '2451111545', '0002', 1519231048.4021, '79031234566', 2, 2, 'test avro default'] will be stored. So if I try to receive data through graphql, I will receive an error in unflatten function. Because in the first version of the schema we don't have 'p3' field.

Whether it worth to deduce connection type (1:1 or 1:N) from index parts provided

The proposal was to consiner:

  • full primary index lookup as unique,
  • partial primary index lookup as non-uniue,
  • full/partial secondary index as non-unique.

I don’t sure we want to reduce flexibility here, like use the unique secondary index as an uniue one. From the other side, for space/shard accessors that affects only result shape: get an object or list of one item which is that object. Need to elaborate how that really constraint us in possible accessor implementations.

The original decision to move indexes description from the graphql part was inspired by feeling that it looks more like accessor’ part, then graphql’s one. But there is possible compromise: link a connection with an index in data accessor, then fetch connection types information in graphql part from the accessor.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.