vaticle / typedb-behaviour Goto Github PK

View Code? Open in Web Editor NEW

10.0 9.0 19.0 1.57 MB

TypeDB Behaviour Test Specification

License: GNU Affero General Public License v3.0

Gherkin 98.59% Starlark 1.41%

typedb-behaviour's Introduction

TypeDB Behaviour Specification

typedb-behaviour's People

Contributors

Stargazers

Watchers

Forkers

jmsfltchr flyingsilverfin alexjpwalker lolski adammitchelldev vaticle-test coplanetary maxbaxt phillammon haikalpribadi erikedin dmitrii-ubskii krishnangovindraj jamesreprise dmikhalin shiladitya-mukherjee cxdorn farost

typedb-behaviour's Issues

Password update scenario

Add password update scenario

Check unhappy paths in Connection tests

Problem to Solve

We have recently exposed a number of segfault and stalling issues caused by simple user errors: closing the transaction before a query finishes running, deleting the database and then trying to close a session... and so on.

While these are most certainly user errors, they should be handled gracefully, like any other user error. A segfault or stall is not an acceptable response in these scenarios.

Proposed Solution

We should add unhappy path tests to our Connection tests (Database, Session and Transaction). Basically just think of any scenario that could cause trouble because the user did a DB/session/tx operation they weren't meant to and write a test for it. (By tx operations, we do not mean syntactically/semantically invalid queries, but rather opening or closing a transaction at an illegal time)

ACID Behaviour Tests

We currently limit our transaction and connection BDD scenarios to open, isRead/Write, close and commit, sequentially and concurrently. However, we don't actually perform any operations within each open in transaction - such as attempting and expecting and exception on writing to a read transaction.

We need to test in parallel operations in various transactions. We should keep in mind this amounts to testing the isolation and serialisability of transactions in the DB, which should be designed properly

List of all steps

Problem to Solve

It could be useful to have a list of all possible steps with descriptions for some of them.

When we add a new step it is simpler to check if this step already exists in one file than in all feature-files.
Some steps are not self-describing, especially if they have a table with parameters (e.g. "uniquely identify answer concepts").
Some steps could be spelled in different ways (e.g. with/without verb ending).

Proposed Solution

The list itself could be auto-generated (but descriptions need to be added by hand, obviously).

Testing opening a transaction when the DB is deleted crashes in spectacular fashion in 2.0

The affected test is Scenario: delete a database causes open sessions to fail in database.feature.

In Grakn 1.8, attempting to open a transaction throws an exception if the DB is deleted.

In Grakn 2.0, it fails spectacularly, crashing and killing the entire JVM with a segfault:

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x0000000000000000, pid=99594, tid=0x0000000000001003
#
# JRE version: OpenJDK Runtime Environment (8.0_252-b09) (build 1.8.0_252-b09)
# Java VM: OpenJDK 64-Bit Server VM (25.252-b09 mixed mode bsd-amd64 compressed oops)
# Problematic frame:
# C  0x0000000000000000
#
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# An error report file with more information is saved as:
# /private/var/tmp/_bazel_aw/5782084d1ce012ddde90f3bc6bd3ed03/execroot/graknlabs_grakn_core/bazel-out/darwin-fastbuild/bin/test/behaviour/connection/database/test.runfiles/graknlabs_grakn_core/hs_err_pid99594.log
#
# If you would like to submit a bug report, please visit:
#   http://bugreport.java.com/bugreport/crash.jsp
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
#

The issue appears to be caused by RocksDB cleaning up and removing objects from memory when the DB is deleted, and then attempting to access invalid memory when opening a tx.

We should either fix the test (which seems unlikely) or delete it.

Test "get subtypes explicit" steps

Make sure we check if transaction is closed when a query throws

Enforce that all data have keys

Objective

Resolution testing depends upon identifying the original concepts inserted by key, and so this should be enforced.

Test overriding 'plays', 'owns', etc. when defining types

We need Graql Define tests for overriding 'plays', 'owns' and so on.

We discovered recently that:

define
locates relates located;
contractor-locates sub locates, relates contractor-located as located;

employment sub relation, relates employee, plays locates:located;
contractor-employment sub employment, plays contractor-locates:contractor-located as located;

causes a NPE to be thrown on the server.

See vaticle/typedb#6032 - we would have caught this severe bug much sooner, if we had these scenarios.

Test converting a key ownership to a regular attribute ownership

We have

Scenario: an attribute ownership can be converted to a key ownership
    When graql define
      """
      define person owns name @key;
      """
    Then transaction commits
    Then the integrity is validated
    When session opens transaction of type: read
    When get answers of graql query
      """
      match $x owns name @key;
      """
    Then uniquely identify answer concepts
      | x            |
      | label:person |

but we don't have the other way round. We should add the other way around.

Implement variable roles Match tests

A lot of nuanced match cases can be found by considering variable roles in a relation. A quick check shows that we have no such match tests in the current match.feature.

For example, see the test added in #171.

We should write tests cases to cover these, and discuss what the expected outcome of for instance the following queries are:

match ($role: $x) isa $rel; $role type friendship:friend;

match ($role: $x) isa rel; $role sub friendship:friend;

and any other fun cases that can be found!

Tests broken by down and side casting ban

All tests that have been disabled are tagged with the following comment:
# TODO: disabled because down and side casting no longer allowed

All these tests need to be edited to coincide with the new restrictions, or deleted.

Differentiate between 'throws syntax error' and 'throws semantic error'

Currently we only test that exceptions are thrown. We should also test whether they are syntax errors or semantic errors.

RBE-ify circleCI

Currently CircleCI runs without RBE since we spawn the Grakn Server in a separate command from the integration tests themselves, and then disable RBE by using spawn-strategy=local.

The fix would be to follow client-java's approach by passing the grakn distribution as data and executing it from the code. However, this requires moving the code that does this from client-java to another location that can be easily shared from different repositories, including console, verification, and client-java. All three of these currently use a different approach (console pulls GraknTestServer from Grakn's internals that should not be used externally anyway)

Change List entry in BDD to be """ entry for all Grakn queries

Multiline graql queries using | | in BDD is confusing - looks like one can write multiple valid graql queries in the same step.

However, we can avoid this by using """ instead of | to denote a query. This is allowed to span multiple lines and has the same different highlighting advantages as using |

Connection tests use a lot of memory

Problem to Solve

When running the feature files connection/session.feature and connection/database.feature in parallel, our CI machines run out of memory.

Proposed Solution

We could do one of the following:

Reduce the number of concurrent databases opened in these features (currently 6)
Increase the amount of memory in the CI machine that runs them (currently 2GB)
Limit the parallelisation of test runs using --jobs=1 (this is what we currently do, along with a TODO message saying we should fix this issue. It may be that this is, however, the best solution, in which case we should remove that TODO)

Behavioural tests for explanations

Objective

We wish to build behavioural tests to assert the structure and content of explanations and patterns that are returned with answers.

## Limitations

These tests will be seriously limited in the scenarios written must have only one possible resolvable path that the reasoner can choose such that they can be relied upon and remain relatively simplistic in implementation. Resolution testing is designed to handle the more complex cases.

Tests broken by banning rule inferences adding roles to existing relations

All tests that have been disabled are tagged with the following comment:
# TODO: disabled because added new roles to existing relations no longer allowed in rule heads

All these tests need to be edited to coincide with the new restrictions, or deleted.

Reasoner completeness checking

Objective

In resolution testing, we would like to check two things:

rules are triggered when they should be
rules are not triggered when they shouldn't be

The first is tested against a completed knowledge base, and the exact resolution path can be checked exactly.

The second, which this issue concerns, is to check that inferring all possible facts produces exactly the same number of facts that are expected if inference is working correctly. This doesn't test with certainty, but a false positive will only be given if there are multiple failures which zemblanitantly sum to give the expected count.

Document how to integrate BDD into a new client

Problem to Solve

It isn't currently clear how to integrate BDD into a new client.

Proposed Solution

Document how to integrate BDD into a new client

End-to-end resolution testing

Objective

Broadly, the generation of a query from the explanations of an inferred answer is complete #24, and forward chaining to complete a KB is also working #23.

What remains is to interface the two, and iron out any issues that arise. This includes:

All BDD tests should use "throws exception containing" rather than "throws" to ensure we are failing in the expected fashion

Test inserting Relation instances in various cases to do with role hierarchies

Problem to Solve

We don't currently have any Concept or Graql tests that test inserting Relation instances in the case of a role hierarchy.

Proposed Solution

These tests should be added to relation.feature in concept and insert.feature in graql/language.

An example test case is: if we do

define 
  super-rel sub relation, relates abc-role;
  sub-rel sub super-rel;

  abc sub entity, plays super-rel:abc-role:

will be able to insert abc instances playing role abc-role in sub-rel?

Add RoleType Concept tests

Problem to Solve

RoleType Concept methods were all broken in client-java and I didn't know about it, because we have no RoleType Concept tests.

Proposed Solution

Add RoleType Concept tests

Split up Graql Define BDD into several features

define.feature is 2600 lines long and takes 23 minutes to execute the tests in CircleCI (without caching).

It would be organised more cleanly if the feature was broken down into sub-features, and put into a new package named test/graql/language/schema (or perhaps even test/graql/language/define).

This might also make it possible to parallelise the tests better, speeding up our builds.

Upgrade Explanation tests for Grakn 2.0

Problem to Solve

The current Explanation feature files are not run anywhere, meaning Explanations are untested by BDD.

Proposed Solution

The feature specifications should be upgraded to be correct in Grakn 2.0. Then, Grakn Core and the Clients should be updated to run these specs.

Add Test Cases for Validations

We need to provide further test scenarios that match all of our expected validation and failure cases...
for example, nothing should be able to have relates except relations, and only attributes can be connected by has, key.

Test "explicit" and "overridden" concept methods

The newly added *Explicit and *Overridden concept methods are untested, and should have BDD tests added for them, to be run in Core and all clients.

Null aggregates are now checked by detecting isNaN; update client implementations accordingly

Integrity Test Tool (part 2) for Schema + Data

The Integrity tool currently only builds and tests the sets required for analysing the schema integrity following the formal semantics. We should extend this tool to also validate all the data, and update it to match the expected Semantics for the target release of Grakn (currently Grakn 2.0)

Implement Concept API -- Graql comparison tests

We should have a set of tests that check that there is an equivalence between the Concept API and Graql queries.

For instance:
match $x id V123; $x has attribute $a; get $a; should be equivalent to doing tx.getConcept("V123").attributes()`

Forward chaining for Knowledge Graph Completion

Objective

Implement a very naive forward chaining that can take an existing knowledge base and apply the rules contained to infer facts to exhaustion. All the while, each application of a rule needs to be recorded such that the resolution of a query's answer can be traced back to the original facts used.

Full approach detailed in (5) of #20

Improve user management features

Problem to Solve

Our user management feature is fairly bare-bones right now, with TypeDB Cluster carrying the bulk of the weight for testing user management.

Current Workaround

We test much of user management with TypeDB Cluster integration tests.

Proposed Solution

We should expand the test suite and the available steps that can be executed during a test. These will be more stateful and akin to our tests which open and close sessions and transactions.

Rule Resolution Testing Methodology

We have considered a number of apporaches to testing that rules resolve correctly, which we are recording here. All of these solutions involve creating a competing method against which to test.

To test the resolution of rules, there are two failure modes that we wish to catch:

i. A rule doesn't fire when it should
ii. A rule fires when it shouldn't

(1) Query Comparison

Centres around manipulating queries.

We start with an initial query, which is executed to find an answer.
We expand the explanations of that query recursively until we reach the set of material facts used, which we substitute back into the original query to find a query that contains nothing inferred.
Find all of the ways that the query could be resolved by the reasoner by making substitutions of a rule's body for a rule's head to the query. This will create a tree of queries that can possibly be resolved (we are calling these the resolutions).
Check that the initial query is present in this resolution tree

This cannot check for failure mode (ii), since not everything in the resolution tree is checked for correctness (no completeness check).
This does not check for failure mode (i) with certainty, since it is possible that a rule doesn't fire when it should, and yet thanks to a different rule (correct or otherwise) the expected resolution is still present in the resolution tree.

(2) "Mega Query"

The basis here is that the sub-queries made by the reasoner for each rule body can be brought together to form one big query which should contain the original answer in its answer set.

We execute an initial query to gain an answer.
We expand the explanations of the answer, and make substitutions into the initial query with the bodies of the rules encountered.
Once the explanations have been exhausted, we have a "Mega Query", which we execute. This gives us concrete instances with IDs.
We check that, in one of the answers returned, these instances correspond to those given by the explanations earlier OR We substitute all of the concrete instance IDs into the "Mega Query" and check that there is one result.

We believe this does not check for failure mode (i) or (ii), since the query is built from explanations given by Grakn, which if incorrect means we are not testing against a competitive solution, but against itself. It seems this deigns this solution useless.

(3) Data Generated from Rules

The basis is to choose the ordering in which rules will be applied ahead of time, and insert data that will trigger them accordingly.

Create a set of trees of rules that will become the expected explanation.
Conjunct (or otherwise) the bodies of the rules at the leaves of the trees to build an insert query, and make that insertion.
Conjunct (or otherwise) the bodies of the rules at the roots of the trees, and execute that query.
Examine the explanation to asser that the initial facts were used and the rules were applied in the correct order OR assert that there was exactly one answer

It seems that this can test for failure modes (i) and (ii)
Building the rule trees seems very difficult, and it seems very difficult to ensure or cater for the fact that there may be more than one path to an inferred fact.

(4) Prolog Translation

Neither Prolog nor Grakn has a way to deterministically produce IDs for inferred facts. Therefore comparing IDs between Grakn and Prolog remains an issue if a translation can be made.

(5) Naive Forward Chaining

The basis is that we use naive forward chaining to expand a set of facts to create a graph that contains all possible inferences and how they can be arrived at.

Begin with a KB with some facts and rules present (let's call this state KB-initial).
Naively iterate through the rules repeatedly, inserting all possible facts that are inferred by each rule. For each insertion also record the name of the rule that was used, and the facts that were used for the inference. We can call this state KB-complete
Create a set of queries which have answers present in KB-complete, where reasoning is disabled (and/or rules have been removed) for KB-complete.
Create KB-test with state as KB-initial.
Run the set of queries from KB-compete on KB-test, and check that the explanation for each inferred concept matches one of the possible paths recorded in KB-complete.
Run a query such as match $x isa thing; get; in order to trigger all rules on KB-test and (takng de-duplication into account) check that the total number of material+inferred facts in KB-test is equal to that in KB-complete.

This approach stands to test in full for failure mode (i) and (ii). Step 6 should test for completeness against failure mode (ii) (that there are no rules that fire when they shouldn't once all possible concepts are inferred).

Attached is a pic of our whiteboard in case it's useful later

Test attribute assignment and reassignment

Problem to Solve

We currently have a set of errors/gap in functionality and documentation to do with moving 'has' attribute ownerships around.

Current Workaround

We can currently do everything one would expect by creating a value concept first:

match $x isa person, has name $name; ?n = $name;
insert $x isa person, has nickname ?n;

Proposed Solution

We should complete the following set of tests in both 'match', 'insert', and 'delete' clauses if they don't already exist:

$x has $attr;

$x has name $attr;

$x has name = "name";

$x has name = $name;

$x has name = ?name;

In particular, we should complete the coverage for cases where the attribute value is 'matched' before, or updated before (in an 'update' query).

Additional Information

We should validate it's possible to move an attribute value from one owned type to another owned type.

Define attribute subtype does not throw if you try to override 'value'

This behaviour should be fixed in core, then the @ignore flag should be removed from the behaviour:

"define attribute subtype throws if you try to override 'value'"

See behaviour/graql/language/define.feature

Handle all types of pattern in resolution testing

Objective

Presently resolution testing only considers conjunctions and statements, this needs to be extended to also consider disjunctions and negations.

"[do not] contain" steps should be replaced by "equals"

Please replace every line in curly brackets { like this } with appropriate answers, and remove this line.

Problem to Solve

All steps that reference "[do not] contain" are just a wordier way to check for exact set equality.

Then entity(customer) get owns key types contain:
  | username   |
  | reference  |
  | work-email |
Then entity(customer) get owns key types do not contain:
  | email |
Then entity(customer) get owns explicit key types contain:
  | reference  |
  | work-email |
Then entity(customer) get owns explicit key types do not contain:
  | username   |
  | email      |

Proposed Solution

The tests could be made terser and more explicit by the introduction of an "equals" or "is exactly" step instead.

Then entity(customer) get owns key types equals:
  | username   |
  | reference  |
  | work-email |
Then entity(customer) get owns explicit key types equals:
  | reference  |
  | work-email |

Un-ignore client-java explanation tests

Description

These tests have been ignored for now until we have bandwidth to return to them. They are needed to ensure that explanations and answer patterns are returned correctly on the client-side. The tests are found at //behaviour/graql/explanation

Construct resolution query from answer's explanation(s)

Objective

In order to verify that the reasoner arrived at inferences correctly, we need to construct a query that will be made on a completed knowledge base (with no rules), using only information given in the answer explanations of a query of an inference-enabled knowledge base (with rules).

Given a query that uses inference, execute it, and for each answer build a query that will assert that all facts used were correct and inferred facts were inferred by the correct application of each rule used.

Full approach detailed in (5) of #20

Missing functionality in Resolution Test framework

The following issues have been highlighted in the Resolution Test framework while writing resolution tests:

Matching attribute types

when {
  $x isa name;
};

will currently fail structural validation. The error is: A structural validation error has occurred. The type [name] of role player [V20672] is not allowed to play Role [instance]

Attribute re-attachment

Currently, it is not possible to complete the materialised keyspace upon defining

define
      transfer-string-attribute-to-other-people sub rule,
      when {
        $x isa person, has string-attribute $r1;
        $y isa person;
      },
      then {
        $y has string-attribute $r1;
      };

because attaching an existing attribute to a second owner is not supported. A blank AssertionError is thrown.

Re-attachment of unrelated attributes

A different error is thrown when the inferred attribute has an unrelated type to the non-inferred attribute:

transfer-attribute-value-to-unrelated-attribute sub rule,
      when {
        $x isa person, has string-attribute $r1;
      },
      then {
        $x has unrelated-attribute $r1;
      };

The error is: grakn.core.kb.graql.exception.GraqlSemanticException: Downcasting concepts from type Base Type [ATTRIBUTE_TYPE] - Id [V16560] - Label [string-attribute] to type Base Type [ATTRIBUTE_TYPE] - Id [V12464] - Label [unrelated-attribute] is not allowed.

This one might need a bit of thought, since the variable $r1 refers to two different concepts, which is a difference between rules and the rest of graql.

Materialised keyspace fails to de-duplicate attributes when counting them

Given

define
      lucky-number sub attribute, value long;
      person has lucky-number;
      rule-1337 sub rule, when { $x isa person; }, then { $x has lucky-number $n; $n 1337; };
      rule-1667 sub rule, when { $x isa person; }, then { $x has lucky-number $n; $n 1667; };
      rule-1997 sub rule, when { $x isa person; }, then { $x has lucky-number $n; $n 1997; };
insert
      $x isa person, has ref 0;
      $y isa person, has ref 1;

then the completeness test should return [3] inferred concepts in the materialised keyspace, but it in fact counts [6] as it fails to de-duplicate the inferred attributes.

Slowness of query

The scenario 3-hop transitivity takes too long to run.

Infinite materialised keyspace

From Scenario: when resolution produces an infinite stream of answers, limiting the answer size allows it to terminate in relation-inference.feature:

Given for each session, graql define
      """
      define

      dream sub relation,
        relates dreamer,
        relates dream-subject,
        plays dream-subject;

      person plays dreamer, plays dream-subject;

      inception sub rule,
      when {
        $x isa person;
        $z (dreamer: $x, dream-subject: $y) isa dream;
      }, then {
        (dreamer: $x, dream-subject: $z) isa dream;
      };
      """
    Given for each session, graql insert
      """
      insert
      $x isa person, has name "Yusuf";
      # If only Yusuf didn't dream about himself...
      (dreamer: $x, dream-subject: $x) isa dream;
      """
    When materialised keyspace is completed

It's currently unclear what this step should do - by definition, forward chaining never terminates.

Type generation

Given

define

      duelist sub person;
      poet sub person;

      romeo-is-a-duelist sub rule,
      when {
        $x isa person, has name "Romeo";
      }, then {
        $x isa duelist;
      };

the step "materialised keyspace is completed" fails with an error saying that downcasting concepts (person to duelist) is not allowed".

Schema queries (variable types)

The step all answers are correct in reasoned keyspace throws various errors, including: ["Currently we only handle Things, and not Types", "The concept RELATION_TYPE - label [friendship] is not of type concept.api.Thing"] when the query to test contains a variable type.

Unidentified issue 1

Given

define
      lucky-number sub attribute, value long;
      person has lucky-number;
      rule-1337 sub rule, when { $x isa person; }, then { $x has lucky-number $n; $n 1337; };
      rule-1667 sub rule, when { $x isa person; }, then { $x has lucky-number $n; $n 1667; };
insert
      $x isa person, has ref 0;
      $y isa person, has ref 1;

Then

      match
        $x isa person, has lucky-number $m;
        $y isa person, has lucky-number $n;
        $m >= $n;
      get;

should work fine, but in fact it throws an especially bizarre error message when verifying that answers are correct in the reasoned keyspace:

grakn.core.test.behaviour.resolution.framework.Resolution$CorrectnessException: Resolution query had 0 answers, it should have had 1. The query is:
 match { $x10 (instance: $r1-x) isa isa-property, has type-label "person"; $r1-n 1337; $x11 (owner: $r1-x) isa has-attribute-property, has lucky-number $r1-n; $r0-y has ref 0; $r1-x has ref 0; $_ (body: $x10, head: $x11) isa resolution, has rule-label "rule-1337"; $r0-x has lucky-number $r0-m; $r0-x has ref 1; $r0-n 1337 isa lucky-number; $x9 (owner: $r1-x) isa has-attribute-property, has lucky-number $r1-n; $_ (body: $x8, head: $x9) isa resolution, has rule-label "rule-1337"; $r1-x has ref 1; $r0-y has lucky-number $r0-n; $r0-m 1337 isa lucky-number; $x8 (instance: $r1-x) isa isa-property, has type-label "person"; $r1-x has lucky-number $r1-n; $r0-x isa person; $r0-y isa person; $r1-x isa person; }; get;

Unidentified issue 2

Given

define
      lucky-number sub attribute, value long;
      person has lucky-number;
      rule-1337 sub rule, when { $x isa person; }, then { $x has lucky-number $n; $n 1337; };
      rule-1667 sub rule, when { $x isa person; }, then { $x has lucky-number $n; $n 1667; };
insert
      $x isa person, has ref 0;
      $y isa person, has ref 1;

Then

      match
        $x isa person, has lucky-number $m;
        $y isa person, has lucky-number $n;
        $m > $n;
        $n > 1667;
      get;

should work fine, but in fact it throws: No resolution queries were constructed for query

Unidentified issue 3

Given for each session, graql define
      """
      define

      iceland-sells-drinks sub rule,
      when {
        $x isa soft-drink;
      },
      then {
        $x has retailer 'Iceland';
      };

      poundland-sells-drinks sub rule,
      when {
        $x isa soft-drink;
      },
      then {
        $x has retailer 'Poundland';
      };

      londis-sells-drinks sub rule,
      when {
        $x isa soft-drink;
      },
      then {
        $x has retailer 'Londis';
      };
      """
    Given for each session, graql insert
      """
      insert $x isa soft-drink, has name "Fanta", has ref 0;
      """
    When materialised keyspace is completed
    Then for graql query
      """
      match
        $x has retailer $rx;
        $rx contains "land";
      get;
      """

This fails the completeness check - The complete KB contains 0 inferred concepts, whereas the test KB contains 3 inferred concepts. It also fails to complete the materialised keyspace sometimes: Java.lang.IllegalArgumentException: Neither the sideEffects, map, nor path has a #1594749248821829-key: WherePredicateStep(eq(#1594749248821829)) at org.apache.tinkerpop.gremlin.process.traversal.step.Scoping.getScopeValue(Scoping.java:124) ...

Unidentified issue 4

Given for each session, graql define
      """
      define

      iceland-sells-drinks sub rule,
      when {
        $x isa soft-drink;
      },
      then {
        $x has retailer 'Iceland';
      };

      poundland-sells-drinks sub rule,
      when {
        $x isa soft-drink;
      },
      then {
        $x has retailer 'Poundland';
      };

      londis-sells-drinks sub rule,
      when {
        $x isa soft-drink;
      },
      then {
        $x has retailer 'Londis';
      };
      """
    Given for each session, graql insert
      """
      insert
      $x isa soft-drink, has name "Fanta", has ref 0;
      $y isa soft-drink, has name "Tango", has ref 1;
      """
    When materialised keyspace is completed
    Then for graql query
      """
      match
        $x has retailer $rx;
        $y has retailer $ry;
        $rx == $ry;
        $ry contains 'land';
      get;
      """

This fails the correctness check -

Resolution query had 0 answers, it should have had 1. The query is:
 match { $r0-rx "Iceland" isa retailer; $r1-x has retailer $r1-1594203238859810; $r0-ry "Iceland" isa retailer; $x5 (owner: $r1-x) isa has-attribute-property, has retailer $r1-1594203238859810; $r0-y has ref 0; $r1-x has ref 0; $r0-x has retailer $r0-rx; $r0-x has ref 1; $r1-x isa soft-drink; $r1-x has ref 1; $r0-ry contains "land"; $_ (body: $x2, head: $x3) isa resolution, has rule-label "iceland-sells-drinks"; $r0-y has retailer $r0-ry; $r1-x has retailer $r1-1594203238859788; $x3 (owner: $r1-x) isa has-attribute-property, has retailer $r1-1594203238859788; $_ (body: $x4, head: $x5) isa resolution, has rule-label "iceland-sells-drinks"; $x2 (instance: $r1-x) isa isa-property, has type-label "soft-drink"; $x4 (instance: $r1-x) isa isa-property, has type-label "soft-drink"; }; get;

Unidentified issue 5

Given for each session, graql define
      """
      define

      iceland-sells-drinks sub rule,
      when {
        $x isa soft-drink;
      },
      then {
        $x has retailer 'Iceland';
      };

      poundland-sells-drinks sub rule,
      when {
        $x isa soft-drink;
      },
      then {
        $x has retailer 'Poundland';
      };

      londis-sells-drinks sub rule,
      when {
        $x isa soft-drink;
      },
      then {
        $x has retailer 'Londis';
      };
      """
    Given for each session, graql insert
      """
      insert
      $x isa soft-drink, has name "Fanta", has ref 0;
      $y isa soft-drink, has name "Tango", has ref 1;
      """

then: The complete KB contains 0 inferred concepts, whereas the test KB contains 3 inferred concepts.

Unidentified issue 6

Given for each session, graql define
      """
      define
      person has age;
      age sub attribute, value long;
      not-ten sub rule,
      when {
        $x isa person;
        not { $x has age 10; };
      }, then {
        $x has name "Not Ten";
      };
      """
    Given for each session, graql insert
      """
      insert
      $x isa person, has age 10;
      $y isa person, has age 20;
      """
    When materialised keyspace is completed
    Then for graql query
      """
      match $x has name "Not Ten", has age 20; get;
      """
    Then all answers are correct in reasoned keyspace

The last step should pass, but instead it fails to find any answers.

Unidentified issue 7

Given for each session, graql define
      """
      define
      dominion sub relation, relates ruler, relates ruled-person;
      giant-turtle sub entity, plays ruler;
      person plays ruled-person;

      giant-turtles-rule-the-world sub rule,
      when {
        $r (ruled-person: $p) isa dominion;
        $gt isa giant-turtle;
      }, then {
        $r (ruler: $gt) isa dominion;
      };
      """
    Given for each session, graql insert
      """
      insert
      $x isa person;
      $y isa person;
      $z isa giant-turtle;

      (ruled-person: $x) isa dominion;
      (ruled-person: $y) isa dominion;
      """
    When materialised keyspace is completed
    Then for graql query
      """
      match
        (ruled-person: $x, ruler: $y) isa dominion;
      get;
      """
    Then all answers are correct in reasoned keyspace

The last step should find answers, but it fails to find any.

Unidentified issue 8

From Scenario: rules can divide entities into groups, linking each entity group to a specific concept by attribute value in value-predicate.feature:

    Given for each session, graql define
      """
      define

      soft-drink plays priced-item;

      price-range sub attribute, value string,
        plays price-category;

      price-classification sub relation,
        relates priced-item,
        relates price-category;

      expensive-drinks sub rule,
      when {
        $x has price >= 3.50;
        $y "expensive" isa price-range;
      }, then {
        (priced-item: $x, price-category: $y) isa price-classification;
      };

      not-expensive-drinks sub rule,
      when {
        $x has price < 3.50;
        $y "not expensive" isa price-range;
      }, then {
        (priced-item: $x, price-category: $y) isa price-classification;
      };

      low-price-drinks sub rule,
      when {
        $x has price < 1.75;
        $y "low price" isa price-range;
      }, then {
        (priced-item: $x, price-category: $y) isa price-classification;
      };

      cheap-drinks sub rule,
      when {
        (priced-item: $x, price-category: $y) isa price-classification;
        $y "not expensive" isa price-range;
        (priced-item: $x, price-category: $y2) isa price-classification;
        $y2 "low price" isa price-range;
        $y3 "cheap" isa price-range;
      }, then {
        (priced-item: $x, price-category: $y3) isa price-classification;
      };
      """
    Given for each session, graql insert
      """
      insert

      $x isa soft-drink, has name "San Pellegrino Limonata", has price 3.99;
      $y isa soft-drink, has name "Sprite", has price 2.00;
      $z isa soft-drink, has name "Tesco Value Lemonade", has price 0.39;

      $p1 "expensive" isa price-range;
      $p2 "not expensive" isa price-range;
      $p3 "low price" isa price-range;
      $p4 "cheap" isa price-range;
      """
    When materialised keyspace is completed
    Then for graql query
      """
      match
        $x "not expensive" isa price-range;
        ($x, priced-item: $y) isa price-classification;
      get;
      """
    Then all answers are correct in reasoned keyspace
    Then answer size in reasoned keyspace is: 2
    Then for graql query
      """
      match
        $x "low price" isa price-range;
        ($x, priced-item: $y) isa price-classification;
      get;
      """
    Then all answers are correct in reasoned keyspace
    Then answer size in reasoned keyspace is: 1
    Then for graql query
      """
      match
        $x "cheap" isa price-range;
        ($x, priced-item: $y) isa price-classification;
      get;
      """
#    Then all answers are correct in reasoned keyspace

The last step should succeed, but the resolution query fails to have any answers.

Unidentified issue 9

From Scenario: the relation type constraint can be excluded from a reasoned match query in relation-inference.feature

Given for each session, graql define
      """
      define
      transitive-location sub rule,
      when {
        (location-subordinate: $x, location-superior: $y) isa location-hierarchy;
        (location-subordinate: $y, location-superior: $z) isa location-hierarchy;
      }, then {
        (location-subordinate: $x, location-superior: $z) isa location-hierarchy;
      };
      """
    Given for each session, graql insert
      """
      insert
      $x isa place, has name "Turku Airport";
      $y isa place, has name "Turku";
      $z isa place, has name "Finland";

      (location-subordinate: $x, location-superior: $y) isa location-hierarchy;
      (location-subordinate: $y, location-superior: $z) isa location-hierarchy;
      """
    When materialised keyspace is completed
    Then for graql query
      """
      match
        $a isa place, has name "Turku Airport";
        ($a, $b);
        $b isa place, has name "Turku";
        ($b, $c);
      get;
      """
    Then all answers are correct in reasoned keyspace

The last step should pass, but in fact it returns the wrong number of answers: the materialised keyspace has 4 answers, but it should have had 2.

Unidentified issue 10

From Scenario: inferred relations can be filtered by shared attribute ownership in relation-inference.feature

Given for each session, graql define
      """
      define
      selection sub relation, relates choice1, relates choice2;
      person plays choice1, plays choice2;
      symmetric-selection sub rule,
      when {
        (choice1: $x, choice2: $y) isa selection;
      }, then {
        (choice1: $y, choice2: $x) isa selection;
      };
      transitive-selection sub rule,
      when {
        (choice1: $x, choice2: $y) isa selection;
        (choice1: $y, choice2: $z) isa selection;
      }, then {
        (choice1: $x, choice2: $z) isa selection;
      };
      """
    Given for each session, graql insert
      """
      insert
      $x isa person, has name "a";
      $y isa person, has name "b";
      $z isa person, has name "c";

      (choice1: $x, choice2: $y) isa selection;
      (choice1: $y, choice2: $z) isa selection;
      """
    When materialised keyspace is completed
    Then for graql query
      """
      match
        (choice1: $x, choice2: $y) isa selection;
        $x has name $n;
        $y has name $n;
      get;
      """
    Then all answers are correct in reasoned keyspace

Firstly, materialisation takes about 4 minutes, which is far too long.
Secondly, 'all answers are correct' step takes a very long time to execute.

Unidentified issue 11

Scenario: when evaluating negation blocks, global subgoals are not updated

    The test highlights a potential issue with eagerly updating global subgoals when branching out to determine whether
    negation conditions are met. When checking negation satisfiability, we are interested in a first answer that can
    prove us wrong - we are not exhaustively exploring all answer options.

    Consequently, if we use the same subgoals as for the main loop, we can end up with a query which answers weren't
    fully consumed but that was marked as visited.

    As a result, if it happens that a negated query has multiple answers and is visited more than a single time
    - because of the admissibility check, answers might be missed.

    Given for each session, graql define
      """
      define

      session sub entity,
          plays parent-session;
      fault sub entity,
          plays relevant-fault,
          plays identified-fault,
          plays diagnosed-fault;
      question sub entity,
          has response,
          plays identifying-question,
          plays question-logged,
          plays question-not-answered;

      response sub attribute, value string;

      reported-fault sub relation,
          relates relevant-fault,
          relates parent-session;

      logged-question sub relation,
          relates question-logged,
          relates parent-session;

      unanswered-question sub relation,
          relates question-not-answered,
          relates parent-session;

      fault-identification sub relation,
          relates identifying-question,
          relates identified-fault;

      diagnosis sub relation,
          relates diagnosed-fault,
          relates parent-session;


      no-response-means-unanswered-question sub rule,
      when {
          $ques isa question;
          (question-logged: $ques, parent-session: $ts) isa logged-question;
          not {
              $ques has response $r;
          };
      }, then {
          (question-not-answered: $ques, parent-session: $ts) isa unanswered-question;
      };

      determined-fault sub rule,
      when {
          (relevant-fault: $flt, parent-session: $ts) isa reported-fault;
          not {
              (question-not-answered: $ques, parent-session: $ts) isa unanswered-question;
              ($flt, $ques) isa fault-identification;
          };
      }, then {
          (diagnosed-fault: $flt, parent-session: $ts) isa diagnosis;
      };
      """
    Given for each session, graql insert
      """
      insert
      $sesh isa session;
      $q1 isa question;
      $q2 isa question;
      $f1 isa fault;
      $f2 isa fault;
      (relevant-fault: $f1, parent-session: $sesh) isa reported-fault;
      (relevant-fault: $f2, parent-session: $sesh) isa reported-fault;

      (question-logged: $q1, parent-session: $sesh) isa logged-question;
      (question-logged: $q2, parent-session: $sesh) isa logged-question;

      (identified-fault: $f1, identifying-question: $q1) isa fault-identification;
      (identified-fault: $f2, identifying-question: $q2) isa fault-identification;
      """
    When materialised keyspace is completed
    Then for graql query
      """
      match (diagnosed-fault: $flt, parent-session: $ts) isa diagnosis; get;
      """
    Then answer size in reasoned keyspace is: 0
    Then answers are consistent across 5 executions in reasoned keyspace
#    Then materialised and reasoned keyspaces are the same size

The final check fails because: The complete KB contains 15 inferred concepts, whereas the test KB contains 13 inferred concepts.

Complete Axiomatic Tests - implement missing Graql query scenarios

The BDD scenarios currently cover relatively comprehensively (except verification/errors that should throw as in #11):

Undefine
Define
insert
match
delete

Some missing tests that should be implemented have the scenario names written in the respective feature files: match, get, insert all have missing implementations.

I will supervise as needed when completing this task

Rename "user expiry-seconds" step

Please replace every line in curly brackets { like this } with appropriate answers, and remove this line.

Problem to Solve

We have a step "user expiry-seconds" that should check if the user's password expiration field is set, but it is not clear from the step.

Proposed Solution

Rename this step to something more meaningful such as "user has password expiry". Then rename it in all clients and verify if they do the expected check (it seems that java-client doesn't).

Error type testing

Problem to Solve

We should test whether our Graql queries throw semantic, syntax, or other exceptions.

Current Workaround

The BDD tests, which do not support error type testing at present, test that an error was thrown, but do not inspect the error type to verify that it is correct.

Move resolution tests to negation feature where applicable

Some of our tests are labelled with "TODO: move to negation.feature".

This should be done once resolution tests are finished.

vaticle / typedb-behaviour Goto Github PK

typedb-behaviour's Introduction

TypeDB Behaviour Specification

typedb-behaviour's People

Contributors

Stargazers

Watchers

Forkers

typedb-behaviour's Issues

Problem to Solve

Proposed Solution

Problem to Solve

Proposed Solution

Objective

Problem to Solve

Proposed Solution

Objective

Objective

Problem to Solve

Proposed Solution

Objective

Problem to Solve

Proposed Solution

Problem to Solve

Proposed Solution

Problem to Solve

Proposed Solution

Objective

Problem to Solve

Current Workaround

Proposed Solution

Rule Resolution Testing Methodology

(1) Query Comparison

(2) "Mega Query"

(3) Data Generated from Rules

(4) Prolog Translation

(5) Naive Forward Chaining

Problem to Solve

Current Workaround

Proposed Solution

Additional Information

Objective

Problem to Solve

Proposed Solution

Description

Objective

Matching attribute types

Attribute re-attachment

Re-attachment of unrelated attributes

Materialised keyspace fails to de-duplicate attributes when counting them

Slowness of query

Infinite materialised keyspace

Type generation

Schema queries (variable types)

Unidentified issue 1

Unidentified issue 2

Unidentified issue 3

Unidentified issue 4

Unidentified issue 5

Unidentified issue 6

Unidentified issue 7

Unidentified issue 8

Unidentified issue 9

Unidentified issue 10

Unidentified issue 11

Problem to Solve

Proposed Solution

Problem to Solve

Current Workaround

Recommend Projects

Recommend Topics

Recommend Org