Giter Club home page Giter Club logo

verifiable-tests's Introduction

Verifiable Code-Writing Tests for CS Students

This application attempts to apply formal methods (D. Gries style) to specification and verification of code-writting tasks for CS students.

It is still under development, but you can already use it for all the task examples we have and the other ones you come up with (see task files in /tasks for example of a task structure).

Steps You Need to Get it Runnging

  1. Install Node.js.

  2. Clone repository with

    git clone https://github.com/harrytallbelt/verifiable-tests.git

    or just download and unzip it.

  3. Enter the repo directory:

    cd verification-tests

  4. Install dependencies:

    npm install

  5. Run it with

    node server.js

    and check out localhost:3000 in your browser.

  6. If you want to run on a different interface or choose a different port, set VTESTS_HOST and VTESTS_PORT environment variables.

verifiable-tests's People

Contributors

harrytallbelt avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

alkhizha

verifiable-tests's Issues

doTheoremPt6 leaves commands[] empty

doTheoremPt6 leaves commands[] of a loop branch empty
if the branch contains nothing but loops. Try to repeat (there will be
a null dereference at some point):

do i <> n ->
  do j <> m -> skip od
od

Invariant-variant pairs as objects

Shouldn't we store invariant-variant pairs as objects and keep
an array of these objects as a task description object field?
(Currently invariants and variants are two separate array-fields of a task.)

Document axiom triggers

Comment and add examples (of what is being prsed and what's not) to all axiom triggers, as they contain a bit tricky code.

Keeping track of the number of loops

Currently we do not support errors of too many or too few loops
in a program code. (Task description assumes a specific number
of loops.)
In the first case, it will probably result in a nasty server error
at some point.

Custom axiom parsing code support

Some axioms require changing the actual code of the system
(see to-simplify-syntax.js, wp.js, and PredicatesRepresentationBuilder.js),
which is not a good thing. This can be solved by allowing the user (teacher)
to write their own parsing code. This way a new axiom definition will contain:

  1. the actual Simplify definition file,
  2. JS file with a function for recognition of the parts of the predicate,
    relevant to the defined axiom use.

Reusable SUM, PROD, and COUNT quantifiers

We cannot define SUM, PROD, and COUNT quantifiers for all possible
inner expression/predicate structure, because Simplify does not support
function- and predicate-variables. For example (SUM k : Q(k) : a[k])
cannot be defined with the same axiom as (SUM k : Q(k) : a[k]*b[k]).

To solve this we might change any quantifier (SUM k : Q(k) : f(k, <outer-var-list>))
into (SUM k : Q(k) : @a[k]) while adding a condition that (A k : Q(k) : @a[k] = f(k, ...)).
This way we can only define SUM, PROD and COUNT for an array.
(In fact, COUNT can be defined as (N k : Q(k) : P(k, ...)) == (SUM k : Q(k) : a[k]),
where (A k : Q(k) : P(k, ...) ^ @a[k] = 1 | ~P(k, ...) ^ @a[k] = 0).)

One complication might be name generation (for @a), because the id format
for Simplify (Modula-3) is the same as for our predicates, so we cannot use any
symbols, we know wouldn't be in the predicate. If we generate a random name,
we'll probably still want to check if it isn't already used in predicate. We also can
concatenate all the names in predicate (UPD: or just find the longest one and
make sure the generated name is longer) and add a number in the end for each
new array. It might not be a big part of the task, as we will (probably) need
to change the implication premises.

The last issue is does this all mean that our system becomes extremely Simplify-oriented?
UPD: This is probably not an issue, because all the feature code will be in
the exact part of the system responsible for working with Simplify.

DTO docs

The program makes use of DTO objects for parsed programs, predicates and WP context objects, but generally lacks documentation on those.

Interface support for multiple loops

The current take on UI assumes only one loop in the program, but the backend implements the generalised approach for any number of loops. The UI needs to become aware of that and new examples should be added.

wp.js/withoutLoops bug

In wp.js, function withoutLoops does not remove loops from inside if statements, but the code, apparently, expects it to.

Run appropriate Simplify binaries

Currently Windows Simplify binary is run regardless of the user's system (it works on Windows and on Linux via wine). But there are also Simplify binaries for Linux and MacOS, so we can choose which one to use, depending on the OS.

The binaries can be found in bin/versions or downloaded from via the link on this page.

Asynchronous verification script

There is the possibility that input code verification might take too much time for a node.js server. This should be researched and tested, and, if needed, we might want to run the script asynchronously or, better yet, in a separate thread.

The reason this might be irrelevant is because the script itself handles file readings and prover callings asynchronously, so it might not take such an amount of interpreting time in one go. (The prover calling is not implemented on the moment of issue firing, but there seems to be no way it would be synchronous in node.js and we certainly do not want this.).

UPD: Apparently the way to do it is to call setTimeout for 0 ms.

Incorrect handling of IF statements in pt.6 of DO proofs

When proving a DO statement, we toss off all the inner loops of a loop.
This leads to IF statements proving problems, because we lose the conditions,
provided by the inner loop guards negations:
For example consider a loop body

x := -1;
do x < 0 -> ... od;
if x < 0 -> abort | ... fi

By throwing the inner loop out, we lose the knowledge that x will never be < 0 at IF.

We already have problems of this sort, for example proving matrix_search
with an incorrect inner loop initialisation command j := 1 yields not only the
correct error, but also errors in loop finalisation IF statement.

Custom definitions support

Can we allow the user (teacher) to add their own definitions to simplify description of the task? For example

{
  "precondition": "n = N && x = X && equalArrays(a, A) && contains(A, X, 0, N-1)",
  ...
  "definitions": [
  {
    "name": "contains",
    "arguments": [ "a", "x", "from", "to" ],
    "definition": "(E k : from <= k <= to : a[k] = x)"
  },
  ...
  ]
  ...
}

Would this be consistent with perm - a predicate that we do not define, only describe some of its properties?

Precompiled Handlebars templates

Handlebars docs have a note about precompiled templates here, but it seems to talk about client side. Anyway it is probably possible to reuse compiled templates on server side (in the end, it can be a singleton object).

Add bubble sort

Apparently bubble sort is not implemented, because we thought inversion count would be hard to implement, but it wouldn't (see axioms/array-perm).

Perm predicate handling

After solving #28 and #30 we should be able to throw out perm predicate from the predicate grammar and the system code.

Record types

It seems the record types can be implemented pretty easily. The name of the record field being referenced would just be one more type of selector. Then, analogous to arrays, a record can be thought of like a function: field-name -> field-value.

For arrays these two lines are the same.
a[n] := a[m]
a := (a; n:a[m])

For records these two lines are the same.
john.age := samuel.age
john := (john; age:samuel.age)

Simplify supports this kind of operations via its theory of maps.

Script for testing on all the tasks

  • Store all the solutions somewhere (a secret gist?) in json format {"name":"solution"}.
  • Write a script that tests a list of task description files and outputs the failing predicates for each of them.

Syntax

node test.js SOLUTION.json [TASK_0.json ... TASK_N.json]

Calling example

node testing-utils/test.js testing-utils/solutions.json `ls tasks/*.json` | tee out.txt

Output format

; TASKNAME, TASKFILE.json
; ------------------------
[AXIOMS]
; FAILING_PREDICATE_IN_PREDICATE_SYNTAX
SIMPLIFY_FAILING_PREDICATE

Output example

; array_sum, sum.json
; ------------------------

(BG_PUSH
  (FORALL (i j b) (PATS (sum i j b))
    (AND
      (IMPLIES
        (> i j)
        (EQ (sum i j b) 0) )
      (IMPLIES
        (<= i j)
        (AND
          (EQ
            (sum i j b)
            (+ (select b j) (sum i (- j 1) b)) )
          (EQ
            (sum i j b)
            (+ (select b i) (sum (+ i 1) j b)) ))))))

; n=N && (A k : 0 <= k < N : b[k]=B[k]) =>
;   n=N && (A k : 0 <= k < N : b[k]=B[k]) &&
;   0<=1<=N && s=(SUM k : 0<=k<1 : B[k])
(IMPLIES
  (AND
    (EQ n N)
    (FORALL (k)
      (IMPLIES
        (AND (<= 0 k) (< k N))
        (EQ (select b k) (select B k)))))
  (AND
    (EQ n N)
    (FORALL (k)
      (IMPLIES
        (AND (<= 0 k) (< k N))
        (EQ (select b k) (select B k))))
    (<= 0 1)
    (<= 1 N)
    (FORALL (k)
      (IMPLIES
        (AND (<= 0 k) (< k 1))
        (EQ s (sum B 0 (- 1 1)))))))

exec for Windows

In prove.js quotation marks in exec's argument command are rendered in cmd.

We should test the code on Windows.

Keeping track of invariant and bondary functions

The current WP implementation does not behave well with nested loops and if's.
When branches of a loop or an if are considered one-by-one, each of them is given
the same innerSpec object. Those should actually be different, because WP might
remove some number of invariants and boundary functions on each of branches
and we will need to give the updated spec to the next branch proof.

There is a possibility that it might just work in a hacky way the way it is now.
That is because we do not copy specs and effectively all branches have the same
spec object. When map is run over these objects, WP will remove the invariants and
boundary functions just in the right order. Anyway, this is way to subtle, and frankly
just a coincidence.

UPT from 27.04.2017
For multiple loops we also need to differentiate t's of each loop (used to prove pt. 5 of the loop theorem).

Incorrect picking of loop invariant and variant pair

Choosing loop variant and invariant is currently done by taking the first element
of the arrays (using shift), which won't work for several sequential loops. On the
other hand, taking the last pair (with pop) will work for sequential loops, but won't
for the nested ones.

Taking the first loop in the sequence (instead of the last one) might solve this.
Otherwise, the way invariant-variant pairs are saved needs to be reconsidered.

Use linked lists

The current take on program specification uses arrays to store invariants, boundary functions, result predicates, and context objects. (Double?) linked list seem to better suit this task, because we do not need random access, but do want persistent slices when adding/removing first/last elements.

The program representation also uses arrays, while it probably can also benefit from lists.

The suggested package is this one: https://www.npmjs.com/package/yallist

Note that:

  • this is not performance critical right now,
  • but may result in a better looking code (without so much array copying).

Antlr parser errors

Currently antlr parser errors aren't handled. We need to add support for syntactic errors and notify the user in case they make one.

UPD: We might want to reuse the error format antlr uses for errors in ProgramDtoBuilder.js (search for TODO).

UPD: Look into antlr semantic predicates to check if an assignment statement has identica lvalues (probably won't work out, but still).

Localisation

Considering the project origins, it seems sensible to add support for Russian and Ukrainian languages.

Task syntax checking script

Implement a script that

  • checks predicate syntax without shorthands
  • checks shorthands
  • applies shorthands and checks resulting predicate syntax
  • checks axioms somehow ???

Identical lvalues in assignment

We claim that using the same lvalues is a syntax error, but it is definitely a semantic one, because we cannot check all these options:

a[i + 1], a[1 + i] := 123, 321
x := 1;
a[i + 1], a[i + x] := 123, 321
x := 0;
do k <> n && a[k] <> val ->
  x := x + 1
od;
a[i + 1], a[i + x] := 123, 321  // sometimes x will be 1 and sometimes not

We should probably throw out the code, that checks that lvalues are identical (it is obviously wrong) and just state a specific order in which the assignment is performed (left to right, for example).

Note: this makes errors in ProgramRepresenatationBuilder class not needed anymore.

Generating PATS forbids usage of unique bound variables

Fixing #47 we used an axiom trigger to generate PATS clojure, but the callback in the trigger cannot use getUniqueName() function and so uses the unchanged bound variable, which is prone to errors.

Maybe, in trigger, we should provide a way of getting the standard conversion result.

Predicate rendering problems

  • We should render parentheses, because they are needed sometimes: a && (x => y).
  • We should find a way to indent predicates, as they can be quite long.

Implement predicate parser

We would like to parse predicates from mathematical notation. It should also be documented (connected to #8).

UPD: quirks of predicate notation we'll have to do something about (either implemented or ban from usage)

  • B[0:n-1], array slices; (ruled out)
  • perm((x,y,z), (X,Y,Z)), permutations of vectors;
  • x, y = X, Y, vector equalities;
  • division (used in "reverse" task), (ruled out)
  • a <= x <= b, range defining comparisons;
  • b = B, array equivalence; (ruled out)
  • sum(A,a,b), Contains(A,a,b,x), functions and predicates defined via axioms. (ruled out)

Storing number constants as numbers vs strings

Currently, when parsing pseudocode, the number constants are converted to int's which might in some cases cause overflow. To prevent that, we may store int constants as strings, but it isn't clear if we should.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.