datacamp / sqlwhat Goto Github PK

View Code? Open in Web Editor NEW

3.0 24.0 8.0 1.28 MB

Home Page: https://sqlwhat.readthedocs.io

License: GNU Affero General Public License v3.0

Python 99.56% Makefile 0.44%

fs le

sqlwhat's Introduction

sqlwhat

sqlwhat enables you to write Submission Correctness Tests (SCTs) for interactive SQL exercises on DataCamp.

If you are new to teaching on DataCamp, check out https://authoring.datacamp.com.
If you want to learn what SCTs are and how they work, visit this article specifically.
For more information about writing SCTs for SQL exercises, consult https://sqlwhat.readthedocs.io.

Installing

pip install sqlwhat     # install from pypi
make install            # install from source

Reference

API Docs: https://sqlwhat.readthedocs.io
AST viewer: https://ast-viewer.datacamp.com
Extensions: https://github.com/datacamp/sqlwhat-ext

Raising issues with how SQL is parsed

Please raise an issue on the respsective parser repo:

Basic Use

from sqlwhat.State import State    # State holds info needed for tests
from sqlwhat.Reporter import Reporter
from sqlwhat.checks import *       # imports all SCTs
from sqlalchemy import create_engine

code = "SELECT * FROM artists WHERE id < 100"

state = State(
    student_code = code,
    solution_code = code,
    pre_exercise_code = "",
    student_conn = create_engine('sqlite:///'),
    solution_conn = create_engine('sqlite:///'),
    student_result = {'id': [1,2,3], 'name': ['greg', 'jon', 'martha']},
    solution_result = {'id': [1,2,3], 'name': ['toby', 'keith', 'deb']},
    reporter = Reporter()
    )

# test below passes, since code is equal for student and solution
has_equal_ast(state)

# test below raises a TestFail error, since 'name' col of results
# doesn't match between student and solution results
check_result(state)
# shows error data
state.reporter.build_payload()

# can also be done using a chain
from sqlwhat.sct_syntax import Ex
Ex(state).check_result()

Running unit tests

pytest -m "not backend"

If you also want to run the backend tests, you need to set a GITHUB_TOKEN environment variable with access to the (private) sqlbackend repository. After this, you can:

make install
pytest

Rules of testing

Running queries is the backend's job
If a test doesn't run queries, it doesn't need the backend
Very few tests should run queries

Building Docs

Install sqlwhat and run ..

cd docs
make html

sqlwhat's People

Contributors

Stargazers

Watchers

Forkers

optionalg fossabot codecamp akafle1003 fagan2888 4rtess ra1ze505 shraysalvi

sqlwhat's Issues

Implement basic messaging system

generic pattern for AST libraries to report human friendly name of AstNode instances
sqlwhat only uses most recent message for now, relies on highlighting for context. (e.g. rather than "Check the 1st Select statement. Check the WHERE clause. Did you _____?", Use highlighting + "Did you ____?".
sqlwhat uses human friendly names when can't find node or field (e.g. "Did you define a {ordinal} {node_name}" -> "Did you define a 2nd SELECT statement?")

Move student query errors to state, rather than reporter

Developer friendly syntax error message

where -> check_result - "incorrect third condition in WHERE?"

In general, producing a statement like this is very hard. It's unclear how this should be done. For reference, a WHERE clause might look like this...

(Note that we essentially run into the same problem here as for test_operator in pythonwhat.)

move out postgres parser and put it on pypi

Currently, we use travis to build sqlwhat (by generating the antlr python targets). However, if we move the parser out, then we can move all the building to the parser, which will likely be updated less frequently than sqlwhat itself. This will also help for creating a small app to visualize the AST, since it and sqlwhat can just depend on the parser.

use subset of unit tests to examine feedback messages

Many tests have the form

def test_test_has_columns_fail():
    state = prepare_state({'a': [1,2,3]}, {})
    with pytest.raises(TF): cr.test_has_columns(state)

For some of them, it would be useful to optionally view their feedback message. Could do this by doing something like

@pytest.mark.feedback
def test_something():
    ....

Then they can be viewed by running from command-line...

pytest -s -m "feedback"    # plus some meta-parameter telling tests to print feedback

test_correct

should be able to pull from pythonwhat

SCTErrors and TestFail should indicate which SCT raised the exception

SCT chains can store calls as actions, which can be put together in string form on fail.

Cases:

if SCT uses test_column(), but no results

replace ast list with ListNode

Currently, when you get the target_list of a SelectStmt, you get a literal python list. That's fine for comparing currently, since things like has_equal_ast use repr to compare AST representations. However, it could cause problems down the road.

override_sql function

has_equal_ast will be equivalent to using override_sql -> testing the asts are identical

use single dispatch to call different functions for check_result

plsql grammar - select statement may end with ';'

question about check_field with DISTINCT

Query

SELECT DISTINCT country
FROM films;

SCT

distinct = sel.check_field('pref').has_equal_ast(msg='blah')

sqlwhat-viewer

Error

Question

Has the field for DISTINCT changed to something else?

cc @ncarchedi

defer SCTs if not chained from Ex()

Swipe it from pythonwhattt

Adding test_ncols to check_result

Overview

Currently check_result() will pass if the columns in the student submission are some superset of the columns in the solution.

This means that

SELECT *
FROM films;

will pass, when the desired solution is:

SELECT title, release_year
FROM films;

Enhancement Request

Adding test_ncols() to check_result() so the number of columns in student submission must match the number of columns in the solution code. I would argue that this is desirable default behavior, but it's up for discussion: cc @ncarchedi

Code font in Incorrect Submission message

Enhancement request:

duration_days --> duration_days

code highlighting

check_clause - gets AST node attribute

can be skipped to get next statement (e.g. check select stmt -> check select stmt)
missing msg, if attribute is None
can be followed by has_equal_ast

raise NotImplemented error if has_equal_ast does not have a custom message.

This is since a good, robust message for has_equal_ast is going to take time, and exercise data to figure out.

check_result - "have you included all columns?"

Dump AST to dict (for JSON serialization)

Allow has_equal_ast to pass if student AST contains solution

Similar to pythonwhat. This will allow SCTs like

*** =solution

SELECT * 
FROM film
WHERE release_year = 2006
AND film_id < 5;

*** =sct


Ex().check_node('SelectStmt') \
    .check_field('where_clause') \
    .has_equal_ast(sql = "film_id < 5", start="expression", exact = False)

Note that we'll need to work out how to show CDs the names of parser rules in sqlwhat-viewer (so they can specify the start arg).

test_or function

Dialect specific handling of case sensitivity

Currently, to workaround tsql issues...

AST representations are converted to lowercase
Column names are converted to lowercase

In postgres, these are not issues because unquoted column names should be made lowercase in the AST (as the official parser does). This also means that column names tend to be lowercase, too. Using casing is possible, though, so we don't want to convert everything to lowercase for postgres.

column check incorrectly fails when there are no rows

Support row order independent checking in check_column

When checking columns, there should be an option to compare the columns without accounting for the order of the rows. This can be done brutally, by simply ordering the column and seeing what happens, or ordering by a certain column, and keeping the permutation to order other columns accordingly (which is more correct I think).
As this can raise the number of steps in checking quite a bit, I think you should make this opt-in to start with.

add LIMIT to grammar

e.g. SELECT * FROM artists LIMIT 10

check_result - automatic feedback "1 row when should be 2"

PEC

CREATE TABLE simple ( id INTEGER);
INSERT INTO simple (1) (2)

Solution

SELECT * FROM simple

Submission

SELECT * FROM simple LIMIT 1

Give feedback if blanks are left, even if they cause syntax error

Raised by @colinricardo in https://github.com/datacamp/courses-msft-transact-sql/issues/49#issuecomment-302792248

See screencast: https://v.usetapes.com/WayrkiARsE

Basically, adding blanks can have the exercise pass the SCT.

Ignore AST tests if parsing error in antlr

It's possible that students will submit code that is valid postgresql, but the parser can't handle. In this case..

It should raise (e.g.) a ParsingError, that is caught and short-circuits chaining
Things like checking results should still run

raise InstructorError exception for informative CD feedback

AST node -> position in raw submission text

for ctx = ast_node._ctx, should be able to use something like

start = ctx.start     # starting token
start.start              # it's first letter
start.stop
start.line

stop = ctx.stop      # stopping token
# etc..

(I'm not sure what happens if a token spans multiple lines, but also can't think of when that would happen)

Provide option in check_result to make case-insenstive

Feedback from Graeme, Microsoft contact for the transact-sql course:

Also, I noticed that I get an error based on the column name case not matching the expected answer (e.g. if I were to use the alias CustomerName when it expects Customername) – by default , Transact-SQL is case-insensitive so this should not cause an error. Can you fix that?

I suggest we put an option in check_result() and test_column() to be forgiving for case sensitivity. Pretty urgent.

Success message for plain MCE's is always the same in sqlwhat

@ncarchedi commented on Wed Apr 26 2017

Summary of Issue

In sqlwhat, the success message for PlainMultipleChoiceExercises is always Great work!, overriding whatever custom message has been specified by the author.

Steps to Reproduce

Go here and submit the correct answer. The success message should be different.

Screenshots/Links

@ramnathv commented on Thu Apr 27 2017

@filipsch @machow can you take a look at sqlwhat?

Mapping column aliases should work in clauses

If I understand the spec correctly people should be able to specify that these two options are both correct even when testing ASTs...

SELECT id FROM sometable ORDER BY id ASC

and

SELECT id as id2 FROM sometable ORDER BY id2 ASC

However, if the order by clause is incorrect in the second give feedback:

-- "incorrect ORDER BY clause"
SELECT id as id2 FROM sometable ORDER BY id2 DESC

implementation

This will require mapping identifiers from the solution onto the submission AST. An analogous case is mapping imports in pythonwhat. One solution is, like pythonwhat, keep a dictionary that maps each identifier and alias to the table it is referring to. E.g.

# where these are encountered in order
id as id2 -> sometable.id
id -> sometable.id
id2 -> sometable.id
sometable.id -> sometable.id

Note that this doesn't account for cases where an alias is given to, say, the result of COUNT or a CASE expression. For those situations it would just map to the corresponding COUNT or CASE ASTs...

# for CASE ... as id2
id2 -> CASE AST
# for COUNT(*) as count
count -> COUNT(*)

unit tests for AST parsing

this will let us fix mis-parsings quickly, and avoid regressions.

test_student_typed

related to #26 (since need to know where in raw text we are)

check_result option to be flexible on column name/order

Minimal example:

CREATE TABLE simple ( id INTEGER)
INSERT INTO simple (1)

Pass both SELECT id FROM simple and SELECT id as id2 FROM simple.

Realistic Example:

For example, both should pass:

SELECT 
    EXTRACT(MONTH FROM start_time) AS month, 
    COUNT(*) 
FROM trips 
GROUP BY month;

SELECT 
    EXTRACT(MONTH FROM start_time) AS month, 
    COUNT(*) as ttl
FROM trips 
GROUP BY month;

Complete AST parsing for most SELECT clauses

Most of these just involve removing the keyword

(bonus: HAVING, WINDOW, FETCH, FOR)

should sqlwhat fail student errors by default?

Allow test_column to specify tolerance, so floating point doesn't fail

For example:

AVG(duration) / 60 and AVG(duration / 60) should both be accepted.

Is the best way to do this using test_or() with test_student_typed()?

regex matching seemingly not working

Am I doing something wrong?

See screenshot below:

Test is on line 1071, and works fine when fixed=True but I want to check for and|AND.

Another example is here:

function names should reflect what is happening to state

Currently the three functions

check_result
check_statement
check_clause

all begin with check, but only the last two change state (by fetching a new AST node). It seems like we should change check_result, whose behavior will change depending on the state (e.g. called after check_statement('select', ...)) to test or test_result.

basic messages in all check_results subtests

Multiple feedback messages when submission has syntax error

The first comes from the backend, and is passed to test_exercise. The second is from running the SCT.

Should...

add test_error function, which fails if backend sent in an error.
ensure that if SCT completes, and there is already an error, then it fails and returns the original error message.

Note that this does not include the case where someone wants an exercise to succeed, even if there is a syntax error (we can always through in an allow_error function if it comes to it, seems unlikely).

parser does not have offset clause for select statement

Very basic sphinx api doc

make test to automatically fail sct

for debugging, highlight checking

ORDER BY parsing is incorrect

edit:
UNION is treated as compound statement, so for two select statements, a trailing ORDER BY is the outermost piece. It's probably okay for now, but the ORDER BY will incorrectly be part of the final SELECT statement. Example below.

SELECT a FROM b
UNION
SELECT x FROM y
ORDER BY c

the postgres ast is

  [{"SelectStmt"=>
     {"sortClause"=>
       [{"SortBy"=>
          {"node"=>
            {"ColumnRef"=>
              {"fields"=>[{"String"=>{"str"=>"c"}}], "location"=>47}},
           "sortby_dir"=>0,
           "sortby_nulls"=>0,
           "location"=>-1}}],
      "op"=>1,
      "larg"=>
       {"SelectStmt"=>
         {"targetList"=>
           [{"ResTarget"=>
              {"val"=>
                {"ColumnRef"=>
                  {"fields"=>[{"String"=>{"str"=>"a"}}], "location"=>7}},
               "location"=>7}}],
          "fromClause"=>
           [{"RangeVar"=>
              {"relname"=>"b",
               "inhOpt"=>2,
               "relpersistence"=>"p",
               "location"=>14}}],
          "op"=>0}},
      "rarg"=>
       {"SelectStmt"=>
         {"targetList"=>
           [{"ResTarget"=>
              {"val"=>
                {"ColumnRef"=>
                  {"fields"=>[{"String"=>{"str"=>"x"}}], "location"=>29}},
               "location"=>29}}],
          "fromClause"=>
           [{"RangeVar"=>
              {"relname"=>"y",
               "inhOpt"=>2,
               "relpersistence"=>"p",
               "location"=>36}}],
          "op"=>0}}}}],

test_student_did_not_type functionality

Summary of Issue

At present we can check if a student typed something in the submission. It would be useful to be able to check that they did not type something.

Example

Currently for the TransactSQL course, the triple underscore (___ ) sometimes causes a parsing error in sqlwhat.

Additionally, for exercises using ISQL, at present we can only check the AST.

Enhancement Idea

So, it might be useful to check that there is no ___ present in the submission using test_student_did_not_type(), rather than failing when trying to parse the submission. Thus in the SCT, test_student_did_not_type() would only pass and allow the AST checking to occur if there were no blanks present.

Related to #96

datacamp / sqlwhat Goto Github PK

sqlwhat's Introduction

sqlwhat

Installing

Reference

Raising issues with how SQL is parsed

Basic Use

Running unit tests

Rules of testing

Building Docs

sqlwhat's People

Contributors

Stargazers

Watchers

Forkers

sqlwhat's Issues

Query

SCT

sqlwhat-viewer

Error

Question

Overview

Enhancement Request

Summary of Issue

Steps to Reproduce

Screenshots/Links

implementation

Minimal example:

Realistic Example:

Summary of Issue

Example

Enhancement Idea

Recommend Projects

Recommend Topics

Recommend Org