Giter Club home page Giter Club logo

preql's Introduction

๐Ÿ‘‹ Welcome to my Github profile! Have a look at my open-source projects:

  • Lark, an innovative parsing toolkit, that's grown to be the most popular parser in the Python ecosystem!
  • Data-diff, a tool for cross-database diffing, that was generously funded by Datafold.
  • Preql, an ambitious new programming language that compiles to SQL. (inactive, but I plan to resume working on it some day!)
  • Runtype, a collection of run-time type utilities including validation and multiple dispatch.
  • Many more!

๐Ÿ‘ฉโ€๐Ÿ’ป My stack is Python for everything, and Svelte+Typescript for webdev. I also write in C++, Rust, Julia, and many others.

๐ŸŒŽ I'm living the life of a digital nomad; I spend a lot of my time in Amsterdam.

๐Ÿ‘… I speak Hebrew, English and pocito de Spanish.

๐Ÿ’ž๏ธ Iโ€™m looking to collaborate! Let me know if you have a cool project in mind, or want to help with one of mine. I'm especially interested in programming languages, the semantic-web, machine learning, static analysis, and recommendation systems.

๐Ÿ‘ทโ€โ™‚๏ธ I am a freelancer, and may be available for hire for cool and interesting projects, either open-source and proprietry.

๐Ÿ”— Would you like to know more? Visit my website.

preql's People

Contributors

erezsh avatar manor avatar tals avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

preql's Issues

Error on saving table to BigQuery - required field id cannot be null

Givent this schema in a source table:
2021-08-28_11-45-23

attempting to persist a table in BigQuery thows an error:

connect("bigquery:///gcp-project-name-here")

calendar = import_table("eCom_US.calendar")
temp = calendar{
    Fin_Year,       // INTEGER
    Fin_Year_Label  // STRING
}[..5]
print(temp)

table test_data.temp_cal = temp
// Got error: 400 Required field id cannot be null

Add Automatic joins via attribute access

Allow to auto-join when accessing an attribute, based on declaration:

table Country {
	name: string
}

table Person {
	country: Country
}

Person {country.name}    // results in a join on Country & Person

Incorrect SQL code generated

Came to this from the HN discusson 'against SQL'.
In https://github.com/erezsh/Preql

The translation of this...

print Continent {
... // Include existing fields
density: population / area // Create new a field

} order{^density}

...to this...

WITH subq_1(id, name, area, population, density) AS (
SELECT id, name, area, population, (CAST(population AS float) / area) AS density
FROM Continent
ORDER BY density DESC)
SELECT * FROM subq_1

...is almost certainly wrong. Certainly it is in mssql and likely every other DB. The order by is not honoured except in the last (outermost) select statement.

If tried in MSSQL you correctly get this error:

Msg 1033, Level 15, State 1, Line 6
The ORDER BY clause is invalid in views, inline functions, derived tables, subqueries, and common table expressions, unless TOP, OFFSET or FOR XML is also specified.

Any 'order by' in a subquery, including a CTE, does not guarantee any ordering at all.

It needs to be

WITH subq_1(id, name, area, population, density) AS (
SELECT id, name, area, population, (CAST(population AS float) / area) AS density
FROM Continent)
SELECT * FROM subq_1
ORDER BY density DESC

(which mssql accepts without complaint)

HTH

`random()` is evaluated multiple times

I noticed the following:

 >> new Point(x: random(), y: random())
Row{id: 1, x: 0.4692004100377667, y: 0.7031168818890262}

 >> Point
                  table Point =1                   
โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”“
โ”ƒ id โ”ƒ                   x โ”ƒ                    y โ”ƒ
โ”กโ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ฉ
โ”‚  1 โ”‚ 0.11453542091477337 โ”‚ 0.005568786776541362 โ”‚
โ””โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

The row as returned is not the same as the one that ends up in the table.

This seems to be related to lazy evaluation:

>> a = new Point(x: random(), y: random())
Row{id: 2, x: 0.20436837218255102, y: 0.4469266234140903}
 >> a
Row{id: 2, x: 0.41022233186312107, y: 0.42288136749245864}
 >> a
Row{id: 2, x: 0.30889023608879995, y: 0.10890735147249475}
 >> a
Row{id: 2, x: 0.8751359884952026, y: 0.6736662191962679}
 >> a
Row{id: 2, x: 0.3742608736379257, y: 0.6794840732600752}

Error on saving table to BigQuery - type mismatch

Given this schema in a source table in BigQuery:
2021-08-28_11-45-23

attempting to persist a table as a subset of the source table throws an error:

connect("bigquery:///gcp-project-id-here")

calendar = import_table("eCom_US.calendar")
temp = calendar{
    Calendar_Date,  // DATE
    Fin_Year,       // INTEGER
    Fin_Year_Label  // STRING
}[..5]
print(temp)

print(columns(temp))
// {Calendar_Date: string?, Fin_Year: int?, Fin_Year_Label: string?}

table test_data.temp_cal = temp
// Got error: 400 Query column 1 has type DATE which cannot be inserted into column Calendar_Date, which has type STRING

Grammar railroad diagram

I did a kick/dirty manual translation to the EBNF understood by https://www.bottlecaps.de/rr/ui to show a railroad diagram for preql/core/preql.lark and I think that it could help document/understand/debug Preql.

Copy and paste the EBNF bellow on https://www.bottlecaps.de/rr/ui tab Edit Grammar then switch to the tab View Diagram.

module::= _NL? stmt+   //-> as_list
/*?*/stmt::= table_def
     | struct_def
     | func_def
     | set_value
     | insert_rows
     | print
     | assert_
     | if_stmt
     | while_stmt
     | for_stmt
     | return_stmt
     | import_stmt
     | throw
     | try_catch
     | expr _NL

pql_dict::= "{" _ml_sep? ml_list (named_expr)* _ml_sep? "}"
key_value::= name ":" expr

pql_list::= "[" _NL? _ml_seperated (expr)* "]"
expr_list::= "[" _NL? _ml_seperated (expr)* "]" //-> as_list
proj_exprs::= _ml_seperated (named_expr)* //-> as_list

ml_list/*{item}*/::= _ml_seperated (item)* //-> as_list   // multi-line list

table_def::= "table" name "{" _ml_sep? _table_contents _ml_sep? "}" _NL
         | CONST? "table" name "=" expr _NL //-> table_def_from_expr

CONST::= "const"

func_defs::= func_def* //-> as_list

table_cols::= _ml_seperated (col_def)* ellipsis? //-> as_list
_table_contents::= table_cols func_defs

col_def::= name ":" coltype ("->" expr)? ("=" expr)?
/*?*/coltype::= type
        | m2m
type/*.-10*/::= molecule typemod?
m2m::= "[" type "]"

struct_def::= "struct" name "{" _ml_sep? ml_list (member_def)* "}" _NL
member_def::= name ":" type //-> as_list


typemod::= PK | NULLABLE

func_def::= "func" name func_params "=" expr [_NL string_raw] _NL //-> func_def_short
        | "func" name func_params codeblock _NL

try_catch::= "try" codeblock _NL? "catch" "(" [name ":"] expr ")" codeblock _NL
if_stmt::= "if" "(" expr ")" codeblock (_NL? "else" (codeblock _NL|if_stmt) | _NL)
for_stmt::= "for" "(" name "in" expr ")" codeblock _NL
while_stmt::= "while" "(" expr ")" codeblock _NL
codeblock::= _NL? "{" _NL? stmt* "}"
return_stmt::= "return" expr _NL
import_stmt::= "import" NAME _NL
throw::= "throw" expr _NL
print::= "print" _cs_list expr* _NL
assert_::= "assert" expr _NL
func_params::= "(" _separated (param ("," param)*)? ("," param_variadic)? ")" //-> as_list
           | "(" param_variadic ")" //-> as_list
param::= name (":" expr)? ("=" expr)?  // TODO type? default value?
param_variadic::= "..." name
insert_rows::= atom "+=" expr _NL
set_value::= atom "=" expr _NL

selection::= molecule expr_list
slice::= molecule range
range::=  "[" (molecule? ".." molecule | molecule ".." molecule?) "]"

projection::= molecule "{" _NL? proj_exprs _NL? "}"
          | molecule "{" _NL? proj_exprs _NL? "=>" _NL? proj_exprs _NL? "}" //-> projection_grouped
          | molecule "{" _NL? "=>" _NL? proj_exprs "}" //-> projection_grouped_nokeys
          | molecule "{" proj_exprs _NL? "=>" _NL? "}" //-> projection_grouped_novalues


order::= molecule "order" "{" _NL? ml_list expr* _NL? "}"
update::= molecule "update" "{" _NL? proj_exprs _NL? "}"
delete::= molecule "delete" expr_list


/*?*/expr::= or_test

/*!*/comp_op::= "<"|">"|"=="|">="|"<="|"<>"|"!="| "in" | NOT_IN //|"is"|"is" "not"
NOT_IN::= "!in"

/*?*/or_test::= (or_test "or")? and_test
/*?*/and_test::= (and_test "and")? not_test
/*?*/not_test::= "not" not_test //-> not_test
         | compare
/*?*/compare::= (compare comp_op)? like
/*?*/like::= (like ("~"|"like"))? add_expr
/*?*/add_expr::= (add_expr add_op)? term
/*?*/term::= (term mul_op)? power
/*?*/factor::= ("+" | "-")* molecule
/*?*/power::= molecule ("**" factor)?

/*!*/add_op::= "+"|"-" | "&" | "|"
/*!*/mul_op::= "*"|"/"|"%"|"/~"

/*?*/molecule::= atom
     | func_call
     | pql_list
     | pql_dict
     | range
     | selection
     | projection
     | slice
     | order
     | update
     | delete
     | "(" _NL? expr _NL? ")"
     | "-" molecule //-> neg
     | "^" molecule //-> desc
     | new
     | one
     | ellipsis

/*?*/atom::= var
     | marker
     | getattr
     | const

one::= "one" [NULLABLE] molecule

new::= "new" name func_args
     | _NEW_ARRAY name func_args //-> new_rows
//    | _GET_OR_NEW name func_args //-> get_or_new

_GET_OR_NEW.10::= "new?"
_NEW_ARRAY.10::= "new[]"

func_call::= molecule func_args
func_args::= "(" _separated (named_expr ("," named_expr)*)? ("," ellipsis)? ")" //-> as_list
         | "(" ellipsis ")" //-> as_list

named_expr::= (name ":")? expr

/*?*/const::= SIGNED_INT //-> int
      | SIGNED_FLOAT //-> float
      | "null" //-> null
      | "false" //-> false
      | "true" //-> true
      | string_raw //-> string

/*?*/string_raw::= STRING | LONG_STRING

marker::= MARKER
ellipsis::= "..." expr? exclude*
/*?*/exclude::= "!" (name | marker)

getattr::= molecule? "." (name | marker)
name::= NAME | QUOTED_NAME | SPECIAL_NAME
var::= name

PK::= "pk"
NULLABLE::= "?"
/*
COMMENT::= /#[^\n]* / | /\/\/[^\n]* /
_NL::= ( /\r?\n[\t ]* / | COMMENT )+

DECIMAL::= INT "." INT
       | INT "." /(?![.])/
       | "." INT

_EXP::= ("e"|"E") SIGNED_INT
FLOAT::= INT _EXP
     | DECIMAL _EXP?
SIGNED_FLOAT::= ["+"|"-"] FLOAT
SIGNED_INT::= ["+"|"-"] INT

QUOTED_NAME::= "`" /[^`\r\n]|``/+ "`"

STRING ::= /("(?!"").*?(?<!\\)(\\\\)*?"|'(?!'').*?(?<!\\)(\\\\)*?')/i
LONG_STRING::= /(""".*?(?<!\\)(\\\\)*?"""|'''.*?(?<!\\)(\\\\)*?''')/is
*/
_ml_seperated/*{item}*/::= (item _ml_sep)* item _ml_sep?
_ml_sep::= "," | "," _NL | _NL
_separated/*{item, sep}*/::= (item sep)* item
_cs_list/*{item}*/::= (item ",")* item ","?


SPECIAL_NAME::= "$" NAME
/*
%import common.CNAME -> NAME

%import common.WS_INLINE
%import common.INT
%ignore WS_INLINE
%ignore /\\[\t \f]*\r?\n/   // LINE_CONT
%ignore COMMENT

%declare MARKER             // Used for internal analysis
*/

ModuleNotFoundError: No module named 'preql.core'

trying to run pipx run preql
or in venv pip install preql and running preql throws this same error.

Traceback (most recent call last):
  File "C:\Users\j\scoop\apps\python\current\lib\runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Users\j\scoop\apps\python\current\lib\runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "C:\Users\j\ok\scripts\crawlformtool\Scripts\preql.exe\__main__.py", line 4, in <module>
  File "C:\Users\j\ok\scripts\crawlformtool\lib\site-packages\preql\__init__.py", line 1, in <module>
    from . import _base_imports
  File "C:\Users\j\ok\scripts\crawlformtool\lib\site-packages\preql\_base_imports.py", line 1, in <module>
    from .core import interp_common, pql_types, pql_objects, types_impl
ModuleNotFoundError: No module named 'preql.core'

This is in windows

Assertion error with BigQuery when running .pql scripts but not REPL

Running the following (or any other) script with preql -m foo command:

connect("bigquery:///gcp-project-id-here")

cal = import_table("test_data.temp_cal")
print(cal)

Results in the correct execution (judging by the output) but an assertion error afterwards. Doing the same in REPL works without any errors.

(INFO) state -- [Preql] Connecting to bigquery:///gcp-project-id-here
                         table temp_cal =5
โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”“
โ”ƒ Fin_Year โ”ƒ Fin_Year_Label โ”ƒ id                                   โ”ƒ
โ”กโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ฉ
โ”‚     2016 โ”‚ FY16           โ”‚ 1849f81a-51c6-4135-ac16-120d2e5a92fd โ”‚
โ”‚     2016 โ”‚ FY16           โ”‚ bf20059c-7089-4b7b-a0df-a583d3a7fad9 โ”‚
โ”‚     2016 โ”‚ FY16           โ”‚ 158140c2-9228-49a2-90e6-a44aafd7f9c5 โ”‚
โ”‚     2016 โ”‚ FY16           โ”‚ ea7f8b5b-8f80-4b56-9f97-4cdd5a26105a โ”‚
โ”‚     2016 โ”‚ FY16           โ”‚ 9b6abcc5-6da7-4d9d-923b-84e7d33e311e โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Traceback (most recent call last):
  File "/PyCharmVenv/PreQL/bin/preql", line 8, in <module>
    sys.exit(main())
  File "/PyCharmVenv/PreQL/lib/python3.9/site-packages/preql/__main__.py", line 76, in main
    p('import ' + args.module)
  File "/PyCharmVenv/PreQL/lib/python3.9/site-packages/preql/api.py", line 25, in inner
    return f(*args, **kwargs)
  File "/PyCharmVenv/PreQL/lib/python3.9/site-packages/preql/api.py", line 197, in __call__
    res = self._run_code(code, '<inline>', args)
  File "/PyCharmVenv/PreQL/lib/python3.9/site-packages/preql/api.py", line 193, in _run_code
    return self._interp.execute_code(code + "\n", source_name, pql_args)
  File "/PyCharmVenv/PreQL/lib/python3.9/site-packages/preql/core/interpreter.py", line 36, in inner
    return f(interp, *args, **kwargs)
  File "/PyCharmVenv/PreQL/lib/python3.9/site-packages/preql/core/interpreter.py", line 86, in _execute_code
    last = execute(stmt)
  File "/PyCharmVenv/PreQL/lib/python3.9/site-packages/preql/core/evaluate.py", line 383, in execute
    return stmt._execute() or objects.null
  File "/PyCharmVenv/PreQL/lib/python3.9/site-packages/preql/core/evaluate.py", line 364, in _execute
    module = import_module(context.state, r)
  File "/PyCharmVenv/PreQL/lib/python3.9/site-packages/preql/core/evaluate.py", line 355, in import_module
    assert state.db is i.state.db
AssertionError

OracleInterface not working with TNSNAMES

Hi,

when a connection uri specify just the TNSNAMES.ORA entry name (e.g. oracle://:@<tnsadmin_entry>), PREQL generates the root error:
cx_Oracle.DatabaseError: ORA-12154: TNS:could not resolve the connect identifier specified

The problem is due to OracleInterface class that creates a wrong dns in the code at line 225:

...
self.args = dict(dsn="%s/%s" % (host, database), user=user, password=password)
...

In fact, using the TNSNAMES entry only, the host variable contains <tnsadmin_entry>, while the database variable is empty, hence the resulting dsn is corrupted with a trailing "/" that generates the error when the instant client searches the reference in TNSNAMES.ORA

Documentation for "..." in projection

I get why this:

join(c: Country.name, n:["Palau", "Nauru"].item) {...c}

returns this:

image

and even why this:

join(c: Country.name, n:["Palau", "Nauru"].item) {...n}

returns this:

image

but what does this even mean?

join(c: Country.name, n:["Palau", "Nauru"].item) {...}

because this is what it returns:

image

In the context of a project, '...' means all the columns, but the type of the result of the join and/or the meaning of "..." are a bit unclear. Also, I understand rows and tables but it looks like entire structs are being returned in the last case:

image

Provide a way to specify data when executing a script directly

When writing a preql script I may want to specify some information I may not have at the moment of writing it (eg, dynamically obtained data from another source). Since it is not possible to directly implement IO operations in preql, I think the next option would be that of implementing the functionality to pass data to the script during execution.

We should probably include some kind of placeholder syntax in the script to say that a variable is externally provided (I guess their type can always be string), and they may potentially also have default values. Something like:

external USER_ID = "default value"

And as for providing the actual data for these external variables, one (or more) of the following should suffice:

  • Pass them as key=value pairs when invoking the command
  • Read them from an input file (eg, json, ini, etc)
  • Read them from environment variables at the moment of executing the script

In all of the above, the name of the key should match exactly the name of the variable. A question here could be whether preql should throw a compilation exception if no value is found for a key. I would vote for yes, since it makes the functionality safer.

Not sure if this is possible or how hard it is, but it would make preql much more useful ๐Ÿ˜„

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.