mechanicalrabbit / funsql.jl Goto Github PK
View Code? Open in Web Editor NEWJulia library for compositional construction of SQL queries
Home Page: https://mechanicalrabbit.github.io/FunSQL.jl
License: Other
Julia library for compositional construction of SQL queries
Home Page: https://mechanicalrabbit.github.io/FunSQL.jl
License: Other
using FunSQL: SQLTable, From, Select, Join, Where, Group, Define, Fun, Get, Agg, render
person = SQLTable(:person, :person_id, :year_of_birth, :location_id)
q = Define(:complex_expression => From(person) |> Group() |> Select(Agg.count())) |>
Select(Get.complex_expression .+ Get.complex_expression)
print(render(q))
#=>
SELECT ((
SELECT COUNT(*) AS "count"
FROM "person" AS "person_1"
) + (
SELECT COUNT(*) AS "count"
FROM "person" AS "person_1"
)) AS "+"
=#
Hi @xitology and co!
I am in the process of making an all-in-one package to support connecting to a variety of databases. It is unregistered, but the draft is here: https://github.com/JuliaDatabases/DBConnector.jl
In short, I am running into a problem with the reflect
statement for PostgreSQL when using a JDBC driver and JDBC.jl. I have a working example of the package here for SQLite:
pkg> add OMOPCDMCohortCreator
pkg> add HealthSampleData
pkg> add DBInterface
pkg> add DataFrames
pkg> add https://github.com/JuliaDatabases/DBConnector.jl
using DBConnector, DataFrames, OMOPCDMCohortCreator, HealthSampleData
eunomia = HealthSampleData.Eunomia()
conn = DBConnector.DBConnection(connection_string = "jdbc:sqlite:$(eunomia)", driver_path =
"path/to/sqlite-jdbc-3.41.0.0.jar", connectivity = "jdbc")
GenerateDatabaseDetails(:sqlite, "main")
GenerateTables(conn)
GetDatabasePersonIDs(conn)
As you can see, the above works perfectly for SQLite. However, when doing something similar for PostgreSQL, it fails. Sadly, I don't have a PostgreSQL DB set-up for reproducibility that you could use, but here is the script:
using DBConnector, DataFrames, OMOPCDMCohortCreator, HealthSampleData
eunomia = HealthSampleData.Eunomia()
conn = DBConnector.DBConnection(connection_string = "jdbc:postgresql:db/path", driver_path =
"path/to/postgresql-jdbc.jar", connectivity = "jdbc")
GenerateDatabaseDetails(:postgresql, "synpuf5")
GenerateTables(conn)
And I get the following error:
Exception in thread "main" org.postgresql.util.PSQLException: ERROR: bind message suppl
ies 0 parameters, but prepared statement "" requires 1
at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorI
mpl.java:2553)
at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.ja
va:2285)
at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:323)
at org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:473)
at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:393)
at org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:322)
at org.postgresql.jdbc.PgStatement.executeCachedSql(PgStatement.java:308)
at org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:284)
at org.postgresql.jdbc.PgStatement.executeQuery(PgStatement.java:236)
ERROR: JavaCall.JavaCallError("Error calling Java: org.postgresql.util.PSQLException: E
RROR: bind message supplies 0 parameters, but prepared statement \"\" requires 1")
Stacktrace:
[1] geterror(allow::Bool)
@ JavaCall ~/.julia/packages/JavaCall/MlduK/src/core.jl:418
[2] geterror
@ ~/.julia/packages/JavaCall/MlduK/src/core.jl:403 [inlined]
[3] _jcall(obj::JavaCall.JavaObject{Symbol("java.sql.Statement")}, jmethodId::Ptr{Not
hing}, callmethod::Ptr{Nothing}, rettype::Type, argtypes::Tuple{DataType}, args::String
)
@ JavaCall ~/.julia/packages/JavaCall/MlduK/src/core.jl:373
[4] jcall(obj::JavaCall.JavaObject{Symbol("java.sql.Statement")}, method::String, ret
type::Type, argtypes::Tuple{DataType}, args::String)
@ JavaCall ~/.julia/packages/JavaCall/MlduK/src/core.jl:245
[5] executeQuery
@ ~/.julia/packages/JDBC/2ruzk/src/JDBC.jl:146 [inlined]
[6] #prepare#8
@ ~/FOSS/DBConnector.jl/src/jdbc.jl:20 [inlined]
[7] prepare
@ ~/FOSS/DBConnector.jl/src/jdbc.jl:18 [inlined]
[8] reflect(conn::JavaCall.JavaObject{Symbol("java.sql.Connection")}; schema::String,
dialect::Symbol, cache::Int64)
@ FunSQL ~/.julia/packages/FunSQL/Ufc3L/src/reflect.jl:89
[9] GenerateTables(conn::JavaCall.JavaObject{Symbol("java.sql.Connection")}; inplace:
:Bool, exported::Bool)
@ OMOPCDMCohortCreator ~/.julia/packages/OMOPCDMCohortCreator/0hUCQ/src/generators.
jl:168
[10] GenerateTables(conn::JavaCall.JavaObject{Symbol("java.sql.Connection")})
@ OMOPCDMCohortCreator ~/.julia/packages/OMOPCDMCohortCreator/0hUCQ/src/generators.
jl:166
[11] top-level scope
@ REPL[6]:1
Doing some more digging in the reflect
source code, I discovered that for some reason here:
Lines 17 to 28 in 2741b75
My assigned schema
is not being interpolated correctly into the SQL block despite my prepare statement in DBConnector accepting the interpolation syntax. I have isolated the error to my extension of the function DBInterface.prepare
in DBConnector as I can confirm that using my dispatch for DBInterface.connect
and DBInterface.execute
does work with code querying a PostgreSQL DB like DBInterface.execute(conn, "SELECT * FROM person LIMIT 1;") |> DataFrame
Here is the source code for my JDBC DBInterface.prepare
dispatch (link here too: https://github.com/JuliaDatabases/DBConnector.jl/blob/main/src/jdbc.jl):
"""
Dispatch for JDBC interface to DBInterface `prepare` function
BUG: Doesn't seem to work for all JDBC versions yet
"""
function DBInterface.prepare(conn::JDBC.JavaObject{Symbol("java.sql.Connection")}, args...; kws...)
stmt = JDBC.createStatement(conn)
result = executeQuery(stmt, args...)
return result
end
Any thoughts as to what I may be doing wrong?
Thanks!
Consider a query where Define
overrides an existing column of a table:
using FunSQL: SQLTable, From, Define, Select, Get, render
person = SQLTable(:person, :person_id, :year_of_birth, :location_id)
q = From(person) |>
Define(:year_of_birth => 0)
When the query does not have an explicit Select
node, this Define
node gets ignored:
print(render(q))
#=>
SELECT "person_1"."person_id", "person_1"."year_of_birth", "person_1"."location_id"
FROM "person" AS "person_1"
=#
One would expect that year_of_birth
respects the redefinition. To fix this, we should make the columns created with Define
to be included to the default output.
Describe the bug
I was using the documentation on the Define
functionality in building a query that defines a new column.
Currently, this function seems to be broken
To Reproduce
Julia Version (i.e. output of julia -v
): 1.6.1
Operating system (Mac, Linux, Windows): macOS Catalina 10.15.7
FunSQL Version: FunSQL v0.7.0
Minimum working code example that led to bug:
using LibPQ
const conn = LibPQ.Connection("CREDENTIALS")
const person = SQLTable(:person, columns = [:person_id, :year_of_birth, :location_id])
const location = SQLTable(:location, columns = [:location_id, :city, :state])
q = From(person) |>
Define(:age => 2021 .- Get.year_of_birth) |>
Where(Get.age .> 16)
sql = render(q, dialect = :postgresql)
res = LibPQ.execute(conn, sql)
test = DataFrame(res)
Expected Behavior and Actual Behavior
I was expecting a final table that contains a new column named age
but after SQL rendering, there is no new column being generated.
julia> print(sql)
SELECT "person_1"."person_id", "person_1"."year_of_birth", "person_1"."location_id", [...other columns...]
FROM "synpuf5"."person" AS "person_1"
WHERE ((2021 - "person_1"."year_of_birth") > 16)
Stacktrace (If Applicable)
No error available.
Screenshots
N/A
Additional context
N/A
This issue is used to trigger TagBot; feel free to unsubscribe.
If you haven't already, you should update your TagBot.yml
to include issue comment triggers.
Please see this post on Discourse for instructions and more details.
If you'd like for me to do this for you, comment TagBot fix
on this issue.
I'll open a PR within a few hours, please be patient!
Currently, FunSQL tracks the names and the order of the output columns, but it does not distinguishes their types.
To fix this, FunSQL needs to:
Column information could be stored in a structure SQLColumn(name, type)
with SQLTable.columns
becoming an OrderedDict{Symbol, SQLColumn}
. Accidentally, it will allow a column to have a FunSQL name that differs from its SQL name.
It is much harder to implement type resolution for arbitrary expressions. We could start by allowing the user to explicitly declare the result type of a particular function or an operator call.
often times I'd like to add a top-level sort column but otherwise keep subordinate sorts
It would be lovely to directly support grouping sets, rollup and cube. This is related also to sorting nulls last when the sort is otherwise descending order and nulls would come first. Supporting automatic naming for aggregates is also important.
See hacked implementation at TuftsCTSI/TRDW.jl@89cb917#diff-754e6441f2054c00a80b0953b1876f78a4f3e85993eebd6fa0e3b883611a58caR182-R212
Hi there,
Firstly just want to say a huge thank you for making this library. It really is the perfect solution to my current needs.
I am currently working on a project which uses multiple catalogues/schemas within a single datasource. When trying to implement this with the FunSQL library I get an unexpected behaviour for joins. Using the Join example found in the API reference for this library the following sql is rendered when no qualifiers are used on a SQLTable
object:
using FunSQL: SQLTable, render, From, Join, Select, Get
person0 = SQLTable(:person, columns=[:person_id, :location_id])
location0 = SQLTable(:location, columns=[:location_id, :state])
query = From(:person) |> Join(:location => From(:location), Get.location_id .== Get.location.location_id) |> Select(Get.person_id, Get.location.state)
println(render(query, tables=[person0, location0]))
Output:
SELECT
"person_1"."person_id",
"location_1"."state"
FROM "person" AS "person_1"
JOIN "location" AS "location_1" ON ("person_1"."location_id" = "location_1"."location_id")
This is the expected behaviour, namely that the Join clause created joins directly to the table. This does not happen when using 2 or more qualifiers (the situation I find myself in: one for the DB and another for the schema).
person2 = SQLTable(:person, qualifiers=["EXAMPLE_DB", "EXAMPLE_SCHEMA"], columns=[:person_id, :location_id])
location2 = SQLTable(:location, qualifiers=["EXAMPLE_DB", "EXAMPLE_SCEHMA"], columns=[:location_id, :state])
println(render(query, tables=[person2, location2]))
Output:
SELECT
"person_1"."person_id",
"location_2"."state"
FROM "EXAMPLE_DB"."EXAMPLE_SCHEMA"."person" AS "person_1"
JOIN (
SELECT
"location_1"."location_id",
"location_1"."state"
FROM "EXAMPLE_DB"."EXMAPLE_SCHEMA"."location" AS "location_1"
) AS "location_2" ON ("person_1"."location_id" = "location_2"."location_id")
As you can see, whenever one uses 2+ qualifiers then the join clause is created using an inner/nested/sub query. While I am not very experienced in DB tuning, I don't think that is as efficient as a direct join.
I am also unsure if this expected behaviour for the library and perhaps there is a configuration setting I can change to amend the behaviour.
Thank you :)
When I run this line, copied from the documentation, error message appears => CrossJoin is undefined
using FunSQL: FunSQL, Agg, Append, As, Asc, Bind, CrossJoin, Define, Desc, Fun, From, Get, Group, Highlight, Iterate, Join, LeftJoin, Limit, Lit, Order, Partition, Select, Sort, Var, Where, With, WithExternal, render
Hello, I'm really glad this project is starting to exist. I've been checking out the documentation and trying to learn about it.
One part of the design that I'm curious about is I'm interested in why Get is used.
In the first query example, we have
const person =
SQLTable(:person, columns = [:person_id, :year_of_birth, :location_id])
const location =
SQLTable(:location, columns = [:location_id, :city, :state])
const visit_occurrence =
SQLTable(:visit_occurrence, columns = [:visit_occurrence_id, :person_id, :visit_start_date])
q = person |>
Where(Get.year_of_birth .<= 1950) |>
Join(:location => location,
on = Get.location_id .== Get.location.location_id) |>
Where(Get.location.state .== "IL") |>
Join(:visit_group => visit_occurrence |>
Group(Get.person_id),
on = Get.person_id .== Get.visit_group.person_id,
left = true) |>
Select(Get.person_id,
:max_visit_start_date =>
Get.visit_group |> Agg.max(Get.visit_start_date))
Why is it designed to use Get.visit_start_date
instead of visit_occurrence.visit_start_date
as it would appear in SQL? Likewise, what is the advantage of an API that wants Get.location.location_id
instead of location.location_id
? Is this style inspired by some other query builder?
One point of friction using FunSQL.jl is that I still often need to switch between two different syntax. I still need to write SQL in a query editor while debugging and performance testing, but I eventually need to translate to modular FunSQL.jl syntax in a package. I'm not sure how hard it would be at this stage to reverse the render
function to generate the Julia object I want to use in my code... perhaps that kind of functionality is out of scope? I do think it would be really slick and it'd be really cool if this function could identify parts of a dialect that aren't currently supported.
using FunSQL: SQLTable, From, Select, Join, Where, Group, Define, Bind, Fun, Get, Agg, Var, render
visit_occurrence = SQLTable(:visit_occurrence, :visit_occurrence_id, :person_id, :visit_start_date, :visit_end_date)
condition_occurrence = SQLTable(:condition_occurrence, :condition_occurrence_id, :person_id, :condition_start_date, :condition_end_date)
ConditionAfter(person_id, date) =
From(condition_occurrence) |>
Where(Fun.and(Get.person_id .== Var.person_id, Get.condition_start_date .> Var.date)) |>
Bind(:person_id => person_id, :date => date)
q = From(visit_occurrence) |>
Group(Get.person_id) |>
Where(Fun.exists(ConditionAfter(Get.person_id, Agg.max(Get.visit_start_date))))
print(render(q))
#=>
SELECT "visit_occurrence_1"."person_id"
FROM "visit_occurrence" AS "visit_occurrence_1"
GROUP BY "visit_occurrence_1"."person_id"
HAVING (EXISTS (
SELECT NULL
FROM "condition_occurrence" AS "condition_occurrence_1"
WHERE (("condition_occurrence_1"."person_id" = "visit_occurrence_1"."person_id") AND ("condition_occurrence_1"."condition_start_date" > MAX("visit_occurrence_1"."visit_start_date")))
=#
In most databases, NULL
values are returned first, unless NULLS LAST
is used.
This request is asking for something similar to DESC() decorator.
For use with @funsql
macro, FunSQL currently exports a small number of common SQL functions and operators:
Any SQL function not in these lists has to be declared like this:
const var"funsql_%" = FunSQL.Fun."%"
const funsql_array_get = FunSQL.Fun."?[?]"
const funsql_as_integer = FunSQL.Fun."(?::integer)"
const funsql_bool_and = FunSQL.Agg.bool_and
const funsql_regexp_matches = FunSQL.Fun.regexp_matches
const funsql_string_agg = FunSQL.Agg."string_agg(?, NULL ORDER BY ?)"
Ideally, FunSQL should provide all built-in functions and operators, for every supported database engine. Since every engine has its own list of supported functions, these declarations could be added to engine-specific modules:
using FunSQL, FunSQL.PostgreSQL
@funsql select(as_integer("0")) #-> SELECT ('0'::integer)
Currently, the SQL that gets printed looks like this:
julia> println(sql)
SELECT "person_1"."person_id", "person_1"."gender_concept_id", "person_1"."year_of_birth",
"person_1"."month_of_birth", "person_1"."day_of_birth", "person_1"."birth_datetime", "per
son_1"."race_concept_id", "person_1"."ethnicity_concept_id", "person_1"."location_id", "pe
rson_1"."provider_id", "person_1"."care_site_id", "person_1"."person_source_value", "perso
n_1"."gender_source_value", "person_1"."gender_source_concept_id", "person_1"."race_source
_value", "person_1"."race_source_concept_id", "person_1"."ethnicity_source_value", "person
_1"."ethnicity_source_concept_id"
FROM "person" AS "person_1"
Which is fine if you are not later interfacing with the SQL.
However, for printing to reports/tutorials/materials, the strings can come out looking very cluttered or hard to read.
It would be nice if there could be a display
dispatch or something that takes a SQLStatement
and could print it out like this:
SELECT "person_1"."person_id",
"person_1"."gender_concept_id",
"person_1"."year_of_birth",
"person_1"."month_of_birth",
"person_1"."day_of_birth",
"person_1"."birth_datetime",
"person_1"."race_concept_id",
"person_1"."ethnicity_concept_id",
"person_1"."location_id",
"person_1"."provider_id",
"person_1"."care_site_id",
"person_1"."person_source_value",
"person_1"."gender_source_value",
"person_1"."gender_source_concept_id",
"person_1"."race_source_value",
"person_1"."race_source_concept_id",
"person_1"."ethnicity_source_value",
"person_1"."ethnicity_source_concept_id"
FROM "person" AS "person_1"
Is this something that could be implemented?
Thanks!
using FunSQL: SQLTable, From, Define, Where, Get, render
person = SQLTable(:person, :person_id, :year_of_birth, :location_id)
ByYOB(yob) =
(q = Define(:yob => Get.year_of_birth)) |>
Where(q.yob .== yob)
q = From(person) |>
ByYOB(1950)
print(render(q))
#=>
ERROR: GetError: cannot find yob in:
let person = SQLTable(:person, …),
q1 = From(person),
q2 = q1 |> Define(Get.year_of_birth |> As(:yob)),
q3 = Define(Get.year_of_birth |> As(:yob)),
q4 = q2 |> Where(Fun."=="(q3.yob, Lit(1950)))
q4
end
=#
using FunSQL: SQLTable, From, Select, Get, Agg, render
person = SQLTable(:person, :person_id, :year_of_birth, :location_id)
q = From(person) |> Select(Get.person_id, Agg.count())
print(render(q))
#=>
SELECT "person_1"."person_id", COUNT(*) AS "count"
FROM "person" AS "person_1"
=#
This should instead raise an error.
Currently, Define
adds any new columns at the end of the current column list. This is fine as the default behavior, but it should be possible to insert them at the beginning or at any particular position.
Definitions that are replacing existing columns currently retain their positions. It should be possible to re-position them as if they are new columns.
The @funsql
macro was officially released in v0.12.0. Going forward, it should become a preferred method for assembling FunSQL queries. For this reason, the documentation, in particular, Usage Guide, API Reference and Examples, should be updated to use @funsql
macro.
Hi all,
Running into a strange type issue with datetimes.
Is your feature request related to a problem? Please explain.
I am using the Eunomia sqlite database for some prototyping of ideas and for some reason, the birth_datetime
field is misinterpreted as a Float object.
Even if I tried to modify the query to Fun.datetime(Get.birth_datetime)
, this results in the objects being reported missing
in my resulting dataframe.
To Reproduce
Julia Version (i.e. output of julia -v): 1.6.1
Operating system (Mac, Linux, Windows): macOS Catalina 10.15.7
FunSQL Version: FunSQL v0.8.0
Minimum working code example that led to bug:
using RCall
path_to_db = mktempdir() * "file.sqlite"
R"""
library("Eunomia")
Eunomia::getEunomiaConnectionDetails(databaseFile = $(path_to_db))
"""
person = SQLTable(
:person,
columns = [
:person_id,
:gender_concept_id,
:year_of_birth,
:month_of_birth,
:day_of_birth,
:birth_datetime,
:race_concept_id,
:ethnicity_concept_id,
:location_id,
:provider_id,
:care_site_id,
:person_source_value,
:gender_source_value,
:gender_source_concept_id,
:race_source_value,
:race_source_concept_id,
:ethnicity_source_value,
:ethnicity_source_concept_id,
],
)
sql =
From(person) |>
Select(
Get.person_id,
Get.gender_concept_id,
Get.year_of_birth,
Get.month_of_birth,
Get.day_of_birth,
Get.birth_datetime,
Get.race_concept_id,
Get.ethnicity_concept_id,
Get.location_id,
Get.provider_id,
Get.care_site_id,
Get.person_source_value,
Get.gender_source_value,
Get.gender_source_concept_id,
Get.race_source_value,
Get.race_source_concept_id,
Get.ethnicity_source_value,
Get.ethnicity_source_concept_id,
) |>
x -> render(x, dialect = :sqlite)
result = SQLite.DB(path_to_db) |>
eunomia -> DBInterface.execute(eunomia, sql) |> DataFrame
Expected Behavior and Actual Behavior
I expected the datetime object to be converted to a Julia DateTime
object in Julia.
Instead, the datetime is either converted to a Float
or missing
.
In the compiled SQL string, the CTE tables are not aliased. This results in an invalid query when doing something like a self join.
using FunSQL:
SQLTable,
From, Join, Get, As, Select, With,
render;
const tab =
SQLTable(:foo, columns=[:x, :y])
q = From(:foo_plus) |>
Join(:foo_plus_dup => From(:foo_plus), on = Get.x .== Get.foo_plus_dup.x2) |>
With(:foo_plus => From(tab) |> Select(Get.x, Get.x .+ 1 |> As(:x2)))
print(render(q))
#=>
WITH "foo_plus_1" ("x", "x2") AS (
SELECT
"foo_1"."x",
("foo_1"."x" + 1) AS "x2"
FROM "foo" AS "foo_1"
)
SELECT
"foo_plus_1"."x",
"foo_plus_1"."x2"
FROM "foo_plus_1"
JOIN "foo_plus_1" ON ("foo_plus_1"."x" = "foo_plus_1"."x2")
=#
A From
node referencing a CTE subquery is specialized to a FromReference
node during the annotation pass. Looking at the assemble
routine for FromTable
, an alias is created everytime the node is encountered. We could port over the same logic to FromReference
.
# current impl
function assemble(n::FromReferenceNode, refs, ctx)
cte_a = ctx.cte_map[n.over]
a = unwrap_repl(cte_a.a)
# remove
# c = FROM(over = ID(over = cte_a.schema, name = cte_a.name))
# subs = make_subs(a, cte_a.name)
# add
alias = allocate_alias(ctx, cte_a.name)
c = FROM(over = ID(over = cte_a.schema, name = alias))
subs = make_subs(a, alias)
trns = Pair{SQLNode, SQLClause}[]
for ref in refs
push!(trns, ref => subs[ref])
end
repl, cols = make_repl_cols(trns)
return Assemblage(c, cols = cols, repl = repl)
end
I would commit a PR but I'm not yet comfortable with the Julia workflow.
When constructing component queries, I sometimes would like to add an additional sort column to an existing sort order. However, currently it seems that I'm only able to overwrite sort columns. Conversely, it'd be nice to have a stable sort, which would do the opposite, add a high-level sort order but keeping existing subordinate sorts in place.
I can connect to my server using
conn = DBInterface.connect(FunSQL.DB{ODBC.Connection},
"Driver={ODBC Driver 18 for SQL Server};server=xxxx.database.windows.net;database=xxxx;UID=xxxx;Authentication=ActiveDirectoryInteractive;Encrypt=yes;",
dialect = :sqlserver)
, and can query the database OK with handwritten sql.
However, the FunSQL catalog object is not populated with any tables; so From(:xxsymbol)
just throws a FunSQL.ReferenceError
All the table names in my database have a dot in them, I wonder if that has any impact?
Hi all,
I seem to be having a strange issue where the following code does not work as expected:
From(tab) |> Select(Get.foo) |> Limit(1) |> q -> render(q, dialect = :mysql) |> x -> DBInterface.execute(conn, x) |> DataFrame
ERROR: MethodError: no method matching query(::MySQL.API.MYSQL, ::FunSQL.SQLString)
Closest candidates are:
query(::MySQL.API.MYSQL, ::String) at ~/.julia/packages/MySQL/0vHyV/src/api/capi.jl:1219
Stacktrace:
[1] execute(conn::MySQL.Connection, sql::FunSQL.SQLString, params::Tuple{}; mysql_store_result::Bool, mysql_date_and_time::Bool)
@ MySQL ~/.julia/packages/MySQL/0vHyV/src/execute.jl:158
[2] execute(conn::MySQL.Connection, sql::FunSQL.SQLString, params::Tuple{}) (repeats 2 times)
@ MySQL ~/.julia/packages/MySQL/0vHyV/src/execute.jl:155
[3] (::var"#8#10")(x::FunSQL.SQLString)
@ Main ./REPL[16]:1
[4] |>(x::FunSQL.SQLString, f::var"#8#10")
@ Base ./operators.jl:966
[5] (::var"#7#9")(q::FunSQL.SQLNode)
@ Main ./REPL[16]:1
[6] |>(x::FunSQL.SQLNode, f::var"#7#9")
@ Base ./operators.jl:966
[7] top-level scope
@ REPL[16]:1
Where my connection looks like this:
conn = DBInterface.connect(MySQL.Connection, "my.host.com", "username", "password"; db = "database", port = 3306, unix_socket = MySQL.API.MYSQL_DEFAULT_SOCKET, client_flag=MySQL.API.CLIENT_MULTI_STATEMENTS, opts = Dict())
A workaround to this I found was doing the following:
From(tab) |> Select(Get.foo) |> Limit(1) |> q -> render(q, dialect = :mysql) |> x -> DBInterface.execute(conn, String(x)) |> DataFrame
Which returned the expected result.
This seems like a bug where FunSQL.SQLString
is not accepted when I expected it to be.
Is this a bug?
Details:
FunSQL Version: [cf6cc811] FunSQL v0.10.1
Julia Version: 1.7.0
OS: Mac OSX
Sometimes it is easier to remove columns from the default output than to build the new output list. A new node, Unselect()
, should do it.
The SQLCatalog
mapping stores available SQL tables indexed by name. It is used for resolving query nodes From(name::Symbol)
.
We should extend SQLCatalog
to allow entries that are arbitrary FunSQL queries. Such entries can be interpreted as CTEs. This will be useful, in particular, for composable ETL pipelines.
Hi @xitology and @clarkevans !
I was wondering, would it be possible to support DuckDB syntax here? I had read the syntax was very similar to PostgreSQL, so I ran OHDSICohortExpressions with the model flavor set to PostgreSQL but I got an error that looked like this
julia> DBInterface.execute(conn, queries[3] |> String)
ERROR: Binder Error: No function matches the given name and argument types '+(DATE, BIGINT)'. You might need to add explicit type casts.
Candidate functions:
+(TINYINT) -> TINYINT
+(TINYINT, TINYINT) -> TINYINT
+(SMALLINT) -> SMALLINT
+(SMALLINT, SMALLINT) -> SMALLINT
+(INTEGER) -> INTEGER
+(INTEGER, INTEGER) -> INTEGER
+(BIGINT) -> BIGINT
+(BIGINT, BIGINT) -> BIGINT
+(HUGEINT) -> HUGEINT
+(HUGEINT, HUGEINT) -> HUGEINT
+(FLOAT) -> FLOAT
+(FLOAT, FLOAT) -> FLOAT
+(DOUBLE) -> DOUBLE
+(DOUBLE, DOUBLE) -> DOUBLE
+(DECIMAL) -> DECIMAL
+(DECIMAL, DECIMAL) -> DECIMAL
+(UTINYINT) -> UTINYINT
+(UTINYINT, UTINYINT) -> UTINYINT
+(USMALLINT) -> USMALLINT
+(USMALLINT, USMALLINT) -> USMALLINT
+(UINTEGER) -> UINTEGER
+(UINTEGER, UINTEGER) -> UINTEGER
+(UBIGINT) -> UBIGINT
+(UBIGINT, UBIGINT) -> UBIGINT
+(DATE, INTEGER) -> DATE
+(INTEGER, DATE) -> DATE
+(INTERVAL, INTERVAL) -> INTERVAL
+(DATE, INTERVAL) -> DATE
+(INTERVAL, DATE) -> DATE
+(TIME, INTERVAL) -> TIME
+(INTERVAL, TIME) -> TIME
+(TIMESTAMP, INTERVAL) -> TIMESTAMP
+(INTERVAL, TIMESTAMP) -> TIMESTAMP
+(TIME, DATE) -> TIMESTAMP
+(DATE, TIME) -> TIMESTAMP
+(ANY[], ANY[]) -> ANY[]
I could give my whole SQL expression for what I am trying but am curious if there'd ever be interest to support DuckDB. Thanks!
Hi all,
I was experimenting with FunSQL today - super easy to work with by the way! - and ran into this personally irritating situation.
What do you all think?
Is your feature request related to a problem? Please explain.
Currently, I need to specify what are the exact columns I need to query when creating a SQLTable
entity like so:
const person = SQLTable(:person, columns = [:person_id, :year_of_birth, :location_id])
I find having to define each column I need to be painful as sometimes I do not know all the columns for a given table easily when I write a query.
Describe the solution you'd like
Ideally, I would like to be able to provide nothing to the columns
variable and have it default to select all columns in the query - analogous to SELECT * FROM person
.
I was wondering if the syntax could like something like this:
const person = SQLTable(:person)
and a dispatch or underlying functionality to look like:
SQLTable(name; schema = nothing, columns = nothing) =
SQLTable(schema = schema, name = name, columns = columns)
Describe alternatives you've considered
I could have the data dictionary at hand but that is quite tedious for data analysis on the fly.
Additional context
N/A
using FunSQL: SQLTable, From, Select, Join, Where, Group, Fun, Get, Agg, render
person = SQLTable(:person, :person_id, :year_of_birth, :location_id)
visit_occurrence = SQLTable(:visit_occurrence, :visit_occurrence_id, :person_id, :visit_start_date, :visit_end_date)
q = From(person) |>
Join(:visit_group => From(visit_occurrence) |> Group(Get.person_id), on = Get.person_id .== Get.visit_group.person_id) |>
Where(Agg.count(over = Get.visit_group) .>= 2) |>
Select(Get.person_id, Agg.count(over = Get.visit_group))
print(render(q))
#=>
SELECT "person_1"."person_id", "visit_group_1"."count_2" AS "count"
FROM "person" AS "person_1"
JOIN (
SELECT "visit_occurrence_1"."person_id", COUNT(*) AS "count_1", COUNT(*) AS "count_2"
FROM "visit_occurrence" AS "visit_occurrence_1"
GROUP BY "visit_occurrence_1"."person_id"
) AS "visit_group_1" ON ("person_1"."person_id" = "visit_group_1"."person_id")
WHERE ("visit_group_1"."count_1" >= 2)
=#
I find that I have to write the following for Or to be useful.
function build_or(items...)
if length(items) == 0
return @funsql(false)
end
if length(items) == 1
return items[1]
end
return @funsql(or($items...))
end
render(conn, From(:anytable) |> Select(:rowid))
produces an error: ReferenceError: cannot find rowid
render(conn, From(:anytable) |> Select(“rowid"))
Almost works but it produces SELECT ‘rowid’ AS “_”
I wonder if it would be possible for FunSQL to support databases/engines that can execute SQL queries specified as regular strings, but don't have query parameter support.
A specific example that I played with and it didn't work is so-called "Virtual Observatory Table Access Protocol" (https://www.ivoa.net/documents/TAP/). There's my VirtualObservatory.jl
package that implements DBInterface
for such databases.
This is what I tried:
using VirtualObservatory
using FunSQL
conn = TAPService(:simbad)
# queries work same as elsewhere in dbinterface
execute(conn, "select top 10 * from basic")
# not 100% correct, just put something here for now
FunSQL.SQLDialect(::Type{TAPService}) = FunSQL.SQLDialect(:sqlite)
catalog = FunSQL.reflect(conn)
but it throws MethodError: no method matching prepare(::VirtualObservatory.TAPService, ::String)
. Of course, prepare
could be trivially implemented as a no-op, but I see that the next line in reflect
(https://github.com/MechanicalRabbit/FunSQL.jl/blob/master/src/reflect.jl#L90) executes the prepared query together with some parameter values. This part isn't natively supported by TAP.
Would be nice to be able to use FunSQL here!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.