Giter Club home page Giter Club logo

funsql.jl's People

Contributors

clarkevans avatar thecedarprince avatar xitology avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

funsql.jl's Issues

do not collapse Define nodes

using FunSQL: SQLTable, From, Select, Join, Where, Group, Define, Fun, Get, Agg, render
person = SQLTable(:person, :person_id, :year_of_birth, :location_id)
q = Define(:complex_expression => From(person) |> Group() |> Select(Agg.count())) |>
    Select(Get.complex_expression .+ Get.complex_expression)
print(render(q))
#=>
SELECT ((
  SELECT COUNT(*) AS "count"
  FROM "person" AS "person_1"
) + (
  SELECT COUNT(*) AS "count"
  FROM "person" AS "person_1"
)) AS "+"
=#

Prepared Statement Not Being Properly Interpolated for PostgreSQL `reflect`

Hi @xitology and co!

I am in the process of making an all-in-one package to support connecting to a variety of databases. It is unregistered, but the draft is here: https://github.com/JuliaDatabases/DBConnector.jl

In short, I am running into a problem with the reflect statement for PostgreSQL when using a JDBC driver and JDBC.jl. I have a working example of the package here for SQLite:

  1. Create a test environment and add the following packages:
pkg> add OMOPCDMCohortCreator
pkg> add HealthSampleData
pkg> add DBInterface
pkg> add DataFrames
pkg> add https://github.com/JuliaDatabases/DBConnector.jl
  1. Download the SQLite and PostgreSQL JDBC drivers
  2. SQLite: https://github.com/xerial/sqlite-jdbc/releases/tag/3.41.0.0
  3. PostgreSQL: https://jdbc.postgresql.org
  4. Run this test to see it work correctly:
using DBConnector, DataFrames, OMOPCDMCohortCreator, HealthSampleData

eunomia = HealthSampleData.Eunomia()

conn = DBConnector.DBConnection(connection_string = "jdbc:sqlite:$(eunomia)", driver_path = 
"path/to/sqlite-jdbc-3.41.0.0.jar", connectivity = "jdbc")

GenerateDatabaseDetails(:sqlite, "main")

GenerateTables(conn)

GetDatabasePersonIDs(conn)

As you can see, the above works perfectly for SQLite. However, when doing something similar for PostgreSQL, it fails. Sadly, I don't have a PostgreSQL DB set-up for reproducibility that you could use, but here is the script:

using DBConnector, DataFrames, OMOPCDMCohortCreator, HealthSampleData

eunomia = HealthSampleData.Eunomia()

conn = DBConnector.DBConnection(connection_string = "jdbc:postgresql:db/path", driver_path = 
"path/to/postgresql-jdbc.jar", connectivity = "jdbc")

GenerateDatabaseDetails(:postgresql, "synpuf5")

GenerateTables(conn)

And I get the following error:

Exception in thread "main" org.postgresql.util.PSQLException: ERROR: bind message suppl
ies 0 parameters, but prepared statement "" requires 1
        at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorI
mpl.java:2553)
        at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.ja
va:2285)
        at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:323)
        at org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:473)
        at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:393)
        at org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:322)
        at org.postgresql.jdbc.PgStatement.executeCachedSql(PgStatement.java:308)
        at org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:284)
        at org.postgresql.jdbc.PgStatement.executeQuery(PgStatement.java:236)
ERROR: JavaCall.JavaCallError("Error calling Java: org.postgresql.util.PSQLException: E
RROR: bind message supplies 0 parameters, but prepared statement \"\" requires 1")
Stacktrace:
  [1] geterror(allow::Bool)
    @ JavaCall ~/.julia/packages/JavaCall/MlduK/src/core.jl:418
  [2] geterror
    @ ~/.julia/packages/JavaCall/MlduK/src/core.jl:403 [inlined]
  [3] _jcall(obj::JavaCall.JavaObject{Symbol("java.sql.Statement")}, jmethodId::Ptr{Not
hing}, callmethod::Ptr{Nothing}, rettype::Type, argtypes::Tuple{DataType}, args::String
)
    @ JavaCall ~/.julia/packages/JavaCall/MlduK/src/core.jl:373
  [4] jcall(obj::JavaCall.JavaObject{Symbol("java.sql.Statement")}, method::String, ret
type::Type, argtypes::Tuple{DataType}, args::String)
    @ JavaCall ~/.julia/packages/JavaCall/MlduK/src/core.jl:245
  [5] executeQuery
    @ ~/.julia/packages/JDBC/2ruzk/src/JDBC.jl:146 [inlined]
  [6] #prepare#8
    @ ~/FOSS/DBConnector.jl/src/jdbc.jl:20 [inlined]
  [7] prepare
    @ ~/FOSS/DBConnector.jl/src/jdbc.jl:18 [inlined]
  [8] reflect(conn::JavaCall.JavaObject{Symbol("java.sql.Connection")}; schema::String,
 dialect::Symbol, cache::Int64)
    @ FunSQL ~/.julia/packages/FunSQL/Ufc3L/src/reflect.jl:89
  [9] GenerateTables(conn::JavaCall.JavaObject{Symbol("java.sql.Connection")}; inplace:
:Bool, exported::Bool)
    @ OMOPCDMCohortCreator ~/.julia/packages/OMOPCDMCohortCreator/0hUCQ/src/generators.
jl:168
 [10] GenerateTables(conn::JavaCall.JavaObject{Symbol("java.sql.Connection")})
    @ OMOPCDMCohortCreator ~/.julia/packages/OMOPCDMCohortCreator/0hUCQ/src/generators.
jl:166
 [11] top-level scope
    @ REPL[6]:1

Doing some more digging in the reflect source code, I discovered that for some reason here:

FunSQL.jl/src/reflect.jl

Lines 17 to 28 in 2741b75

const postgresql_reflect_clause =
FROM(:n => (:pg_catalog, :pg_namespace)) |>
JOIN(:c => (:pg_catalog, :pg_class), on = FUN("=", (:n, :oid), (:c, :relnamespace))) |>
JOIN(:a => (:pg_catalog, :pg_attribute), on = FUN("=", (:c, :oid), (:a, :attrelid))) |>
WHERE(FUN(:and, FUN("=", (:n, :nspname), FUN(:coalesce, VAR(:schema), "public")),
FUN(:in, (:c, :relkind), "r", "v"),
FUN(">", (:a, :attnum), 0),
FUN(:not, (:a, :attisdropped)))) |>
ORDER((:n, :nspname), (:c, :relname), (:a, :attnum)) |>
SELECT(:schema => (:n, :nspname),
:name => (:c, :relname),
:column => (:a, :attname))

My assigned schema is not being interpolated correctly into the SQL block despite my prepare statement in DBConnector accepting the interpolation syntax. I have isolated the error to my extension of the function DBInterface.prepare in DBConnector as I can confirm that using my dispatch for DBInterface.connect and DBInterface.execute does work with code querying a PostgreSQL DB like DBInterface.execute(conn, "SELECT * FROM person LIMIT 1;") |> DataFrame

Here is the source code for my JDBC DBInterface.prepare dispatch (link here too: https://github.com/JuliaDatabases/DBConnector.jl/blob/main/src/jdbc.jl):

"""
Dispatch for JDBC interface to DBInterface `prepare` function
BUG: Doesn't seem to work for all JDBC versions yet
"""
function DBInterface.prepare(conn::JDBC.JavaObject{Symbol("java.sql.Connection")}, args...; kws...)
    stmt = JDBC.createStatement(conn)
    result = executeQuery(stmt, args...)
    return result
end

Any thoughts as to what I may be doing wrong?

Thanks!

`Define()` is ignored by `default_list()`

Consider a query where Define overrides an existing column of a table:

using FunSQL: SQLTable, From, Define, Select, Get, render

person = SQLTable(:person, :person_id, :year_of_birth, :location_id)

q = From(person) |>
    Define(:year_of_birth => 0)

When the query does not have an explicit Select node, this Define node gets ignored:

print(render(q))
#=>
SELECT "person_1"."person_id", "person_1"."year_of_birth", "person_1"."location_id"
FROM "person" AS "person_1"
=#

One would expect that year_of_birth respects the redefinition. To fix this, we should make the columns created with Define to be included to the default output.

[BUG] `Define` Not Creating New Column for SQL Query

Describe the bug

I was using the documentation on the Define functionality in building a query that defines a new column.
Currently, this function seems to be broken

To Reproduce

  1. Julia Version (i.e. output of julia -v): 1.6.1

  2. Operating system (Mac, Linux, Windows): macOS Catalina 10.15.7

  3. FunSQL Version: FunSQL v0.7.0

  4. Minimum working code example that led to bug:

using LibPQ

const conn = LibPQ.Connection("CREDENTIALS")

const person = SQLTable(:person, columns = [:person_id, :year_of_birth, :location_id])
const location = SQLTable(:location, columns = [:location_id, :city, :state])

q = From(person) |>
           Define(:age => 2021 .- Get.year_of_birth) |>
           Where(Get.age .> 16)

sql = render(q, dialect = :postgresql)
res = LibPQ.execute(conn, sql)
test = DataFrame(res)

Expected Behavior and Actual Behavior

I was expecting a final table that contains a new column named age but after SQL rendering, there is no new column being generated.

julia> print(sql)
SELECT "person_1"."person_id", "person_1"."year_of_birth", "person_1"."location_id",  [...other columns...] 
FROM "synpuf5"."person" AS "person_1" 
WHERE ((2021 - "person_1"."year_of_birth") > 16)     

Stacktrace (If Applicable)

No error available.

Screenshots

N/A

Additional context

N/A

TagBot trigger issue

This issue is used to trigger TagBot; feel free to unsubscribe.

If you haven't already, you should update your TagBot.yml to include issue comment triggers.
Please see this post on Discourse for instructions and more details.

If you'd like for me to do this for you, comment TagBot fix on this issue.
I'll open a PR within a few hours, please be patient!

Track scalar types

Currently, FunSQL tracks the names and the order of the output columns, but it does not distinguishes their types.

To fix this, FunSQL needs to:

  • Introspect the types of table columns and preserve them in the catalog;
  • Implement type resolution for SQL functions and operators.

Column information could be stored in a structure SQLColumn(name, type) with SQLTable.columns becoming an OrderedDict{Symbol, SQLColumn}. Accidentally, it will allow a column to have a FunSQL name that differs from its SQL name.

It is much harder to implement type resolution for arbitrary expressions. We could start by allowing the user to explicitly declare the result type of a particular function or an operator call.

[FEATURE] stable order

often times I'd like to add a top-level sort column but otherwise keep subordinate sorts

Adding multiple qualifiers to table results in joins being subqueries

Hi there,

Firstly just want to say a huge thank you for making this library. It really is the perfect solution to my current needs.

I am currently working on a project which uses multiple catalogues/schemas within a single datasource. When trying to implement this with the FunSQL library I get an unexpected behaviour for joins. Using the Join example found in the API reference for this library the following sql is rendered when no qualifiers are used on a SQLTable object:

using FunSQL: SQLTable, render, From, Join, Select, Get


person0 = SQLTable(:person, columns=[:person_id, :location_id])
location0 = SQLTable(:location, columns=[:location_id, :state])

query = From(:person) |> Join(:location => From(:location), Get.location_id .== Get.location.location_id) |> Select(Get.person_id, Get.location.state)

println(render(query, tables=[person0, location0]))

Output:

SELECT
  "person_1"."person_id",
  "location_1"."state"
FROM "person" AS "person_1"
JOIN "location" AS "location_1" ON ("person_1"."location_id" = "location_1"."location_id")

This is the expected behaviour, namely that the Join clause created joins directly to the table. This does not happen when using 2 or more qualifiers (the situation I find myself in: one for the DB and another for the schema).

person2 = SQLTable(:person, qualifiers=["EXAMPLE_DB", "EXAMPLE_SCHEMA"], columns=[:person_id, :location_id])
location2 = SQLTable(:location, qualifiers=["EXAMPLE_DB", "EXAMPLE_SCEHMA"], columns=[:location_id, :state])

println(render(query, tables=[person2, location2]))

Output:

SELECT
  "person_1"."person_id",
  "location_2"."state"
FROM "EXAMPLE_DB"."EXAMPLE_SCHEMA"."person" AS "person_1"
JOIN (
  SELECT
    "location_1"."location_id",
    "location_1"."state"
  FROM "EXAMPLE_DB"."EXMAPLE_SCHEMA"."location" AS "location_1"
) AS "location_2" ON ("person_1"."location_id" = "location_2"."location_id")

As you can see, whenever one uses 2+ qualifiers then the join clause is created using an inner/nested/sub query. While I am not very experienced in DB tuning, I don't think that is as efficient as a direct join.

I am also unsure if this expected behaviour for the library and perhaps there is a configuration setting I can change to amend the behaviour.

Thank you :)

Documentation code gives error for "CrossJoin"

When I run this line, copied from the documentation, error message appears => CrossJoin is undefined
using FunSQL: FunSQL, Agg, Append, As, Asc, Bind, CrossJoin, Define, Desc, Fun, From, Get, Group, Highlight, Iterate, Join, LeftJoin, Limit, Lit, Order, Partition, Select, Sort, Var, Where, With, WithExternal, render

`Get` attributes versus table attributes

Hello, I'm really glad this project is starting to exist. I've been checking out the documentation and trying to learn about it.

One part of the design that I'm curious about is I'm interested in why Get is used.

In the first query example, we have

const person =
    SQLTable(:person, columns = [:person_id, :year_of_birth, :location_id])
const location =
    SQLTable(:location, columns = [:location_id, :city, :state])
const visit_occurrence =
    SQLTable(:visit_occurrence, columns = [:visit_occurrence_id, :person_id, :visit_start_date])

q = person |>
    Where(Get.year_of_birth .<= 1950) |>
    Join(:location => location,
         on = Get.location_id .== Get.location.location_id) |>
    Where(Get.location.state .== "IL") |>
    Join(:visit_group => visit_occurrence |>
                         Group(Get.person_id),
         on = Get.person_id .== Get.visit_group.person_id,
         left = true) |>
    Select(Get.person_id,
           :max_visit_start_date =>
               Get.visit_group |> Agg.max(Get.visit_start_date))

Why is it designed to use Get.visit_start_date instead of visit_occurrence.visit_start_date as it would appear in SQL? Likewise, what is the advantage of an API that wants Get.location.location_id instead of location.location_id? Is this style inspired by some other query builder?

Reverse render: constructing query nodes from existing SQL string

One point of friction using FunSQL.jl is that I still often need to switch between two different syntax. I still need to write SQL in a query editor while debugging and performance testing, but I eventually need to translate to modular FunSQL.jl syntax in a package. I'm not sure how hard it would be at this stage to reverse the render function to generate the Julia object I want to use in my code... perhaps that kind of functionality is out of scope? I do think it would be really slick and it'd be really cool if this function could identify parts of a dialect that aren't currently supported.

do not collapse a `Group` node when an aggregate is used in `Bind`

using FunSQL: SQLTable, From, Select, Join, Where, Group, Define, Bind, Fun, Get, Agg, Var, render
visit_occurrence = SQLTable(:visit_occurrence, :visit_occurrence_id, :person_id, :visit_start_date, :visit_end_date)
condition_occurrence = SQLTable(:condition_occurrence, :condition_occurrence_id, :person_id, :condition_start_date, :condition_end_date)
ConditionAfter(person_id, date) =
    From(condition_occurrence) |>
    Where(Fun.and(Get.person_id .== Var.person_id, Get.condition_start_date .> Var.date)) |>
    Bind(:person_id => person_id, :date => date)
q = From(visit_occurrence) |>
    Group(Get.person_id) |>
    Where(Fun.exists(ConditionAfter(Get.person_id, Agg.max(Get.visit_start_date))))
print(render(q))
#=>
SELECT "visit_occurrence_1"."person_id"
FROM "visit_occurrence" AS "visit_occurrence_1"
GROUP BY "visit_occurrence_1"."person_id"
HAVING (EXISTS (
  SELECT NULL
  FROM "condition_occurrence" AS "condition_occurrence_1"
  WHERE (("condition_occurrence_1"."person_id" = "visit_occurrence_1"."person_id") AND ("condition_occurrence_1"."condition_start_date" > MAX("visit_occurrence_1"."visit_start_date")))
=#

Make built-in SQL functions and operators available in @funsql macro

For use with @funsql macro, FunSQL currently exports a small number of common SQL functions and operators:

Any SQL function not in these lists has to be declared like this:

const var"funsql_%" = FunSQL.Fun."%"
const funsql_array_get = FunSQL.Fun."?[?]"
const funsql_as_integer = FunSQL.Fun."(?::integer)"
const funsql_bool_and = FunSQL.Agg.bool_and
const funsql_regexp_matches = FunSQL.Fun.regexp_matches
const funsql_string_agg = FunSQL.Agg."string_agg(?, NULL ORDER BY ?)"

Ideally, FunSQL should provide all built-in functions and operators, for every supported database engine. Since every engine has its own list of supported functions, these declarations could be added to engine-specific modules:

using FunSQL, FunSQL.PostgreSQL
@funsql select(as_integer("0")) #-> SELECT ('0'::integer)

[FEATURE REQUEST] Pretty Printing of Rendered SQL

Currently, the SQL that gets printed looks like this:

julia> println(sql)
SELECT "person_1"."person_id", "person_1"."gender_concept_id", "person_1"."year_of_birth",
 "person_1"."month_of_birth", "person_1"."day_of_birth", "person_1"."birth_datetime", "per
son_1"."race_concept_id", "person_1"."ethnicity_concept_id", "person_1"."location_id", "pe
rson_1"."provider_id", "person_1"."care_site_id", "person_1"."person_source_value", "perso
n_1"."gender_source_value", "person_1"."gender_source_concept_id", "person_1"."race_source
_value", "person_1"."race_source_concept_id", "person_1"."ethnicity_source_value", "person
_1"."ethnicity_source_concept_id"
FROM "person" AS "person_1"   

Which is fine if you are not later interfacing with the SQL.
However, for printing to reports/tutorials/materials, the strings can come out looking very cluttered or hard to read.
It would be nice if there could be a display dispatch or something that takes a SQLStatement and could print it out like this:

SELECT "person_1"."person_id",
       "person_1"."gender_concept_id",
       "person_1"."year_of_birth",
       "person_1"."month_of_birth",
       "person_1"."day_of_birth",
       "person_1"."birth_datetime",
       "person_1"."race_concept_id",
       "person_1"."ethnicity_concept_id",
       "person_1"."location_id",
       "person_1"."provider_id",
       "person_1"."care_site_id",
       "person_1"."person_source_value",
       "person_1"."gender_source_value",
       "person_1"."gender_source_concept_id",
       "person_1"."race_source_value",
       "person_1"."race_source_concept_id",
       "person_1"."ethnicity_source_value",
       "person_1"."ethnicity_source_concept_id"
FROM "person" AS "person_1"

Is this something that could be implemented?
Thanks!

bound references are not rebased

using FunSQL: SQLTable, From, Define, Where, Get, render

person = SQLTable(:person, :person_id, :year_of_birth, :location_id)

ByYOB(yob) =
    (q = Define(:yob => Get.year_of_birth)) |>
    Where(q.yob .== yob)

q = From(person) |>
    ByYOB(1950)

print(render(q))
#=>
ERROR: GetError: cannot find yob in:
let person = SQLTable(:person, …),
    q1 = From(person),
    q2 = q1 |> Define(Get.year_of_birth |> As(:yob)),
    q3 = Define(Get.year_of_birth |> As(:yob)),
    q4 = q2 |> Where(Fun."=="(q3.yob, Lit(1950)))
    q4
end
=#

`Agg` should always require a `Group` node.

using FunSQL: SQLTable, From, Select, Get, Agg, render
person = SQLTable(:person, :person_id, :year_of_birth, :location_id)
q = From(person) |> Select(Get.person_id, Agg.count())
print(render(q))
#=>
SELECT "person_1"."person_id", COUNT(*) AS "count"
FROM "person" AS "person_1"
=#

This should instead raise an error.

Define: add new columns to a specific position

Currently, Define adds any new columns at the end of the current column list. This is fine as the default behavior, but it should be possible to insert them at the beginning or at any particular position.

Definitions that are replacing existing columns currently retain their positions. It should be possible to re-position them as if they are new columns.

Migrate documentation to @funsql macro

The @funsql macro was officially released in v0.12.0. Going forward, it should become a preferred method for assembling FunSQL queries. For this reason, the documentation, in particular, Usage Guide, API Reference and Examples, should be updated to use @funsql macro.

[BUG] Improper Handling of `datetime`-Typed Objects

Hi all,

Running into a strange type issue with datetimes.

Is your feature request related to a problem? Please explain.

I am using the Eunomia sqlite database for some prototyping of ideas and for some reason, the birth_datetime field is misinterpreted as a Float object.
Even if I tried to modify the query to Fun.datetime(Get.birth_datetime), this results in the objects being reported missing in my resulting dataframe.

To Reproduce

Julia Version (i.e. output of julia -v): 1.6.1

Operating system (Mac, Linux, Windows): macOS Catalina 10.15.7

FunSQL Version: FunSQL v0.8.0

Minimum working code example that led to bug:

using RCall
path_to_db = mktempdir() * "file.sqlite"

R"""
library("Eunomia")
Eunomia::getEunomiaConnectionDetails(databaseFile = $(path_to_db))
"""

person = SQLTable(
    :person,
    columns = [
        :person_id,
        :gender_concept_id,
        :year_of_birth,
        :month_of_birth,
        :day_of_birth,
        :birth_datetime,
        :race_concept_id,
        :ethnicity_concept_id,
        :location_id,
        :provider_id,
        :care_site_id,
        :person_source_value,
        :gender_source_value,
        :gender_source_concept_id,
        :race_source_value,
        :race_source_concept_id,
        :ethnicity_source_value,
        :ethnicity_source_concept_id,
    ],
)

sql =
    From(person) |>
    Select(
    	Get.person_id,
        Get.gender_concept_id,
        Get.year_of_birth,
        Get.month_of_birth,
        Get.day_of_birth,
        Get.birth_datetime,
        Get.race_concept_id,
        Get.ethnicity_concept_id,
        Get.location_id,
        Get.provider_id,
        Get.care_site_id,
        Get.person_source_value,
        Get.gender_source_value,
        Get.gender_source_concept_id,
        Get.race_source_value,
        Get.race_source_concept_id,
        Get.ethnicity_source_value,
        Get.ethnicity_source_concept_id,
    ) |>
    x -> render(x, dialect = :sqlite)

result = SQLite.DB(path_to_db) |> 
	 eunomia -> DBInterface.execute(eunomia, sql) |> DataFrame

Expected Behavior and Actual Behavior

I expected the datetime object to be converted to a Julia DateTime object in Julia.
Instead, the datetime is either converted to a Float or missing.

Bug: CTEs are not aliased when used in the query

In the compiled SQL string, the CTE tables are not aliased. This results in an invalid query when doing something like a self join.

Repro

using FunSQL:
    SQLTable, 
    From, Join, Get, As, Select, With,
    render;

const tab = 
    SQLTable(:foo, columns=[:x, :y])

q = From(:foo_plus) |>
    Join(:foo_plus_dup => From(:foo_plus), on = Get.x .== Get.foo_plus_dup.x2) |>
    With(:foo_plus => From(tab) |> Select(Get.x, Get.x .+ 1 |> As(:x2)))
print(render(q))

#=>
WITH "foo_plus_1" ("x", "x2") AS (
  SELECT
    "foo_1"."x",
    ("foo_1"."x" + 1) AS "x2"
  FROM "foo" AS "foo_1"
)
SELECT
  "foo_plus_1"."x",
  "foo_plus_1"."x2"
FROM "foo_plus_1"
JOIN "foo_plus_1" ON ("foo_plus_1"."x" = "foo_plus_1"."x2")
=#

Possible Fix

A From node referencing a CTE subquery is specialized to a FromReference node during the annotation pass. Looking at the assemble routine for FromTable, an alias is created everytime the node is encountered. We could port over the same logic to FromReference.

# current impl
function assemble(n::FromReferenceNode, refs, ctx)
    cte_a = ctx.cte_map[n.over]
    a = unwrap_repl(cte_a.a)

    # remove
    # c = FROM(over = ID(over = cte_a.schema, name = cte_a.name))
    # subs = make_subs(a, cte_a.name)

    # add
    alias = allocate_alias(ctx, cte_a.name)
    c = FROM(over = ID(over = cte_a.schema, name = alias))
    subs = make_subs(a, alias)

    trns = Pair{SQLNode, SQLClause}[]
    for ref in refs
        push!(trns, ref => subs[ref])
    end
    repl, cols = make_repl_cols(trns)
    return Assemblage(c, cols = cols, repl = repl)
end

I would commit a PR but I'm not yet comfortable with the Julia workflow.

add additional sort column

When constructing component queries, I sometimes would like to add an additional sort column to an existing sort order. However, currently it seems that I'm only able to overwrite sort columns. Conversely, it'd be nice to have a stable sort, which would do the opposite, add a high-level sort order but keeping existing subordinate sorts in place.

Empty catalog after connect to Azure SQL server using ODBC

I can connect to my server using

conn = DBInterface.connect(FunSQL.DB{ODBC.Connection}, 
    "Driver={ODBC Driver 18 for SQL Server};server=xxxx.database.windows.net;database=xxxx;UID=xxxx;Authentication=ActiveDirectoryInteractive;Encrypt=yes;", 
    dialect = :sqlserver)

, and can query the database OK with handwritten sql.

However, the FunSQL catalog object is not populated with any tables; so From(:xxsymbol) just throws a FunSQL.ReferenceError

All the table names in my database have a dot in them, I wonder if that has any impact?

[BUG] Using `DBInterface.execute` Does Not "Just Work" with MySQL

Hi all,

I seem to be having a strange issue where the following code does not work as expected:

From(tab) |> Select(Get.foo) |> Limit(1) |> q -> render(q, dialect = :mysql) |> x -> DBInterface.execute(conn, x) |> DataFrame

ERROR: MethodError: no method matching query(::MySQL.API.MYSQL, ::FunSQL.SQLString)
Closest candidates are:
  query(::MySQL.API.MYSQL, ::String) at ~/.julia/packages/MySQL/0vHyV/src/api/capi.jl:1219
Stacktrace:
 [1] execute(conn::MySQL.Connection, sql::FunSQL.SQLString, params::Tuple{}; mysql_store_result::Bool, mysql_date_and_time::Bool)
   @ MySQL ~/.julia/packages/MySQL/0vHyV/src/execute.jl:158
 [2] execute(conn::MySQL.Connection, sql::FunSQL.SQLString, params::Tuple{}) (repeats 2 times)
   @ MySQL ~/.julia/packages/MySQL/0vHyV/src/execute.jl:155
 [3] (::var"#8#10")(x::FunSQL.SQLString)
   @ Main ./REPL[16]:1
 [4] |>(x::FunSQL.SQLString, f::var"#8#10")
   @ Base ./operators.jl:966
 [5] (::var"#7#9")(q::FunSQL.SQLNode)
   @ Main ./REPL[16]:1
 [6] |>(x::FunSQL.SQLNode, f::var"#7#9")
   @ Base ./operators.jl:966
 [7] top-level scope
   @ REPL[16]:1

Where my connection looks like this:

conn = DBInterface.connect(MySQL.Connection, "my.host.com", "username", "password"; db = "database", port = 3306, unix_socket = MySQL.API.MYSQL_DEFAULT_SOCKET, client_flag=MySQL.API.CLIENT_MULTI_STATEMENTS, opts = Dict())

A workaround to this I found was doing the following:

From(tab) |> Select(Get.foo) |> Limit(1) |> q -> render(q, dialect = :mysql) |> x -> DBInterface.execute(conn, String(x)) |> DataFrame

Which returned the expected result.
This seems like a bug where FunSQL.SQLString is not accepted when I expected it to be.
Is this a bug?

Details:

FunSQL Version: [cf6cc811] FunSQL v0.10.1
Julia Version: 1.7.0
OS: Mac OSX

`Unselect()`

Sometimes it is easier to remove columns from the default output than to build the new output list. A new node, Unselect(), should do it.

Allow FunSQL "views" in SQLCatalog

The SQLCatalog mapping stores available SQL tables indexed by name. It is used for resolving query nodes From(name::Symbol).

We should extend SQLCatalog to allow entries that are arbitrary FunSQL queries. Such entries can be interpreted as CTEs. This will be useful, in particular, for composable ETL pipelines.

Support for DuckDB?

Hi @xitology and @clarkevans !

I was wondering, would it be possible to support DuckDB syntax here? I had read the syntax was very similar to PostgreSQL, so I ran OHDSICohortExpressions with the model flavor set to PostgreSQL but I got an error that looked like this

julia> DBInterface.execute(conn, queries[3] |> String)
ERROR: Binder Error: No function matches the given name and argument types '+(DATE, BIGINT)'. You might need to add explicit type casts.
        Candidate functions:
        +(TINYINT) -> TINYINT
        +(TINYINT, TINYINT) -> TINYINT
        +(SMALLINT) -> SMALLINT
        +(SMALLINT, SMALLINT) -> SMALLINT
        +(INTEGER) -> INTEGER
        +(INTEGER, INTEGER) -> INTEGER
        +(BIGINT) -> BIGINT
        +(BIGINT, BIGINT) -> BIGINT
        +(HUGEINT) -> HUGEINT
        +(HUGEINT, HUGEINT) -> HUGEINT
        +(FLOAT) -> FLOAT
        +(FLOAT, FLOAT) -> FLOAT
        +(DOUBLE) -> DOUBLE
        +(DOUBLE, DOUBLE) -> DOUBLE
        +(DECIMAL) -> DECIMAL
        +(DECIMAL, DECIMAL) -> DECIMAL
        +(UTINYINT) -> UTINYINT
        +(UTINYINT, UTINYINT) -> UTINYINT
        +(USMALLINT) -> USMALLINT
        +(USMALLINT, USMALLINT) -> USMALLINT
        +(UINTEGER) -> UINTEGER
        +(UINTEGER, UINTEGER) -> UINTEGER
        +(UBIGINT) -> UBIGINT
        +(UBIGINT, UBIGINT) -> UBIGINT
        +(DATE, INTEGER) -> DATE
        +(INTEGER, DATE) -> DATE
        +(INTERVAL, INTERVAL) -> INTERVAL
        +(DATE, INTERVAL) -> DATE
        +(INTERVAL, DATE) -> DATE
        +(TIME, INTERVAL) -> TIME
        +(INTERVAL, TIME) -> TIME
        +(TIMESTAMP, INTERVAL) -> TIMESTAMP
        +(INTERVAL, TIMESTAMP) -> TIMESTAMP
        +(TIME, DATE) -> TIMESTAMP
        +(DATE, TIME) -> TIMESTAMP
        +(ANY[], ANY[]) -> ANY[]

I could give my whole SQL expression for what I am trying but am curious if there'd ever be interest to support DuckDB. Thanks!

[FEATURE REQUEST] Analogous Functionality for `SELECT * FROM table` in `SQLTable`

Hi all,

I was experimenting with FunSQL today - super easy to work with by the way! - and ran into this personally irritating situation.
What do you all think?

Is your feature request related to a problem? Please explain.

Currently, I need to specify what are the exact columns I need to query when creating a SQLTable entity like so:

const person = SQLTable(:person, columns = [:person_id, :year_of_birth, :location_id])

I find having to define each column I need to be painful as sometimes I do not know all the columns for a given table easily when I write a query.

Describe the solution you'd like

Ideally, I would like to be able to provide nothing to the columns variable and have it default to select all columns in the query - analogous to SELECT * FROM person.
I was wondering if the syntax could like something like this:

const person = SQLTable(:person)

and a dispatch or underlying functionality to look like:

SQLTable(name; schema = nothing, columns = nothing) =
    SQLTable(schema = schema, name = name, columns = columns)

Describe alternatives you've considered

I could have the data dictionary at hand but that is quite tedious for data analysis on the fly.

Additional context

N/A

deduplicate aggregates

using FunSQL: SQLTable, From, Select, Join, Where, Group, Fun, Get, Agg, render
person = SQLTable(:person, :person_id, :year_of_birth, :location_id)
visit_occurrence = SQLTable(:visit_occurrence, :visit_occurrence_id, :person_id, :visit_start_date, :visit_end_date)
q = From(person) |>
    Join(:visit_group => From(visit_occurrence) |> Group(Get.person_id), on = Get.person_id .== Get.visit_group.person_id) |>
    Where(Agg.count(over = Get.visit_group) .>= 2) |>
    Select(Get.person_id, Agg.count(over = Get.visit_group))
print(render(q))
#=>
SELECT "person_1"."person_id", "visit_group_1"."count_2" AS "count"
FROM "person" AS "person_1"
JOIN (
  SELECT "visit_occurrence_1"."person_id", COUNT(*) AS "count_1", COUNT(*) AS "count_2"
  FROM "visit_occurrence" AS "visit_occurrence_1"
  GROUP BY "visit_occurrence_1"."person_id"
) AS "visit_group_1" ON ("person_1"."person_id" = "visit_group_1"."person_id")
WHERE ("visit_group_1"."count_1" >= 2)
=#

smarter Or

I find that I have to write the following for Or to be useful.

function build_or(items...)
    if length(items) == 0
        return @funsql(false)                       
    end
    if length(items) == 1
        return items[1]    
    end
    return @funsql(or($items...))
end

No way to use rowid

render(conn, From(:anytable) |> Select(:rowid))

produces an error: ReferenceError: cannot find rowid

render(conn, From(:anytable) |> Select(“rowid"))

Almost works but it produces SELECT ‘rowid’ AS “_”

Support for databases without parametrized queries

I wonder if it would be possible for FunSQL to support databases/engines that can execute SQL queries specified as regular strings, but don't have query parameter support.
A specific example that I played with and it didn't work is so-called "Virtual Observatory Table Access Protocol" (https://www.ivoa.net/documents/TAP/). There's my VirtualObservatory.jl package that implements DBInterface for such databases.

This is what I tried:

using VirtualObservatory
using FunSQL

conn = TAPService(:simbad)

# queries work same as elsewhere in dbinterface
execute(conn, "select top 10 * from basic")

# not 100% correct, just put something here for now
FunSQL.SQLDialect(::Type{TAPService}) = FunSQL.SQLDialect(:sqlite)

catalog = FunSQL.reflect(conn)

but it throws MethodError: no method matching prepare(::VirtualObservatory.TAPService, ::String). Of course, prepare could be trivially implemented as a no-op, but I see that the next line in reflect (https://github.com/MechanicalRabbit/FunSQL.jl/blob/master/src/reflect.jl#L90) executes the prepared query together with some parameter values. This part isn't natively supported by TAP.

Would be nice to be able to use FunSQL here!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.