tconbeer / sqlfmt Goto Github PK
View Code? Open in Web Editor NEWsqlfmt formats your dbt SQL files so you don't have to
Home Page: https://sqlfmt.com
License: Apache License 2.0
sqlfmt formats your dbt SQL files so you don't have to
Home Page: https://sqlfmt.com
License: Apache License 2.0
Describe the bug
A clear and concise description of what the bug is.
It looks like whitespace is removed from {{ dbt_utils.group_by }}
, sticking it directly into the clause above (a where clause, for example), which breaks the query.
To Reproduce
Steps to reproduce the behavior. If this is a formatting issue, include the input code here.
If you sqlfmt
a file that looks like this:
select
...
from <table_name>
where <where_clause>
{{ dbt_utils.group_by(10) }}
Expected behavior
A clear and concise description of what you expected to happen. If this is a formatting issue, include the expected formatting here, by manually formatting your code from above.
select
...
from <table_name>
where <where_clause>
{{ dbt_utils.group_by(10) }}
Actual behavior
Provide any output generated by sqlfmt here. If this is a formatting issue, include the code generated by sqlfmt.
select
...
from <table_name>
where <where_clause>{{ dbt_utils.group_by(10) }}
Additional context
What is the output of sqlfmt --version
?
0.6.0
Describe the bug
sqlfmt should always produce the same output on a given query, even when run multiple times. When this is not the case, we call this "unstable formatting".
There is an interaction between jinja expressions (and jinjafmt) and operators that is causing unstable formatting.
To Reproduce
This file in the gitlab project has already been formatted by 0.8.0.
Run it through sqlfmt --diff
Expected behavior
sqlfmt should no-op
Actual behavior
--- source_query
+++ formatted_query
@@ -29,25 +29,38 @@
as dim_sales_qualified_source_id,
{{ get_keyed_nulls("order_type.dim_order_type_id") }} as dim_order_type_id
from {{ ref("sheetload_sales_funnel_targets_matrix_source") }}
- left join date on {{
+ left join
+ date
+ on
+ {{
sales_funnel_text_slugify(
"sheetload_sales_funnel_targets_matrix_source.month"
)
- }} = {{ sales_funnel_text_slugify("date.fiscal_month_name_fy") }}
- left join sales_qualified_source on {{
+ }}
+ = {{ sales_funnel_text_slugify("date.fiscal_month_name_fy") }}
+ left join
+ sales_qualified_source
+ on
+ {{
sales_funnel_text_slugify(
"sheetload_sales_funnel_targets_matrix_source.opportunity_source"
)
- }} = {{
+ }}
+ =
+ {{
sales_funnel_text_slugify(
"sales_qualified_source.sales_qualified_source_name"
)
}}
- left join order_type on {{
+ left join
+ order_type
+ on
+ {{
sales_funnel_text_slugify(
"sheetload_sales_funnel_targets_matrix_source.order_type"
)
- }} = {{ sales_funnel_text_slugify("order_type.order_type_name") }}
+ }}
+ = {{ sales_funnel_text_slugify("order_type.order_type_name") }}
),
fy22_user_hierarchy as (
@@ -94,11 +107,9 @@
from target_matrix
left join
fy22_user_hierarchy
- on {{ sales_funnel_text_slugify("target_matrix.area") }} = {{
- sales_funnel_text_slugify(
- "fy22_user_hierarchy.crm_opp_owner_area_stamped"
- )
- }}
+ on {{ sales_funnel_text_slugify("target_matrix.area") }}
+ =
+ {{ sales_funnel_text_slugify("fy22_user_hierarchy.crm_opp_owner_area_stamped") }}
where target_matrix.fiscal_year = 2022
union all
@@ -126,7 +137,8 @@
left join
fy23_and_beyond_user_hierarchy
on {{ sales_funnel_text_slugify("target_matrix.area") }}
- = {{
+ =
+ {{
sales_funnel_text_slugify(
"fy23_and_beyond_user_hierarchy.crm_opp_owner_sales_segment_geo_region_area_stamped"
)
Additional context
What is the output of sqlfmt --version
?
0.8.0
These currently raise a parsing error, but:
Given a column with the following sql
_airbyte_data:"AccountId"::varchar as accountid,
sqlfmt will add additional spaces around :
like so
_airbyte_data : "AccountId"::varchar as accountid,
Describe the bug
Sometimes aliases don't get merged onto the line above, even when there is space. Could we be using the token's raw prefix somewhere, instead of the computed prefix?
To Reproduce
from the gitlab project
), final AS (
SELECT DISTINCT
...
state_or_province AS geozone_state_or_province,
MIN(valid_from) OVER (PARTITION BY country,state_or_province,locality_group)::DATE AS first_file_date,
MAX(valid_to) OVER (PARTITION BY country,state_or_province,locality_group)::DATE AS last_file_date,
...
)
Expected behavior
),
final as (
select distinct
...
min(valid_from) over (
partition by country, state_or_province, locality_group
)::date as first_file_date,
max(valid_to) over (
partition by country, state_or_province, locality_group
)::date as last_file_date,
...
Actual behavior
),
final as (
select distinct
...
min(valid_from) over (
partition by country, state_or_province, locality_group
)::date
as first_file_date,
max(valid_to) over (
partition by country, state_or_province, locality_group
)::date
as last_file_date,
...
Additional context
What is the output of sqlfmt --version
?
Describe the bug
add support for the within group
keyword, which seems analagous to over
, e.g.,
array_agg(distinct my_field) within group (order by another_field)
This is necessary for proper formatting when this spills onto multiple lines, as below.
To Reproduce
from gitlab's project
, services_by_marketing_contact_id AS (
SELECT
marketing_contact_order.dim_marketing_contact_id AS dim_marketing_contact_id,
COUNT(*) AS pql_nbr_integrations_installed,
ARRAY_AGG(DISTINCT services.service_type) WITHIN GROUP (ORDER BY services.service_type) AS pql_integrations_installed
FROM services
LEFT JOIN project
ON services.project_id = project.dim_project_id
LEFT JOIN marketing_contact_order
ON marketing_contact_order.dim_namespace_id = project.dim_namespace_id
GROUP BY 1
), users_role_by_marketing_contact_id AS (
SELECT
marketing_contact_order.dim_marketing_contact_id,
ARRAY_AGG(DISTINCT marketing_contact.job_title) WITHIN GROUP (ORDER BY marketing_contact.job_title) AS pql_namespace_creator_job_description
FROM marketing_contact_order
...
Expected behavior
services_by_marketing_contact_id as (
select
marketing_contact_order.dim_marketing_contact_id as dim_marketing_contact_id,
count(*) as pql_nbr_integrations_installed,
array_agg(distinct services.service_type) within group (
order by services.service_type
) as pql_integrations_installed
from services
left join project on services.project_id = project.dim_project_id
left join
marketing_contact_order
on marketing_contact_order.dim_namespace_id = project.dim_namespace_id
group by 1
),
users_role_by_marketing_contact_id as (
select
marketing_contact_order.dim_marketing_contact_id,
array_agg(distinct marketing_contact.job_title) within group (
order by marketing_contact.job_title
) as pql_namespace_creator_job_description
from marketing_contact_order
...
Actual behavior
services_by_marketing_contact_id as (
select
marketing_contact_order.dim_marketing_contact_id as dim_marketing_contact_id,
count(*) as pql_nbr_integrations_installed,
array_agg(
distinct services.service_type) within group(order by services.service_type
) as pql_integrations_installed
from services
left join project on services.project_id = project.dim_project_id
left join
marketing_contact_order
on marketing_contact_order.dim_namespace_id = project.dim_namespace_id
group by 1
),
users_role_by_marketing_contact_id as (
select
marketing_contact_order.dim_marketing_contact_id,
array_agg(
distinct marketing_contact.job_title
) within group(order by marketing_contact.job_title
) as pql_namespace_creator_job_description
from marketing_contact_order
...
Additional context
What is the output of sqlfmt --version
?
Describe the bug
Jinjafmt adds a space between the macro name and its opening parens. Thankfully jinja doesn't seem to mind, but it's bad style.
To Reproduce
{{ dbt_utils.surrogate_key(['users.id', "'users'"]) }} as id,
Expected behavior
{{ dbt_utils.surrogate_key(['users.id', "'users'"]) }} as id,
Actual behavior
{{ dbt_utils.surrogate_key (['users.id', "'users'"]) }} as id,
Additional context
What is the output of sqlfmt --version
?
0.5.1
Describe the bug
The formatter is unstable with dbt config blocks.
Likely related to #175 but not sure.
To Reproduce
Start with a simple file with a config block definition, such as config_block.txt, and make a minor change. Run the formatter and then make another minor change and run the formatter again. You should see the indentation of the config variables increasing each time (4 more spaces).
Expected behavior
The formatter should produce a stable result.
Actual behavior
The formatter is unstable because it keeps indenting the config variables.
Additional context
sqlfmt, version 0.9.0
Also consider replacing ~
(jinja string concatenation) with +
or similar?
Search for an unused operator (+, -, *, /) and substitute that. black doesn't care that you can't actually divide strings.
Originally posted by @tconbeer in #177 (comment)
Describe the bug
sqlfmt doesn't support the ClickHouse dialect which is case sensitive.
To Reproduce
input code:
SELECT toString(1) AS Test_string, toDateTime64('2022-05-25', 3) AS Test_dateTime64, ifNull(null, 'TestNull') as testIf, JSONExtractString('{"abc": "hello"}', 'abc') as testJSON
Expected behavior
select toString(1) as Test_string, toDateTime64('2022-05-25', 3) as Test_dateTime64, ifNull(null, 'TestNull') as testIf, JSONExtractString('{"abc": "hello"}', 'abc') as testJSON
Actual behavior
select tostring(1) as test_string, todatetime64('2022-05-25', 3) as test_datetime64, ifnull(NULL, 'TestNull') as testif, jsonextractstring('{"abc": "hello"}', 'abc') as testjson
this output is not a valid ClickHouse code
Additional context
What is the output of sqlfmt --version
?
sqlfmt, version 0.9.0
Describe the bug
Sometimes sqlfmt increases the indentation on a mulitline jinja tag when it's run multiple times.
To Reproduce
with
base_spine as (
{{
dbt_utils.date_spine(
datepart="day",
start_date="'2021-01-01'::date",
end_date="sysdate() + interval '1 year'"
)
}}
),
Expected behavior
with
base_spine as (
{{
dbt_utils.date_spine(
datepart="day",
start_date="'2021-01-01'::date",
end_date="sysdate() + interval '1 year'"
)
}}
),
Actual behavior
with
base_spine as (
{{
dbt_utils.date_spine(
datepart="day",
start_date="'2021-01-01'::date",
end_date="sysdate() + interval '1 year'"
)
}}
),
Additional context
What is the output of sqlfmt --version
?
0.5.1
File line.py
has 785 lines of code (exceeds 250 allowed). Consider refactoring.
https://codeclimate.com/github/tconbeer/sqlfmt/src/sqlfmt/line.py#issue_61f46a9f597a09000100003e
Describe the bug
Looks like #138 should not have been closed.
I suspect this is happening when jinjafmt is enabled, but black is not able to parse the string, so we abort formatting; we don't detect this today and probably need to handle this case and not modify the indentation (same as other instances where jinjafmt no-ops)
To Reproduce
...
select
{{
dbt_utils.star(
from=ref('prep_crm_user'),
except=['CREATED_BY','UPDATED_BY','MODEL_CREATED_DATE','MODEL_UPDATED_DATE','DBT_UPDATED_AT','DBT_CREATED_AT']
)
}}
from {{ ref("prep_crm_user") }}
...
Expected behavior
This was produced by 0.8.0, so it should no-op
Actual behavior
...
select
{{
dbt_utils.star(
from=ref('prep_crm_user'),
except=['CREATED_BY','UPDATED_BY','MODEL_CREATED_DATE','MODEL_UPDATED_DATE','DBT_UPDATED_AT','DBT_CREATED_AT']
)
}}
from {{ ref("prep_crm_user") }}
...
Additional context
What is the output of sqlfmt --version
?
0.8.0
Describe the bug
Jinja doesn't like trailing commas in macro arguments (?) but black inserts them when the args wrap to multiple lines
To Reproduce
{%- macro my_macro(
arg_here,
kwarg1=true,
kwarg2="''",
kwarg3="model",
kwarg4="permanent",
kwarg5=256
) -%}
Expected behavior
{%- macro my_macro(
arg_here,
kwarg1=true,
kwarg2="''",
kwarg3="model",
kwarg4="permanent",
kwarg5=256
) -%}
Actual behavior
{%- macro my_macro(
arg_here,
kwarg1=true,
kwarg2="''",
kwarg3="model",
kwarg4="permanent",
kwarg5=256,
) -%}
Unfortunately this doesn't compile because of the trailing comma after 256
Additional context
What is the output of sqlfmt --version
?
0.6.0
Describe the bug
I believe that union
gets improperly indented. The first and second blocks are of the same "importance" but union
indents it because I assume it thinks it's a function and the opening parentheses ((
) at the start. I think this is a bug.
To Reproduce
Paste the query in http://sqlfmt.com/
Input code
(
select *
from "data_warehouse"."order_status"
where var1 is not null and var2 is not null
limit 100
)
union
(
select *
from "data_warehouse"."order_status"
where var1 is not null and var2 is not null
limit 100
)
Expected behavior
(
select *
from "data_warehouse"."order_status"
where var1 is not null and var2 is not null
limit 100
)
union
(
select *
from "data_warehouse"."order_status"
where var1 is not null and var2 is not null
limit 100
)
Actual behavior
(
select *
from "data_warehouse"."order_status"
where var1 is not null and var2 is not null
limit 100
)
union
(
select *
from "data_warehouse"."order_status"
where var1 is not null and var2 is not null
limit 100
)
Function _maybe_merge_lines_split_by_operators
has a Cognitive Complexity of 24 (exceeds 5 allowed). Consider refactoring.
https://codeclimate.com/github/tconbeer/sqlfmt/src/sqlfmt/merger.py#issue_61fc0af0ef6b070001000051
#98 should have been caught in CI, but wasn't, since our test env also installs black by default.
Poetry's next release will support dependency groups, which we can use to solve this. Just waiting for that new version to be released.
We should also use this opportunity to clean up the Lint and Test workflow so we only lint the code once.
Describe the bug
Today we don't split before a between's and, see #124
But this is wrong, we should still split before it, we should just merge it more aggressively than other ands.
To Reproduce
From Gitlab
Expected behavior
between namespace_snapshots.valid_from::date
and namespace_snapshots.valid_to_::date
Actual behavior
between namespace_snapshots.valid_from
::date and namespace_snapshots.valid_to_::date
Additional context
What is the output of sqlfmt --version
?
commit 88617d2 (0.10pre)
File line.py
has 785 lines of code (exceeds 250 allowed). Consider refactoring.
https://codeclimate.com/github/tconbeer/sqlfmt/src/sqlfmt/line.py#issue_61f46a9f597a09000100003e
While it's possible to run sqlfmt from the terminal inside VS Code, they support 3rd party formatters via their extensions API, and it would be great if we could support that.
The extension has to be written in Typescript.
There is a scaffold to get started: https://code.visualstudio.com/api/get-started/your-first-extension
As an example of a formatting extension, someone created this for black (before the official python extension added black support): https://github.com/joslarson/black-vscode
Describe the bug
jinjafmt passes jinja to black, but jinja allows some things that python does not. Specifically, variables (esp. macro arguments) can have names that are reserved words. For example:
{{ dbt_utils.star(from=ref('snowplow_page_views_30'), except=fields_to_exclude|upper) }}
This is valid jinja and dbt sql (from gitlab's project), but from
and except
are both reserved words in python. Running black -c "from"
returns error: cannot format <string>: Cannot parse: 1:4: from
Right now jinjafmt no-ops in these situations, but it wouldn't be that hard to replace these reserved words (followed by an eq, e.g. from=
) with a sentinel (e.g., from__sqlfmt__=
), format that, and then replace the sentinel with the keyword again. There are 33 reserved words in python
Describe the bug
When using the dbt_utils.unpivot SQL Generator after sqlfmt has run the compiled SQL is missing spaces around the union all between select statements. This does not happen if the statement is not put through sqlfmt first. A working format for the unpivot generator is provided under additional context
To Reproduce
sqlfmt . && dbt build -s <model using unpivot>
Expected behavior
When complied each union all should be separated:
WITH unpivoted AS (
select
cast('ID' as
varchar
) as column_name,
cast(
ID
as decimal) as percentage
from <database>.<schema>.<model>
union all
select
cast('USERNAME' as
varchar
) as column_name,
cast(
USERNAME
as decimal) as percentage
from <database>.<schema>.<model>
union all
select
cast('LAST_NAME' as
varchar
) as column_name,
cast(
LAST_NAME
as decimal) as percentage
from <database>.<schema>.<model>
union all
select ...
Actual behavior
with
unpivoted as (
select
cast('ID' as varchar) as column_name,
cast(
ID
as decimal
) as percentage
from <database>.<schema>.<model>union allselect
cast('USERNAME' as varchar) as column_name,
cast(
USERNAME
as decimal
) as percentage
from <database>.<schema>.<model>union allselect
cast('LAST_NAME' as varchar) as column_name,
cast(
LAST_NAME
as decimal
) as percentage
from <database>.<schema>.<model>union allselect...
Additional context
What is the output of sqlfmt --version
?
sqlfmt, version 0.6.0
Working format
WITH unpivoted AS (
{{ dbt_utils.unpivot(
relation = <model_name>,
cast_to = 'decimal',
field_name = 'column_name',
value_name = 'percentage'
) }}
)
SELECT
*
FROM
unpivoted
e.g., today this won't merge onto a single line, since it's too long, so we leave it:
foofoofoofoofoofoofoofoofoofoo
/ barbarbarbarbarbarbar
+ foofoofoofoofoofoofoofoofoofoo
/ bazbazbazbazbazbaz
+ foofoofoofoofoofoofoofoofoofoo
/ quxquxquxquxqux
but it would be better to merge the /
lines:
foofoofoofoofoofoofoofoofoofoo / barbarbarbarbarbarbar
+ foofoofoofoofoofoofoofoofoofoo / bazbazbazbazbazbaz
+ foofoofoofoofoofoofoofoofoofoo / quxquxquxquxqux
Primer should run against already-formatted repos.
semicolons end queries, so they always have a depth of zero
when we reach a semicolon, open_brackets
should be reset to []
Segment is a List[Line]
. Would really help with readability/maintenance of merger.py to have segment methods in a different class:
merger._split_into_segments
becomes segment.create_segments_from_lines
merger._tail_closes_head
becomes Segment.tail_closes_head
merger._get_first_nonblank_line
becomes Segment.head
(add a Segment.tail
also)merger._segment_continues_operator_sequence
becomes Segment.continues_operator_sequence
merger._get_remainder_of_segment
becomes Segment.split
We use bigquery, which supports column names in mixed case though doesn't allow names which differ only in case:
https://cloud.google.com/bigquery/docs/schemas#column_names
A column name must contain only letters (a-z, A-Z), numbers (0-9), or underscores (_), and it must start with a letter or underscore. Duplicate column names are not allowed even if the case differs.
When we started using bigquery, we settle on (perhaps misguided) approach of using camelCase for column names. Now it is too late to fix it as that convention is used in hundreds of column names. It works mostly fine - as bigquery persists the case of column name.
Currently sqlfmt lowercases column names. An alternative way to think is that column names are like variable names and it is responsibility of author to get it right - so sqlfmt should just leave it as is.
If it is really common for sqlfmt users to use inconsistent case, and so it is a desirable feature to convert it all to lowercase - perhaps considering an option to leave column names alone.
Today, sqlfmt does not do any formatting inside the curlies of a jinja tag. We should enforce a single space inside each curly, and use black to format the python between.
Hi, wanted to lead off by saying how excited we are by this tool. Thanks for your work on it. Now, to the issue at hand.
sqlfmt 0.4.2 formats this query with a line break between "cross" and "join":
select foo.*, bar.*
from {{ ref('foo_foo_foo') }} as foo cross
join {{ ref('bar_bar_bar') }} as bar
I would prefer to have cross join
together on the new line, like so:
select foo.*, bar.*
from {{ ref('foo_foo_foo') }} as foo
cross join {{ ref('bar_bar_bar') }} as bar
Feel free to close the issue if this is intended behavior, but I found it surprising and suspect it might be a bug.
Describe the bug
When using between ___ and ____
, the line is always split before the and
To Reproduce
Use the between
operator in a where
clause, such as the following:
select
radio,
mcc,
net as mnc,
area as lac,
cell % 65536 as cid,
cell / 65536 as rnc,
cell as long_cid,
lon,
lat
from
towershift
where
radio != 'CDMA'
and mcc between 200 and 799
and net between 1 and 999
and area between 0 and 65535
and cell between 0 and 268435455
and lon between -180 and 180
and lat between -90 and 90
Expected behavior
select
radio,
mcc,
net as mnc,
area as lac,
cell % 65536 as cid,
cell / 65536 as rnc,
cell as long_cid,
lon,
lat
from towershift
where
radio != 'CDMA'
and mcc between 200 and 799
and net between 1 and 999
and area between 0 and 65535
and cell between 0 and 268435455
and lon between -180 and 180
and lat between -90 and 90
Actual behavior
select
radio,
mcc,
net as mnc,
area as lac,
cell % 65536 as cid,
cell / 65536 as rnc,
cell as long_cid,
lon,
lat
from towershift
where
radio != 'CDMA'
and mcc between 200
and 799
and net between 1
and 999
and area between 0
and 65535
and cell between 0
and 268435455
and lon between -180
and 180
and lat between -90
and 90
Additional context
% sqlfmt --version
sqlfmt, version 0.5.0
sqlfmt -l 80 test.sql
1 file formatted.
0 files left unchanged.
test.sql formatted.
Consider simplifying this complex logical expression.
https://codeclimate.com/github/tconbeer/sqlfmt/src/sqlfmt/splitter.py#issue_61f46a9f597a090001000042
sqlfmt does not have many configuration options, but it would be nice to be able to set the config we do have in a pyproject.toml file in the root of a project directory
Describe the bug
It's possible for a single token to be composed of multiple words, like union all
, select top 25
, not in
, etc. We parse those tokens even if they contain newlines or extra whitespace, like union\n\n all
, but we don't then remove those extra characters from the formatted output
To Reproduce
union
all
Expected behavior
union all
Actual behavior
union
all
Additional context
What is the output of sqlfmt --version
?
0.9.0
Describe the bug
A clear and concise description of what the bug is.
To Reproduce
{{- return(adapter.dispatch('my_macro', 'my_package')(arg1, arg2)) -}}
Expected behavior
{{- return(adapter.dispatch('my_macro', 'my_package')(arg1, arg2)) -}}
Actual behavior
This is not valid jinja, and won't compile
{{- return adapter.dispatch('my_macro', 'my_package')(arg1, arg2) -}}
Additional context
What is the output of sqlfmt --version
?
0.6.0
Describe the bug
We support common jinja blocks, but we forgot about {% snapshot my_snapshot %}
, so the {% endsnapshot %}
tag can get put on the wrong line.
To Reproduce
{% snapshot snp_my_snapshot %}
{{
config(
target_database='analytics',
target_schema=target.schema + '_snapshots',
unique_key='id',
strategy='timestamp',
updated_at='updated_at',
)
}}
select * from {{ ref('stg_my_model') }}
{% endsnapshot %}
Expected behavior
{% snapshot snp_my_snapshot %}
{{
config(
target_database='analytics',
target_schema=target.schema + '_snapshots',
unique_key='id',
strategy='timestamp',
updated_at='updated_at',
)
}}
select *
from {{ ref('stg_my_model') }}
{% endsnapshot %}
Actual behavior
{% snapshot snp_my_snapshot %}
{{
config(
target_database='analytics',
target_schema=target.schema + '_snapshots',
unique_key='id',
strategy='timestamp',
updated_at='updated_at',
)
}}
select *
from {{ ref('stg_my_model') }}{% endsnapshot %}
Additional context
What is the output of sqlfmt --version
? 0.5.1
other tools support this; would be helpful to avoid formatting the target
and dbt_packages
directories in a dbt project
Describe the bug
If an operator ends up on a line by itself (after splitting), we don't try hard enough to merge the segment after that operator into the operator's line, which can result in ugly output (for somewhat pathological input)
To Reproduce
From gitlab's project:
{{column|lower}} ||
ENCRYPT_RAW(
to_binary('{{ get_salt(column|lower) }}', 'utf-8'),
to_binary('{{ env_var("SALT_PASSWORD") }}', 'HEX'),
to_binary('416C736F4E637265FFFFFFAB', 'HEX')
)
Expected behavior
{{ column | lower }} || encrypt_raw(
to_binary('{{ get_salt(column|lower) }}', 'utf-8'),
to_binary('{{ env_var("SALT_PASSWORD") }}', 'HEX'),
to_binary('416C736F4E637265FFFFFFAB', 'HEX')
)
Actual behavior
{{ column | lower }} ||
encrypt_raw(
to_binary('{{ get_salt(column|lower) }}', 'utf-8'),
to_binary('{{ env_var("SALT_PASSWORD") }}', 'HEX'),
to_binary('416C736F4E637265FFFFFFAB', 'HEX')
)
Additional context
What is the output of sqlfmt --version
?
Commit 88617d2 (v0.10)
Describe the bug
Reported by @Rainymood on discourse.
We're not very smart about building a hierarchy of chained boolean expressions; mixing simple expressions with complex expressions can produce unexpected and hard-to-read formatting.
To Reproduce
select *
from historic_orders h
where 1 = 1
and order_type = 'clothing'
and category = "pants"
and dt_utc > date('2019-10-01')
and is_test is false
Expected behavior
select *
from historic_orders h
where
1 = 1
and order_type = 'clothing'
and category = "pants"
and dt_utc > date('2019-10-01')
and is_test is false
Actual behavior
select *
from historic_orders h
where
1 = 1 and order_type = 'clothing' and category = "pants" and dt_utc > date(
'2020-10-01'
) and is_test is false
Additional context
What is the output of sqlfmt --version
?
0.8.0
Consider simplifying this complex logical expression.
https://codeclimate.com/github/tconbeer/sqlfmt/src/sqlfmt/line.py#issue_61f46a9f597a09000100003d
Describe the bug
The --fmt: off
tags should stop all formatting of the contained code, but today that is not the case, since comments may be rendered differently in those blocks
To Reproduce
-- fmt: off
where something_long = something_else_long -- and there is a long comment that should stay on this line
-- fmt: on
Expected behavior
-- fmt: off
where something_long = something_else_long -- and there is a long comment that should stay on this line
-- fmt: on
Actual behavior
-- fmt: off
-- and there is a long comment that should stay on this line
where something_long = something_else_long
-- fmt: on
Additional context
What is the output of sqlfmt --version
?
0.5.1
?
, the "existence" operator in pgMaybe many others here: https://www.postgresql.org/docs/current/functions.html
Describe the feature
When we have multiple queries in a single file, it would be nice if those queries were formatted the same as if they were in their own files. These queries may be separated by semicolons or set operations, like union
Desired behavior
select * from my_table
union all
select * from another_table
;
select * from another_query_entirely
Current behavior
select *
from my_table
union all
select *
from another_table
;
select *
from another_query_entirely
Additional context
What is the output of sqlfmt --version
?
0.5.1
Describe the bug
sqlfmt breaks placeholders used by psycopg (python postgres driver), such as %s
and %(name)s
To Reproduce
Format the following code:
INSERT INTO some_table VALUES (%s)
INSERT INTO some_table VALUES (%(foo)s)
Expected behavior
The placeholders should remain intact:
insert into some_table values(%s)
insert into some_table values(%(foo)s)
Actual behavior
sqlfmt inserts extra spaces:
insert into some_table values(% s)
insert into some_table values(% (foo) s)
Additional context
What is the output of sqlfmt --version
?
sqlfmt, version 0.9.0
helpful for debugging unstable formatting
-k/--reset-cache
When doing a first-time, fresh install of sqlfmt
, I get a ModuleNotFoundError
for black
.
I added shandy-sqlfmt==0.4.2
to the requirements.txt
in my dbt project, and—in a clean virtualenv—ran pip install -r requirements.txt
.
(I know the recommended install method is pipx
, but since pip
is given as an option, I figured I'd log this FWIW.)
pip install black
in the virtualenvadditional_dependencies
line to .pre-commit-config.yaml
:- repo: https://github.com/tconbeer/sqlfmt
rev: v0.4.2
hooks:
- id: sqlfmt
language_version: python
additional_dependencies: ['black']
args: [--line-length=90]
sqlfmt
works 👍
Some jinja expressions contain keywords, but they're all just formatted like names.
For example, {{ dbt_utils.group_by(n) }}
tends to get wrapped onto the line above it. sqlfmt doesn't have any way to know that the macro compiles to a keyword, so it wants to one-line it with the end of the where
statement. If sqlfmt knew this compiled to GROUP BY, it would not do that.
I've thought about this before, and I don't have a great solution except for hard-coding it in as an exception (it's one of very few macros that start with a keyword).
other than the hacky/hard-coded method, we could consider an inline comment that could tell sqlfmt how to treat a jinja expression (if the default treatment doesn't work). So something like:
{{ dbt_utils.group_by(6) }} -- fmt: group by
If anyone else has solutions/ideas, I'm open to them, but I do not want to go down the sqlfluff road and jinja-template the sql.
For now, I'd suggest either 1) using a -- fmt: off
comment on this line, or b) not using dbt_utils.group_by()
and instead implement a macro called range
:
{%- macro range(n) -%}
{% for i in range(1, n + 1) -%}
{{ i }}{{ ',' if not loop.last }}
{%- endfor -%}
{%- endmacro -%}
and then call it with:
...
where my_condition
group by {{ range(10) }}
Originally posted by @tconbeer in #165 (comment)
Describe the bug
This is ugly. We force merge the lines that have function calls, since they open brackets, but we probably shouldn't, so that all the sibling operators stay at the same level.
The new rule should be: don't stubborn-merge p1 operators that open brackets if the previous segment also starts with an operator. Also, consider that p1 operators might need a different sequence of stubborn-merge strategies from p0 operators.
To Reproduce
From gitlab's project (indented a few more times, so it doesn't fit on a single line):
SELECT
LOWER(table_catalog)||'.'||LOWER(table_schema)||'.'||LOWER(table_name) AS fqd_name,
...
Expected behavior
select
lower(table_catalog)
|| '.'
|| lower(table_schema)
|| '.'
|| lower(table_name) as fqd_name,
Actual behavior
select
lower(table_catalog)
|| '.' || lower(
table_schema
)
|| '.'
|| lower(table_name) as fqd_name,
Additional context
What is the output of sqlfmt --version
?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.