Giter Club home page Giter Club logo

Comments (7)

amardeep avatar amardeep commented on May 30, 2024 1

Maybe my description wasn't clear - but disabling formatting doesn't work - as we use camelCase column names everywhere.

Different databases has different notion of case sensitivity for column names:
https://seeq.atlassian.net/wiki/spaces/KB/pages/443088907/SQL+Column+Names+and+Case+Sensitivity

For bigquery - it preserves case in column names. As bigquery preserves case, it would be nice if sqlfmt doesn't insist on making it lowercase. It is somewhat similar to the other bug #193.

It is totally fine to be opinionated and choose a single style for formatting and sql keywords - but insisting on column names be lowercase makes it unusable for anyone working with dbs where column name case is preserved and have chosen to use camelCase column names.

from sqlfmt.

rcaddell avatar rcaddell commented on May 30, 2024

You can disable formatting on those items

If you would like sqlfmt to ignore a file, or part of a file, you can add -- fmt: off and -- fmt: on comments to your code (or # fmt: off on MySQL or BigQuery). sqlfmt will not change any code between those comments; a single -- fmt: off at the top of a file will keep the entire file intact.

We implemented sqlfmt for this very reason, a mix of CamelCase, UPPERCASE and lowercase users, amongst other styling differences.

from sqlfmt.

tconbeer avatar tconbeer commented on May 30, 2024

Thanks for opening an issue, @amardeep, and thanks for chiming in, Randy.

A few comments from me to explain where we are, how we got here, and what we can do about it:

  1. I do have a preference for snake_case for all keywords and identifiers, and I believe this preference is shared by most data people I know. But I realize that naming and style conventions aren't always under our users' control, and this preference ideally would not create a barrier to adopting sqlfmt (I think generally, line breaks and indentation are a bigger benefit than consistent capitalization)
  2. For occasional deviations from snake_case, you can (and should) quote your identifiers. sqlfmt will never change anything about the formatting of a quoted identifier. So if your project is mostly snake_case but you have a few tables that use camelCase, spaces, or anything else, just quote it (double quotes in most dialects; backticks in BQ and MySql). Of course, if your entire project/team/company use camelCase everywhere, this is probably a nonstarter. But `myField` is easier and better than # fmt: off
  3. In BigQuery, table names are case-sensitive, but column names are not. I understand that if you are creating tables/views/etc. from your select statements and you want those relations to have camelCase fields, then you don't want those names to be lowercased, but right now sqlfmt does not produce invalid sql by lowercasing your query (unlike in #193 , where we create syntax errors)
  4. Lowercasing all keywords, and leaving identifiers intact, is very challenging and much more complex than the current application can handle. We intentionally parse a minimum subset of sql keywords to make sqlfmt compatible with every dialect. This means we're mostly parsing only the most common, reserved keywords (like select). We don't parse any function names (not event sum or count), nor do we have any way of distinguishing between table names, field names, aliases, functions, macros, or other keywords. My proposed fix to #193 is "only lowercase it if we recognize it, otherwise leave it be." That's because in ClickHouse, function names, identifiers, and aliases can all be case-sensitive, and right now I'm not willing to maintain a list of every non-reserved keyword in every dialect, and add the complexity of parsing whether a given token is an overloaded identifier or an actual function name or keyword (and what about UDFs?).

Which is all a long way of saying, once #193 is fixed, you can run your project with -- dialect clickhouse and preserve your camelCase field names. But you may also see SUM instead of sum and other undesirable side effects.

If anyone reading this wants to propose and contribute another solution, I'm open to it, but I don't think I'll be building and maintaining this myself. Going to leave this issue open for further discussion, though.

from sqlfmt.

amardeep avatar amardeep commented on May 30, 2024

Thanks for understanding Ted. You are right - that bigquery column names are not case sensitive, and so the query produced by sqlfmt is not invalid. And going with snake case would have been a better option. Though we can't change that now - so stuck with camelCase and those column names are much harder to read when all lowercased.

I understand that properly implementing case preservation will make the code much more comples. Clickhouse dialect option will definitely help. Sqlfmt is mostly attractive for us for its consistent formatting especially with jinja templates and dbt. Without dbt/jinja, jetbrains datastudio did a good enough job, but it fails with jinja. Changing case is not actually that useful for our use case - if there was a switch to not do any case changes - that would work too - as datastudio or other tools can handle case changes when needed. I will try with clickhouse dialect once that is released.

from sqlfmt.

nfcampos avatar nfcampos commented on May 30, 2024

FYI BigQuery JSON access is indeed case-sensitive, see here https://cloud.google.com/bigquery/docs/reference/standard-sql/json-data ie. if your JSON object contains a property named aProperty trying to access aproperty won't retrieve its value

from sqlfmt.

tconbeer avatar tconbeer commented on May 30, 2024

@nfcampos interesting. Can you quote JSON property names using the field access operator, like my_json_field."aProperty" or my_json_field.`aProperty` ?

from sqlfmt.

nfcampos avatar nfcampos commented on May 30, 2024

yea that's right, all 4 of these are equivalent (summary being a field of JSON type)
select summary.lastTouched, `summary`.`lastTouched`, summary.`lastTouched`, `summary`.lastTouched
This however is an error
select `summary.lastTouched`
And so is this
select summary."lastTouched"

from sqlfmt.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.