Giter Club home page Giter Club logo

Comments (4)

nicor88 avatar nicor88 commented on May 28, 2024

I like the proposed naming, just some more suggestions:

  • table_type: can be external (default, the name external is what is used in hive, we are creating external tables), iceberg (that is managed)
    then the regarding format, this means when we have table_type=iceberg we cannot support the all format, but parquet/orc I guess
    I was also thinking about that when I saw your PR about iceberg.

from dbt-athena.

Jrmyy avatar Jrmyy commented on May 28, 2024

For the table_type, it is just to stick to the table_type in CTAS definition. So you mean then we will need to map external to hive right ?

from dbt-athena.

pixie79 avatar pixie79 commented on May 28, 2024

How does that affect things is table_type is not set?
Currently we are using format=JSON as a slight hack but easy way to export hourly data (with incremental) for writing data to a partner s3 bucket.
i.e we have our normal DBT models, then we have some partner product models which really would be export views of the last hour/24hours data. This then gets written in JSON via DBT and Athena to their specific bucket.

This has two nice side benefits (if not a bit hacky). Support can use Athena to quickly check what data was sent as it is a registered table and querying the actual data. The engineers only need to worry about DBT for the code which is the same place as the rest of the pipeline.

from dbt-athena.

nicor88 avatar nicor88 commented on May 28, 2024

So you mean then we will need to map external to hive right ?

@Jrmyy So pretty much we have external tables (HIVE) and not external table (ICEBERG). I was not aware that Athena had already the concept of table_type , said so, I agree with you, let's stick with that and with athena values, therefore please ignore my comment about having external as table_type, as using HIVE as table type means already that is an external table.

How does that affect things is table_type is not set?

@pixie79 we will have a default behavior, HIVE, that pretty much leave the actual setup unchanged, that means that you still need to set a format explicitly in your models if you want to use JSON.

from dbt-athena.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.