Comments (4)
I like the proposed naming, just some more suggestions:
- table_type: can be
external
(default, the name external is what is used in hive, we are creating external tables),iceberg
(that is managed)
then the regarding format, this means when we have table_type=iceberg we cannot support the all format, but parquet/orc I guess
I was also thinking about that when I saw your PR about iceberg.
from dbt-athena.
For the table_type, it is just to stick to the table_type in CTAS definition. So you mean then we will need to map external to hive right ?
from dbt-athena.
How does that affect things is table_type is not set?
Currently we are using format=JSON as a slight hack but easy way to export hourly data (with incremental) for writing data to a partner s3 bucket.
i.e we have our normal DBT models, then we have some partner product models which really would be export views of the last hour/24hours data. This then gets written in JSON via DBT and Athena to their specific bucket.
This has two nice side benefits (if not a bit hacky). Support can use Athena to quickly check what data was sent as it is a registered table and querying the actual data. The engineers only need to worry about DBT for the code which is the same place as the rest of the pipeline.
from dbt-athena.
So you mean then we will need to map external to hive right ?
@Jrmyy So pretty much we have external tables (HIVE) and not external table (ICEBERG). I was not aware that Athena had already the concept of table_type
, said so, I agree with you, let's stick with that and with athena values, therefore please ignore my comment about having external
as table_type, as using HIVE as table type means already that is an external table.
How does that affect things is table_type is not set?
@pixie79 we will have a default behavior, HIVE, that pretty much leave the actual setup unchanged, that means that you still need to set a format explicitly in your models if you want to use JSON.
from dbt-athena.
Related Issues (20)
- Docs - Add bucketing documentation for iceberg
- Upgrade to support dbt-core v1.8.0 HOT 1
- [Bug] `truncate()` partition transformation does not work when it includes more than 100 partitions HOT 1
- Bug Hitting `ThrottlingException` on `GetWorkGroup` with threads turned up HOT 5
- [Bug] Iceberg table materialization shouldn't s3_data_naming=table
- [Bug] Adapter error when FIPS mode is enabled HOT 4
- [Bug] Resolution failure for `create_table_as` macro when upgrading to 1.7.2 HOT 1
- upgrade to support dbt-core v1.8.0 HOT 6
- [Feature] Control glue database/schema for tmp tables generated by incremental models
- [Bug] force_batch deletes data from model_tmp_not_partitioned before coping to the final table HOT 2
- [Feature] Rename unique_key to unique_columns or merge_on_columns HOT 3
- [Feature] Support configurable management of Table Optimisers for Iceberg tables HOT 3
- [Bug] Error when Python Model Goes To Write To Database HOT 14
- [Feature] Custom strategy for incremental models when table type is iceberg
- [Bug] dbt source freshness expected a timestamp but received a string HOT 1
- [Feature] Athena dbt-external-tables impl as independent package HOT 5
- [Bug] Clone materialization raises an error when cloning Python models HOT 2
- TABLE_NOT_FOUND Error During Unit Testing in dbt-athena 1.8 Due to Jinja Macro Dependency HOT 3
- Hive vs Iceberg timestamps in unit tests HOT 4
- [Bug] TABLE_NOT_FOUND {{tmp_relation}} when there are zero batches to process in incremental model HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from dbt-athena.