Giter Club home page Giter Club logo

dbt-athena's Introduction

dbt-athena's People

Contributors

cmcarthur avatar daha avatar dandandan avatar drewbanin avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dbt-athena's Issues

dbt docs generate fails with "Column 'table_schema' is ambiguous"

When using the out of the box project from dbt init and dbt-athena (git+https://github.com/daha/[email protected] )

dbt run is successful, but dbt docs generate fails. Copy and paste from Windows 10 / powershell:

PS C:\athena-test> dbt run
Running with dbt=0.17.0
Found 2 models, 4 tests, 0 snapshots, 0 analyses, 134 macros, 0 operations, 0 seed files, 0 sources

15:42:52 | Concurrency: 8 threads (target='dev')
15:42:52 |
15:42:52 | 1 of 2 START table model dev.my_first_dbt_model...................... [RUN]
15:42:58 | 1 of 2 OK created table model dev.my_first_dbt_model................. [OK in 6.29s]
15:42:58 | 2 of 2 START view model dev.my_second_dbt_model...................... [RUN]
15:43:03 | 2 of 2 OK created view model dev.my_second_dbt_model................. [OK in 4.48s]
15:43:03 |
15:43:03 | Finished running 1 table model, 1 view model in 16.43s.

Completed successfully

Done. PASS=2 WARN=0 ERROR=0 SKIP=0 TOTAL=2
PS C:\athena-test> dbt docs generate
Running with dbt=0.17.0
Found 2 models, 4 tests, 0 snapshots, 0 analyses, 134 macros, 0 operations, 0 seed files, 0 sources

15:43:21 | Concurrency: 8 threads (target='dev')
15:43:21 |
15:43:21 | Done.
15:43:21 | Building catalog
Encountered an error while generating catalog: Runtime Error
Runtime Error
SYNTAX_ERROR: line 42:24: Column 'table_schema' is ambiguous
dbt encountered 1 failure while writing the catalog
15:43:22 | Catalog written to C:\athena-test\target\catalog.json

Status of the adapter

Thanks for the great work on this!
Very keen to use dbt at our co and we're using Athena in production.
I gave the adapter a try using pip install dbt==1.14.0 && pip install git+https://github.com/Dandandan/dbt-athena.git

However, running 'dbt debug' I get
Profile loading failed for the following reason: Runtime Error Credentials in profile "athena_data_warehouse", target "awscatalog" invalid: Runtime Error Could not find adapter type athena!

The profile and a project yml files have the relevant database/schema settings and pass debug.

I realise the repo does say 'work-in-progress', should this be functional right now or should I check back in on progress later.

Thanks again!

Catalog does not exist

Hi, Thanks for the great work on this adapter!

Somehow when i trying to run command dbt run I get

Running with dbt=0.15.1
Found 1 model, 0 tests, 0 snapshots, 0 analyses, 128 macros, 0 operations, 0 seed files, 0 sources

Encountered an error:
Runtime Error
  SYNTAX_ERROR: line 2:10: Catalog testing does not exist

here is the dbt_project.yml

name: 'test_athena'
version: '0.0.1'
profile: 'athena'
source-paths: ["models"]
analysis-paths: ["analysis"] 
test-paths: ["tests"]
data-paths: ["data"]
macro-paths: ["macros"]

and this is profiles.yml

athena:
  outputs:
    dev:
      type: athena
      threads: 1
      database: testing
      region_name: ap-southeast-1
      schema: testing
      s3_staging_dir: s3://test_staging_dbt
  target: dev

and this is dbt-debug results

dbt debug
Running with dbt=0.15.1
dbt version: 0.15.1
python version: 3.7.4
python path: /home/user/python/dbt_impl/bin/python3
os info: Linux-4.15.0-88-generic-x86_64-with-debian-buster-sid
Using profiles.yml file at /home/user/.dbt/profiles.yml

Configuration:
  profiles.yml file [OK found and valid]
  dbt_project.yml file [OK found and valid]
  profile: athena [OK found]
  target: dev [OK found]

Required dependencies:
 - git [OK found]

Connection:
  s3_staging_dir: s3://test_staging_dbt
  database: testing
  schema: testing
  region_name: ap-southeast-1
  Connection test: OK connection ok

As far as i know, catalog represent schema_name in PyAthena.
I assume this error came from athena query rather than dbt it self. so i tried to create connection in PyAthena and run same queries. it run perfectly.
do you have some clues?

Thanks again!

Error: mismatched input WITH - CTE's in views

I tried setting up a project in dbt using the adapter today and ran into a problem using WITH statement in a view.

I get the error: An error occurred (InvalidRequestException) when calling the StartQueryExecution operation: line 2:5: mismatched input 'WITH' expecting {'(', 'SELECT', 'VALUES', 'TABLE'}

Looking at the logs, this appears to be because dbt is wrapping the entire SELECT statement, including the CTE's generated by WITH statement, in parentheses, as:

create view my_table as ( WITH my_other_table AS ( ... ) SELECT .... )

On closer inspection there actually only appears to be an opening parenthesis and not a closing one in the logs. However, running the statement with one or both parentheses in Athena yields the same error.

Are WITH statements currently supported in views using the adapter?

Thanks!

Error while generating docs

dbt docs generate --profiles-dir=./
Running with dbt=0.15.1
Found 21 models, 0 tests, 0 snapshots, 0 analyses, 128 macros, 0 operations, 0 seed files, 0 sources

17:47:03 | Concurrency: 20 threads (target='xxx')
17:47:03 |
17:47:03 | Done.
17:47:03 | Building catalog
Encountered an error:
Runtime Error
SYNTAX_ERROR: line 38:19: Column 'table_schema' is ambiguous

dbt-athena doesn't handle HIVE_CANNOT_OPEN_SPLIT: File does not exist: errors gracefully

As discussed in the dbt slack (but I can't find it now due to age) Athena queries sometimes return as finished, but the CTAS table file isn't actually ready for querying yet. The error you get is "HIVE_CANNOT_OPEN_SPLIT: File does not exist". Running dbt run again usually fixes the problem.

The best way to handle this would probably be to retry a query that fails with this error a set number of times.

Until then, I'm looking at adding in a 1 second wait between DAG nodes. I've been looking in Lib\site-packages\dbt\adapters\athena\connections.py as the right place to try. Could I get a clue where to try that?

Thanks!

-Jeff

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.