Comments (4)
Just realized that the Docker build from the dbt-core
project (https://github.com/dbt-labs/dbt-core/tree/main/docker) does not work for dbt-spark
when using the PyHive
version (or the default all
)
docker build --tag my-dbt-spark:1.6.0 --target dbt-spark --build-arg [email protected] --build-arg [email protected] --build-arg dbt_spark_version=PyHive .
produces an error:
...
107.2 note: This error originates from a subprocess, and is likely not a problem with pip.
107.2 ERROR: Failed building wheel for sasl
107.2 Running setup.py clean for sasl
107.4 Building wheel for future (setup.py): started
108.0 Building wheel for future (setup.py): finished with status 'done'
108.0 Created wheel for future: filename=future-0.18.3-py3-none-any.whl size=492023 sha256=0dced4fde8484b7cf07f3ca722cbe787880c6fcb8eb27af37c82213dd20b48b8
108.0 Stored in directory: /tmp/pip-ephem-wheel-cache-pgg_8qmj/wheels/da/19/ca/9d8c44cd311a955509d7e13da3f0bea42400c469ef825b580b
108.0 Building wheel for PyHive (setup.py): started
108.3 Building wheel for PyHive (setup.py): finished with status 'done'
108.3 Created wheel for PyHive: filename=PyHive-0.6.5-py3-none-any.whl size=51554 sha256=b78987c7c11b9d3a18704d5339f9d1caf6221976e1f4c572f609fac9dd9da102
108.3 Stored in directory: /tmp/pip-ephem-wheel-cache-pgg_8qmj/wheels/cc/b2/8d/74115da1b8e1ee44544ec7870783c9fbf1127b66d296f6c4be
108.3 Building wheel for pure-sasl (setup.py): started
108.6 Building wheel for pure-sasl (setup.py): finished with status 'done'
108.6 Created wheel for pure-sasl: filename=pure_sasl-0.6.2-py3-none-any.whl size=11423 sha256=ef452afe0aeb515f2ad15f63e0df15ea5c620fef4e4f7d4413de8ebdb05b064e
108.6 Stored in directory: /tmp/pip-ephem-wheel-cache-pgg_8qmj/wheels/be/bd/15/23761a50b737a712aacac51c718906ce3563705a336d2c4ffc
108.6 Successfully built pyspark thrift dbt-spark logbook minimal-snowplow-tracker future PyHive pure-sasl
108.6 Failed to build sasl
108.6 ERROR: Could not build wheels for sasl, which is required to install pyproject.toml-based projects
------
Dockerfile:104
--------------------
102 | /tmp/* \
103 | /var/tmp/*
104 | >>> RUN python -m pip install --no-cache-dir "git+https://github.com/dbt-labs/${dbt_spark_ref}#egg=dbt-spark[${dbt_spark_version}]"
105 |
106 |
--------------------
ERROR: failed to solve: process "/bin/sh -c python -m pip install --no-cache-dir \"git+https://github.com/dbt-labs/${dbt_spark_ref}#egg=dbt-spark[${dbt_spark_version}]\"" did not complete successfully: exit code: 1
It works fine if I'm using the ODBC
spark version.
Update: I have the same problem locally (not in docker) if I switch from Python 3.10 to 3.11. So problem is related to #864
from dbt-spark.
The issue here is related to the sasl
package which does not work with python 3.11 anymore. To make this work, you need to install pyhive
with extra hive_pure_sasl
which uses pure-sasl
instead of the sasl
package. To make this work, dbt-spark should use pyhive[hive_pure_sasl]
instead of just pyhive
when installing dbt-spark[pyhive]
.
You can easily reproduce this issue by running pip install pyhive
vs pip install pyhive[hive_pure_sasl]
on a python 3.11 installation.
from dbt-spark.
This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please comment on the issue or else it will be closed in 7 days.
from dbt-spark.
Although we are closing this issue as stale, it's not gone forever. Issues can be reopened if there is renewed community interest. Just add a comment to notify the maintainers.
from dbt-spark.
Related Issues (20)
- [Bug] New test failure for `tests/functional/adapter/test_constraints.py::TestSparkIncrementalConstraintsRollback::test__constraints_enforcement_rollback` HOT 2
- [CI/CD Improvements] Migrate to `hatch` and standardize workflows
- Import relevant pytest(s) for cross-database `cast` macro
- Cross-database `date` macro
- [Bug] New test failure for `tests/functional/adapter/test_python_model.py::TestPythonIncrementalModelSpark::test_incremental`
- [Bug] dbt run will fail if default namespace doesn't exist. HOT 1
- [Bug] Cannot run unit tests against Spark/Hudi, receiving "NoneType is not iterable" error HOT 3
- [ADAP-1085] [Bug] When using iceberg format, dbt docs generate is unable to populate the columns information HOT 1
- [ADAP-1093] [Feature] Run integration tests against all supported python versions
- [Feature] Support HTTP transport protocol for Thrift method
- [Feature] Support OCI Dataflow as a backend for dbt-spark
- `dbt-core` Dockerfile does not work for `dbt-spark` due to `PyHive` HOT 2
- [Bug] CI is broken on `main` due to dependency resolution and timeout issues HOT 1
- [Feature] Spike on supporting Py3.12 in dbt-spark
- [Bug] The tblproperties are not applied when using Python Model to create a table HOT 1
- [Issue] sasl as a dependency HOT 3
- [Feature] Livy connection support for Spark SQL models
- [Unit Testing] Add functional tests for unit testing: HOT 1
- [Bug] dbt docs generate does not include Data Type (data stored in iceberg format) HOT 2
- [Bug] flake8 code smells HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from dbt-spark.