Giter Club home page Giter Club logo

Comments (15)

marknorkin avatar marknorkin commented on July 30, 2024

Facing the same issue. Is there a workaround for this ?

from emr-serverless-samples.

dacort avatar dacort commented on July 30, 2024

I'm not quite sure why this started happening - will have to do some tests later this week.

Do you need the --constraint in there?

from emr-serverless-samples.

MrThomasWagner avatar MrThomasWagner commented on July 30, 2024

Yea it does work ok without the constraints flag for my proof of concept - I have some more dependencies I'm going to want to add in the future and would like to be able to include it.

Awesome plugin btw

from emr-serverless-samples.

dacort avatar dacort commented on July 30, 2024

Unfortunately EMR Serverless requires a newer version of boto3 than what's in that constraints file. I don't know if there's a way to override that...

from emr-serverless-samples.

MrThomasWagner avatar MrThomasWagner commented on July 30, 2024

I noticed it doesn't conflict with Airflow 2.4.2 which is out - MWAA is just a little behind on that. I.e.

https://raw.githubusercontent.com/apache/airflow/constraints-2.4.2/constraints-3.7.txt

from emr-serverless-samples.

dacort avatar dacort commented on July 30, 2024

Yup, MWAA is still on 2.2.2. I'm curious, can you help me understand why you're including the constraints line? I know you kind of mentioned it, but I'm still not sure what it's used for / why it's needed?

from emr-serverless-samples.

MrThomasWagner avatar MrThomasWagner commented on July 30, 2024

I was following this best practices guide in the MWAA docs: https://docs.aws.amazon.com/mwaa/latest/userguide/best-practices-dependencies.html

There is an Option 2 there using wheel fwiw.. maybe I'll look into that if 2.2.2 is SOL

from emr-serverless-samples.

dacort avatar dacort commented on July 30, 2024

Ahhh got it thank you. Yea, the boto3 will be an issue just because of when EMR Serverless support was added to it.

from emr-serverless-samples.

marknorkin avatar marknorkin commented on July 30, 2024

@dacort thank you for response. Curious, what features of boto3>=1.23.9 and ~=1.23 are in use by emr serverless operators and sensors that are not present in boto3==1.18.65 ? We for example are using MWAA 2.2.2 on our project and EMR Serverless 6.7.0, and can not use the library because of this boto issue.

from emr-serverless-samples.

dacort avatar dacort commented on July 30, 2024

@marknorkin EMR Serverless was made generally available this year, and boto3 1.23.9 is when support for EMR Serverless was added. You can still use the Operator on MWAA 2.2.2, you just need to upgrade boto3 (which will happen automatically if you use the Operator from this repo).

I wasn't aware of the recommendation in our docs to add the constraints line to the requirements.txt - that said, I've tried this operator with the upgraded boto3 with MWAA and haven't seen any issues.

from emr-serverless-samples.

dacort avatar dacort commented on July 30, 2024

Going to close this for now as EMR Serverless requires a newer version of boto3. If you're willing to forego the constraints, you can still use the operator on MWAA, but I don't think there's a workaround. The Operator is in use in MWAA environments.

For reference, this is the dependency tree of the EMR Serverless operator. You could potentially update the constraints file with the relevant versions...or I do see now that there is a constraints-no-provider file as well. Maybe that'll help if the concern is preventing against upgrade of core libraries for Airflow?

https://raw.githubusercontent.com/apache/airflow/constraints-2.4.2/constraints-no-providers-3.7.txt

emr-serverless==1.0.1
  - boto3 [required: ~=1.23,>=1.23.9, installed: 1.26.10]
    - botocore [required: >=1.29.10,<1.30.0, installed: 1.29.10]
      - jmespath [required: >=0.7.1,<2.0.0, installed: 1.0.1]
      - python-dateutil [required: >=2.1,<3.0.0, installed: 2.8.2]
        - six [required: >=1.5, installed: 1.16.0]
      - urllib3 [required: >=1.25.4,<1.27, installed: 1.26.12]
    - jmespath [required: >=0.7.1,<2.0.0, installed: 1.0.1]
    - s3transfer [required: >=0.6.0,<0.7.0, installed: 0.6.0]
      - botocore [required: >=1.12.36,<2.0a.0, installed: 1.29.10]
        - jmespath [required: >=0.7.1,<2.0.0, installed: 1.0.1]
        - python-dateutil [required: >=2.1,<3.0.0, installed: 2.8.2]
          - six [required: >=1.5, installed: 1.16.0]
        - urllib3 [required: >=1.25.4,<1.27, installed: 1.26.12]

from emr-serverless-samples.

dlecina avatar dlecina commented on July 30, 2024

Hello, even without constraints files, we are having this issue on a new MWAA 2.2.2 environment. Our only peculiarity is that we are hosting your released .zip file in our nexus repository (the file is unmodified):

adding trusted host: 'nexus.REDACTED' (from line 1 of /usr/local/airflow/requirements/requirements.txt)
adding trusted host: 'nexusmaster.REDACTED' (from line 2 of /usr/local/airflow/requirements/requirements.txt)
Looking in indexes: https://nexus.REDACTED/repository/pypi-public/simple/
Collecting emr_serverless@ https://nexusmaster.REDACTED/repository/REDACTED/REDACTED/mwaa_plugin.zip
  Downloading https://nexusmaster.REDACTED/repository/REDACTED/REDACTED/mwaa_plugin.zip (6.7 kB)
  Preparing metadata (setup.py): started
  Preparing metadata (setup.py): finished with status 'done'
Collecting boto3>=1.23.9,~=1.23
  Downloading https://nexus.REDACTED/repository/pypi-public/packages/boto3/1.26.15/boto3-1.26.15-py3-none-any.whl (132 kB)
Collecting s3transfer<0.7.0,>=0.6.0
  Downloading https://nexus.REDACTED/repository/pypi-public/packages/s3transfer/0.6.0/s3transfer-0.6.0-py3-none-any.whl (79 kB)
Requirement already satisfied: jmespath<2.0.0,>=0.7.1 in ./.local/lib/python3.7/site-packages (from boto3>=1.23.9,~=1.23->emr_serverless@ https://nexusmaster.REDACTED/repository/REDACTED/REDACTED/mwaa_plugin.zip->-r /usr/local/airflow/requirements/requirements.txt (line 5)) (0.10.0)
Collecting botocore<1.30.0,>=1.29.15
  Downloading https://nexus.REDACTED/repository/pypi-public/packages/botocore/1.29.15/botocore-1.29.15-py3-none-any.whl (9.9 MB)
Requirement already satisfied: python-dateutil<3.0.0,>=2.1 in ./.local/lib/python3.7/site-packages (from botocore<1.30.0,>=1.29.15->boto3>=1.23.9,~=1.23->emr_serverless@ https://nexusmaster.REDACTED/repository/REDACTED/REDACTED/mwaa_plugin.zip->-r /usr/local/airflow/requirements/requirements.txt (line 5)) (2.8.2)
Requirement already satisfied: urllib3<1.27,>=1.25.4 in ./.local/lib/python3.7/site-packages (from botocore<1.30.0,>=1.29.15->boto3>=1.23.9,~=1.23->emr_serverless@ https://nexusmaster.REDACTED/repository/REDACTED/REDACTED/mwaa_plugin.zip->-r /usr/local/airflow/requirements/requirements.txt (line 5)) (1.26.7)
Requirement already satisfied: six>=1.5 in ./.local/lib/python3.7/site-packages (from python-dateutil<3.0.0,>=2.1->botocore<1.30.0,>=1.29.15->boto3>=1.23.9,~=1.23->emr_serverless@ https://nexusmaster.REDACTED/repository/REDACTED/REDACTED/mwaa_plugin.zip->-r /usr/local/airflow/requirements/requirements.txt (line 5)) (1.16.0)
Building wheels for collected packages: emr-serverless
  Building wheel for emr-serverless (setup.py): started
  Building wheel for emr-serverless (setup.py): finished with status 'done'
  Created wheel for emr-serverless: filename=emr_serverless-1.0.1-py3-none-any.whl size=7414 sha256=da8ce9ab8a2ff91d9a3b883ddaafbc3c9e892133a4ffb499e420236b70068f0f
  Stored in directory: /tmp/pip-ephem-wheel-cache-lpa7pkzp/wheels/13/92/50/475b17c65c8d67d0c9ecba04a3df4e16188d880c57c8d90d8f
Successfully built emr-serverless
Installing collected packages: botocore, s3transfer, boto3, emr-serverless
  Attempting uninstall: botocore
    Found existing installation: botocore 1.21.65
    Uninstalling botocore-1.21.65:
      Successfully uninstalled botocore-1.21.65
  Attempting uninstall: s3transfer
    Found existing installation: s3transfer 0.5.0
    Uninstalling s3transfer-0.5.0:
      Successfully uninstalled s3transfer-0.5.0
  Attempting uninstall: boto3
    Found existing installation: boto3 1.18.65
    Uninstalling boto3-1.18.65:
      Successfully uninstalled boto3-1.18.65
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
apache-airflow-providers-amazon 2.4.0 requires boto3<1.19.0,>=1.15.0, but you have boto3 1.26.15 which is incompatible.
apache-airflow-providers-amazon 2.4.0 requires watchtower~=1.0.6, but you have watchtower 2.0.1 which is incompatible.
Successfully installed boto3-1.26.15 botocore-1.29.15 emr-serverless-1.0.1 s3transfer-0.6.0

Our requirements file is as follows:

--trusted-host nexus.REDACTED
--trusted-host nexusmaster.REDACTED
--index https://nexus.REDACTED/repository/pypi-public/
--index-url https://nexus.REDACTED/repository/pypi-public/simple/
emr_serverless @ https://nexusmaster.REDACTED/repository/REDACTED/REDACTED/mwaa_plugin.zip

As a bit of an aside, we have tried getting around this by setting this:

apache-airflow==2.2.2
apache-airflow-providers-amazon>=v5.1.0

This solves the version issue and install works correctly everywhere except on WebServer (as in https://repost.aws/questions/QUmgPhWhgmTFGMc18d7De40A/airflow-webserver-not-installing-python-requirements). However, if we set this and then try to use the operator in a DAG, the DAG gets processed correctly, but we never get a Task to actually run. We have also tried this with different versions of apache-airflow-providers-amazon (3.1.1, 5.1.0, 6.0.0). In the latter case we removed mwaa_plugin.zip as the library itself should already be providing the operator. We are unsure of the reason why this is not working (it may be our fault), hence why we are not opening a new issue yet.

In any case, we just wanted to let you know that just setting the emr_serverless requirement is not working for us, even without constraints.

from emr-serverless-samples.

dacort avatar dacort commented on July 30, 2024

@dlecina Interesting, thank you for all the detail. I know the MWAA team has been doing some work on Python requirements lately so I wonder if something changed here.

I will try to reproduce this and reopen this if I run into the same. Between the US holiday this week and re:Invent next week it may take me a bit, but I'll try to take a look ASAP.

from emr-serverless-samples.

dlecina avatar dlecina commented on July 30, 2024

Thanks @dacort! Yes, I expect there have been some changes in the background that explain the different behavior.

In case it's helpful to anyone, in the end the following combination seemed to work for us; we were able to reach EMR Serverless with this:

--trusted-host nexus.REDACTED
--trusted-host nexusmaster.REDACTED
--index https://nexus.REDACTED/repository/pypi-public/
--index-url https://nexus.REDACTED/repository/pypi-public/simple/
apache-airflow==2.2.2
apache-airflow-providers-amazon==6.0.0
boto>=1.23.9

Context:
Setting apache-airflow-providers-amazon==6.1.0 would be ideal, as it has the correct boto requirement, but then it demands apache-airflow>=2.3.0, which does not work with MWAA 2.2.2, so instead we set boto explicitely and that seemed to work as it does not conflict with either library. Not setting boto explicitely does not work in this configuration because, despite 6.0.0 having the EMR Serverless Operator, the boto requirement is set to an older version which does not have the emr-serverless API and it will fail when running the task.

In short:
apache-airflow-providers-amazon==6.1.0 -> apache-airflow>=2.3.0 ❌ boto3>=1.24.0 ✔️
apache-airflow-providers-amazon==6.0.0 -> apache-airflow>=2.2.0 ✔️ boto3>=1.15.0 ❌
apache-airflow-providers-amazon==6.0.0 + boto>=1.23.9 -> apache-airflow>=2.2.0 ✔️ boto>=1.23.9 ✔️

from emr-serverless-samples.

dacort avatar dacort commented on July 30, 2024

Just to confirm, I was still able to use MWAA 2.2.2 with the release from this repository without a problem.

My requirements file is just this plugin, though.

emr_serverless @ https://github.com/aws-samples/emr-serverless-samples/releases/download/v1.0.1/mwaa_plugin.zip

I'm using the CDK stack from this repository.

I'll also try with the constraints-no-provider file as well and see if that works.

from emr-serverless-samples.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.