Giter Club home page Giter Club logo

azkaban's People

Contributors

acharneski avatar aeroevan avatar dfdx avatar hongmi avatar juhoautio avatar kjmrknsn avatar mtth avatar pmerienne avatar ralnoc avatar sankethkatta avatar xiaogaozi avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

azkaban's Issues

Wait for job log

When the Azkaban server is overloaded, jobs can hang in preparing state which delays the creation of the log file, which causes azkabanpig to error out.

Future features

Revamp .azkabanrc:

[azkaban]
default_alias = ...

[azkabanpig]
default_project = pig_you
default_type = pig

[aliases]
foo = you@http://that
bar = http://this

[session_ids]
foo = abc

Add Job Name Validation Check

Azkaban requires all job names to be unique within a project. The current behavior when there's a naming collision is for Azkaban to throw a 500 error that doesn't provide useful details to the deploying user about what went wrong. This leads to a lot of confusion for developers who are encountering the issue for the first time.

Instead of the current behavior, I propose a pre-deploy check for uniqueness among job names in a project. This provides the benefit of being able to provide a more detailed error message that will allow developers to resolve their naming collision more quickly.

Dynamic project import bug

Running azkaban info -p jobs.py2 still works when the actual file is named jobs.py. Should change the way the script is dynamically loaded: move from __import__ to imp.load_source.

Potential dependency conflicts between azkaban and urllib3

Hi, as shown in the following full dependency graph of azkaban, azkaban requires urllib3 * , while the installed version of requests(2.22.0) requires urllib3 <1.26,>=1.21.1.

According to Pip's “first found wins” installation strategy, urllib3 1.25.7 is the actually installed version.

Although the first found package version urllib3 1.25.7 just satisfies the later dependency constraint (urllib3 <1.26,>=1.21.1), it will lead to a build failure once developers release a newer version of urllib3.

Dependency tree--------

azkaban - 0.9.13
| +- docopt(install version:0.6.2 version range:*)
| +- requests(install version:2.22.0 version range:>=2.4.0)
| | +- certifi(install version:2019.9.11 version range:>=2017.4.17) m
| | +- chardet(install version:3.0.4 version range:<3.1.0,>=3.0.2)
| | +- idna(install version:2.8 version range:>=2.5,<2.9)
| | +- urllib3(install version:1.25.6 version range:<1.26,>=1.21.1)
| +- six(install version:1.13.0 version range:>=1.6.1)
| +- urllib3(install version:1.25.6 version range:*)

Thanks for your attention.
Best,
Neolith

When uploading the ZIP, set the file name to the real one.

After I upload the job, here is what I have in the Azkaban log:

      > Time    User    Type    Message
      > 2014-07-07 16:29 35s    jdoe    Uploaded    Uploaded project files zip file.zip

What I would like to have

      > Time    User    Type    Message
      > 2014-07-07 16:29 35s    jdoe    Uploaded    Uploaded project files my-job-0.0.4-SNAPSHOT-azkaban.zip

azkaban can not work. requests.packages.urllib3 does not exist

Traceback (most recent call last):
  File "/usr/local/bin/azkaban", line 9, in <module>
    load_entry_point('azkaban==0.9.5', 'console_scripts', 'azkaban')()
  File "/usr/lib/python2.7/dist-packages/pkg_resources.py", line 351, in load_entry_point
    return get_distribution(dist).load_entry_point(group, name)
  File "/usr/lib/python2.7/dist-packages/pkg_resources.py", line 2363, in load_entry_point
    return ep.load()
  File "/usr/lib/python2.7/dist-packages/pkg_resources.py", line 2088, in load
    entry = __import__(self.module_name, globals(),globals(), ['__name__'])
  File "/usr/local/lib/python2.7/dist-packages/azkaban/__main__.py", line 98, in <module>
    from azkaban.project import Project
  File "/usr/local/lib/python2.7/dist-packages/azkaban/project.py", line 8, in <module>
    from .util import AzkabanError, Adapter, flatten, temppath, write_properties
  File "/usr/local/lib/python2.7/dist-packages/azkaban/util.py", line 13, in <module>
    from requests.packages.urllib3 import disable_warnings
ImportError: No module named packages.urllib3

Create complex flows

Give the option to create complex flows that have multiple jobs that depend to each other. For example it would be really useful if we were able to generate job files like this:

# foo.job
type=command
command=echo foo
# bar.job
type=command
dependencies=foo
command=echo bar

AttributeError: 'AzkabanError' object has no attribute 'message'

To enable integration tests I created ~/.azkabanrc with

[azkaban]
test.alias = local

[alias.local]
url = http://localhost:8081

Then I ran

nosetests

There was an error:

..............................................Azkaban password for JuhoAutio@http://localhost:8081:
Azkaban password for JuhoAutio@http://localhost:8081: Azkaban password for JuhoAutio@http://localhost:8081: E
======================================================================
ERROR: test_remote.TestCreateDelete.test_create_delete_project
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/JuhoAutio/.pyenv/versions/3.7.0/lib/python3.7/site-packages/nose/case.py", line 198, in runTest
    self.test(*self.arg)
  File "/Users/JuhoAutio/ideaprojects/AzkabanCli/azkaban/test/test_remote.py", line 97, in test_create_delete_project
    ok_(not self.project_exists(self.project))
  File "/Users/JuhoAutio/ideaprojects/AzkabanCli/azkaban/test/test_remote.py", line 90, in project_exists
    self.session.upload_project(project, path)
  File "/Users/JuhoAutio/ideaprojects/AzkabanCli/azkaban/azkaban/remote.py", line 651, in upload_project
    self._refresh() # ensure that the ID is valid
  File "/Users/JuhoAutio/ideaprojects/AzkabanCli/azkaban/azkaban/remote.py", line 765, in _refresh
    if not 'Incorrect Login.' in err.message:
AttributeError: 'AzkabanError' object has no attribute 'message'

Hmm not sure if I can reproduce this test run consistently.. but any way it shows the error AttributeError: 'AzkabanError' object has no attribute 'message' which is real. I believe since python 2.6 or something Exception doesn't have message any more.


Apart from that, how can I successfully run the integration tests?

I set the url with credentials:

[alias.local]
url = azkaban:azkaban@http://localhost:8081

After this I nosetests were run through, but there were quite many errors. For example:

AssertionError: 'Execution queued successfully wi' != 'Execution submitted successfully'

Compatibility with Python 3 - ConfigParser

azkaban-cli requires ConfigParser package but this package is built into the system on python 3. So ideally, we would want to try import ConfigParser and then if don't work, then contigparser

Support all concurrent choices

When running a flow, azkaban-cli only support 'concurrent' and 'skip' values for the concurrentOption request parameters. However Azkaban server 3.0 supports 'skip', 'pipeline' and 'queue'.

It could be really great to supports all this values.

Create and Delete of Projects

azkaban (create | delete)

create and delete should return a user prompt to complete action by filling in project name and description.

Error when creating a new project on deploy in Azkaban 3.59.0

We are currently running Azkaban 3.1.0 in our production environment and all aspects of this code works great on 3.1.0.

However we are in the process of getting 3.59.0 working in our lab, in preparation of upgrading to a more current version. Almost everything works, however we have one issue that we have discovered.

Issue

  • We execute the following to build and upload a project: azkaban build -cp project.py -a lab-azkaban
  • When a Project does exist in Azkaban, the build and upload works without issue.
  • When the project does not exist in Azkaban, get this error after we enter the password and press enter to upload the project:
Azkaban password for username@http://azkaban.example.com:
Traceback (most recent call last):
  File "/path/to/virtualenvs/venv_name/lib/python2.7/site-packages/azkaban/util.py", line 310, in wrapper
    return func(*args, **kwargs)
  File "/path/to/virtualenvs/venv_name/lib/python2.7/site-packages/azkaban/__main__.py", line 502, in main
    ['ZIP', '--url', '--alias', '--replace', '--create', '--option']
  File "/path/to/virtualenvs/venv_name/lib/python2.7/site-packages/azkaban/__main__.py", line 460, in build_project
    res = _upload_zip(session, project.name, _zip, _create, archive_name)
  File "/path/to/virtualenvs/venv_namelib/python2.7/site-packages/azkaban/__main__.py", line 320, in _upload_zip
    callback=_callback,
  File "/path/to/virtualenvs/venv_name/lib/python2.7/site-packages/azkaban/remote.py", line 673, in upload_project
    data=form,
  File "/path/to/virtualenvs/venv_name/lib/python2.7/site-packages/azkaban/remote.py", line 902, in _request
    raise err
HTTPError: 410 Client Error: Gone for url: http://azkaban.example.com/manager

This error only manifests when we have the -c argument being passed and the project does not already exist. If the project already exists, then the build and upload works without issue.

Are there any recommendations regarding where this issue might be arising? I was hoping you might have some insight.

Thank you.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.