Giter Club home page Giter Club logo

metadata-parser's People

Contributors

badasshenkka avatar dewan-ahmed avatar floord avatar ftisiot avatar inifares23lab avatar jlprat avatar laysauchoa avatar safa-topal avatar tibsatwork avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

metadata-parser's Issues

Make program robust when expected fields are not present

At the moment, if an expected field is not present in the data returned for a service, an exception happens.

A first level of handling this would be to catch exceptions at the "call the method for this type of service" level, and instead return [], []

Later on, localised handling within the methods would allow better messages when this occurs.

MySQL handling can require the `cryptography` package

Observed as:

Traceback (most recent call last):
  File "metadata-parser/main.py", line 69, in <module>
    (newnodes, newedges) = explore_service.explore(myclient, service["service_type"], service["service_name"], project=config['DEFAULT']['PROJECT'])
  File "metadata-parser/src/explore_service.py", line 115, in explore
    (newnodes, newedges) = explore_mysql(self, service_name, project)
  File "metadata-parser/metadata-parser/src/explore_service.py", line 624, in explore_mysql
    conn = pymysql.connect(
  File "metadata-parser/venv/lib/python3.10/site-packages/pymysql/connections.py", line 353, in __init__
    self.connect()
  File "metadata-parser/venv/lib/python3.10/site-packages/pymysql/connections.py", line 633, in connect
    self._request_authentication()
  File "metadata-parser/venv/lib/python3.10/site-packages/pymysql/connections.py", line 932, in _request_authentication
    auth_packet = _auth.caching_sha2_password_auth(self, auth_packet)
  File "metadata-parser/venv/lib/python3.10/site-packages/pymysql/_auth.py", line 265, in caching_sha2_password_auth
    data = sha2_rsa_encrypt(conn.password, conn.salt, conn.server_public_key)
  File "venv/lib/python3.10/site-packages/pymysql/_auth.py", line 143, in sha2_rsa_encrypt
    raise RuntimeError(
RuntimeError: 'cryptography' package is required for sha256_password or caching_sha2_password auth methods

(note: prefix directories removed from file paths to make them shorter)

The solution is probably to add cryptography to requirements.txt (this appears to work when I add it and re-install the dependencies)

Missing numpy in requirements.txt

What happened?

WHEN running python app.py
THEN there is a dependency error ModuleNotFoundError: No module named 'numpy'

What did you expect to happen?

WHEN running python app.py
THEN program starts without errors

What else do we need to know?

running on Python 3.11.5 on Fedora Linux 38 in dedicated virtual environment after installing dependencies from requirements.txt

Parse MirrorMaker services

What is currently missing?

As of now the explore_service.py doesn't explore MirrorMaker services (see explore_m3db, explore_mirrormaker, explore_coordinator).
We should write code to parse MirrorMaker instances and retrieve metadata

Is this a feature you would work on yourself?

  • I plan to open a pull request for this feature

Parse influxDB services

What is currently missing?

As of now the explore_service.py doesn't explore InfluxDB services (see explore_influxdb).
We should write code to parse InfluxDB instances and retrieve metadata

Is this a feature you would work on yourself?

  • I plan to open a pull request for this feature

Cope when a service has "gone away"

When get_service is called, it is possible that the service responds with something other than success. At the moment this causes an exception - it would be better to output a message and return an "empty" result.

[EPIC] Apache Kafka Connect Connectors

What is currently missing?

The Apache Kafka Connect Connector parsing is tedious since we need to parse the JSON definition for each of them. As of now the basic Debezium PG source and OpenSearch sink are done but can be improved. Other connectors (see explore_kafka_connect in explore_service.py) are:

Source

Sink

Add secrets and projects information to CI

What is needed?

The CI needs an Aiven Console project and user credentials. This information needs to be stored securely.

Further details

Refer to the conv.env.sample and you'll see the requirements to add the project name and user credentials. This information is required during the build. But we cannot have hardcoded project name and credentials in the CI.

Suggested steps:

  1. Use GitHub Actions Environment Variables for project names.
  2. Use GitHub Actions Secrets for user credentials.

Screen Shot 2022-04-27 at 9 19 13 AM

3. Clone the repository to the Linux VM within GitHub Actions and run main.py/app.py with the parameters from the first two steps.

Additional reference

Parse ClickHouse services

What is currently missing?

As of now the explore_service.py doesn't explore ClickHouse services (see explore_m3db, explore_clickhouse, explore_coordinator).
We should write code to parse ClickHouse instances and retrieve metadata

Is this a feature you would work on yourself?

  • I plan to open a pull request for this feature

Error when running `pip install -r requirements.txt`

What can we help you with?

When I run pip install -r requirements.txt I get the below,

ERROR: Failed building wheel for psycopg2
  Running setup.py clean for psycopg2
Failed to build psycopg2
Installing collected packages: psycopg2, dash, cryptography, configparser, colour
    Running setup.py install for psycopg2 ... error

Not sure what I'm missing.

Where would you expect to find this information?

Parse M3 instances

What is currently missing?

As of now the explore_service.py doesn't explore M3 services (see explore_m3db, explore_aggregator, explore_coordinator).
We should write code to parse M3 instances and retrieve metadata

Is this a feature you would work on yourself?

  • I plan to open a pull request for this feature

Build plumbing for the CI

What is needed?

The plumbing for the CI (Continuous Integration) using GitHub Actions.

Further details

If you run python main.py, the application will want to connect to an active Aiven console project and fetch the services. This means that you would need to pass a project name and authentication token to the CI.

This issue focuses on the structure/setup of the CI rather than the specific issue of handling secrets within a CI. Possible steps:

  1. Identify a suitable GitHub Actions for this repository, e.g. actions/setup-python
  2. Follow the GitHub Actions docs to add that action to this project
  3. Rather than actual build or test commands, add some echo statements to ensure that the CI is triggered and run successfully on push to master or pull requests.

requirements-typo

What happened?

There is a typo in the requirements.txt file
❯❯ pip install -r requirements.txt ✔️0
ERROR: Invalid requirement: 'networkx=2.8.8'

What did you expect to happen?

Installing dependencies goes smoothly

Testing

What is currently missing?

As of now there's no testing apart from running the src/create_services.sh script, run the code, and check visually the output.
Some more testing is needed to validate changes

Is this a feature you would work on yourself?

  • I plan to open a pull request for this feature

Parse Cassandra services

What is currently missing?

As of now the explore_service.py doesn't explore Cassandra services (see explore_m3db, explore_cassandra, explore_coordinator).
We should write code to parse Cassandra instances and retrieve metadata

Is this a feature you would work on yourself?

  • I plan to open a pull request for this feature

Adopt pylint and lint the existing modules

What is currently missing?

A set of standards around code style would help code readability and consistency among different modules of the project. Pylint seems like the way to go option.

Is this a feature you would work on yourself?

  • I plan to open a pull request for this feature

Dot file writing can fail because of node values with `:` in them?

When working with our internal sandbox project, I have seen occurrences of:

Traceback (most recent call last):
  File "/Users/tony.ibbs/temp/metadata-parser/metadata-parser/main.py", line 79, in <module>
    pyvis_display.pyviz_graphy(nodes, edges)
  File "/Users/tony.ibbs/temp/metadata-parser/metadata-parser/src/pyvis_display.py", line 109, in pyviz_graphy
    write_dot(g, 'graph_data.dot')
  File "/Users/tony.ibbs/temp/metadata-parser/metadata-parser/venv/lib/python3.10/site-packages/networkx/utils/decorators.py", line 845, in func
    return argmap._lazy_compile(__wrapper)(*args, **kwargs)
  File "<class 'networkx.utils.decorators.argmap'> compilation 5", line 5, in argmap_write_dot_1
  File "/Users/tony.ibbs/temp/metadata-parser/metadata-parser/venv/lib/python3.10/site-packages/networkx/drawing/nx_pydot.py", line 51, in write_dot
    P = to_pydot(G)
  File "/Users/tony.ibbs/temp/metadata-parser/metadata-parser/venv/lib/python3.10/site-packages/networkx/drawing/nx_pydot.py", line 263, in to_pydot
    raise ValueError(
ValueError: Node names and attributes should not contain ":" unless they are quoted with "".                For example the string 'attribute:data1' should be written as '"attribute:data1"'.                Please refer https://github.com/pydot/pydot/issues/258

(line numbers may be a little off as this was on a branch). This causes the program to stop.

  1. If writing the dot file fails, catch the exception
  2. Work out what to escape in the output so that this does not happen

Parse ElasticSearch services

What is currently missing?

As of now the explore_service.py doesn't explore Elasticsearch services (see explore_elasticsearch).
We should write code to parse Elasticsearch instances and retrieve metadata

Is this a feature you would work on yourself?

  • I plan to open a pull request for this feature

Expose backup metadata

What is currently missing?

Currently, no backup related information is exposed in the project. It would be a useful feature to add

Is this a feature you would work on yourself?

Yes

  • I plan to open a pull request for this feature

Fix dependency issues during installation

What happened?

During installation of the project, encountered psycopg2 dependency related problems that blocked the process. Also, requirements.txt seems it is missing pydot library.

What did you expect to happen?

metadata-parser is installed without obstacles given the installation command on README file

What else do we need to know?

Error message:

WARNING: Discarding https://files.pythonhosted.org/packages/3a/7a/968afcb86b1958ae963a3aaa42c561e3ed2c2d4a8b773622b03856a16248/psycopg2-2.0.13.tar.gz#sha256=a15e622e101b16aa8ad44813d8fb1eced91379396c054aacbfa3ad658352332b (from https://pypi.org/simple/psycopg2/). Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
  Using cached psycopg2-2.0.12.tar.gz (256 kB)
    ERROR: Command errored out with exit status 1:
     command: /home/safa.topal/code/metadata-parser/metadata-parser-env/bin/python3 -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-revt39xn/psycopg2_7d096204ae3d497e80e2a9b22270457b/setup.py'"'"'; __file__='"'"'/tmp/pip-install-revt39xn/psycopg2_7d096204ae3d497e80e2a9b22270457b/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-pip-egg-info-z189u7gk
         cwd: /tmp/pip-install-revt39xn/psycopg2_7d096204ae3d497e80e2a9b22270457b/
    Complete output (6 lines):
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/tmp/pip-install-revt39xn/psycopg2_7d096204ae3d497e80e2a9b22270457b/setup.py", line 225
        except Warning, w:
               ^^^^^^^^^^
    SyntaxError: multiple exception types must be parenthesized
    ----------------------------------------
WARNING: Discarding https://files.pythonhosted.org/packages/6a/8d/ee5c330823d527a5cd14c833063f825211d7b5de6e4897f72e250c107d85/psycopg2-2.0.12.tar.gz#sha256=542c187531e756867fb60034c393b6f2beca34eeeb3ce2e0089a2b6fb8be1292 (from https://pypi.org/simple/psycopg2/). Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
  Using cached psycopg2-2.0.11.tar.gz (255 kB)
    ERROR: Command errored out with exit status 1:
     command: /home/safa.topal/code/metadata-parser/metadata-parser-env/bin/python3 -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-revt39xn/psycopg2_3feede73f28e43c5bd7a1e221943860d/setup.py'"'"'; __file__='"'"'/tmp/pip-install-revt39xn/psycopg2_3feede73f28e43c5bd7a1e221943860d/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-pip-egg-info-gl5az38m
         cwd: /tmp/pip-install-revt39xn/psycopg2_3feede73f28e43c5bd7a1e221943860d/
    Complete output (6 lines):
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/tmp/pip-install-revt39xn/psycopg2_3feede73f28e43c5bd7a1e221943860d/setup.py", line 225
        except Warning, w:
               ^^^^^^^^^^
    SyntaxError: multiple exception types must be parenthesized
    ----------------------------------------
WARNING: Discarding https://files.pythonhosted.org/packages/2d/d7/496da11d7c81971870ddd36800419c4f84e8f6208aac5eabedf9f7748729/psycopg2-2.0.11.tar.gz#sha256=e6b4e0e41df97441eff34e00065376414da6488e0d55848a45cd77551dbae434 (from https://pypi.org/simple/psycopg2/). Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
  Using cached psycopg2-2.0.10.tar.gz (255 kB)
    ERROR: Command errored out with exit status 1:
     command: /home/safa.topal/code/metadata-parser/metadata-parser-env/bin/python3 -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-revt39xn/psycopg2_6c7f956b264a4ab6a887e218e8cc7c53/setup.py'"'"'; __file__='"'"'/tmp/pip-install-revt39xn/psycopg2_6c7f956b264a4ab6a887e218e8cc7c53/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-pip-egg-info-x96zesj3
         cwd: /tmp/pip-install-revt39xn/psycopg2_6c7f956b264a4ab6a887e218e8cc7c53/
    Complete output (5 lines):
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/tmp/pip-install-revt39xn/psycopg2_6c7f956b264a4ab6a887e218e8cc7c53/setup.py", line 50, in <module>
        import ConfigParser
    ModuleNotFoundError: No module named 'ConfigParser'
    ----------------------------------------
WARNING: Discarding https://files.pythonhosted.org/packages/19/79/35c7596bab4456f3610c12ec542a94d51c6781ced587d1d85127210b879b/psycopg2-2.0.10.tar.gz#sha256=e40cc04b43849085725076ae134bfef9e3b087f6dd7c964aeeb930e2f0bc14ab (from https://pypi.org/simple/psycopg2/). Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
ERROR: Could not find a version that satisfies the requirement psycopg2 (from versions: 2.0.10, 2.0.11, 2.0.12, 2.0.13, 2.0.14, 2.2.0, 2.2.1, 2.2.2, 2.3.0, 2.3.1, 2.3.2, 2.4, 2.4.1, 2.4.2, 2.4.3, 2.4.4, 2.4.5, 2.4.6, 2.5, 2.5.1, 2.5.2, 2.5.3, 2.5.4, 2.5.5, 2.6, 2.6.1, 2.6.2, 2.7, 2.7.1, 2.7.2, 2.7.3, 2.7.3.1, 2.7.3.2, 2.7.4, 2.7.5, 2.7.6, 2.7.6.1, 2.7.7, 2.8, 2.8.1, 2.8.2, 2.8.3, 2.8.4, 2.8.5, 2.8.6, 2.9, 2.9.1, 2.9.2, 2.9.3)
ERROR: No matching distribution found for psycopg2

Parse Redis services

What is currently missing?

As of now the explore_service.py doesn't explore Redis services (see explore_m3db, explore_redis, explore_coordinator).
We should write code to parse Redis instances and retrieve metadata

Is this a feature you would work on yourself?

  • I plan to open a pull request for this feature

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.