Giter Club home page Giter Club logo

one-click-mlflow's Introduction

1. one-click-mlflow

A tool to deploy a mostly serverless MLflow on a GCP project with one command

1.1. How to use

1.1.1. Pre-requisites

  • A GCP project on which you are owner
  • Terraform, make, and jq installed
  • Initialized gcloud SDK with your owner account

1.1.2. Deploying

Clone the repo

Run make one-click-mlflow and let the wizard guide you.

If you want to see the innards, you can run it in debug mode: DEBUG=true make one-click-mlflow

1.1.3. What it does

  • Enables the necessary services
  • Builds and deploys the MLFlow docker image
  • Creates a private IP CloudSQL (MySQL) database for the tracking server
  • Creates an AppEngine Flex on the default service for the web UI, secured by IAP
  • Manages all the network magic
  • Creates the mlflow-log-pusher service account

Architecture

1.1.4. Other available make commands

  • make deploy: builds and pushes the application image and (re)deploys the infrastructure
  • make docker: builds and pushes the application image
  • make apply: (re)deploys the infrastructure
  • make destroy: destroys the infrastructure. Will not delete the OAuth consent screen, and the app engine application.

1.1.5. Pushing your first parameters, logs, artifacts

Once the deployment successful, you can start pushing to your MLFlow instance.

cd examples
python3 -m venv venv 
source venv/bin/activate
pip install -r requirements.txt
python track_experiment.py

You can than adapt examples/track_experiment.py and examples/mlflow_config.py to suit your application's needs.

one-click-mlflow's People

Contributors

alexisvlrt avatar cedric-magnan avatar cugnierew avatar dependabot[bot] avatar griseau avatar louispaga avatar pol-defont-reaulx avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

one-click-mlflow's Issues

As an Artefact Data Scientist, I want to have access to an MLFlow instance "as a service"

Is your feature request related to a problem? Please describe.
It not always possible of desirable to use one-click-mlflow on a client's infra. For example when:

  • There are organisation policies blocking the deployment
  • The DS does not have the required roles/permissions
  • The client is not on GCP
  • The mission is short in duration, a POC, ...

Describe the solution you'd like
A web app to request an instance of a GCP project with MLFlow deployed

The requester does not have direct access to the project, but is issued a service account to push logs/params/artifacts and the link to the MLFlow web app

The process is automated, so no one else needs to be involved.

Make destroy does not work propely

Describe the bug
Need to destroy multiple times to get rid of all the resources

To Reproduce
run make destroy

Expected behavior
All resources are destroyed without errors

Desktop (please complete the following information):

  • MacOS catalina
  • terraform 13.2

Error when running migrations

Describe the bug
Error when deploying mlflow server,

pymysql.err.ProgrammingError: (1146, "Table 'mlflow.experiments' doesn't exist")
ValueError: Invalid IPv6 URL

To Reproduce
Steps to reproduce the behavior:
Just run make one-click-mlflow on an empty project

Full traceback

 Error: Error waiting to create FlexibleAppVersion: Error waiting for Creating FlexibleAppVersion: Error code 9, message: Flex operation projects/sandbox-thomas-323814/regions/europe-west1/operations/8cade860-1654-4007-91d4-a70f350bed1a error [FAILED_PRECONDITION]: An internal error occurred while processing task /app-engine-flex/flex_await_healthy/flex_await_healthy>2021-08-23T15:51:13.172Z56066.wm.0: 2021/08/23 15:52:57 INFO mlflow.store.db.utils: Updating database tables in preparation for MLflow 1.0 schema migrations 
│ INFO  [alembic.runtime.migration] Context impl MySQLImpl.
│ INFO  [alembic.runtime.migration] Will assume non-transactional DDL.
│ INFO  [alembic.runtime.migration] Running upgrade  -> ff01da956556, ensure_unique_constraint_names
│ Traceback (most recent call last):
│   File "/usr/local/lib/python3.7/dist-packages/sqlalchemy/engine/base.py", line 1246, in _execute_context
│     cursor, statement, parameters, context
│   File "/usr/local/lib/python3.7/dist-packages/sqlalchemy/engine/default.py", line 588, in do_execute
│     cursor.execute(statement, parameters)
│   File "/usr/local/lib/python3.7/dist-packages/pymysql/cursors.py", line 163, in execute
│     result = self._query(query)
│   File "/usr/local/lib/python3.7/dist-packages/pymysql/cursors.py", line 321, in _query
│     conn.query(q)
│   File "/usr/local/lib/python3.7/dist-packages/pymysql/connections.py", line 505, in query
│     self._affected_rows = self._read_query_result(unbuffered=unbuffered)
│   File "/usr/local/lib/python3.7/dist-packages/pymysql/connections.py", line 724, in _read_query_result
│     result.read()
│   File "/usr/local/lib/python3.7/dist-packages/pymysql/connections.py", line 1069, in read
│     first_packet = self.connection._read_packet()
│   File "/usr/local/lib/python3.7/dist-packages/pymysql/connections.py", line 676, in _read_packet
│     packet.raise_for_error()
│   File "/usr/local/lib/python3.7/dist-packages/pymysql/protocol.py", line 223, in raise_for_error
│     err.raise_mysql_exception(self._data)
│   File "/usr/local/lib/python3.7/dist-packages/pymysql/err.py", line 107, in raise_mysql_exception
│     raise errorclass(errno, errval)
│ pymysql.err.ProgrammingError: (1146, "Table 'mlflow.experiments' doesn't exist")
│ 
│ The above exception was the direct cause of the following exception:
│ 
│ Traceback (most recent call last):
│   File "/usr/local/bin/mlflow", line 8, in <module>
│     sys.exit(cli())
│   File "/usr/local/lib/python3.7/dist-packages/click/core.py", line 1137, in __call__
│     return self.main(*args, **kwargs)
│   File "/usr/local/lib/python3.7/dist-packages/click/core.py", line 1062, in main
│     rv = self.invoke(ctx)
│   File "/usr/local/lib/python3.7/dist-packages/click/core.py", line 1668, in invoke
│     return _process_result(sub_ctx.command.invoke(sub_ctx))
│   File "/usr/local/lib/python3.7/dist-packages/click/core.py", line 1668, in invoke
│     return _process_result(sub_ctx.command.invoke(sub_ctx))
│   File "/usr/local/lib/python3.7/dist-packages/click/core.py", line 1404, in invoke
│     return ctx.invoke(self.callback, **ctx.params)
│   File "/usr/local/lib/python3.7/dist-packages/click/core.py", line 763, in invoke
│     return __callback(*args, **kwargs)
│   File "/usr/local/lib/python3.7/dist-packages/mlflow/db.py", line 29, in upgrade
│     mlflow.store.db.utils._upgrade_db_initialized_before_mlflow_1(engine)
│   File "/usr/local/lib/python3.7/dist-packages/mlflow/store/db/utils.py", line 179, in _upgrade_db_initialized_before_mlflow_1
│     command.upgrade(config, "heads")
│   File "/usr/local/lib/python3.7/dist-packages/alembic/command.py", line 298, in upgrade
│     script.run_env()
│   File "/usr/local/lib/python3.7/dist-packages/alembic/script/base.py", line 489, in run_env
│     util.load_python_file(self.dir, "env.py")
│   File "/usr/local/lib/python3.7/dist-packages/alembic/util/pyfiles.py", line 98, in load_python_file
│     module = load_module_py(module_id, path)
│   File "/usr/local/lib/python3.7/dist-packages/alembic/util/compat.py", line 184, in load_module_py
│     spec.loader.exec_module(module)
│   File "<frozen importlib._bootstrap_external>", line 728, in exec_module
│   File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
│   File "/usr/local/lib/python3.7/dist-packages/mlflow/temporary_db_migrations_for_pre_1_users/env.py", line 84, in <module>
│     run_migrations_online()
│   File "/usr/local/lib/python3.7/dist-packages/mlflow/temporary_db_migrations_for_pre_1_users/env.py", line 78, in run_migrations_online
│     context.run_migrations()
│   File "<string>", line 8, in run_migrations
│   File "/usr/local/lib/python3.7/dist-packages/alembic/runtime/environment.py", line 846, in run_migrations
│     self.get_context().run_migrations(**kw)
│   File "/usr/local/lib/python3.7/dist-packages/alembic/runtime/migration.py", line 518, in run_migrations
│     step.migration_fn(**kw)
│   File "/usr/local/lib/python3.7/dist-packages/mlflow/temporary_db_migrations_for_pre_1_users/versions/ff01da956556_ensure_unique_constraint_names.py", line 180, in upgrade
│     condition=column("lifecycle_stage").in_(["active", "deleted"]),
│   File "/usr/lib/python3.7/contextlib.py", line 119, in __exit__
│     next(self.gen)
│   File "/usr/local/lib/python3.7/dist-packages/alembic/operations/base.py", line 354, in batch_alter_table
│     impl.flush()
│   File "/usr/local/lib/python3.7/dist-packages/alembic/operations/batch.py", line 83, in flush
│     fn(*arg, **kw)
│   File "/usr/local/lib/python3.7/dist-packages/alembic/ddl/impl.py", line 244, in add_constraint
│     self._exec(schema.AddConstraint(const))
│   File "/usr/local/lib/python3.7/dist-packages/alembic/ddl/impl.py", line 140, in _exec
│     return conn.execute(construct, *multiparams, **params)
│   File "/usr/local/lib/python3.7/dist-packages/sqlalchemy/engine/base.py", line 982, in execute
│     return meth(self, multiparams, params)
│   File "/usr/local/lib/python3.7/dist-packages/sqlalchemy/sql/ddl.py", line 72, in _execute_on_connection
│     return connection._execute_ddl(self, multiparams, params)
│   File "/usr/local/lib/python3.7/dist-packages/sqlalchemy/engine/base.py", line 1044, in _execute_ddl
│     compiled,
│   File "/usr/local/lib/python3.7/dist-packages/sqlalchemy/engine/base.py", line 1250, in _execute_context
│     e, statement, parameters, cursor, context
│   File "/usr/local/lib/python3.7/dist-packages/sqlalchemy/engine/base.py", line 1476, in _handle_dbapi_exception
│     util.raise_from_cause(sqlalchemy_exception, exc_info)
│   File "/usr/local/lib/python3.7/dist-packages/sqlalchemy/util/compat.py", line 398, in raise_from_cause
│     reraise(type(exception), exception, tb=exc_tb, cause=cause)
│   File "/usr/local/lib/python3.7/dist-packages/sqlalchemy/util/compat.py", line 152, in reraise
│     raise value.with_traceback(tb)
│   File "/usr/local/lib/python3.7/dist-packages/sqlalchemy/engine/base.py", line 1246, in _execute_context
│     cursor, statement, parameters, context
│   File "/usr/local/lib/python3.7/dist-packages/sqlalchemy/engine/default.py", line 588, in do_execute
│     cursor.execute(statement, parameters)
│   File "/usr/local/lib/python3.7/dist-packages/pymysql/cursors.py", line 163, in execute
│     result = self._query(query)
│   File "/usr/local/lib/python3.7/dist-packages/pymysql/cursors.py", line 321, in _query
│     conn.query(q)
│   File "/usr/local/lib/python3.7/dist-packages/pymysql/connections.py", line 505, in query
│     self._affected_rows = self._read_query_result(unbuffered=unbuffered)
│   File "/usr/local/lib/python3.7/dist-packages/pymysql/connections.py", line 724, in _read_query_result
│     result.read()
│   File "/usr/local/lib/python3.7/dist-packages/pymysql/connections.py", line 1069, in read
│     first_packet = self.connection._read_packet()
│   File "/usr/local/lib/python3.7/dist-packages/pymysql/connections.py", line 676, in _read_packet
│     packet.raise_for_error()
│   File "/usr/local/lib/python3.7/dist-packages/pymysql/protocol.py", line 223, in raise_for_error
│     err.raise_mysql_exception(self._data)
│   File "/usr/local/lib/python3.7/dist-packages/pymysql/err.py", line 107, in raise_mysql_exception
│     raise errorclass(errno, errval)
│ sqlalchemy.exc.ProgrammingError: (pymysql.err.ProgrammingError) (1146, "Table 'mlflow.experiments' doesn't exist")
│ [SQL: ALTER TABLE experiments ADD CONSTRAINT experiments_lifecycle_stage CHECK (lifecycle_stage IN ('active', 'deleted'))]
│ (Background on this error at: http://sqlalche.me/e/f405)
│ 2021/08/23 15:52:58 ERROR mlflow.cli: Error initializing backend store
│ 2021/08/23 15:52:58 ERROR mlflow.cli: Invalid IPv6 URL
│ Traceback (most recent call last):
│   File "/usr/local/lib/python3.7/dist-packages/mlflow/cli.py", line 385, in server
│     initialize_backend_stores(backend_store_uri, default_artifact_root)
│   File "/usr/local/lib/python3.7/dist-packages/mlflow/server/handlers.py", line 146, in initialize_backend_stores
│     _get_tracking_store(backend_store_uri, default_artifact_root)
│   File "/usr/local/lib/python3.7/dist-packages/mlflow/server/handlers.py", line 131, in _get_tracking_store
│     _tracking_store = _tracking_store_registry.get_store(store_uri, artifact_root)
│   File "/usr/local/lib/python3.7/dist-packages/mlflow/tracking/_tracking_service/registry.py", line 37, in get_store
│     builder = self.get_store_builder(store_uri)
│   File "/usr/local/lib/python3.7/dist-packages/mlflow/tracking/registry.py", line 75, in get_store_builder
│     scheme = store_uri if store_uri == "databricks" else get_uri_scheme(store_uri)
│   File "/usr/local/lib/python3.7/dist-packages/mlflow/utils/uri.py", line 157, in get_uri_scheme
│     scheme = urllib.parse.urlparse(uri_or_path).scheme
│   File "/usr/lib/python3.7/urllib/parse.py", line 368, in urlparse
│     splitresult = urlsplit(url, scheme, allow_fragments)
│   File "/usr/lib/python3.7/urllib/parse.py", line 459, in urlsplit
│     raise ValueError("Invalid IPv6 URL")
│ ValueError: Invalid IPv6 URL
│ 
│ 
│   with module.mlflow.module.server.google_app_engine_flexible_app_version.mlflow_app,
│   on modules/mlflow/server/main.tf line 112, in resource "google_app_engine_flexible_app_version" "mlflow_app":
│  112: resource "google_app_engine_flexible_app_version" "mlflow_app" {
│ 
╵
make: *** [apply-terraform] Error 1

Technical Story: refactor the way configuration variables are set prior to deploying

Definition of ready
Ready

Description
We have vars, vars_base, and vars_additionnal that are treated as sh scripts. This is not clear and well designed.

Refactor this to have a json file and a parser

Definition of done

  • Behavior is unchanged from the user's perspective
  • One single json file contains all the variables
  • A parser to export the variables from the json file so they are accessible by Terraform through env-vars
  • We don't want to just create a Terraform variables file because some variables depend on each others, and we want to avoid having a monolithic sh script that does it all at once

make one-click-mlflow not working after make destroy because of undeleted bucket

Describe the bug
Problem encountered by @ucsky. Running make one-click-mlflow is not working after make destroy because of the artifacts' bucket which still exists.
Got the following error:


Setting up your GCP project...
╷
│ Error: googleapi: Error 409: You already own this bucket. Please select another name., conflict
│ 
│   with module.bucket_backend.google_storage_bucket.this,
│   on ../modules/mlflow/artifacts/main.tf line 18, in resource "google_storage_bucket" "this":
│   18: resource "google_storage_bucket" "this" {

To Reproduce
Steps to reproduce the behavior:

  1. run make one-click-mlflow and finish it
  2. run make destroy
  3. run make one-click-mlflow
  4. See error

Expected behavior
The second command make one-click-mlflow should work

MLflow app creation crashes because of protobuf version

Hi there,

Describe the bug
When running make one-click-mlflow, an error appears while creating the app-engine-flex:

Error: Error waiting to create FlexibleAppVersion: Error waiting for Creating FlexibleAppVersion: Error code 9, message: An internal error occurred while processing task /app-engine-flex/flex_await_healthy/flex_await_healthy>2023-03-28T10:26:33.695Z25166.wd.0: Traceback (most recent call last):
│   File "/usr/local/bin/mlflow", line 5, in <module>
│     from mlflow.cli import cli
│   File "/usr/local/lib/python3.7/dist-packages/mlflow/__init__.py", line 32, in <module>
│     import mlflow.tracking._model_registry.fluent
│   File "/usr/local/lib/python3.7/dist-packages/mlflow/tracking/__init__.py", line 8, in <module>
│     from mlflow.tracking.client import MlflowClient
│   File "/usr/local/lib/python3.7/dist-packages/mlflow/tracking/client.py", line 8, in <module>
│     from mlflow.entities import ViewType
│   File "/usr/local/lib/python3.7/dist-packages/mlflow/entities/__init__.py", line 6, in <module>
│     from mlflow.entities.experiment import Experiment
│   File "/usr/local/lib/python3.7/dist-packages/mlflow/entities/experiment.py", line 2, in <module>
│     from mlflow.entities.experiment_tag import ExperimentTag
│   File "/usr/local/lib/python3.7/dist-packages/mlflow/entities/experiment_tag.py", line 2, in <module>
│     from mlflow.protos.service_pb2 import ExperimentTag as ProtoExperimentTag
│   File "/usr/local/lib/python3.7/dist-packages/mlflow/protos/service_pb2.py", line 18, in <module>
│     from .scalapb import scalapb_pb2 as scalapb_dot_scalapb__pb2
│   File "/usr/local/lib/python3.7/dist-packages/mlflow/protos/scalapb/scalapb_pb2.py", line 35, in <module>
│     serialized_options=None, file=DESCRIPTOR)
│   File "/usr/local/lib/python3.7/dist-packages/google/protobuf/descriptor.py", line 561, in __new__
│     _message.Message._CheckCalledFromGeneratedFile()
│ TypeError: Descriptors cannot not be created directly.
│ If this call came from a _pb2.py file, your generated code is out of date and must be regenerated with protoc >= 3.19.0.
│ If you cannot immediately regenerate your protos, some other possible workarounds are:
│  1. Downgrade the protobuf package to 3.20.x or lower.
│  2. Set PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python (but this will use pure-Python parsing and will be much slower).
│ 
│ More information: https://developers.google.com/protocol-buffers/docs/news/2022-05-06#python-updates
│ Traceback (most recent call last):
│   File "/usr/local/bin/mlflow", line 5, in <module>
│     from mlflow.cli import cli
│   File "/usr/local/lib/python3.7/dist-packages/mlflow/__init__.py", line 32, in <module>
│     import mlflow.tracking._model_registry.fluent
│   File "/usr/local/lib/python3.7/dist-packages/mlflow/tracking/__init__.py", line 8, in <module>
│     from mlflow.tracking.client import MlflowClient
│   File "/usr/local/lib/python3.7/dist-packages/mlflow/tracking/client.py", line 8, in <module>
│     from mlflow.entities import ViewType
│   File "/usr/local/lib/python3.7/dist-packages/mlflow/entities/__init__.py", line 6, in <module>
│     from mlflow.entities.experiment import Experiment
│   File "/usr/local/lib/python3.7/dist-packages/mlflow/entities/experiment.py", line 2, in <module>
│     from mlflow.entities.experiment_tag import ExperimentTag
│   File "/usr/local/lib/python3.7/dist-packages/mlflow/entities/experiment_tag.py", line 2, in <module>
│     from mlflow.protos.service_pb2 import ExperimentTag as ProtoExperimentTag
│   File "/usr/local/lib/python3.7/dist-packages/mlflow/protos/service_pb2.py", line 18, in <module>
│     from .scalapb import scalapb_pb2 as scalapb_dot_scalapb__pb2
│   File "/usr/local/lib/python3.7/dist-packages/mlflow/protos/scalapb/scalapb_pb2.py", line 35, in <module>
│     serialized_options=None, file=DESCRIPTOR)
│   File "/usr/local/lib/python3.7/dist-packages/google/protobuf/descriptor.py", line 561, in __new__
│     _message.Message._CheckCalledFromGeneratedFile()
│ TypeError: Descriptors cannot not be created directly.
│ If this call came from a _pb2.py file, your generated code is out of date and must be regenerated with protoc >= 3.19.0.
│ If you cannot immediately regenerate your protos, some other possible workarounds are:
│  1. Downgrade the protobuf package to 3.20.x or lower.
│  2. Set PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python (but this will use pure-Python parsing and will be much slower).
│ 
│ More information: https://developers.google.com/protocol-buffers/docs/news/2022-05-06#python-updates
│ 
│ 
│   with module.mlflow.module.server.google_app_engine_flexible_app_version.mlflow_app,
│   on modules/mlflow/server/main.tf line 112, in resource "google_app_engine_flexible_app_version" "mlflow_app":
│  112: resource "google_app_engine_flexible_app_version" "mlflow_app" {
│ 
╵

To Reproduce
Steps to reproduce the behavior:

terraform version
Terraform v1.4.2
on linux_amd64

  1. Install Terraform v1.4.2
  2. run make one-click-mlflow
  3. See error

Expected behavior
The Terraform script should create the MLflow app without error

Desktop (please complete the following information):

  • OS: Ubuntu
  • Version: 22.04

Turn ocmlf into a Terraform module

Description

Right now OCMLF has a tightly coupled CLI workflow and infrastructure deployment.
Let's uncouple these by making the Terraform part a standalone module available in the Terraform registry.
This repo will just contain the CLI workflow to generate the deployment config, and invoke the TF module.

This will allow the Terraform part of OCMLF to be seamlessly integrated with existing IAC projects.

MLflow 2.2.0 Support

Why asking for this feature
MLflow 2.2.0 supports the following over MLflow 1.X:

  • UI has a sleeker design
  • Simplifying the management of end-to-end MLOps workflows with the MLflow Pipeline module

Describe the solution you'd like
MLflow Tracking Server running 2.2.0 for enhanced UI and additional features

A new, different AppEngine service is created when doing 2+ deployments on the same project

Describe the bug
If a first deployment created a default AppEngine service for MLFlow, deploying again will create a new mlflow service instead of updating the first one

The main issue for the end-user is that the URL of the MLFlow server is inconsistent if you do one or multiple deployments: https://mlflow-dot-{PROJECT_ID}.ew.r.appspot.com or https://{PROJECT_ID}.ew.r.appspot.com.

To Reproduce
Steps to reproduce the behavior:

  1. make one-click-mlflow -> AppEngine default is created
  2. make one-click-mlflow a second time on the same project -> AppEngine mlflow is created

Expected behavior
If the first deployment created the default service, we expect it to be updated, not create a new one

Desktop (please complete the following information):

  • OS: MacOS Big Sur
  • Version Terraform 0.14.6

Have an `Editor` role deployment option

Allow the deployment with only Editor level access on a GCP project

Adds pre-requisites:

  • roles/cloudsql.client, roles/secretmanager.secretAccessor, and roles/compute.networkUser to <project-id>@appspot.gserviceaccount.com
  • roles/storage.objectAdmin to <project-id>@gae-api-prod.google.com.iam.gserviceaccount.com

TODO:

  • Bash script (or other) to import these bindings to the TFstate
  • Readme section "Editor deployment"

Deploying on a shared VPC is not working properly

Describe the bug
Deploying on a shared VPC from another GCP project is not working properly

module.network.google_compute_global_address.private_ip_addresses: Creating...

Error: Error creating GlobalAddress: googleapi: Error 400: Invalid value for field 'resource.network': 'projects/<another project>/global/networks/<shared VPC name>'. The specified network can not come from a different project., invalid

To Reproduce
run make deploy with a shared VPC as TF_VAR_network_name

Expected behavior
MLFlow server deployed on the shared VPC

Failed to build docker image

Describe the bug
When running apt-get update, build failed with following error:

#6 1.552 Get:1 http://deb.debian.org/debian buster InRelease [122 kB]                                                                                                                          
#6 1.552 Get:2 http://security.debian.org/debian-security buster/updates InRelease [65.4 kB]                                                                                                   
#6 1.625 Get:3 http://deb.debian.org/debian buster-updates InRelease [51.9 kB]                                                                                                                 
#6 1.719 Get:4 https://packages.cloud.google.com/apt cloud-sdk-buster InRelease [6774 B]
#6 6.642 Get:5 https://packages.cloud.google.com/apt cloud-sdk-buster/main amd64 Packages [180 kB]
#6 7.197 Reading package lists...
#6 11.71 E: Repository 'http://security.debian.org/debian-security buster/updates InRelease' changed its 'Suite' value from 'stable' to 'oldstable'
#6 11.71 E: Repository 'http://deb.debian.org/debian buster InRelease' changed its 'Suite' value from 'stable' to 'oldstable'
#6 11.71 E: Repository 'http://deb.debian.org/debian buster-updates InRelease' changed its 'Suite' value from 'stable-updates' to 'oldstable-updates'

To Reproduce
Steps to reproduce the behavior:
Build the docker image

Error: Failed to get existing workspaces

Hi there,

Describe the bug
When running make one-click-mlflow, I get the following error:

Error: Failed to get existing workspaces: querying Cloud Storage failed: storage: bucket doesn't exist

It seems that Terraform is unable to get my workspace.

To Reproduce

> terraform version
Terraform v1.4.2
on linux_amd64
  1. Create a project and make sure you project is part of an organization
  2. Run make one-click-mlflow
  3. See the error

Expected behavior
Terraform should recognize my workspace.

Problem with get_token()

Describe the bug

I have the following error when I try to test the one-click-mlflow

(venv) ucsky@machine:~/try/one-click-mlflow$ python examples/track_experiment.py 
Enter your project ID: ofi-ai-try
Enter the name of your MLFlow experiment: test
Traceback (most recent call last):
  File "/home/ucsky/try/one-click-mlflow/examples/track_experiment.py", line 5, in <module>
    import mlflow_config
  File "/home/ucsky/try/one-click-mlflow/examples/mlflow_config.py", line 61, in <module>
    os.environ["MLFLOW_TRACKING_TOKEN"] = get_token()
  File "/home/ucsky/try/one-click-mlflow/examples/mlflow_config.py", line 17, in get_token
    token = _get_token()
  File "/home/ucsky/try/one-click-mlflow/examples/mlflow_config.py", line 35, in _get_token
    open_id_connect_token = id_token.fetch_id_token(Request(), client_id)
  File "/home/ucsky/try/one-click-mlflow/examples/venv/lib/python3.9/site-packages/google/oauth2/id_token.py", line 252, in fetch_id_token
    credentials = service_account.IDTokenCredentials.from_service_account_info(
  File "/home/ucsky/try/one-click-mlflow/examples/venv/lib/python3.9/site-packages/google/oauth2/service_account.py", line 528, in from_service_account_info
    signer = _service_account_info.from_dict(
  File "/home/ucsky/try/one-click-mlflow/examples/venv/lib/python3.9/site-packages/google/auth/_service_account_info.py", line 46, in from_dict
    missing = keys_needed.difference(six.iterkeys(data))
  File "/home/ucsky/try/one-click-mlflow/examples/venv/lib/python3.9/site-packages/six.py", line 599, in iterkeys
    return iter(d.keys(**kw))
AttributeError: 'NoneType' object has no attribute 'keys'

To Reproduce
Installing with
make one-click-mlflow
and after

cd examples
python3 -m venv venv 
source venv/bin/activate
pip install -r requirements.txt
python track_experiment.py

Expected behavior
Experiment tracking in MLFlow.

Desktop (please complete the following information):

lsb_release -a
LSB Version:	core-11.1.0ubuntu2-noarch:security-11.1.0ubuntu2-noarch
Distributor ID:	Pop
Description:	Pop!_OS 21.04
Release:	21.04
Codename:	hirsute

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.