Giter Club home page Giter Club logo

metaflow-ui's Introduction

Metaflow UI

Build GitHub release

Metaflow UI is a tool to monitor Metaflow workflows in real-time.

Getting started

Using Metaflow UI requires Metaflow Service for now.

To setup a local development environment, see docs/README.md.

Deploying the UI

Deploying Metaflow UI requires setting up a UI service (which is different from the Metaflow service but uses the same backing database). To deploy the UI service, follow instructions at Metaflow UI Service.

Docker support

Dockerfile provides support for an nginx container hosting the production build of the application.

# Build Docker image
$ docker build --tag metaflow-ui:latest .
# Run Docker container on port 3000
$ docker run -p 3000:3000 metaflow-ui:latest
# Run Docker container using custom API endpoint
$ docker run -p 3000:3000 -e METAFLOW_SERVICE=http://custom-ui-backend/api metaflow-ui:latest

For example, when used with a locally deployed Metaflow UI Service, the UI can be launched with

docker run -p 3000:3000 -e METAFLOW_SERVICE=http://localhost:8083/ metaflow-ui:latest

Dockerfile also supports the following environment variables to inject content into the UI's index.html:

  • METAFLOW_HEAD - Inject content to head element
  • METAFLOW_BODY_BEFORE - Inject content at the beginning of body element
  • METAFLOW_BODY_AFTER - Inject content at the end of body element

Use cases for these variables range from additional meta tags to analytics script injection.

Example on how to add a keyword meta tag to Metaflow UI:

METAFLOW_HEAD='<meta name="keywords" content="metaflow" />'

Plugins development

See docs/plugin-system.md to get started with plugins development.

Documentation

See docs/README.md to learn more.

General Metaflow documentation available here:

Contributing

We welcome contributions to Metaflow. Please see our contribution guide for more details.

Get in Touch

There are several ways to get in touch with us:

metaflow-ui's People

Contributors

0xrushi avatar darinyu avatar jackie-ob avatar msavela avatar oavdeev avatar obgibson avatar romain-intel avatar rsanteri avatar saikonen avatar savingoyal avatar tuulos avatar valaydave avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

metaflow-ui's Issues

add another breakdown of running jobs, by current-step

Summary

Adding another column for "current step" of each run, and also allows filters - just like the status of Running/Failed/Completed (only with more options) will be very useful.

Once you submit dozens of jobs which will be running for hours, it is very nice to see where they currently stand (that we've passed something that was recently fixed, for instance), and make sure that the pace makes sense and we can estimate how long it will take.

An alternative solution will be a different page entirely, grouping the current steps of running-jobs only showing some kind of graph. I'll take that as well.

Alternatives: None, except for manually querying s3 relevant paths - or opening lots of tabs.
I don't think the aws-UI also has a per-step StepFunctions dashboard either.

The complication i can think of is having lots of steps (but this can be fixed using UI), or higher refresh-rate needed since steps can change more quickly than other fields.

Unable to view DAG

Description

Unable to view DAG.

DAG encountered an unexpected error. This should not happen and might be caused by unexpected data.

Steps to Reproduce

  1. Navigate to Metaflow UI
  2. Click on flow that completed successfully, it will show the Timeline tab
  3. Click on DAG tab

Expected behavior:

See the DAG.

Reproduces how often:

Every time.

Versions

OS: macOS catalina 10.15.7
Application version: v1.0.0
Service version: 2.2.1--

Additional Information

These flows are deployed to Step Functions. I see this in the Developer console:

react-dom.production.min.js:209 TypeError: Cannot read properties of undefined (reading 'box_ends')
    at gO (DAGUtils.ts:58:28)
    at hO (DAGUtils.ts:104:10)
    at tE (index.tsx:22:33)
    at Zi (react-dom.production.min.js:153:146)
    at Fa (react-dom.production.min.js:175:309)
    at _l (react-dom.production.min.js:263:406)
    at gs (react-dom.production.min.js:246:265)
    at ms (react-dom.production.min.js:246:194)
    at ls (react-dom.production.min.js:239:172)
    at react-dom.production.min.js:123:115

Time zone selection does not seem to adapt to daylight saving

Description

If I select America/Los Angeles as my timezone, it is marked as GMT-7 but the current time should be GMT-8.

Steps to Reproduce

  1. Select the America/Los Angeles time zone
  2. Notice that things are off by 1h.
  3. Selecting the America/Juneau timezone makes it all work :) .

Expected behavior:

Timezone would respect daylight saving.

Actual behavior:

Timezone selection does not respect daylight saving.

Reproduces how often:

Always

Versions

Application version: v1.0.2
Service version: 2.1.0-2021-11-15 16:13:11 PST-15cd9f7fa9d5df15d9fe7dac6a60667383c4fa6b|75ca8de715480c84198fbde993908e6a9cb9747d

Additional Information

Internal deployment at Netflix.

Distinguish "killed by orchestrator" and actual failures

Summary

When a task dies, it can kill all the other concurrently running task and the UI does not distinguish these two "deaths"

Motivation

With a wide foreach, if there is a single failure somewhere, it can be difficult to figure out which one it is because all the tasks are marked as failed. It may be nice to be able to distinguish "user failure" from "killed by orchestrator" type of failures so the user may more easily narrow down on the actual cause of the failure.

Metaflow-UI won't connect

Description

When I run the docker image of the UI and redirect it to port 3000, the terminal shows that the processes are being started but the UI shows waiting for connection.
Expected behavior:

  • The UI would start-up and I would be able to submit DAGs.
    Actual behavior:
  • The UI keeps waiting for connection and the submitted tags shows [Error111] Cnnection refused
    Reproduces how often:
  • Everytime

Versions

  • Metaflow 2.5 & 2.7
  • Python 3.8
  • Ubuntu 20.04

Additional Information

  • The DAG works fine when submitted without the UI parameters
    Screenshot from 2022-08-07 11-47-52
    Screenshot from 2022-08-07 11-48-48

Running flow does not update UI with latest steps

Description

Running flow does not update UI with latest steps. Issue was observed when running a flow from Argo workflows x Kubernetes.

It was a start step branches into a 3-way foreach before joining again.

Steps to Reproduce

  1. Trigger Argo Workflows to run a flow
  2. MFGUI shows the ongoing run. Click into it.
  3. Start task shows up. New tasks don't show up automatically
  4. Refresh shows the latest running / completed tasks of the run.

Expected behavior:

Tasks show up dynamically without refresh as they run.

Actual behavior:

Refresh needed to see latest tasks

Reproduces how often:

We only tried a couple of times.

Versions

<not sure / weeks old release as of 4/5/2023>

Request an example case for Flow from metaflow-ui

Summary

It would be nice to have an example case in metaflow-ui.
I proceeded with a docker container, but I don't know how to add Flow to metaflow-ui.
If you add some use cases for metaflow and metaflow-ui, users will be able to use them well.
You must be busy with a lot of work, but please create a use case to activate metaflow-ui open source.

Thanks :)

image

 METAFLOW_SERVICE_URL=http://0.0.0.0:8083/ METAFLOW_DEFAULT_METADATA=service python3 ./case1.py run
Metaflow 2.8.0 executing LinearFlow for user:test
Validating your flow...
    The graph looks good!
Running pylint...
    Pylint is happy!
    Metaflow service error:
    Metadata request (/flows/LinearFlow) failed (code 405): 405: Method Not Allowed

Tags do not seem to update when present in the URL but also in the session

Description

If I go to the UI with a given tag in the URL (ie: &_tags=XYZ) but I had previously been on the UI with a tag ABC selected, the tag in the UI is ignored and only the tag that was previously selected i used.

Steps to Reproduce

This is always reproducible as described above.

Versions

Application version: v1.0.2
Service version: 2.1.0-2021-11-15 16:13:11 PST-15cd9f7fa9d5df15d9fe7dac6a60667383c4fa6b|75ca8de715480c84198fbde993908e6a9cb9747d

Additional Information

Card listens to plugin height message

Description

The cards and plugins are rendered as iframes. The application talks to them using messages in order to get the height of the card or plugin so that it can be displayed correctly.

The message listeners are added to the window, so the height message from a plugin can be used by a card, and vice versa.

This can cause the heights to be incorrect.

Steps to Reproduce

  1. Add a plugin to MFGUI
  2. Run a flow with a card
  3. Go to the task page that contains the card
  4. Look at the iframe heights of the plugin and the card.
  5. Refresh if necessary, the order of messages is indeterminate.

Expected behavior:

The plugin should the the correct height, and the card should be the correct height.

Actual behavior:

The heights of the plugins and cards are sometimes incorrect.

Reproduces how often:

50%.

Versions

v1.1.4

Additional Information

The relevant code is in CardIframe.tsx. The message handler needs to check the source.

Make the timeline view state part of the URL

Summary

One of the interesting features of the UI is the ability to share a URL and have the recipient of the URL see the same thing. This doesn't work though for timeline view where the state of the expand/collapse step view is not kept around in the URL and revert back to a default setting. This makes it hard to share a timeline view that shows a specific task or set of tasks. It would be nice if this could be kept.

Motivation

See above

Describe alternatives you've considered

Alternative would be as it is today. Another alternative might be to have a button to create a "shared URL". The reason for this is that putting it in the URL would mess with the back behavior which may not be desired.

Additional context

The behavior of the "back" button needs to be considered.

Color "caught" tasks differently

Summary

Right now tasks are either marked "green" (done successfully), "light green" (running) or "red" (failed); we could introduce a new color to mark tasks that have failed but their failure was caught with the @catch decorator.

Motivation

Better visibility/insight about what is going on in the workflow. It would also better support patterns of "this has failed but I don't want to stop".

Describe alternatives you've considered

N/A

Additional context

N/A

Better identification of tasks in the UI

Summary

Currently, when wide foreaches (or worse, nested ones) are running, it is hard to tell which task corresponds to which "position in the foreach tree". For example, if you are doing a foreach on countries (let's say 3 countries) and inside you are doing a foreach on movie titles within each country (let's say 100 movies), your step per_title_compute for example would have 300 tasks and it is very hard to know, as they are running within the UI, which one is which. Identifying them and showing the country/title the task is processing would be very helpful. Even showing something like indices may help.

Motivation

This gives better visibility into what is executing.

Describe alternatives you've considered

One alternative is to print out the position of the task in the log at the beginning. This does give that information (so is useful) but doesn't make it highly visible in the UI and also does not make it filter-able. For example, if the user is looking for the execution of a particular country/title pair, they would have to click through 300 tasks to figure out which one they want to look at.

DAG view links use run_number even when run_id exists

Description

When clicking on the DAG on a step, the URL that is produced contains the run number even if a run id exists.

Steps to Reproduce

  1. Have a run that was run on step-functions (or equivalent)
  2. Navigate to its DAG page
  3. Click on any step
  4. Notice the URL no longer has something like sfn-XXXX but instead a numeric ID for the run ID
  5. If you replace this with the run ID (sfn-XXXX) it still works

Expected behavior:

Correct URL

Actual behavior:

Incorrect URL

Reproduces how often:

100%

Versions

1.1.3

Additional Information

Plugins sometimes fail to register

Description

When there are two (or more) plugins in a slot, one of them may fail to register. This issue is intermittent.

Steps to Reproduce

  1. Add two plugins to a slot
  2. Load a run or task page in MFGUI
  3. Observe the DevTools to see one of the plugins not loaded into the slot

Expected behavior:

Both plugins should be registered and their corresponding iframes should be in the correct slot in the DOM

Actual behavior:

One plugin will be in the correct place. The other plugin will be in the "hidden" element of the DOM, at the bottom of the page.

Reproduces how often:

On the Brave browser, we have seen it about 75% of the time. On Chrome it happens less often.

Versions

1.1.2 of metaflow-ui and 2.2.4 of metaflow-service

Additional Information

We have only seen this issue on Netflix's system. Attempts to reproduce on a smaller system have failed.

We have tracked down the issue to a mismatch between the plugin's iframe name and the javascript within the iframe. We have seen that the javascript uses the incorrect name to send postMessages.

how to use in static ip

Description

I want to access metaflow-ui using static ip.
However, there is an issue that the screen does not appear when trying to access the outside.
If you tell me what part needs to be corrected, I want to fix it.

Steps to Reproduce

  1. run docker
docker run -p 3000:3000 -e METAFLOW_SERVICE=http://localhost:8083/ metaflow-ui:latest
  1. access url
localhost:3000

Expected behavior:

ui

Actual behavior:

404: Not Found

Reproduces how often:

Versions

ubuntu 20.04
metaflow-ui : 1.2.4

Additional Information

DAG Error Because of Python Script

Description

I want to check the DAG of the flow script, but an error occurs in the Python script

Steps to Reproduce

  1. build and run metaflow-ui
docker build --tag metaflow-ui:latest .
docker run -p 3000:3000 -e METAFLOW_SERVICE=http://localhost:8083/ metaflow-ui:latest
git clone https://github.com/Netflix/metaflow-service.git
cd metaflow-service
docker-compose -f docker-compose.development.yml up
  1. RUN flow script in metaflow-ui container
# install python package and metaflow
METAFLOW_SERVICE_URL=http://0.0.0.0:8080/ METAFLOW_DEFAULT_METADATA=service python3 helloworld.py run

Expected behavior:

DAG GRAPH SHOW

image

Traceback (most recent call last):
  File "/usr/local/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/local/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/root/services/ui_backend_service/data/cache/client/cache_server.py", line 307, in <module>
    cli(auto_envvar_prefix='MFCACHE')
  File "/usr/local/lib/python3.7/site-packages/click/core.py", line 1128, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/click/core.py", line 1053, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.7/site-packages/click/core.py", line 1395, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.7/site-packages/click/core.py", line 754, in invoke
    return __callback(*args, **kwargs)
  File "/root/services/ui_backend_service/data/cache/client/cache_server.py", line 301, in cli
    Scheduler(store, max_actions).loop()
  File "/root/services/ui_backend_service/data/cache/client/cache_server.py", line 199, in __init__
    maxtasksperchild=512,  # Recycle each worker once 512 tasks have been completed
  File "/usr/local/lib/python3.7/multiprocessing/context.py", line 119, in Pool
    context=self.get_context())
  File "/usr/local/lib/python3.7/multiprocessing/pool.py", line 176, in __init__
    self._repopulate_pool()
  File "/usr/local/lib/python3.7/multiprocessing/pool.py", line 241, in _repopulate_pool
    w.start()
  File "/usr/local/lib/python3.7/multiprocessing/process.py", line 112, in start
    self._popen = self._Popen(self)
  File "/usr/local/lib/python3.7/multiprocessing/context.py", line 277, in _Popen
    return Popen(process_obj)
  File "/usr/local/lib/python3.7/multiprocessing/popen_fork.py", line 20, in __init__
    self._launch(process_obj)
  File "/usr/local/lib/python3.7/multiprocessing/popen_fork.py", line 74, in _launch
    code = process_obj._bootstrap()
  File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
    self.run()
  File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/local/lib/python3.7/multiprocessing/pool.py", line 121, in worker
    result = (True, func(*args, **kwds))
  File "/root/services/ui_backend_service/data/cache/client/cache_worker.py", line 29, in execute_action
    execute(tempdir, action_cls, request)
  File "/root/services/ui_backend_service/data/cache/client/cache_worker.py", line 56, in execute
    invalidate_cache=req.get('invalidate_cache', False))
  File "/root/services/ui_backend_service/data/cache/generate_dag_action.py", line 97, in execute
    results[result_key] = json.dumps(dag)
  File "/usr/local/lib/python3.7/contextlib.py", line 130, in __exit__
    self.gen.throw(type, value, traceback)
  File "/root/services/ui_backend_service/data/cache/utils.py", line 130, in streamed_errors
    get_traceback_str()
  File "/root/services/ui_backend_service/data/cache/utils.py", line 124, in streamed_errors
    yield
  File "/root/services/ui_backend_service/data/cache/generate_dag_action.py", line 93, in execute
    dag = DataArtifact("{}/_graph_info".format(param_step.task.pathspec)).data
  File "/usr/local/lib/python3.7/site-packages/metaflow/client/core.py", line 825, in data
    obj = filecache.get_artifact(ds_type, location[6:], meta, *components)
  File "/usr/local/lib/python3.7/site-packages/metaflow/client/filecache.py", line 216, in get_artifact
    [name],
  File "/usr/local/lib/python3.7/site-packages/metaflow/datastore/task_datastore.py", line 364, in load_artifacts
    for (key, blob) in self._ca_store.load_blobs(to_load.keys()):
  File "/usr/local/lib/python3.7/site-packages/metaflow/datastore/content_addressed_store.py", line 140, in load_blobs
    with open(file_path, "rb") as f:

TypeError: expected str, bytes or os.PathLike object, not NoneType

Versions

Metaflow 2.8.0

Additional Information

Dynamic graph view

Summary

The static graph view is a nice way to get an overview of the DAG but it would sometimes be nicer to be able to have a fuller dynamic and "exploded" view of the graph. This is somewhat related to #88 .

Motivation

Users sometimes find the DAG view to not present a good enough pictures of things. It can also be confusing with nested foreaches.

Describe alternatives you've considered

An alternative approach may be to provide a view directly in the task view (as a plugin) and show where that task sits within the graph.

Improve access to code tarball in the UI

Summary

Provide a simple button to download the code for a Metaflow run.

Motivation

The code for the run is typically available but hard to access. The path to the code package is visible in the UI but somewhat hidden and there is no simple way to download it.

Describe alternatives you've considered

None

Additional context

Related to Netflix/metaflow#1009

Cards always display with height == 0px

Description

Tried to create a card with Markdown and HTML template, the card section shows nothing. like below
image

I checked the iframe style, it shows the "height: 0px", if I change the value of height to 200px for example, then it can be displayed correctly.

Steps to Reproduce

  1. Metaflow step code example
    @card(type='blank')
    @step
    def start(self):
        logging.info('pipeline start')
        current.card.append(Markdown('# Timestamp'))
        current.card.append(Table([['first', 'second'],
                                   [1, 2]]))
        self.next(xxx)
  1. run the pipeline with metaflow
  2. open the metaflow ui and check the task in start step

Expected behavior:
The card should display with correct height
image

Actual behavior:
It displayed a card iframe with height == 0px
image

Reproduces how often:
always

Versions

metaflow-ui v1.1.3
metaflow==2.7.1
metaflow-card-html==1.0.1

Additional Information

None

Step status color not always correct

Description

In some cases, the color of the step shows as "red" (failed) even though all the tasks in the step succeeded. The step should show green. The correct color sometimes returns on a page reload but not always. This behavior was observed on both Brave and Chrome.

Steps to Reproduce

  1. Navigate to a page where one task in one step failed. Notice that more than one step are red.

Expected behavior:

Proper colors

Actual behavior:

Some successful steps are marked as failures.
Reproduces how often:

It's fairly frequent but not sure what triggers it. It doesn't always happen. It's somewhat random.

Versions

Latest version (Netflix internal)

Additional Information

See screenshot.

Screenshot 2023-12-11 at 10 03 02 AM

DAG view does not auto update

Description

The colors on the DAG view to show the progression of execution do not update even if websocket is properly connected

Steps to Reproduce

  1. Go to a fairly large run
  2. Go to the DAG view
  3. Notice that the colors do not change unless you reload the page

Expected behavior:

DAG view updates state of tasks just like timeline would

Actual behavior:

It doesn't update

Reproduces how often:

All the time

Versions

1.1.3

Additional Information

None

Crashed Flow appear to run forever.

Description

I've had a few Flows which have crashed. The crash is due to my errors not metaflow itself. However in the UI it continues to be marked as "Running".

Steps to Reproduce

Not sure how to reproduce it on demand.

Expected behavior:
Should be marked as failed

Actual behavior:
Stays in a "Running" state.

Reproduces how often:
1 in 16 so far.

Versions

Additional Information

setVisibility() not working

Description

Metaflow.setVisibility() in the plugin code does not work.

Steps to Reproduce

  1. Create a plugin
  2. In plugin.html add setVisibility(false) in the onReady callback
  3. Install the plugin

Expected behavior:

The plugin should be hidden

Actual behavior:

The plugin is not hidden

Reproduces how often:

100%

Versions

1.1.1 and 2.2.2

Additional Information

The issue is that the slot is not getting set in the message that is sent from setVisibility

docker build failing because "download_ui.sh not found"

Description

Trying to setup Metaflow UI locally to run on a Windows machine but building the docker image is failing.

Steps to Reproduce

When running docker build --tag metaflow-ui:latest . from docker build --tag metaflow-ui:latest ., I'm getting the below error.

=> [stage-1 15/20] WORKDIR /root                                                                                                                                                                                        0.1s 
 => [stage-1 16/20] RUN /opt/latest/bin/pip install .                                                                                                                                                                   12.9s 
 => ERROR [stage-1 17/20] RUN /root/services/ui_backend_service/download_ui.sh                                                                                                                                           0.4s 
------
 > [stage-1 17/20] RUN /root/services/ui_backend_service/download_ui.sh:
0.403 /bin/sh: 1: /root/services/ui_backend_service/download_ui.sh: not found
------
Dockerfile:43
--------------------
  41 |
  42 |     # Install Netflix/metaflow-ui release artifact
  43 | >>> RUN /root/services/ui_backend_service/download_ui.sh
  44 |
  45 |     # Migration Service
--------------------
ERROR: failed to solve: process "/bin/sh -c /root/services/ui_backend_service/download_ui.sh" did not complete successfully: exit code: 127

Task status does not update

Description

When looking at the view of a task, the state of a task is highlighted by a little read/green/light-green line next to the task's name. This does not update live as the task finishes. It works fine when on the timeline view though.

Steps to Reproduce

  1. Go to a running flow
  2. Select a running task from the timeline view and click on it
  3. Wait for it to finish. You will see that the task's state does not update but will if you reload the page.

Expected behavior:

The websocket connection should stream the task's state result to the UI and it should update live like it does for the timeline view.

Actual behavior:

Live updates do not happen

Reproduces how often:

Every time

Versions

v1.1.3

Additional Information

yarn start fails

Description

Trying to spin up Metaflow UI locally on a Windows machine to test Metaflow.

Yarn is successfully installing but [(https://github.com/Netflix/metaflow-ui/blob/master/docs/README.md#getting-started)] mentions running yarn start but the command is not found

Steps to Reproduce

C:\Windows\System32>yarn install
➤ YN0087: Migrated your project to the latest Yarn version

➤ YN0000: · Yarn 4.0.0-rc.53.git.20231006.hash-202e568
➤ YN0000: ┌ Resolution step
➤ YN0000: └ Completed
➤ YN0000: ┌ Fetch step
➤ YN0000: └ Completed
➤ YN0000: ┌ Link step
➤ YN0000: └ Completed
➤ YN0000: · Done in 0s 52ms

C:\Windows\System32>yarn start
Usage Error: Couldn't find a script named "start".

$ yarn run [--inspect] [--inspect-brk] [-T,--top-level] [-B,--binaries-only] [--require #0] <scriptName> ...

DAG, stderr, stdout logs not being retrieved and displayed in Metaflow UI

Description

We've got metaflow and metaflow UI deployed on AWS (on local IP, so not publicly accessible), but the logs aren't being retrieved. There was an initially an issue with our ServiceInfoUI container not having enough memory, but this was upped. The RDS burst balance was also too low, but upping the storage to 1000 GiB removed this queue, and changed the error message to a generic error, so I don't think this is the issue any more.

The RDS is accessible, and appears to be storing the logs. The logs are also available from the relevant S3 buckets, Step Functions and Batch.

I can't find exactly where the UI is trying to pull data from, so not sure whether it's a permissions issue with access to the RDS, but the S3 bucket seems to be accessible. As far as I can see, the permissions/configuration is the same as the metaflow UI CF template, so was interested to know if anyone else had had/is having this issue.

Steps to Reproduce

  1. Not exactly certain, after running a few flows for a while, view the metaflow UI

Expected behavior:

DAG, stderr and stdlog display the error messages being logged in CloudWatch in the UI.

Actual behavior:

Error messages don't appear:
image

Reproduces how often:

Every time the UI is used. I've previously looked at a public example from Outerbounds, but can't view that at the moment. This one wasn't having the same issue a couple of months ago.

Versions

Application version: 1.1.4
Service version: 2.3.2
My machine: MacOS 12.6
Viewing on Safari: v16.0

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.