cooelf / auto-ui Goto Github PK

Official implementation for "You Only Look at Screens: Multimodal Chain-of-Action Agents" (stay tuned and more will be updated)

License: Apache License 2.0

Python 100.00%

auto-ui's People

Contributors

Stargazers

Watchers

Forkers

standardgalactic touristshaun truebit 5l1v3r1 gilbertmpanga12 sorokinvld gimme1dollar zhikanggfu lilujunai hyowe sergiogcharles

auto-ui's Issues

Any demo to inference with mobile screenshot with prompt?

Thanks for the nice work.
Is there any demo code to inference a prompt with a mobile sceenshot to get the response from Auto-UI?

Hello Great work Guys!

I just wanted to use your model in the Hugging Face model library but I don't see any model usage definitions, will you be defining any usage instructions or model card any time soon?

Link for Dataset and trained models is not working.

Hello, I found this an interesting project but the link provided for accessing preprocessed data and for trained models is not working. The link you provided is:
https://huggingface.co/cooelf/Auto-UI/tree/main
Can you provide the right link so we can look into the dataset structure and format?

What's the blip-2 feature extractor details?

Thanks for the work. I'd like to inference this model on custom images and goals, I tried to write the inference code by myself.

but I found that the obj file unpickles the image as a tensor, so I'd like to know what's the conversion method used to load the image?

According to the utils_data.py, the image_ids was retrieved from image_ids = torch.tensor(source_image).squeeze();
According to the paper, "Given a screenshot Xscreen ∈
Rh×w×3 with height h and width w at step t ∈ [1, k], we first feed it to a frozen image encoder (e.g.,
BLIP-2 (Li et al., 2023)) and extract vision features Hscreen ∈ R1×ds where ds is the dimension of
the vision features."

So I believe that images are pickled after its image features has been extracted into the tensor. But there is no details and blip-2 model details used for feature extraction.

The loss of the base model during training does not decrease

Hello, I endeavor to replicate the results of the base model using the "declare-lab/flan-alpaca-base" obtained from Hugging Face. I followed the commands provided in the readme for training; however, the loss does not exhibit a descent pattern, and, regrettably, the inference fails to produce any meaningful content. Below, I present a partial excerpt from my trainer_state for your reference：
{
"epoch": 0.02,
"learning_rate": 3.135779241141424e-06,
"loss": 17.987,
"step": 500
},
{
"epoch": 0.03,
"learning_rate": 6.271558482282848e-06,
"loss": 17.9571,
"step": 1000
},
……
{
"epoch": 9.99,
"learning_rate": 1.320328101533231e-07,
"loss": 16.2255,
"step": 318500
},
{
"epoch": 10.0,
"eval_gen_len": 1.0,
"eval_loss": 17.40145492553711,
"eval_rouge1": 0.007,
"eval_rouge2": 0.0,
"eval_rougeL": 0.0069,
"eval_rougeLsum": 0.007,
"eval_runtime": 411.3956,
"eval_samples_per_second": 21.349,
"eval_steps_per_second": 0.168,
"step": 318900
}
When attempting to conduct inference using the acquired model, the generated content proves entirely ineffective：
'- nooutput> - nooutput> - nooutput> - nooutput> - nooutput> - nooutput> - nooutput> - nooutput> - nooutput> - nooutput> - nooutput> - nooutput> - nooutput> - nooutput> '

What are the reasons for the above problems? Looking forward to your answer, thank you！

在提供链接的blip数据集并未找到用于inference的single_parsed_episode_t5_blip数据

在所给的 https://huggingface.co/cooelf/Auto-UI/tree/main 的链接里的blip.zip文件解压后并没有找到用于inference的single_parsed_episode_t5_blip数据，这个数据在那可以得到，想尝试inference

How to deploy this model on sagemaker?

I tried deploying this model on sagemaker following the instructions on https://huggingface.co/docs/sagemaker/inference#deploy-a-model-from-the-hub and the inference calls are failing with the following error:

OSError: /.sagemaker/mms/models/cooelf__Auto-UI does not appear to have a file named config.json. Checkout 'https://huggingface.co//.sagemaker/mms/models/cooelf__Auto-UI/None' for available files.

Any pointers on how to get this running on sagemaker?

Unable to run the model

Hi, I am following the steps in the Readme to run the model.
My goal is to be able to run the model to be able to provide my inputs. I dont want to train the model.

I did the following:

Downloaded the dataset blip from https://huggingface.co/cooelf/Auto-UI/tree/main and placed in the folder dataset.

On running the command-

python -m torch.distributed.launch --nproc_per_node=8 --use_env main.py \
    --data_root blip \
    --model declare-lab/flan-alpaca-base \
    --epoch 10 --lr 1e-4 \
    --user_msg seq_future_blip_axis_all0.1_hist8_future4 --img_type blip --img_dim 1408 \
    --bs 4 --eval_bs 16 --input_len 512 --output_len 128 --eval_acc 40 \
    --transform_axis --warmup_ratio 0.05 \
    --all_data 0.1 \
    --use_history 8 \
    --use_future 4 \
    --eval_subset dataset/blip/general_blip \
    --output_dir experiments

I get the following error :

args Namespace(all_data=0.1, bs=4, data_ratio=None, data_root='blip', debug_num=None, epoch=10, eval_acc=40, eval_bs=16, eval_name=None, eval_subset='dataset/blip/general_blip', evaluate_dir=None, final_eval=False, img_dim=1408, img_type='blip', input_len=512, local_rank=-1, lr=0.0001, model='declare-lab/flan-alpaca-base', output_dir='experiments', output_len=128, seed=42, transform_axis=True, use_future=4, use_generate=True, use_history=8, use_img_history=False, use_layout=False, user_msg='seq_future_blip_axis_all0.1_hist8_future4', warmup_ratio=0.05)
====Input Arguments====
{
  "data_root": "blip",
  "output_dir": "experiments",
  "model": "declare-lab/flan-alpaca-base",
  "data_ratio": null,
  "eval_name": null,
  "local_rank": -1,
  "epoch": 10,
  "lr": 0.0001,
  "warmup_ratio": 0.05,
  "bs": 4,
  "debug_num": null,
  "input_len": 512,
  "output_len": 128,
  "img_dim": 1408,
  "eval_bs": 16,
  "eval_acc": 40,
  "all_data": 0.1,
  "eval_subset": "dataset/blip/general_blip",
  "use_history": 8,
  "use_img_history": false,
  "use_future": 4,
  "use_layout": false,
  "transform_axis": true,
  "use_generate": true,
  "final_eval": false,
  "user_msg": "seq_future_blip_axis_all0.1_hist8_future4",
  "img_type": "blip",
  "evaluate_dir": null,
  "seed": 42
}
args Namespace(all_data=0.1, bs=4, data_ratio=None, data_root='blip', debug_num=None, epoch=10, eval_acc=40, eval_bs=16, eval_name=None, eval_subset='dataset/blip/general_blip', evaluate_dir=None, final_eval=False, img_dim=1408, img_type='blip', input_len=512, local_rank=-1, lr=0.0001, model='declare-lab/flan-alpaca-base', output_dir='experiments', output_len=128, seed=42, transform_axis=True, use_future=4, use_generate=True, use_history=8, use_img_history=False, use_layout=False, user_msg='seq_future_blip_axis_all0.1_hist8_future4', warmup_ratio=0.05)
====Input Arguments====
{
  "data_root": "blip",
  "output_dir": "experiments",
  "model": "declare-lab/flan-alpaca-base",
  "data_ratio": null,
  "eval_name": null,
  "local_rank": -1,
  "epoch": 10,
  "lr": 0.0001,
  "warmup_ratio": 0.05,
  "bs": 4,
  "debug_num": null,
  "input_len": 512,
  "output_len": 128,
  "img_dim": 1408,
  "eval_bs": 16,
  "eval_acc": 40,
  "all_data": 0.1,
  "eval_subset": "dataset/blip/general_blip",
  "use_history": 8,
  "use_img_history": false,
  "use_future": 4,
  "use_layout": false,
  "transform_axis": true,
  "use_generate": true,
  "final_eval": false,
  "user_msg": "seq_future_blip_axis_all0.1_hist8_future4",
  "img_type": "blip",
  "evaluate_dir": null,
  "seed": 42
}
args Namespace(all_data=0.1, bs=4, data_ratio=None, data_root='blip', debug_num=None, epoch=10, eval_acc=40, eval_bs=16, eval_name=None, eval_subset='dataset/blip/general_blip', evaluate_dir=None, final_eval=False, img_dim=1408, img_type='blip', input_len=512, local_rank=-1, lr=0.0001, model='declare-lab/flan-alpaca-base', output_dir='experiments', output_len=128, seed=42, transform_axis=True, use_future=4, use_generate=True, use_history=8, use_img_history=False, use_layout=False, user_msg='seq_future_blip_axis_all0.1_hist8_future4', warmup_ratio=0.05)
====Input Arguments====
{
  "data_root": "blip",
  "output_dir": "experiments",
  "model": "declare-lab/flan-alpaca-base",
  "data_ratio": null,
  "eval_name": null,
  "local_rank": -1,
  "epoch": 10,
  "lr": 0.0001,
  "warmup_ratio": 0.05,
  "bs": 4,
  "debug_num": null,
  "input_len": 512,
  "output_len": 128,
  "img_dim": 1408,
  "eval_bs": 16,
  "eval_acc": 40,
  "all_data": 0.1,
  "eval_subset": "dataset/blip/general_blip",
  "use_history": 8,
  "use_img_history": false,
  "use_future": 4,
  "use_layout": false,
  "transform_axis": true,
  "use_generate": true,
  "final_eval": false,
  "user_msg": "seq_future_blip_axis_all0.1_hist8_future4",
  "img_type": "blip",
  "evaluate_dir": null,
  "seed": 42
}
args Namespace(all_data=0.1, bs=4, data_ratio=None, data_root='blip', debug_num=None, epoch=10, eval_acc=40, eval_bs=16, eval_name=None, eval_subset='dataset/blip/general_blip', evaluate_dir=None, final_eval=False, img_dim=1408, img_type='blip', input_len=512, local_rank=-1, lr=0.0001, model='declare-lab/flan-alpaca-base', output_dir='experiments', output_len=128, seed=42, transform_axis=True, use_future=4, use_generate=True, use_history=8, use_img_history=False, use_layout=False, user_msg='seq_future_blip_axis_all0.1_hist8_future4', warmup_ratio=0.05)
====Input Arguments====
{
  "data_root": "blip",
  "output_dir": "experiments",
  "model": "declare-lab/flan-alpaca-base",
  "data_ratio": null,
  "eval_name": null,
  "local_rank": -1,
  "epoch": 10,
  "lr": 0.0001,
  "warmup_ratio": 0.05,
  "bs": 4,
  "debug_num": null,
  "input_len": 512,
  "output_len": 128,
  "img_dim": 1408,
  "eval_bs": 16,
  "eval_acc": 40,
  "all_data": 0.1,
  "eval_subset": "dataset/blip/general_blip",
  "use_history": 8,
  "use_img_history": false,
  "use_future": 4,
  "use_layout": false,
  "transform_axis": true,
  "use_generate": true,
  "final_eval": false,
  "user_msg": "seq_future_blip_axis_all0.1_hist8_future4",
  "img_type": "blip",
  "evaluate_dir": null,
  "seed": 42
}
[20:18:30] [Model]: Loading declare-lab/flan-alpaca-base...                                                                                                                                                                                         main.py:83
                                                                                                                                                                                                                                                              
           [Data]: Reading data...                                                                                                                                                                                                                  main.py:84
                                                                                                                                                                                                                                                              
experiments/seq_future_blip_axis_all0.1_hist8_future4_declare-lab-flan-alpaca-base_blip_lr0.0001_bs0_ip512_op128_ep10
[20:18:30] [Model]: Loading declare-lab/flan-alpaca-base...                                                                                                                                                                                         main.py:83
                                                                                                                                                                                                                                                              
           [Data]: Reading data...                                                                                                                                                                                                                  main.py:84
                                                                                                                                                                                                                                                              
experiments/seq_future_blip_axis_all0.1_hist8_future4_declare-lab-flan-alpaca-base_blip_lr0.0001_bs0_ip512_op128_ep10
[20:18:30] [Model]: Loading declare-lab/flan-alpaca-base...                                                                                                                                                                                         main.py:83
                                                                                                                                                                                                                                                              
[20:18:30] [Model]: Loading declare-lab/flan-alpaca-base...                                                                                                                                                                                         main.py:83
                                                                                                                                                                                                                                                              
           [Data]: Reading data...                                                                                                                                                                                                                  main.py:84
                                                                                                                                                                                                                                                              
experiments/seq_future_blip_axis_all0.1_hist8_future4_declare-lab-flan-alpaca-base_blip_lr0.0001_bs0_ip512_op128_ep10
           [Data]: Reading data...                                                                                                                                                                                                                  main.py:84
                                                                                                                                                                                                                                                              
experiments/seq_future_blip_axis_all0.1_hist8_future4_declare-lab-flan-alpaca-base_blip_lr0.0001_bs0_ip512_op128_ep10
[20:18:30] [Model]: Loading declare-lab/flan-alpaca-base...                                                                                                                                                                                         main.py:83
                                                                                                                                                                                                                                                              
           [Data]: Reading data...                                                                                                                                                                                                                  main.py:84
                                                                                                                                                                                                                                                              
experiments/seq_future_blip_axis_all0.1_hist8_future4_declare-lab-flan-alpaca-base_blip_lr0.0001_bs0_ip512_op128_ep10
[20:18:30] [Model]: Loading declare-lab/flan-alpaca-base...                                                                                                                                                                                         main.py:83
                                                                                                                                                                                                                                                              
           [Data]: Reading data...                                                                                                                                                                                                                  main.py:84
                                                                                                                                                                                                                                                              
experiments/seq_future_blip_axis_all0.1_hist8_future4_declare-lab-flan-alpaca-base_blip_lr0.0001_bs0_ip512_op128_ep10
[20:18:30] [Model]: Loading declare-lab/flan-alpaca-base...                                                                                                                                                                                         main.py:83
                                                                                                                                                                                                                                                              
           [Data]: Reading data...                                                                                                                                                                                                                  main.py:84
                                                                                                                                                                                                                                                              
experiments/seq_future_blip_axis_all0.1_hist8_future4_declare-lab-flan-alpaca-base_blip_lr0.0001_bs0_ip512_op128_ep10
[20:18:30] [Model]: Loading declare-lab/flan-alpaca-base...                                                                                                                                                                                         main.py:83
                                                                                                                                                                                                                                                              
           [Data]: Reading data...                                                                                                                                                                                                                  main.py:84
                                                                                                                                                                                                                                                              
experiments/seq_future_blip_axis_all0.1_hist8_future4_declare-lab-flan-alpaca-base_blip_lr0.0001_bs0_ip512_op128_ep10
model.safetensors: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 990M/990M [00:17<00:00, 56.1MB/s]
Some weights of T5ForMultimodalGeneration were not initialized from the model checkpoint at declare-lab/flan-alpaca-base and are newly initialized: ['mha_layer.out_proj.bias', 'gate_dense.bias', 'mha_layer.in_proj_bias', 'image_dense.weight', 'mha_layer.out_proj.weight', 'mha_layer.in_proj_weight', 'gate_dense.weight', 'image_dense.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Some weights of T5ForMultimodalGeneration were not initialized from the model checkpoint at declare-lab/flan-alpaca-base and are newly initialized: ['mha_layer.out_proj.bias', 'gate_dense.bias', 'mha_layer.in_proj_bias', 'gate_dense.weight', 'mha_layer.in_proj_weight', 'mha_layer.out_proj.weight', 'image_dense.weight', 'image_dense.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Some weights of T5ForMultimodalGeneration were not initialized from the model checkpoint at declare-lab/flan-alpaca-base and are newly initialized: ['image_dense.bias', 'mha_layer.out_proj.weight', 'image_dense.weight', 'mha_layer.in_proj_bias', 'gate_dense.bias', 'gate_dense.weight', 'mha_layer.out_proj.bias', 'mha_layer.in_proj_weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Some weights of T5ForMultimodalGeneration were not initialized from the model checkpoint at declare-lab/flan-alpaca-base and are newly initialized: ['mha_layer.out_proj.weight', 'mha_layer.in_proj_weight', 'mha_layer.in_proj_bias', 'gate_dense.bias', 'gate_dense.weight', 'mha_layer.out_proj.bias', 'image_dense.bias', 'image_dense.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Some weights of T5ForMultimodalGeneration were not initialized from the model checkpoint at declare-lab/flan-alpaca-base and are newly initialized: ['image_dense.weight', 'mha_layer.out_proj.weight', 'image_dense.bias', 'mha_layer.out_proj.bias', 'mha_layer.in_proj_bias', 'gate_dense.bias', 'mha_layer.in_proj_weight', 'gate_dense.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Some weights of T5ForMultimodalGeneration were not initialized from the model checkpoint at declare-lab/flan-alpaca-base and are newly initialized: ['mha_layer.out_proj.bias', 'gate_dense.bias', 'gate_dense.weight', 'mha_layer.in_proj_bias', 'mha_layer.out_proj.weight', 'mha_layer.in_proj_weight', 'image_dense.weight', 'image_dense.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Some weights of T5ForMultimodalGeneration were not initialized from the model checkpoint at declare-lab/flan-alpaca-base and are newly initialized: ['mha_layer.in_proj_bias', 'mha_layer.in_proj_weight', 'gate_dense.bias', 'image_dense.weight', 'mha_layer.out_proj.weight', 'mha_layer.out_proj.bias', 'gate_dense.weight', 'image_dense.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Some weights of T5ForMultimodalGeneration were not initialized from the model checkpoint at declare-lab/flan-alpaca-base and are newly initialized: ['mha_layer.in_proj_bias', 'gate_dense.weight', 'gate_dense.bias', 'mha_layer.out_proj.bias', 'mha_layer.in_proj_weight', 'image_dense.bias', 'mha_layer.out_proj.weight', 'image_dense.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
generation_config.json: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 142/142 [00:00<00:00, 25.5kB/s]
loading general 0
loading general 0
loading general 0
loading general 0loading general
 0
loading general loading general0 
0
loading general 0
loading google_apps 7580
loading google_apps 7580
loading google_apps 7580
loading google_apps 7580
loading google_apps 7580
loading google_apps 7580
loading google_apps 7580
loading google_apps 7580
[2024-01-07 20:20:07,853] torch.distributed.elastic.multiprocessing.api: [WARNING] Sending process 19300 closing signal SIGTERM
[2024-01-07 20:20:07,855] torch.distributed.elastic.multiprocessing.api: [WARNING] Sending process 19301 closing signal SIGTERM
[2024-01-07 20:20:07,855] torch.distributed.elastic.multiprocessing.api: [WARNING] Sending process 19302 closing signal SIGTERM
[2024-01-07 20:20:07,855] torch.distributed.elastic.multiprocessing.api: [WARNING] Sending process 19303 closing signal SIGTERM
[2024-01-07 20:20:07,855] torch.distributed.elastic.multiprocessing.api: [WARNING] Sending process 19304 closing signal SIGTERM
[2024-01-07 20:20:07,855] torch.distributed.elastic.multiprocessing.api: [WARNING] Sending process 19306 closing signal SIGTERM
[2024-01-07 20:20:07,855] torch.distributed.elastic.multiprocessing.api: [WARNING] Sending process 19307 closing signal SIGTERM
[2024-01-07 20:20:08,928] torch.distributed.elastic.multiprocessing.api: [ERROR] failed (exitcode: -9) local_rank: 5 (pid: 19305) of binary: /home/skirti/.pyenv/versions/3.8.11/bin/python
Traceback (most recent call last):
  File "/home/skirti/.pyenv/versions/3.8.11/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/home/skirti/.pyenv/versions/3.8.11/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/home/skirti/.pyenv/versions/3.8.11/lib/python3.8/site-packages/torch/distributed/launch.py", line 196, in <module>
    main()
  File "/home/skirti/.pyenv/versions/3.8.11/lib/python3.8/site-packages/torch/distributed/launch.py", line 192, in main
    launch(args)
  File "/home/skirti/.pyenv/versions/3.8.11/lib/python3.8/site-packages/torch/distributed/launch.py", line 177, in launch
    run(args)
  File "/home/skirti/.pyenv/versions/3.8.11/lib/python3.8/site-packages/torch/distributed/run.py", line 797, in run
    elastic_launch(
  File "/home/skirti/.pyenv/versions/3.8.11/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 134, in __call__
    return launch_agent(self._config, self._entrypoint, list(args))
  File "/home/skirti/.pyenv/versions/3.8.11/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 264, in launch_agent
    raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError: 
============================================================
main.py FAILED
------------------------------------------------------------
Failures:
  <NO_OTHER_FAILURES>
------------------------------------------------------------
Root Cause (first observed failure):
[0]:
  time      : 2024-01-07_20:20:07
  host      : 211b70a3
  rank      : 5 (local_rank: 5)
  exitcode  : -9 (pid: 19305)
  error_file: <N/A>
  traceback : Signal 9 (SIGKILL) received by PID 19305
============================================================

Any pointers on what is causing this?

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.