Giter Club home page Giter Club logo

android_world's Introduction

AndroidWorld

Unittests

Website โ€ข Paper

Overview

AndroidWorld is an environment for building and benchmarking autonomous computer control agents.

It runs on a live Android emulator and contains a highly reproducible benchmark of 116 hand-crafted tasks across 20 apps, which are dynamically instantiated with randomly-generated parameters to create millions of unique task variations.

In addition to the built-in tasks, AndroidWorld also supports the popular web benchmark, MiniWoB++ from Liu et al..

Key features of AndroidWorld include:

  • ๐Ÿ“ 116 diverse tasks across 20 real-world apps
  • ๐ŸŽฒ Dynamic task instantiation for millions of unique variations
  • ๐Ÿ† Durable reward signals for reliable evaluation
  • ๐ŸŒ Open environment with access to millions of Android apps and websites
  • ๐Ÿ’พ Lightweight footprint (2 GB memory, 8 GB disk)
  • ๐Ÿ”ง Extensible design to easily add new tasks and benchmarks
  • ๐Ÿ–ฅ๏ธ Integration with MiniWoB++ web-based tasks

See demo videos on our website.

Installation

  1. Set up the Android Emulator

    1. Download Android Studio here
    2. Create an Android Virtual Device (AVD) by following these instructions. For hardware select Pixel 6, for System Image select Tiramisu, API Level 33, and choose AVD name as AndroidWorldAvd. Watch the setup video.
  2. Launch the Android Emulator from the command line

    Launch the emulator from the command line, not using the Android Studio UI, with the -grpc 8554 flag which is needed communication with accessibility forwarding app.

    # Typically it's located in ~/Android/Sdk/emulator/emulator or
    # ~/Library/Android/sdk/emulator/emulator
    EMULATOR_NAME=AndroidWorldAvd # From previous step
    ~/Library/Android/sdk/emulator/emulator -avd $EMULATOR_NAME -no-snapshot -grpc 8554
  3. [Optional] It's recommended to use conda, which you can download here.

    conda create -n android_world python=3.11.8
    conda activate android_world
    
  4. Install the latest AndroidEnv:

    git clone https://github.com/google-deepmind/android_env.git
    cd android_env
    python setup.py install
  5. Install AndroidWorld. Note: Python 3.11 or above is required.

    git clone https://github.com/google-research/android_world.git
    cd ./android_world
    pip install -r requirements.txt
    python setup.py install
  6. Add model provider APIs as environment variables.

    # Add to .bashrc.
    export OPENAI_API_KEY=your-key
    export GCP_API_KEY=your-key

Quickstart

Run the minimal_task_runner.py script to see the basic mechanics of AndroidWorld components. It initializes the environment, sets up a task, and runs the default agent, M3A, on it.

python minimal_task_runner.py --task=ContactsAddContact

If you don't specify a task, a random task will be selected. NOTE: If you want to try open-source apps, i.e not included with Android OS, please run --perform_emulator_setup in the script below.

Run the benchmark

python run.py \
  --suite_family=android_world \
  --agent_name=t3a_gpt4 \
  --perform_emulator_setup \
  --tasks=ContactsAddContact,ClockStopWatchRunning \  # Optional: Just run on a subset.
  -v=-2 \

The first time you run this script, you must install the necessary apps and set permissions by specifying --perform_emulator_setup. This is a one-time setup.

Above we specify the optional --tasks flag to run on a subset of tasks. Leave it empty to run on the entire AndroidWorld suite.

The n_task_combinations argument specifies how many parameter permutations to use for each task. For example, for an SMS task, it would correspond to different phone number/message combinations for each run.

If a run fails part-way through, you can resume it by re-running the script with the --checkpoint_dir flag pointing to the output directory from the original run.

You can control verbosity with -v. The -2 verbosity level is equivalent to DEBUG.

Running MiniWoB++ tasks

To run the MiniWoB++ web-based tasks in AndroidWorld, simply set --suite_family=miniwob in the command above.

A key advantage of running MiniWoB++ tasks is that common input elements are rendered as native, commonly used Android UI widgets, rather than as HTML. Thus agents must learn to use universal widgets such as time- and date-pickers:

Create your own agent

In addition to the agents we provide here, you can also easily create your own agent and run the benchmark with it as follows.

  1. Create an agent class that inherits from EnvironmentInteractingAgent and implement the step method. In the current workflow, the agent tries to complete a task in a for loop. In each round, the step method will be called and this is where you implement your agent's logic. A typical approach involves first gathering information like the current screenshot, the UI elements (like buttons, icons) through the AndroidEnv instance within the agent, selecting one of the supported actions, executing it through the AndroidEnv and returning an AgentInteractionResult. The done property on AgentInteractionResult should be set to true to indicate that the task is finished.

  2. Import your agent in run.py and also add it into the _get_agent method which takes in your agent's name and return an instance of it.

  3. Now you can run the benchmark with your new agent using the command above with the agent_name flag changed to your agent's name.

This is not an officially supported Google product.

android_world's People

Contributors

clink42 avatar crawles avatar gabrielle-lau avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.