Giter Club home page Giter Club logo

toolbox-macos's Introduction

@iter-ai/toolbox-macos

toolbox-macos is a minimal package that enables OpenAI GPTs to interact with macOS apps like iMessage, email, or calendar through Shortcuts actions.

  • Simple Integration: Easy setup with a local server and GPT API schema.
  • Privacy-Focused: Runs locally to keep your data secure.
  • Versatile: gives access to 128 APIs from Apple Shortcuts.

For a demo see: https://x.com/LinzhiQ/status/1729555314217734240?s=20

Tweet2.mov
tweet.mov

Running the macOS Toolbox

On a macOS machine with Node.js installed, run:

git clone https://github.com/iter-ai/toolbox-macos.git
npm install
npm run dev

The command will start a Cloudflare Tunnel to allow GPTs to connect to your machine.

Agent architecture

toolbox-macos is designed with supporting custom GPTs in mind. While custom GPTs provide a flexible interface, they come with constraints like single-agent design, character limit for schema descriptions, etc.

Our custom GPT is designed to perform the following five steps:

  1. listTools (/list): providing a list of available action names to the model
  2. selectTools (/schema): providing the schema details for the input actions
  3. submitPlan (/plan): this endpoint receives a plan from the model in plain text and always returns success. The goal of this endpoint is to simply hide the plan from the user.
  4. submitCritique (/critique): similarly, this endpoint receives a critique of the plan and always returns success. Again, this dummy endpoint hides the critique from the user.
  5. runTool (/run): this endpoint executes an action that the GPT decides to take with the given parameters.

The hierarchical design of /list and /schema enable toolbox-macos to support more than a hundred actions to a single GPT. The model can dynamically query and decide which actions to take. /plan and /critique abstract away the Chain of Thought and Self Critique steps from the user. The user can simply focus on the conversation with the model.

You check the system prompt (in cli/src/index.tsx) for more details on how we instruct the agent to leverage these endpoints. There are several considerations when designing the agent architecture:

  • Providing user information includes time zones and names
  • Explaining specific quirks about Apple Shortcuts, such as timezone formats and how to find certain identifiers
  • Instructing the model to follow the above five steps
  • Instructing the model on some interaction patterns, such as when to ask for clarification and confirmation

Apple Shortcuts

See integration/shortcuts/README

toolbox-macos's People

Contributors

tiiiger avatar linzhiq avatar yanndubs avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.