microsoft / codex-cli Goto Github PK

CLI tool that uses Codex to turn natural language commands into their Bash/ZShell/PowerShell equivalents

License: MIT License

Python 54.59% Shell 28.93% PowerShell 16.48%

codex-cli's Introduction

Codex CLI - Natural Language Command Line Interface

This project uses GPT-3 Codex to convert natural language commands into commands in PowerShell, Z shell and Bash.

The Command Line Interface (CLI) was the first major User Interface we used to interact with machines. It's incredibly powerful, you can do almost anything with a CLI, but it requires the user to express their intent extremely precisely. The user needs to know the language of the computer.

With the advent of Large Language Models (LLMs), particularly those that have been trained on code, it's possible to interact with a CLI using Natural Language (NL). In effect, these models understand natural language and code well enough that they can translate from one to another.

This project aims to offer a cross-shell NL->Code experience to allow users to interact with their favorite CLI using NL. The user enters a command, like "what's my IP address", hits Ctrl + G and gets a suggestion for a command idiomatic to the shell they're using. The project uses the GPT-3 Codex model off-the-shelf, meaning the model has not been explicitly trained for the task. Instead we rely on a discipline called prompt engineering (see section below) to coax the right commands from Codex.

Note: The model can still make mistakes! Don't run a command if you don't understand it. If you're not sure what a command does, hit Ctrl + C to cancel it.

This project took technical inspiration from the zsh_codex project, extending its functionality to span multiple shells and to customize the prompts passed to the model (see prompt engineering section below).

Statement of Purpose

This repository aims to grow the understanding of using Codex in applications by providing an example of implementation and references to support the Microsoft Build conference in 2022. It is not intended to be a released product. Therefore, this repository is not for discussing OpenAI API or requesting new features.

Requirements

Python 3.7.1+
- [Windows]: Python is added to PATH.
An OpenAI account
- OpenAI API Key.
- OpenAI Organization Id. If you have multiple organizations, please update your default organization to the one that has access to codex engines before getting the organization Id.
- OpenAI Engine Id. It provides access to a model. For example, code-davinci-002 or code-cushman-001. See here for checking available engines.

Installation

Please follow the installation instructions for PowerShell, bash or zsh from here.

Usage

Once configured for your shell of preference, you can use the Codex CLI by writing a comment (starting with #) into your shell, and then hitting Ctrl + G.

The Codex CLI supports two primary modes: single-turn and multi-turn.

By default, multi-turn mode is off. It can be toggled on and off using the # start multi-turn and # stop multi-turn commands.

If the multi-turn mode is on, the Codex CLI will "remember" past interactions with the model, allowing you to refer back to previous actions and entities. If, for example, you asked the Codex CLI to change your time zone to mountain, and then said "change it back to pacific", the model would have the context from the previous interaction to know that "it" is the user's timezone:

# change my timezone to mountain
tzutil /s "Mountain Standard Time"

# change it back to pacific
tzutil /s "Pacific Standard Time"

The tool creates a current_context.txt file that keeps track of past interactions, and passes them to the model on each subsequent command.

When multi-turn mode is off, this tool will not keep track of interaction history. There are tradeoffs to using multi-turn mode - though it enables compelling context resolution, it also increases overhead. If, for example, the model produces the wrong script for the job, the user will want to remove that from the context, otherwise future conversation turns will be more likely to produce the wrong script again. With multi-turn mode off, the model will behave completely deterministically - the same command will always produce the same output.

Any time the model seems to output consistently incorrect commands, you can use the # stop multi-turn command to stop the model from remembering past interactions and load in your default context. Alternatively, the # default context command does the same while preserving the multi-turn mode as on.

Commands

Command	Description
`start multi-turn`	Starts a multi-turn experience
`stop multi-turn`	Stops a multi-turn experience and loads default context
`load context <filename>`	Loads the context file from `contexts` folder
`default context`	Loads default shell context
`view context`	Opens the context file in a text editor
`save context <filename>`	Saves the context file to `contexts` folder, if name not specified, uses current date-time
`show config`	Shows the current configuration of your interaction with the model
`set <config-key> <config-value>`	Sets the configuration of your interaction with the model

Feel free to improve your experience by changing the token limit, engine id and temperature using the set command. For example, # set engine cushman-codex, # set temperature 0.5, # set max_tokens 50.

Prompt Engineering and Context Files

This project uses a discipline called prompt engineering to coax GPT-3 Codex to generate commands from natural language. Specifically, we pass the model a series of examples of NL->Commands, to give it a sense of the kind of code it should be writing, and also to nudge it towards generating commands idiomatic to the shell you're using. These examples live in the contexts directory. See snippet from the PowerShell context below:

# what's the weather in New York?
(Invoke-WebRequest -uri "wttr.in/NewYork").Content

# make a git ignore with node modules and src in it
"node_modules
src" | Out-File .gitignore

# open it in notepad
notepad .gitignore

Note that this project models natural language commands as comments, and provide examples of the kind of PowerShell scripts we expect the model to write. These examples include single line completions, multi-line completions, and multi-turn completions (the "open it in notepad" example refers to the .gitignore file generated on the previous turn).

When a user enters a new command (say "what's my IP address"), we simple append that command onto the context (as a comment) and ask Codex to generate the code that should follow it. Having seen the examples above, Codex will know that it should write a short PowerShell script that satisfies the comment.

Building your own Contexts

This project comes pre-loaded with contexts for each shell, along with some bonus contexts with other capabilities. Beyond these, you can build your own contexts to coax other behaviors out of the model. For example, if you want the Codex CLI to produce Kubernetes scripts, you can create a new context with examples of commands and the kubectl script the model might produce:

# make a K8s cluster IP called my-cs running on 5678:8080
kubectl create service clusterip my-cs --tcp=5678:8080

Add your context to the contexts folder and run load context <filename> to load it. You can also change the default context from to your context file inside src\prompt_file.py.

Note that Codex will often produce correct scripts without any examples. Having been trained on a large corpus of code, it frequently knows how to produce specific commands. That said, building your own contexts helps coax the specific kind of script you're looking for - whether it's long or short, whether it declares variables or not, whether it refers back to previous commands, etc. You can also provide examples of your own CLI commands and scripts, to show Codex other tools it should consider using.

One important thing to consider is that if you add a new context, keep the multi-turn mode on to avoid our automatic defaulting (which was added to keep faulty contexts from breaking your experience).

We have added a cognitive services context which uses the cognitive services API to provide text to speech type responses as an example.

Troubleshooting

Use DEBUG_MODE to use a terminal input instead of the stdin and debug the code. This is useful when adding new commands and understanding why the tool is unresponsive.

Sometimes the openai package will throws errors that aren't caught by the tool, you can add a catch block at the end of codex_query.py for that exception and print a custom error message.

FAQ

What OpenAI engines are available to me?

You might have access to different OpenAI engines per OpenAI organization. To check what engines are available to you, one can query the List engines API for available engines. See the following commands:

Shell

curl https://api.openai.com/v1/engines \
  -H 'Authorization: Bearer YOUR_API_KEY' \
  -H 'OpenAI-Organization: YOUR_ORG_ID'

PowerShell

PowerShell v5 (The default one comes with Windows)

(Invoke-WebRequest -Uri https://api.openai.com/v1/engines -Headers @{"Authorization" = "Bearer YOUR_API_KEY"; "OpenAI-Organization" = "YOUR_ORG_ID"}).Content

PowerShell v7

(Invoke-WebRequest -Uri https://api.openai.com/v1/engines -Authentication Bearer -Token (ConvertTo-SecureString "YOUR_API_KEY" -AsPlainText -Force) -Headers @{"OpenAI-Organization" = "YOUR_ORG_ID"}).Content

Can I run the sample on Azure?

The sample code can be currently be used with Codex on OpenAI’s API. In the coming months, the sample will be updated so you can also use it with the Azure OpenAI Service.

codex-cli's People

Contributors

Stargazers

Watchers

Forkers

stevel-msft devbox10 abalmeo eeryinkblot lewis-hu wk19921225 richardo2016-forks threebeers moresandeep alokedesai icodein doinker darksun56 suryatmodulus alja7dali pynchmeister 8i4ck theoracley devatdawn elvismunyikiiru 521hellogithub atkins126 gonnavis sanyuesiyuewuyue jdmonty dude-with-the-long-username tejbirwason boxops edwardceballos anima-os-dev singlag miblue119 mamunpw01 lainiwa tebellox cloudsecdb mikaeltrilles mathieu6700417 nanyte25 deathtanium lromano72 ar-aln liuyongs1 lcsouzamenezes shekharsorot fracvx kallehoppe w159 kikofarad yorobotcop cyberflamego barryptak digitalarche kayodebristol nabeelye jingyulee gpathela shellb0y vinglogn boguslaw-d cynicalwilson dakouan18 lukas-lls yusufk basched helenhwl sergtitov hunterxu-gh 4agi antonosika weltolk chinayuan bgrainger beimingyonyu shpetimhaxhiu soulcloud ayuyuyuyu1030 arunjayakumar01 ekaone li-yanliang inayet sameermahajan jmake luxxluciano 44510 amaresan gai24831 xtends jaykul ethicalsecurity-agency johnny-rice terse-coder jefflee99 dearborn-open-ai ctrcodex justwowcq chindris-mihai-alexandru jhines2k7 loiphan1003 codehruv

codex-cli's Issues

Bash - Cleanup script

Please modify the Bash cleanup script (currently empty) to support the following actions:
When user run the cleanup script, it will

Remove CLI setting code from .bashrc
Delete src/openaiapirc file
Print successful message to guide user to close the current shell

NL-CLI Bash clean up completed. Please close this Bash session.

Specify in README what this project is supposed to be

Set clear expectations on what this tool is supposed to be - i.e. not a product, just an example. We can base it off the documentation from oasis/ros.

Add instructions for getting Open AI API Key

We should add instructions for getting Open AI API Key. Create an issue to track.Ryan and I will figure out the details.

Setup doesn't work in PowerShell on MacOS

Setup doesn't work in PS on MacOS.

Repo steps

Run ./scripts/powershell_setup.ps1 from NL-CLI folder in PowerShell terminal.

Results
Run setup without errors

Expected results
The script should prompt for required parameters.

Add some validation for dangerous commands

zsh - No such widget 'create_completion'

Hit an error when ^x in zsh. What have I done wrong?

NL-CLI path: ~/Code/NL-CLI

config in .zshrc

export ZSH_CUSTOM="Code"
source "$ZSH_CUSTOM/NL-CLI/nl_cli.plugin.zsh"
bindkey '^X' create_completion

Change binding to Ctrl + G (zsh and PS)

Change binding to Ctrl + G (zsh and PS) to be consistent with bash.

PowerShell - NL-CLI doesn't seem to work

Is PowerShell 7 supported? My PS7 doesn't seem to work...

My regular PowerShell doesn't work either

Performance improvements

Use token_count metadata to avoid overcounting tokens every time.

Pick through the code for needless file operations.

Add 'load context <filename>' command

Allow users to load a context from CLI

Come up with hero demo example(s)

Across both PowerShell and Bash, come up with the ideal demo script for the next review with Kevin

Add OpenAI access and model check in zsh and bash

Add OpenAI access and model check in zsh and bash.

Investigate Neural Speech Synthesis

Using similar approaches to powershell-voice.txt, use Cognitive Service Neural Speech to enable a high-quality conversational interface with the model.

Allow user to set max_tokens and temperature from CLI

Add multiple Codex response functionality

Instead of using Ctrl+X for single response, maybe allowing another keybinding to trigger multiple responses.

Separate prompt and config

Using a yaml config (or similar) would be cleaner than combined prompt and config. A config.yaml.example can then be provided, reducing the redundancy

Add a new shortcut command for model response without context

Detect and handle Offensive Prompts/Completions

Currently, it's possible to coax offensive content from the model. Though I've never seen the model proactively produce offensive language, prompts like "Make an array of offensive terms" produce unsavory outcomes. We should use the content filter API that's part of the OpenAI service to detect offensive prompts and completions (calling it with the full interaction) and handle them when found - in this case, a message like that in the OAI playground should suffice:

The message should probably be a comment, and should include instructions to cancel the command (i.e. "Press Ctrl + C to cancel..."

When offensive prompts/completions are detected, we should not append them to the context.

Readme Suggestions

I demoed the powershell instructions and I found it really straightforward and easy to use. Cool!

In the readme:

I suggest linking Powershell, ZShell, and Bash to their corresponding sections lower in the readme.
I would consider placing Bash as the first shell documented as that is the most commonly used shell.
I suggest adding a very early comment associating ZShell with MacOS. This comes from a long-time Mac user who had no idea that the shell is called ZShell. Maybe that's just on me.
It's not obvious to me how to turn multi-turn on or off. Do I run the command in the shell after I've set everything up? Do I have to run the commands before setup? Do I use the # format? Perhaps put the usage section a little higher and with more example usage.

Add a library of context files in the saved directory

Command cheat sheets, demo materials and custom command files would be a great addition to the repo. It would allow users to use load context to better focus Codex towards their interested use case.

Add CMD support

We should consider adding support for the command line, primarily for non-devs who occasionally use the command line

Preload the prompt file for a given shell automatically

Add speech integration

Figure out a easy way to manage different nomenclature across Windows and UNIX

the directory slash issue needs to be worked out

Python package installation failed in Ubuntu 18 with Python 3.6

It seems these don't work in Linux (WSL2 and Ubuntu)

python -m pip install openai
python -m pip install psutil

After installing python and pip in Linux, these are the commands that work

pip install openai
pip install psutil

Add context file headers for token count and model tuning

Reduces time complexity of the code and simplifies some other work items

Add "Statement of Purpose" section in ReadMe

Please add the following statement in ReadMe to set the right expectation to users.

Statement of Purpose
This repository aims to grow the understanding of using codex in applications by providing an example of implementation and references to support the Microsoft Build conference in 2022. It is not intended to be a released product. Therefore, this repository is not for discussing OpenAI API or requesting new features.

Add instruction on how to obtain organization id

Please add instruction on how to get organization id, thanks.

Validate command values

Right now we are relying on users providing correct inputs for config settings and context commands. This is obviously going to cause issues down the line. Need to fix.

Provide a `disable context` command for performance improvement

If the script is taking too long to provide responses, it might be good for the user to choose to not have context management on.

Without context management, the file I/O latency would disappear and there would be a direct API call to Codex which will improve performance.

Powershell - multiple keybindings not working

Added a new keybinding for calling Codex without using the context. For some reason, even though there are two different keybindings for with and without context, for both the keybindings, only one of the two gets called every time.

Notes on WSL2 Bash Experience

Environment

Windows Machine running on WSL2
Set up Anaconda environment with conda create -n testing python=3.10
Working with Bash Terminal on WSL2

Setup Notes/Comments

Step 1 - Git Clone: Had to create an access token to use in git clone step, and give it permissions/authorized access to the Microsoft organization
Step 2 - Bash RC Update: Add a note to update the your special path to the repo--was able to figure out how to do this but I did do a blind copy-past to my bashrc.
Step 3 - OpenAI Config: Might be good to link where we'd get this information (I know it's on the OpenAI keys website, but not sure if all will know); would especially be good to supply a sample engine

Usage Notes/Comments

After restarting bash terminal with activated conda environment, my instinct was to try to run a command, as suggested by the first line in the Usage second. I tried to run one of the example commands but typing # change my timezon to mountain and hitting CTRL+X, and got the error -bash: bash_execute_unix_command: cannot find keymap for command. Not sure if this is not 100% working yet, but just wanted to give a heads up on this. Happy to iterate further but will stop for now and try powershell as well.

Update headers after the codex query

Need to update the token count and other header values after it is updated

Bash installation

ReadMe: Clear. The gif demo is very illustrative.
Setup:
- Normally use fish, so going to try out the bash option
- Bash nl_cli.plugin.sh uses $NL_CLI_PATH/codex_query.py for what should be $NL_CLI_PATH/src/codex_query.py and silently fails
- Had to chmod +x codex_query.py
- After doing both of above I can directly run codex_query.py but still cannot get autocompletions with ^x

Bash - Update installation.md to use setup/cleanup scripts

Please update installation.md to use Bash setup/cleanup scripts.

Save inputs/outputs together

I've had a few occurences where I type a command (e.g. "# Move up one directory") and then quickly cancel it, not getting a response from the model. The context still saves just the command without the result, which ends up messing things up in the future.

We should aim to come up with a way to save full interactions (command + response) together to avoid this from happening. Ideally, we should only save interactions if a user chooses to run the suggested code - practically, I'm not sure if this would be possible

Add 'show context <#lines>' command

Add capability to look back a certain number of exchanges

Notes after following Powershell instructions

Put requirements at the top of Installation

Include need for Open AI key
Include need for Open AI Organization ID (took me awhile to find)
Include need for documenting engine choice

Powershell #3
- Need to go to C:\your\custom\path\NL-CLI not what is listed
- Alert user about required values - the current redirect sounds like only needed for expert users, but required for everyone

Powershell #4 - Didn't work, no idea how to debug

PowerShell - feedback for README

Here are my feedback for README.md:

Step 2: Add instructions in notepad $profile step - notepad will prompt you to create a profile if not finding one.
Step 3: Add hyperlink to powershell_plugin.ps1
Step 4: Replace python script with codex_query.py
Step 5: I don't think the execution policy should be set to the current user. It's not secure in general. Can we use Process instead of CurrentUser? If we do need CurrentUser, then please consider adding instructions to remove the Policy afterward - refer to https://docs.microsoft.com/en-us/powershell/module/microsoft.powershell.core/about/about_execution_policies?view=powershell-7.2#remove-the-execution-policy
We should list having Python registered in PATH as a prerequisite for PowerShell users.

Add multi-line response handling

Wrapping responses in ` could work as a way to tag multi-line responses while also keeping them executable. This could allow one to increase the token response length.

Also if someone calls the NL-CLI on an incomplete output, we have to add validation for that case, so that we don't write two copies of the same input-output to the context file. The responses get a little buggy after that.

Completion for incomplete model responses in shell with new keybinding or code

Whenever there is an incomplete response inside the shell and a user calls the tool again, it ends up screwing up the prompt. It might be more prudent to have a new keybinding for that case or add some handling inside the code for it.

Add a dev and prod mode

Testing within a new environment requires switching from stdin to input() until we figure out a way to plug the python script into the shell's source files

Rename and Document Multi-turn support

The naming for Context Mode is a bit off - we should consider a different name (e.g. Multiturn) or something to that effect. We should also document how best to toggle between the modes.

I've also run into a few bugs when tweaking the modes and am not sure where I should be doing it - in the codex_query.py file, in the completion.txt files or from the command line commands. Specifically, I find that when I'm in multi-turn mode and interacting for awhile, then switch to single-turn mode, it doesn't actually change. Similarly, I've also found that loading contexts doesn't always work from other directories.

Add a gif of NL-CLI in action in the README.md

Catalog the wildest use-cases and add a small animation of those in the README for people to get an idea of how powerful this tool can be

Add bash support

Low hanging fruit, zsh plugin code can be ported over to bash easily

Link in Powershell setup section step 3 doesn't go to destination

Minor issue:

Link that is being used in Powershell Step 3 of ReadME: https://github.com/microsoft/NL-CLI#about-powershellsetupps1
Link that should be used: https://github.com/microsoft/NL-CLI#about-powershell_setupps1

Use pre-existing configuration instead of restamping every time

Allow users to reconfigure the model settings from the context file directly. Right now, we are restamping the defaults every time. This is obviously a bug. We want to allow people to share context files across different computers with the same shell but using different model settings.