Giter Club home page Giter Club logo

wolverine's Introduction

DEPRECATED: Try Mentat instead! https://github.com/AbanteAI/mentat

Wolverine

About

Give your python scripts regenerative healing abilities!

Run your scripts with Wolverine and when they crash, GPT-4 edits them and explains what went wrong. Even if you have many bugs it will repeatedly rerun until it's fixed.

For a quick demonstration see my demo video on twitter.

Setup

python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
cp .env.sample .env

Add your openAI api key to .env

warning! By default wolverine uses GPT-4 and may make many repeated calls to the api.

Example Usage

To run with gpt-4 (the default, tested option):

python -m wolverine examples/buggy_script.py "subtract" 20 3

You can also run with other models, but be warned they may not adhere to the edit format as well:

python -m wolverine --model=gpt-3.5-turbo examples/buggy_script.py "subtract" 20 3

If you want to use GPT-3.5 by default instead of GPT-4 uncomment the default model line in .env:

DEFAULT_MODEL=gpt-3.5-turbo

You can also use flag --confirm=True which will ask you yes or no before making changes to the file. If flag is not used then it will apply the changes to the file

python -m wolverine examples/buggy_script.py "subtract" 20 3 --confirm=True

Environment variables

env name description default value
OPENAI_API_KEY OpenAI API key None
DEFAULT_MODEL GPT model to use "gpt-4"
VALIDATE_JSON_RETRY Number of retries when requesting OpenAI API (-1 means unlimites) -1

Future Plans

This is just a quick prototype I threw together in a few hours. There are many possible extensions and contributions are welcome:

  • add flags to customize usage, such as asking for user confirmation before running changed code
  • further iterations on the edit format that GPT responds in. Currently it struggles a bit with indentation, but I'm sure that can be improved
  • a suite of example buggy files that we can test prompts on to ensure reliability and measure improvement
  • multiple files / codebases: send GPT everything that appears in the stacktrace
  • graceful handling of large files - should we just send GPT relevant classes / functions?
  • extension to languages other than python

Star History

Star History Chart

wolverine's People

Contributors

alessandroannini avatar biobootloader avatar chriscarrollsmith avatar eltociear avatar epylar avatar fsboehme avatar hemangjoshi37a avatar juleshenry avatar ksfi avatar nervousapps avatar prayagnshah avatar twsomt avatar zillibub avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

wolverine's Issues

reorder venv commands in README

Hello,

I noticed an issue in the README file. The current instructions are:

python3 -m venv venv
pip install -r requirements.txt
source venv/bin/activate

I believe the correct order should be:

python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt

The term 'source' is not recognized

source : The term 'source' is not recognized as the name of a cmdlet, function,
script file, or operable program. Check the spelling of the name, or if a path
was included, verify that the path is correct and try again.
At line:1 char:1
+ source venv/bin/activate
+ ~~~~~~
    + CategoryInfo          : ObjectNotFound: (source:String) [], CommandNotFou
   ndException
    + FullyQualifiedErrorId : CommandNotFoundException

just an idea

Hi.

I've been writing a package for probing ChatGPT using self-referential patterns and figured I would try running Wolverine swapping out your dict parser or combining with what I have here.

Happy to send you PR's or collaborate but don't want to create bloat or deps you don't want.

-Peter

no package

idunno if am stupid but this is my issus

Traceback (most recent call last):

File "", line 189, in _run_module_as_main
File "", line 112, in _get_module_details
File "/root/wolverine/wolverine/wolverine.py", line 8, in
import openai
ModuleNotFoundError: No module named 'openai'

OPEN FEATURES

Some ideas for Future Features

  1. Configuration file: Add support for a configuration file where users can specify their API key, default model, and other settings, making it easier to manage and customize the script.

  2. Multiple models: Allow users to choose from different GPT models or use multiple models sequentially, which could potentially improve the quality of suggestions.

  3. Rate limiting and retries: Implement rate limiting and automatic retries for API requests, which can help avoid exceeding API limits and handle occasional API errors more gracefully.

  4. Code formatting: Integrate with code formatters like Black or autopep8 to automatically format the fixed code according to Python style guidelines.

  5. Version control integration: Add support for automatically creating a new branch or commit in the version control system (e.g., Git) when changes are applied, making it easier to track and manage changes made by the script.

  6. Test execution: If the project includes unit tests, run them after applying changes to verify that the fixes haven't broken any existing functionality.

  7. Incremental improvements: Instead of applying all suggested changes at once, apply one change at a time and rerun the script to see if the issue has been resolved. This approach can help identify which suggestions are most effective and minimize unnecessary changes.

  8. Custom prompt: Allow users to provide a custom prompt for the GPT model, giving more control over the type of suggestions generated.

  9. Interactive mode: Implement an interactive mode where users can review and approve or reject each suggestion before applying it. This can help ensure that only the desired changes are made to the script.
    #23

  10. Performance metrics: Collect and display performance metrics, such as the number of iterations, time taken for each iteration, and total time taken to fix the script, helping users understand the efficiency of the script.

  11. Logging: Add proper logging to keep track of the actions taken by the script, which can be useful for debugging and monitoring purposes.
    #25

  12. User-friendly error messages: Improve error messages to be more descriptive and user-friendly, making it easier for users to understand and resolve issues.

Please add 3.5

I think 3.5 would be worth adding because even though it may not get things right as much as 4, it's still way cheaper and GPT-4 is still only available to a small number of people.

Planning for smart utilization of 3.5

GPT 3.5 is so much cheaper (.002 vs .06 / k tokens), not to mention it it usually returns faster and is less throttled.
Given that, it makes sense to always at least attempt to use GPT 3.5 first.

Given we are gonna try GPT 3.5 first, how do we determine when to fallback to GPT 4?

  1. When compiling the prompt for our completion, if it leaves less than n tokens remaining for completion, where n is the smallest number we expect to possibly hold an expected completion. IMO 500 tokens is a reasonable amount to reserve. But that's a variable that could use empirical measurement.
  2. When receiving the completion, we should prompt for the answer to be wrapped in some delimiters to detect if there were not enough tokens for GPT 3.5 to complete it's attempted answer.
  3. When detecting if the completion resulted in a fix for the current error. It may however be worth retrying here while slowly ratcheting up temperature. Or feeding new error back in. Need a way to check if GPT is just introducing new errors that happen before the original error could happen.

Additionally, code should include future proofing for fallback to the 32K model using rules 1 & 2 (since it's not smarter, just bigger). Obviously disabled by flag. Similarly allow disabling of 4-8k using the same system.

Extensions for Github Actions

This project seems great if combined with CI/CD process.

For example...

  1. User upload project with test codes
  2. CI workflow, such as GitHub Action, run tests on top of wolverine
  3. Wolverine catches the bug and create pull requests

As result, the developers can accelerate debugging in TDD.

apoyo

class Empleado:
def init(self, nombre, apellido, sueldo_base, afp, fecha_ingreso, hijos):
self.nombre = nombre
self.apellido = apellido
self.sueldo_base = sueldo_base
self.afp = afp
self.fecha_ingreso = fecha_ingreso
self.hijos = hijos

def calcular_base_imponible(self):
meses_trabajados = (2021 - int(self.fecha_ingreso.split("/")[-1])) * 12
bonificacion = self.sueldo_base * (meses_trabajados * 0.01)
asignacion_familiar = self.sueldo_base * (self.hijos * 0.05)
base_imponible = self.sueldo_base + bonificacion + asignacion_familiar
return base_imponible

def calcular_descuentos(self):
base_imponible = self.calcular_base_imponible()

    essalud = base_imponible * 0.07
    
    if self.afp == "AFP(X)":
        afp = base_imponible * 0.12
    elif self.afp == "AFP(Y)":
        afp = base_imponible * 0.114
    else:
        afp = 0 # Asignación predeterminada si no se cumple ninguna condición
    
    return essalud, afp

def calcular_pago_total(self):
essalud, afp = self.calcular_descuentos()
base_imponible = self.calcular_base_imponible()

    pago_total = base_imponible - essalud - afp
    
    return pago_total

Pedir datos de los empleados

empleados = []
for i in range(2):
print(f"Ingrese los datos del empleado {i+1}:")
nombre = input("Nombre: ")
apellido = input("Apellido: ")
sueldo_base = float(input("Sueldo base: "))
afp = input("AFP (AFP(X) o AFP(Y)): ")
fecha_ingreso = input("Fecha de ingreso (DD/MM/YYYY): ")
hijos = int(input("Cantidad de hijos: "))

empleado = "Empleados"(nombre, apellido, sueldo_base, afp, fecha_ingreso, hijos)
empleados.append(empleado)

Calcular y mostrar los pagos individuales

for empleado in empleados:
base_imponible = empleado.calcular_base_imponible()
essalud, afp = empleado.calcular_descuentos()
pago_total = empleado.calcular_pago_total()

print(f"\nEmpleado: {empleado.nombre} {empleado.apellido}")
print(f"Base imponible: {base_imponible:.2f}")
print(f"Descuento ESSALUD: {essalud:.2f}")
print(f"Descuento AFP: {afp:.2f}")
print(f"Pago total: {pago_total:.2f}")

Calcular y mostrar los promedios de pago

total_pagos = sum(empleado.calcular_pago_total() for empleado in empleados)
promedio_pago = total_pagos / len(empleados)

print(f"\nPromedio de pago a los empleados: {promedio_pago:.2f}")

openai.error.InvalidRequestError: The model `gpt-4` does not exist

[$USER@$OS wolverine]$ python3 -m venv venv
[$USER@$OS wolverine]$ source venv/bin/activate
(venv) [$USER@$OS wolverine]$ pip install -r requirements.txt
Collecting aiohttp==3.8.4
Using cached aiohttp-3.8.4-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.0 MB)
Collecting aiosignal==1.3.1
Using cached aiosignal-1.3.1-py3-none-any.whl (7.6 kB)
Collecting async-timeout==4.0.2
Using cached async_timeout-4.0.2-py3-none-any.whl (5.8 kB)
Collecting attrs==22.2.0
Using cached attrs-22.2.0-py3-none-any.whl (60 kB)
Collecting certifi==2022.12.7
Using cached certifi-2022.12.7-py3-none-any.whl (155 kB)
Collecting charset-normalizer==3.1.0
Using cached charset_normalizer-3.1.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (197 kB)
Collecting fire==0.5.0
Using cached fire-0.5.0.tar.gz (88 kB)
Preparing metadata (setup.py) ... done
Collecting flake8==6.0.0
Using cached flake8-6.0.0-py2.py3-none-any.whl (57 kB)
Collecting frozenlist==1.3.3
Using cached frozenlist-1.3.3-cp311-cp311-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (154 kB)
Collecting idna==3.4
Using cached idna-3.4-py3-none-any.whl (61 kB)
Collecting mccabe==0.7.0
Using cached mccabe-0.7.0-py2.py3-none-any.whl (7.3 kB)
Collecting multidict==6.0.4
Using cached multidict-6.0.4-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (117 kB)
Collecting openai==0.27.2
Using cached openai-0.27.2-py3-none-any.whl (70 kB)
Collecting pycodestyle==2.10.0
Using cached pycodestyle-2.10.0-py2.py3-none-any.whl (41 kB)
Collecting pyflakes==3.0.1
Using cached pyflakes-3.0.1-py2.py3-none-any.whl (62 kB)
Collecting requests==2.28.2
Using cached requests-2.28.2-py3-none-any.whl (62 kB)
Collecting six==1.16.0
Using cached six-1.16.0-py2.py3-none-any.whl (11 kB)
Collecting termcolor==2.2.0
Using cached termcolor-2.2.0-py3-none-any.whl (6.6 kB)
Collecting tqdm==4.65.0
Using cached tqdm-4.65.0-py3-none-any.whl (77 kB)
Collecting urllib3==1.26.15
Using cached urllib3-1.26.15-py2.py3-none-any.whl (140 kB)
Collecting yarl==1.8.2
Using cached yarl-1.8.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (278 kB)
Using legacy 'setup.py install' for fire, since package 'wheel' is not installed.
Installing collected packages: urllib3, tqdm, termcolor, six, pyflakes, pycodestyle, multidict, mccabe, idna, frozenlist, charset-normalizer, certifi, attrs, async-timeout, yarl, requests, flake8, fire, aiosignal, aiohttp, openai
Running setup.py install for fire ... done
Successfully installed aiohttp-3.8.4 aiosignal-1.3.1 async-timeout-4.0.2 attrs-22.2.0 certifi-2022.12.7 charset-normalizer-3.1.0 fire-0.5.0 flake8-6.0.0 frozenlist-1.3.3 idna-3.4 mccabe-0.7.0 multidict-6.0.4 openai-0.27.2 pycodestyle-2.10.0 pyflakes-3.0.1 requests-2.28.2 six-1.16.0 termcolor-2.2.0 tqdm-4.65.0 urllib3-1.26.15 yarl-1.8.2

[notice] A new release of pip available: 22.2.2 -> 23.0.1
[notice] To update, run: pip install --upgrade pip
(venv) [$USER@$OS wolverine]$ python wolverine.py buggy_script.py "subtract" 20 3
Script crashed. Trying to fix...
Output: Traceback (most recent call last):
File "/home/$USER/rse/open_source/wolverine/buggy_script.py", line 30, in
fire.Fire(calculate)
File "/home/$USER/rse/open_source/wolverine/venv/lib64/python3.11/site-packages/fire/core.py", line 141, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/$USER/rse/open_source/wolverine/venv/lib64/python3.11/site-packages/fire/core.py", line 475, in _Fire
component, remaining_args = _CallAndUpdateTrace(
^^^^^^^^^^^^^^^^^^^^
File "/home/$USER/rse/open_source/wolverine/venv/lib64/python3.11/site-packages/fire/core.py", line 691, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^
File "/home/$USER/rse/open_source/wolverine/buggy_script.py", line 18, in calculate
result = subtract_numbers(num1, num2)
^^^^^^^^^^^^^^^^
NameError: name 'subtract_numbers' is not defined

Traceback (most recent call last):
File "/home/$USER/rse/open_source/wolverine/wolverine.py", line 153, in
fire.Fire(main)
File "/home/$USER/rse/open_source/wolverine/venv/lib64/python3.11/site-packages/fire/core.py", line 141, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/$USER/rse/open_source/wolverine/venv/lib64/python3.11/site-packages/fire/core.py", line 475, in _Fire
component, remaining_args = _CallAndUpdateTrace(
^^^^^^^^^^^^^^^^^^^^
File "/home/$USER/rse/open_source/wolverine/venv/lib64/python3.11/site-packages/fire/core.py", line 691, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^
File "/home/$USER/rse/open_source/wolverine/wolverine.py", line 142, in main
json_response = send_error_to_gpt(
^^^^^^^^^^^^^^^^^^
File "/home/$USER/rse/open_source/wolverine/wolverine.py", line 55, in send_error_to_gpt
response = openai.ChatCompletion.create(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/$USER/rse/open_source/wolverine/venv/lib64/python3.11/site-packages/openai/api_resources/chat_completion.py", line 25, in create
return super().create(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/$USER/rse/open_source/wolverine/venv/lib64/python3.11/site-packages/openai/api_resources/abstract/engine_api_resource.py", line 153, in create
response, _, api_key = requestor.request(
^^^^^^^^^^^^^^^^^^
File "/home/$USER/rse/open_source/wolverine/venv/lib64/python3.11/site-packages/openai/api_requestor.py", line 226, in request
resp, got_stream = self._interpret_response(result, stream)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/$USER/rse/open_source/wolverine/venv/lib64/python3.11/site-packages/openai/api_requestor.py", line 619, in _interpret_response
self._interpret_response_line(
File "/home/$USER/rse/open_source/wolverine/venv/lib64/python3.11/site-packages/openai/api_requestor.py", line 682, in _interpret_response_line
raise self.handle_error_response(
openai.error.InvalidRequestError: The model gpt-4 does not exist

3.5 - has anyone tested?

First of all, this is f-ing brilliant and I'm going to be using it frequently... I'm wondering if the agent is able to do its job properly when you drop down to 3.5 turbo, because i have an idea that involves the automated deployment of wolverine at scale, and gpt4 is too expensive for that purpose - will 3.5 work or is it a waste of time?

Use subprocess.run with encoding="utf-8"

wolverine/wolverine.py

Lines 36 to 40 in 2f5a026

try:
result = subprocess.check_output(subprocess_args, stderr=subprocess.STDOUT)
except subprocess.CalledProcessError as e:
return e.output.decode("utf-8"), e.returncode
return result.decode("utf-8"), 0

if you pass encoding='utf-8' to subprocess.run, your strings are automatically decoded:

>>> s = subprocess.run("/bin/echo 'hello'".split(" "), stdout=subprocess.PIPE)
>>> s.stdout
b"'hello'\n"
>>> s = subprocess.run("/bin/echo 'hello'".split(" "), stdout=subprocess.PIPE, encoding="utf-8")
>>> s.stdout
"'hello'\n"

So:

    try:
        result = subprocess.check_output(subprocess_args, stderr=subprocess.STDOUT, encoding="utf-8")
    except subprocess.CalledProcessError as e:
        return e.output, e.returncode
    return result, 0

Suggestion: add pysnooper.snoop decorator to add failing function's variables' types/values/updates to error message context

pysnooper.snoop() is a decorator that helps automate printf style debugging: example

import pysnooper

@pysnooper.snoop()
def number_to_bits(number):
    if number:
        bits = []
        while number:
            number, remainder = divmod(number, 2)
            bits.insert(0, remainder)
        return bits
    else:
        return [0]

number_to_bits(6)

example

I think this will help focus/improve GPT-4's debugging ability - https://github.com/cool-RR/PySnooper

There's also torchsnooper for even better snoop insight into pytorch

Add chroma to wolverine

If you will add the vector database for the future update to change large amounts of code it will help out GPT-4 & GPT-3.5.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.