Giter Club home page Giter Club logo

gpt4-image-api's Introduction

GPT4 Image Recognition API

Small tool using selenium to get a temporary API endpoint for the ChatGPT Image Input / image recognition feature.
Very quickly made, you should not rely on this on prod.
Should be deprecated as soon as we have access to official OpenAI endpoints.
Works with google authentification. If you use a different login method, please modify the code for your usecase.

Installation

  • Create venv, then clone the repo
  • Install requirements:
    pip install -r requirements.txt
  • Define a .env file with your OpenAI Google credentials (or whatever but make sure to modiy the code appropriately)
  • Run FastAPI server: python main.py

Endpoints

GET

https://0.0.0.0:8000/start
Start a new session. Complete manually the login steps and press enter when asked.
Wait for the OpenAI popup to display before pressing enter.

https://0.0.0.0:8000/stop
Stop the current session.

POST

https://0.0.0.0:8000/action/
Post an image URL with a prompt. Example:

Request:
{
    "image_url": "https://www.reuters.com/resizer/NLk9k89J1tfmH-B7XKd598-6j_Y=/960x0/filters:quality(80)/cloudfront-us-east-2.images.arcpublishing.com/reuters/AHF2FYISNJO55J6N35YJBZ2JYY.jpg",
    "prompt": "Describe this image precisely."
}

Response:
{
    "status": "Success",
    "result": {
        "answer": "A night view of the Eiffel Tower illuminated, with its reflection visible in calm water in the foreground. The sky is dark blue, and there are two streetlights on either side of the scene."
    }
}

gpt4-image-api's People

Contributors

florianmgs avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

gpt4-image-api's Issues

Code Breaking

What is your chrome browser version? My code keeps randomly breaking with no specific reason.

Understanding limitations from GPT4 end

Could you share any information about limits from GPT4 end, please?

The official limit is 50 messages per 3 hours for one account. Is that true? Maybe it is not true and we can process much more images?

It would be great to know any hints from your experience. Thank you in advance!

action API endpoint error

I have the following error for action API endpoint sometimes.

{"detail":"Message: \nStacktrace:\n0 undetected_chromedriver 0x0000000107996e08 undetected_chromedriver + 5025288\n1 undetected_chromedriver 0x000000010798dc23 undetected_chromedriver + 4987939\n2 undetected_chromedriver 0x000000010752fe67 undetected_chromedriver + 409191\n3 undetected_chromedriver 0x000000010757f1b9 undetected_chromedriver + 733625\n4 undetected_chromedriver 0x000000010757f371 undetected_chromedriver + 734065\n5 undetected_chromedriver 0x00000001075c5194 undetected_chromedriver + 1020308\n6 undetected_chromedriver 0x00000001075a650d undetected_chromedriver + 894221\n7 undetected_chromedriver 0x00000001075c2571 undetected_chromedriver + 1009009\n8 undetected_chromedriver 0x00000001075a62b3 undetected_chromedriver + 893619\n9 undetected_chromedriver 0x0000000107570eb9 undetected_chromedriver + 675513\n10 undetected_chromedriver 0x00000001075720ee undetected_chromedriver + 680174\n11 undetected_chromedriver 0x0000000107958819 undetected_chromedriver + 4769817\n12 undetected_chromedriver 0x000000010795d893 undetected_chromedriver + 4790419\n13 undetected_chromedriver 0x000000010796466e undetected_chromedriver + 4818542\n14 undetected_chromedriver 0x000000010795e5bd undetected_chromedriver + 4793789\n15 undetected_chromedriver 0x000000010793098c undetected_chromedriver + 4606348\n16 undetected_chromedriver 0x000000010797cb78 undetected_chromedriver + 4918136\n17 undetected_chromedriver 0x000000010797cd30 undetected_chromedriver + 4918576\n18 undetected_chromedriver 0x000000010798d85e undetected_chromedriver + 4986974\n19 libsystem_pthread.dylib 0x00007fff208338fc _pthread_start + 224\n20 libsystem_pthread.dylib 0x00007fff2082f443 thread_start + 15\n"}

I am using Chrome Version 118.0.5993.117

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.