Giter Club home page Giter Club logo

geminiplayground's Introduction

Gemini Playground

Gemini Logo

Gemini-Playground provides a Python interface and a UI to interact with last Google's Gemini's version, models/gemini-1.5-pro-latest, and other variants of Gemini. With Gemini Playground, you can:

  • Engage in conversation with your data: Upload images, and videos using a simple API and generate responses based on your prompts.
  • Chat with your codebase: Ask Gemini to analyze your code, explain its functionality, suggest improvements, or even write documentation for it.
  • Explore multimodal capabilities: Combine different data types in your prompts, like asking Gemini to describe what's happening in a video and an image simultaneously.

Features

  • Intuitive API: The GeminiClient class offers a simple and easy-to-use interface for interacting with the Gemini API.
  • Multimodal Support: Upload and use text, images, videos, and code in your prompts.
  • File Management: Upload, list, and remove files from your Gemini storage.
  • Token Counting: Estimate the number of tokens required for a prompt and response.
  • Response Generation: Generate responses from Gemini based on your prompts and uploaded content.
  • Rich Logging: Get informative and colorful logging messages for better understanding of the process.

You can find usage examples in the examples directory.

Installation

pip install --index-url https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple/ geminiplayground

Usage

  1. Set up your API key:

    • Obtain an API key from Google AI-Studio.
    • Set the AISTUDIO_API_KEY environment variable with your API key.
  2. Create a GeminiClient instance:

from geminiplayground.core import GeminiClient
from geminiplayground.parts import VideoFile, ImageFile
from geminiplayground.schemas import HarmCategory, HarmBlockThreshold

gemini_client = GeminiClient()
  1. Upload files:
video_file_path = "BigBuckBunny_320x180.mp4"
video_file = VideoFile(video_file_path, gemini_client=gemini_client)
video_file.upload()

image_file_path = "https://upload.wikimedia.org/wikipedia/commons/4/47/PNG_transparency_demonstration_1.png"
image_file = ImageFile(image_file_path, gemini_client=gemini_client)
image_file.upload()
  1. Create a prompt:
multimodal_prompt = [
    "See this video",
    video_file,
    "and this image",
    image_file,
    "Explain what you see."
]
  1. Generate a response:
response = gemini_client.generate_response("models/gemini-1.5-pro-latest", multimodal_prompt,
                                           generation_config={"temperature": 0.0, "top_p": 1.0},
                                           safety_settings={
                                               "category": HarmCategory.DANGEROUS_CONTENT,
                                               "threshold": HarmBlockThreshold.BLOCK_NONE
                                           })

# Print the response
for candidate in response.candidates:
    for part in candidate.content.parts:
        if part.text:
            print(part.text)
The video is a short animated film called "Big Buck Bunny." It is a comedy about a large, white rabbit 
who is bullied by three smaller animals. The rabbit eventually gets revenge on his tormentors. The film 
was created using Blender, a free and open-source 3D animation software.

The image is of four dice, each a different color. The dice are transparent and have white dots. The 
image is isolated on a black background.
  1. You can also chat with your data:

Chat with your codebase:

from rich import print

from geminiplayground.core import GeminiClient
from geminiplayground.parts.git_repo_part import GitRepo
from geminiplayground.schemas import GenerateRequestParts, TextPart, GenerateRequest


def chat_wit_your_code():
    """
    Get the content parts of a github repo and generate a request.
    :return:
    """

    repo = GitRepo.from_url("https://github.com/karpathy/ng-video-lecture",
                            branch="master",
                            config={
                                "content": "code-files",  # "code-files" or "issues"
                                "exclude_dirs": ["frontend", "ui"],
                                "file_extensions": [".py"]
                            })
    repo_parts = repo.content_parts()

    request_parts = GenerateRequestParts(parts=[
        TextPart(text="use this codebase:"),
        *repo_parts,
        TextPart(
            text="Help me to write a Readme file for this codebase."),
    ])
    request = GenerateRequest(
        contents=[
            request_parts
        ]
    )
    model = "models/gemini-1.5-pro-latest"
    gemini_client = GeminiClient()
    tokens_count = gemini_client.get_tokens_count(model, request)
    print("Tokens count: ", tokens_count)
    response = gemini_client.generate_response(model, request)

    # Print the response
    for candidate in response.candidates:
        for part in candidate.content.parts:
            if part.text:
                print(part.text)


if __name__ == '__main__':
    chat_wit_your_code()

Chat with your videos:

from rich import print

from geminiplayground.core import GeminiClient
from geminiplayground.parts import VideoFile
from geminiplayground.schemas import GenerateRequestParts, TextPart, GenerateRequest


def chat_wit_your_video():
    """
    Get the content parts of a video and generate a request.
    :return:
    """
    gemini_client = GeminiClient()
    model = "models/gemini-1.5-pro-latest"

    video_file_path = "./../data/BigBuckBunny_320x180.mp4"
    video_file = VideoFile(video_file_path, gemini_client=gemini_client)
    video_parts = video_file.content_parts()
    video_files = video_file.files[-4:]
    for part in video_parts[:5]:
        print(part)

    request_parts = GenerateRequestParts(parts=[
        TextPart(text="check this video?:"),
        *video_parts,
        TextPart(text="list the object you see in the video")
    ])
    request = GenerateRequest(
        contents=[
            request_parts
        ]
    )
    tokens_count = gemini_client.get_tokens_count(model, request)
    print("Tokens count: ", tokens_count)
    response = gemini_client.generate_response(model, request)

    # Print the response
    for candidate in response.candidates:
        for part in candidate.content.parts:
            if part.text:
                print(part.text)


if __name__ == '__main__':
    chat_wit_your_video()

Chat with your images:

from rich import print

from geminiplayground.core import GeminiClient
from geminiplayground.parts import ImageFile
from geminiplayground.schemas import GenerateRequestParts, TextPart, GenerateRequest


def chat_wit_your_images():
    """
    Get the content parts of an image and generate a request.
    :return:
    """
    gemini_client = GeminiClient()
    model = "models/gemini-1.5-pro-latest"

    image_file_path = "https://upload.wikimedia.org/wikipedia/commons/4/47/PNG_transparency_demonstration_1.png"
    image_file = ImageFile(image_file_path, gemini_client=gemini_client)
    image_parts = image_file.content_parts()
    image_files = image_file.files
    print("Image files: ", image_files)

    request_parts = GenerateRequestParts(parts=[
        TextPart(text="You see this image?:"),
        *image_parts,
        TextPart(text="Describe what you see"),
    ])
    request = GenerateRequest(
        contents=[
            request_parts
        ],

    )
    tokens_count = gemini_client.get_tokens_count(model, request)
    print("Tokens count: ", tokens_count)
    response = gemini_client.generate_response(model, request)

    # Print the response
    for candidate in response.candidates:
        for part in candidate.content.parts:
            if part.text:
                print(part.text)


if __name__ == '__main__':
    chat_wit_your_images()

This is a basic example.Explore the codebase and documentation for more advanced functionalities and examples.

GUI

You can also use the GUI to interact with Gemini. Remember to set the AISTUDIO_API_KEY environment variable with your API key. You can do so globally, pass it as an argument to the command, or create a .env file in the root of your project and set the AISTUDIO_API_KEY variable there.

For running the GUI, use the following command:

geminiplayground ui

or

AISTUDIO_API_KEY=your_api_key geminiplayground

This will start a local server and open the GUI in your default browser.

Gemini GUI

Contributing

Contributions are welcome! Please see theCONTRIBUTING.md file for guidelines [Coming soon].

License

This codebase is licensed under the MIT License.See theLICENSEfile for details.

geminiplayground's People

Contributors

haruiz avatar jggomez avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.