gbaptista / gemini-ai Goto Github PK

A Ruby Gem for interacting with Gemini through Vertex AI, Generative Language API, or AI Studio, Google's generative AI services.

Home Page: https://rubygems.org/gems/gemini-ai

License: MIT License

Ruby 87.91% Clojure 12.09%

ai deepmind gemini gemini-ai google google-ai google-gemini google-vertex-ai llm nano-bots vertex vertex-ai gemini-functions gemini-tools gemini-pro

gemini-ai's Introduction

Gemini AI

A Ruby Gem for interacting with Gemini through Vertex AI, Generative Language API, or AI Studio, Google's generative AI services.

This Gem is designed to provide low-level access to Gemini, enabling people to build abstractions on top of it. If you are interested in more high-level abstractions or more user-friendly tools, you may want to consider Nano Bots 💎 🤖.

TL;DR and Quick Start

gem 'gemini-ai', '~> 3.2.0'

require 'gemini-ai'

# With an API key
client = Gemini.new(
  credentials: {
    service: 'generative-language-api',
    api_key: ENV['GOOGLE_API_KEY']
  },
  options: { model: 'gemini-pro', server_sent_events: true }
)

# With a Service Account Credentials File
client = Gemini.new(
  credentials: {
    service: 'vertex-ai-api',
    file_path: 'google-credentials.json',
    region: 'us-east4'
  },
  options: { model: 'gemini-pro', server_sent_events: true }
)

# With Application Default Credentials
client = Gemini.new(
  credentials: {
    service: 'vertex-ai-api',
    region: 'us-east4'
  },
  options: { model: 'gemini-pro', server_sent_events: true }
)

result = client.stream_generate_content({
  contents: { role: 'user', parts: { text: 'hi!' } }
})

Result:

[{ 'candidates' =>
   [{ 'content' => {
        'role' => 'model',
        'parts' => [{ 'text' => 'Hello! How may I assist you?' }]
      },
      'finishReason' => 'STOP',
      'safetyRatings' =>
      [{ 'category' => 'HARM_CATEGORY_HARASSMENT', 'probability' => 'NEGLIGIBLE' },
       { 'category' => 'HARM_CATEGORY_HATE_SPEECH', 'probability' => 'NEGLIGIBLE' },
       { 'category' => 'HARM_CATEGORY_SEXUALLY_EXPLICIT', 'probability' => 'NEGLIGIBLE' },
       { 'category' => 'HARM_CATEGORY_DANGEROUS_CONTENT', 'probability' => 'NEGLIGIBLE' }] }],
   'usageMetadata' => {
     'promptTokenCount' => 2,
     'candidatesTokenCount' => 8,
     'totalTokenCount' => 10
   } }]

Setup

Installing

gem install gemini-ai -v 3.2.0

gem 'gemini-ai', '~> 3.2.0'

Credentials

Option 1: API Key (Generative Language API)
Option 2: Service Account Credentials File (Vertex AI API)
Option 3: Application Default Credentials (Vertex AI API)
Required Data

⚠️ DISCLAIMER: Be careful with what you are doing, and never trust others' code related to this. These commands and instructions alter the level of access to your Google Cloud Account, and running them naively can lead to security risks as well as financial risks. People with access to your account can use it to steal data or incur charges. Run these commands at your own responsibility and due diligence; expect no warranties from the contributors of this project.

Option 1: API Key (Generative Language API)

You need a Google Cloud Project, and then you can generate an API Key through the Google Cloud Console here.

You also need to enable the Generative Language API service in your Google Cloud Console, which can be done here.

Alternatively, you can generate an API Key through Google AI Studio here. However, this approach will automatically create a project for you in your Google Cloud Account.

Option 2: Service Account Credentials File (Vertex AI API)

You need a Google Cloud Project and a Service Account to use Vertex AI API.

After creating them, you need to enable the Vertex AI API for your project by clicking Enable here: Vertex AI API.

You can create credentials for your Service Account here, where you will be able to download a JSON file named google-credentials.json that should have content similar to this:

{
  "type": "service_account",
  "project_id": "YOUR_PROJECT_ID",
  "private_key_id": "a00...",
  "private_key": "-----BEGIN PRIVATE KEY-----\n...\n-----END PRIVATE KEY-----\n",
  "client_email": "PROJECT_ID@PROJECT_ID.iam.gserviceaccount.com",
  "client_id": "000...",
  "auth_uri": "https://accounts.google.com/o/oauth2/auth",
  "token_uri": "https://oauth2.googleapis.com/token",
  "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
  "client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/..."
}

You need to have the necessary policies (roles/aiplatform.user and possibly roles/ml.admin) in place to use the Vertex AI API.

You can add them by navigating to the IAM Console and clicking on the "Edit principal" (✏️ pencil icon) next to your Service Account.

Alternatively, you can add them through the gcloud CLI as follows:

gcloud projects add-iam-policy-binding PROJECT_ID \
  --member='serviceAccount:PROJECT_ID@PROJECT_ID.iam.gserviceaccount.com' \
  --role='roles/aiplatform.user'

Some people reported having trouble accessing the API, and adding the role roles/ml.admin fixed it:

gcloud projects add-iam-policy-binding PROJECT_ID \
  --member='serviceAccount:PROJECT_ID@PROJECT_ID.iam.gserviceaccount.com' \
  --role='roles/ml.admin'

If you are not using a Service Account:

gcloud projects add-iam-policy-binding PROJECT_ID \
  --member='user:[email protected]' \
  --role='roles/aiplatform.user'

gcloud projects add-iam-policy-binding PROJECT_ID \
  --member='user:[email protected]' \
  --role='roles/ml.admin'

Option 3: Application Default Credentials (Vertex AI API)

Similar to Option 2, but you don't need to download a google-credentials.json. Application Default Credentials automatically find credentials based on the application environment.

For local development, you can generate your default credentials using the gcloud CLI as follows:

gcloud auth application-default login

For more details about alternative methods and different environments, check the official documentation: Set up Application Default Credentials

Required Data

After choosing an option, you should have all the necessary data and access to use Gemini.

Option 1, for API Key:

{
  service: 'generative-language-api',
  api_key: 'GOOGLE_API_KEY'
}

Remember that hardcoding your API key in code is unsafe; it's preferable to use environment variables:

{
  service: 'generative-language-api',
  api_key: ENV['GOOGLE_API_KEY']
}

Option 2: For the Service Account, provide a google-credentials.json file and a REGION:

{
  service: 'vertex-ai-api',
  file_path: 'google-credentials.json',
  region: 'us-east4'
}

Option 3: For Application Default Credentials, omit both the api_key and the file_path:

{
  service: 'vertex-ai-api',
  region: 'us-east4'
}

As of the writing of this README, the following regions support Gemini:

Iowa (us-central1)
Las Vegas, Nevada (us-west4)
Montréal, Canada (northamerica-northeast1)
Northern Virginia (us-east4)
Oregon (us-west1)
Seoul, Korea (asia-northeast3)
Singapore (asia-southeast1)
Tokyo, Japan (asia-northeast1)

You can follow here if new regions are available: Gemini API

You might want to explicitly set a Google Cloud Project ID, which you can do as follows:

{
  service: 'vertex-ai-api',
  project_id: 'PROJECT_ID'
}

Custom Version

By default, the gem uses the v1 version of the APIs. You may want to use a different version:

# With an API key
client = Gemini.new(
  credentials: {
    service: 'generative-language-api',
    api_key: ENV['GOOGLE_API_KEY'],
    version: 'v1beta'
  },
  options: { model: 'gemini-pro', server_sent_events: true }
)

# With a Service Account Credentials File
client = Gemini.new(
  credentials: {
    service: 'vertex-ai-api',
    file_path: 'google-credentials.json',
    region: 'us-east4',
    version: 'v1beta'
  },
  options: { model: 'gemini-pro', server_sent_events: true }
)

# With Application Default Credentials
client = Gemini.new(
  credentials: {
    service: 'vertex-ai-api',
    region: 'us-east4',
    version: 'v1beta'
  },
  options: { model: 'gemini-pro', server_sent_events: true }
)

Usage

Client

Ensure that you have all the required data for authentication.

Create a new client:

require 'gemini-ai'

# With an API key
client = Gemini.new(
  credentials: {
    service: 'generative-language-api',
    api_key: ENV['GOOGLE_API_KEY']
  },
  options: { model: 'gemini-pro', server_sent_events: true }
)

# With a Service Account Credentials File
client = Gemini.new(
  credentials: {
    service: 'vertex-ai-api',
    file_path: 'google-credentials.json',
    region: 'us-east4'
  },
  options: { model: 'gemini-pro', server_sent_events: true }
)

# With Application Default Credentials
client = Gemini.new(
  credentials: {
    service: 'vertex-ai-api',
    region: 'us-east4'
  },
  options: { model: 'gemini-pro', server_sent_events: true }
)

Methods

stream_generate_content

Receiving Stream Events

Ensure that you have enabled Server-Sent Events before using blocks for streaming:

client.stream_generate_content(
  { contents: { role: 'user', parts: { text: 'hi!' } } }
) do |event, parsed, raw|
  puts event
end

Event:

{ 'candidates' =>
  [{ 'content' => {
       'role' => 'model',
       'parts' => [{ 'text' => 'Hello! How may I assist you?' }]
     },
     'finishReason' => 'STOP',
     'safetyRatings' =>
     [{ 'category' => 'HARM_CATEGORY_HARASSMENT', 'probability' => 'NEGLIGIBLE' },
      { 'category' => 'HARM_CATEGORY_HATE_SPEECH', 'probability' => 'NEGLIGIBLE' },
      { 'category' => 'HARM_CATEGORY_SEXUALLY_EXPLICIT', 'probability' => 'NEGLIGIBLE' },
      { 'category' => 'HARM_CATEGORY_DANGEROUS_CONTENT', 'probability' => 'NEGLIGIBLE' }] }],
  'usageMetadata' => {
    'promptTokenCount' => 2,
    'candidatesTokenCount' => 8,
    'totalTokenCount' => 10
  } }

Without Events

You can use stream_generate_content without events:

result = client.stream_generate_content(
  { contents: { role: 'user', parts: { text: 'hi!' } } }
)

In this case, the result will be an array with all the received events:

[{ 'candidates' =>
   [{ 'content' => {
        'role' => 'model',
        'parts' => [{ 'text' => 'Hello! How may I assist you?' }]
      },
      'finishReason' => 'STOP',
      'safetyRatings' =>
      [{ 'category' => 'HARM_CATEGORY_HARASSMENT', 'probability' => 'NEGLIGIBLE' },
       { 'category' => 'HARM_CATEGORY_HATE_SPEECH', 'probability' => 'NEGLIGIBLE' },
       { 'category' => 'HARM_CATEGORY_SEXUALLY_EXPLICIT', 'probability' => 'NEGLIGIBLE' },
       { 'category' => 'HARM_CATEGORY_DANGEROUS_CONTENT', 'probability' => 'NEGLIGIBLE' }] }],
   'usageMetadata' => {
     'promptTokenCount' => 2,
     'candidatesTokenCount' => 8,
     'totalTokenCount' => 10
   } }]

You can mix both as well:

result = client.stream_generate_content(
  { contents: { role: 'user', parts: { text: 'hi!' } } }
) do |event, parsed, raw|
  puts event
end

generate_content

result = client.generate_content(
  { contents: { role: 'user', parts: { text: 'hi!' } } }
)

Result:

{ 'candidates' =>
  [{ 'content' => { 'parts' => [{ 'text' => 'Hello! How can I assist you today?' }], 'role' => 'model' },
     'finishReason' => 'STOP',
     'index' => 0,
     'safetyRatings' =>
     [{ 'category' => 'HARM_CATEGORY_SEXUALLY_EXPLICIT', 'probability' => 'NEGLIGIBLE' },
      { 'category' => 'HARM_CATEGORY_HATE_SPEECH', 'probability' => 'NEGLIGIBLE' },
      { 'category' => 'HARM_CATEGORY_HARASSMENT', 'probability' => 'NEGLIGIBLE' },
      { 'category' => 'HARM_CATEGORY_DANGEROUS_CONTENT', 'probability' => 'NEGLIGIBLE' }] }],
  'promptFeedback' =>
  { 'safetyRatings' =>
    [{ 'category' => 'HARM_CATEGORY_SEXUALLY_EXPLICIT', 'probability' => 'NEGLIGIBLE' },
     { 'category' => 'HARM_CATEGORY_HATE_SPEECH', 'probability' => 'NEGLIGIBLE' },
     { 'category' => 'HARM_CATEGORY_HARASSMENT', 'probability' => 'NEGLIGIBLE' },
     { 'category' => 'HARM_CATEGORY_DANGEROUS_CONTENT', 'probability' => 'NEGLIGIBLE' }] } }

As of the writing of this README, only the generative-language-api service supports the generate_content method; vertex-ai-api does not.

Modes

Text

result = client.stream_generate_content({
  contents: { role: 'user', parts: { text: 'hi!' } }
})

Result:

[{ 'candidates' =>
   [{ 'content' => {
        'role' => 'model',
        'parts' => [{ 'text' => 'Hello! How may I assist you?' }]
      },
      'finishReason' => 'STOP',
      'safetyRatings' =>
      [{ 'category' => 'HARM_CATEGORY_HARASSMENT', 'probability' => 'NEGLIGIBLE' },
       { 'category' => 'HARM_CATEGORY_HATE_SPEECH', 'probability' => 'NEGLIGIBLE' },
       { 'category' => 'HARM_CATEGORY_SEXUALLY_EXPLICIT', 'probability' => 'NEGLIGIBLE' },
       { 'category' => 'HARM_CATEGORY_DANGEROUS_CONTENT', 'probability' => 'NEGLIGIBLE' }] }],
   'usageMetadata' => {
     'promptTokenCount' => 2,
     'candidatesTokenCount' => 8,
     'totalTokenCount' => 10
   } }]

Image

Courtesy of Unsplash

Switch to the gemini-pro-vision model:

client = Gemini.new(
  credentials: { service: 'vertex-ai-api', region: 'us-east4' },
  options: { model: 'gemini-pro-vision', server_sent_events: true }
)

Then, encode the image as Base64 and add its MIME type:

require 'base64'

result = client.stream_generate_content(
  { contents: [
    { role: 'user', parts: [
      { text: 'Please describe this image.' },
      { inline_data: {
        mime_type: 'image/jpeg',
        data: Base64.strict_encode64(File.read('piano.jpg'))
      } }
    ] }
  ] }
)

The result:

[{ 'candidates' =>
   [{ 'content' =>
      { 'role' => 'model',
        'parts' =>
        [{ 'text' =>
           ' A black and white image of an old piano. The piano is an upright model, with the keys on the right side of the image. The piano is' }] },
      'safetyRatings' =>
      [{ 'category' => 'HARM_CATEGORY_HARASSMENT', 'probability' => 'NEGLIGIBLE' },
       { 'category' => 'HARM_CATEGORY_HATE_SPEECH', 'probability' => 'NEGLIGIBLE' },
       { 'category' => 'HARM_CATEGORY_SEXUALLY_EXPLICIT', 'probability' => 'NEGLIGIBLE' },
       { 'category' => 'HARM_CATEGORY_DANGEROUS_CONTENT', 'probability' => 'NEGLIGIBLE' }] }] },
 { 'candidates' =>
   [{ 'content' => { 'role' => 'model', 'parts' => [{ 'text' => ' sitting on a tiled floor. There is a small round object on the top of the piano.' }] },
      'finishReason' => 'STOP',
      'safetyRatings' =>
      [{ 'category' => 'HARM_CATEGORY_HARASSMENT', 'probability' => 'NEGLIGIBLE' },
       { 'category' => 'HARM_CATEGORY_HATE_SPEECH', 'probability' => 'NEGLIGIBLE' },
       { 'category' => 'HARM_CATEGORY_SEXUALLY_EXPLICIT', 'probability' => 'NEGLIGIBLE' },
       { 'category' => 'HARM_CATEGORY_DANGEROUS_CONTENT', 'probability' => 'NEGLIGIBLE' }] }],
   'usageMetadata' => { 'promptTokenCount' => 263, 'candidatesTokenCount' => 50, 'totalTokenCount' => 313 } }]

Video

coffee.mp4

ALT: A white and gold cup is being filled with coffee. The coffee is dark and rich. The cup is sitting on a black surface. The background is blurred.

Courtesy of Pexels

Switch to the gemini-pro-vision model:

client = Gemini.new(
  credentials: { service: 'vertex-ai-api', region: 'us-east4' },
  options: { model: 'gemini-pro-vision', server_sent_events: true }
)

Then, encode the video as Base64 and add its MIME type:

require 'base64'

result = client.stream_generate_content(
  { contents: [
    { role: 'user', parts: [
      { text: 'Please describe this video.' },
      { inline_data: {
        mime_type: 'video/mp4',
        data: Base64.strict_encode64(File.read('coffee.mp4'))
      } }
    ] }
  ] }
)

The result:

[{"candidates"=>
   [{"content"=>
      {"role"=>"model",
       "parts"=>
        [{"text"=>
           " A white and gold cup is being filled with coffee. The coffee is dark and rich. The cup is sitting on a black surface. The background is blurred"}]},
     "safetyRatings"=>
      [{"category"=>"HARM_CATEGORY_HARASSMENT", "probability"=>"NEGLIGIBLE"},
       {"category"=>"HARM_CATEGORY_HATE_SPEECH", "probability"=>"NEGLIGIBLE"},
       {"category"=>"HARM_CATEGORY_SEXUALLY_EXPLICIT", "probability"=>"NEGLIGIBLE"},
       {"category"=>"HARM_CATEGORY_DANGEROUS_CONTENT", "probability"=>"NEGLIGIBLE"}]}],
  "usageMetadata"=>{"promptTokenCount"=>1037, "candidatesTokenCount"=>31, "totalTokenCount"=>1068}},
 {"candidates"=>
   [{"content"=>{"role"=>"model", "parts"=>[{"text"=>"."}]},
     "finishReason"=>"STOP",
     "safetyRatings"=>
      [{"category"=>"HARM_CATEGORY_HARASSMENT", "probability"=>"NEGLIGIBLE"},
       {"category"=>"HARM_CATEGORY_HATE_SPEECH", "probability"=>"NEGLIGIBLE"},
       {"category"=>"HARM_CATEGORY_SEXUALLY_EXPLICIT", "probability"=>"NEGLIGIBLE"},
       {"category"=>"HARM_CATEGORY_DANGEROUS_CONTENT", "probability"=>"NEGLIGIBLE"}]}],
  "usageMetadata"=>{"promptTokenCount"=>1037, "candidatesTokenCount"=>32, "totalTokenCount"=>1069}}]

Streaming vs. Server-Sent Events (SSE)

Server-Sent Events (SSE) is a technology that allows certain endpoints to offer streaming capabilities, such as creating the impression that "the model is typing along with you," rather than delivering the entire answer all at once.

You can set up the client to use Server-Sent Events (SSE) for all supported endpoints:

client = Gemini.new(
  credentials: { ... },
  options: { model: 'gemini-pro', server_sent_events: true }
)

Or, you can decide on a request basis:

client.stream_generate_content(
  { contents: { role: 'user', parts: { text: 'hi!' } } },
  server_sent_events: true
)

With Server-Sent Events (SSE) enabled, you can use a block to receive partial results via events. This feature is particularly useful for methods that offer streaming capabilities, such as stream_generate_content:

client.stream_generate_content(
  { contents: { role: 'user', parts: { text: 'hi!' } } }
) do |event, parsed, raw|
  puts event
end

Event:

{ 'candidates' =>
  [{ 'content' => {
       'role' => 'model',
       'parts' => [{ 'text' => 'Hello! How may I assist you?' }]
     },
     'finishReason' => 'STOP',
     'safetyRatings' =>
     [{ 'category' => 'HARM_CATEGORY_HARASSMENT', 'probability' => 'NEGLIGIBLE' },
      { 'category' => 'HARM_CATEGORY_HATE_SPEECH', 'probability' => 'NEGLIGIBLE' },
      { 'category' => 'HARM_CATEGORY_SEXUALLY_EXPLICIT', 'probability' => 'NEGLIGIBLE' },
      { 'category' => 'HARM_CATEGORY_DANGEROUS_CONTENT', 'probability' => 'NEGLIGIBLE' }] }],
  'usageMetadata' => {
    'promptTokenCount' => 2,
    'candidatesTokenCount' => 8,
    'totalTokenCount' => 10
  } }

Even though streaming methods utilize Server-Sent Events (SSE), using this feature doesn't necessarily mean streaming data. For example, when generate_content is called with SSE enabled, you will receive all the data at once in a single event, rather than through multiple partial events. This occurs because generate_content isn't designed for streaming, even though it is capable of utilizing Server-Sent Events.

Server-Sent Events (SSE) Hang

Method calls will hang until the server-sent events finish, so even without providing a block, you can obtain the final results of the received events:

result = client.stream_generate_content(
  { contents: { role: 'user', parts: { text: 'hi!' } } },
  server_sent_events: true
)

Result:

[{ 'candidates' =>
   [{ 'content' => {
        'role' => 'model',
        'parts' => [{ 'text' => 'Hello! How may I assist you?' }]
      },
      'finishReason' => 'STOP',
      'safetyRatings' =>
      [{ 'category' => 'HARM_CATEGORY_HARASSMENT', 'probability' => 'NEGLIGIBLE' },
       { 'category' => 'HARM_CATEGORY_HATE_SPEECH', 'probability' => 'NEGLIGIBLE' },
       { 'category' => 'HARM_CATEGORY_SEXUALLY_EXPLICIT', 'probability' => 'NEGLIGIBLE' },
       { 'category' => 'HARM_CATEGORY_DANGEROUS_CONTENT', 'probability' => 'NEGLIGIBLE' }] }],
   'usageMetadata' => {
     'promptTokenCount' => 2,
     'candidatesTokenCount' => 8,
     'totalTokenCount' => 10
   } }]

Non-Streaming

Depending on the service, you can use the generate_content method, which does not stream the answer.

You can also use methods designed for streaming without necessarily processing partial events; instead, you can wait for the result of all received events:

result = client.stream_generate_content({
  contents: { role: 'user', parts: { text: 'hi!' } },
  server_sent_events: false
})

Result:

[{ 'candidates' =>
   [{ 'content' => {
        'role' => 'model',
        'parts' => [{ 'text' => 'Hello! How may I assist you?' }]
      },
      'finishReason' => 'STOP',
      'safetyRatings' =>
      [{ 'category' => 'HARM_CATEGORY_HARASSMENT', 'probability' => 'NEGLIGIBLE' },
       { 'category' => 'HARM_CATEGORY_HATE_SPEECH', 'probability' => 'NEGLIGIBLE' },
       { 'category' => 'HARM_CATEGORY_SEXUALLY_EXPLICIT', 'probability' => 'NEGLIGIBLE' },
       { 'category' => 'HARM_CATEGORY_DANGEROUS_CONTENT', 'probability' => 'NEGLIGIBLE' }] }],
   'usageMetadata' => {
     'promptTokenCount' => 2,
     'candidatesTokenCount' => 8,
     'totalTokenCount' => 10
   } }]

Back-and-Forth Conversations

To maintain a back-and-forth conversation, you need to append the received responses and build a history for your requests:

result = client.stream_generate_content(
  { contents: [
    { role: 'user', parts: { text: 'Hi! My name is Purple.' } },
    { role: 'model', parts: { text: "Hello Purple! It's nice to meet you." } },
    { role: 'user', parts: { text: "What's my name?" } }
  ] }
)

Result:

[{ 'candidates' =>
   [{ 'content' =>
      { 'role' => 'model',
        'parts' => [
          { 'text' => "Purple.\n\nYou told me your name was Purple in your first message to me.\n\nIs there anything" }
        ] },
      'safetyRatings' =>
      [{ 'category' => 'HARM_CATEGORY_HARASSMENT', 'probability' => 'NEGLIGIBLE' },
       { 'category' => 'HARM_CATEGORY_HATE_SPEECH', 'probability' => 'NEGLIGIBLE' },
       { 'category' => 'HARM_CATEGORY_SEXUALLY_EXPLICIT', 'probability' => 'NEGLIGIBLE' },
       { 'category' => 'HARM_CATEGORY_DANGEROUS_CONTENT', 'probability' => 'NEGLIGIBLE' }] }] },
 { 'candidates' =>
   [{ 'content' => { 'role' => 'model', 'parts' => [{ 'text' => ' else I can help you with today, Purple?' }] },
      'finishReason' => 'STOP',
      'safetyRatings' =>
      [{ 'category' => 'HARM_CATEGORY_HARASSMENT', 'probability' => 'NEGLIGIBLE' },
       { 'category' => 'HARM_CATEGORY_HATE_SPEECH', 'probability' => 'NEGLIGIBLE' },
       { 'category' => 'HARM_CATEGORY_SEXUALLY_EXPLICIT', 'probability' => 'NEGLIGIBLE' },
       { 'category' => 'HARM_CATEGORY_DANGEROUS_CONTENT', 'probability' => 'NEGLIGIBLE' }] }],
   'usageMetadata' => {
     'promptTokenCount' => 24,
     'candidatesTokenCount' => 31,
     'totalTokenCount' => 55
   } }]

Tools (Functions) Calling

As of the writing of this README, only the vertex-ai-api service and the gemini-pro model supports tools (functions) calls.

You can provide specifications for tools (functions) using JSON Schema to generate potential calls to them:

input = {
  tools: {
    function_declarations: [
      {
        name: 'date_and_time',
        description: 'Returns the current date and time in the ISO 8601 format for a given timezone.',
        parameters: {
          type: 'object',
          properties: {
            timezone: {
              type: 'string',
              description: 'A string represents the timezone to be used for providing a datetime, following the IANA (Internet Assigned Numbers Authority) Time Zone Database. Examples include "Asia/Tokyo" and "Europe/Paris". If not provided, the default timezone is the user\'s current timezone.'
            }
          }
        }
      }
    ]
  },
  contents: [
    { role: 'user', parts: { text: 'What time is it?' } }
  ]
}

result = client.stream_generate_content(input)

Which may return a request to perform a call:

[{ 'candidates' =>
   [{ 'content' => {
        'role' => 'model',
        'parts' => [{ 'functionCall' => {
          'name' => 'date_and_time',
          'args' => { 'timezone' => 'local' }
        } }]
      },
      'finishReason' => 'STOP',
      'safetyRatings' =>
      [{ 'category' => 'HARM_CATEGORY_HARASSMENT', 'probability' => 'NEGLIGIBLE' },
       { 'category' => 'HARM_CATEGORY_HATE_SPEECH', 'probability' => 'NEGLIGIBLE' },
       { 'category' => 'HARM_CATEGORY_SEXUALLY_EXPLICIT', 'probability' => 'NEGLIGIBLE' },
       { 'category' => 'HARM_CATEGORY_DANGEROUS_CONTENT', 'probability' => 'NEGLIGIBLE' }] }],
   'usageMetadata' => { 'promptTokenCount' => 5, 'totalTokenCount' => 5 } }]

Based on these results, you can perform the requested calls and provide their outputs:

gem 'tzinfo', '~> 2.0', '>= 2.0.6'

require 'tzinfo'
require 'time'

function_calls = result.dig(0, 'candidates', 0, 'content', 'parts').filter do |part|
  part.key?('functionCall')
end

function_parts = []

function_calls.each do |function_call|
  next unless function_call['functionCall']['name'] == 'date_and_time'

  timezone = function_call.dig('functionCall', 'args', 'timezone')

  time = if !timezone.nil? && timezone != '' && timezone.downcase != 'local'
           TZInfo::Timezone.get(timezone).now
         else
           Time.now
         end

  function_output = time.iso8601

  function_parts << {
    functionResponse: {
      name: function_call['functionCall']['name'],
      response: {
        name: function_call['functionCall']['name'],
        content: function_output
      }
    }
  }
end

input[:contents] << result.dig(0, 'candidates', 0, 'content')
input[:contents] << { role: 'function', parts: function_parts }

This will be equivalent to the following final input:

{ tools: { function_declarations: [
  { name: 'date_and_time',
    description: 'Returns the current date and time in the ISO 8601 format for a given timezone.',
    parameters: {
      type: 'object',
      properties: {
        timezone: {
          type: 'string',
          description: "A string represents the timezone to be used for providing a datetime, following the IANA (Internet Assigned Numbers Authority) Time Zone Database. Examples include \"Asia/Tokyo\" and \"Europe/Paris\". If not provided, the default timezone is the user's current timezone."
        }
      }
    } }
] },
  contents: [
    { role: 'user', parts: { text: 'What time is it?' } },
    { role: 'model',
      parts: [
        { functionCall: { name: 'date_and_time', args: { timezone: 'local' } } }
      ] },
    { role: 'function',
      parts: [{ functionResponse: {
        name: 'date_and_time',
        response: {
          name: 'date_and_time',
          content: '2023-12-13T21:15:11-03:00'
        }
      } }] }
  ] }

With the input properly arranged, you can make another request:

result = client.stream_generate_content(input)

Which will result in:

[{ 'candidates' =>
   [{ 'content' => { 'role' => 'model', 'parts' => [{ 'text' => 'It is 21:15.' }] },
      'finishReason' => 'STOP',
      'safetyRatings' =>
      [{ 'category' => 'HARM_CATEGORY_HARASSMENT', 'probability' => 'NEGLIGIBLE' },
       { 'category' => 'HARM_CATEGORY_HATE_SPEECH', 'probability' => 'NEGLIGIBLE' },
       { 'category' => 'HARM_CATEGORY_SEXUALLY_EXPLICIT', 'probability' => 'NEGLIGIBLE' },
       { 'category' => 'HARM_CATEGORY_DANGEROUS_CONTENT', 'probability' => 'NEGLIGIBLE' }] }],
   'usageMetadata' => { 'promptTokenCount' => 5, 'candidatesTokenCount' => 9, 'totalTokenCount' => 14 } }]

New Functionalities and APIs

Google may launch a new endpoint that we haven't covered in the Gem yet. If that's the case, you may still be able to use it through the request method. For example, stream_generate_content is just a wrapper for google/models/gemini-pro:streamGenerateContent, which you can call directly like this:

result = client.request(
  'streamGenerateContent',
  { contents: { role: 'user', parts: { text: 'hi!' } } }
)

Request Options

Adapter

To enable streaming, the gem uses Faraday with the Typhoeus adapter by default.

You can use a different adapter if you want:

require 'faraday/net_http'

client = Gemini.new(
  credentials: { service: 'vertex-ai-api', region: 'us-east4' },
  options: {
    model: 'gemini-pro',
    connection: { adapter: :net_http }
  }
)

Timeout

You can set the maximum number of seconds to wait for the request to complete with the timeout option:

client = Gemini.new(
  credentials: { service: 'vertex-ai-api', region: 'us-east4' },
  options: {
    model: 'gemini-pro',
    connection: { request: { timeout: 5 } }
  }
)

You can also have more fine-grained control over Faraday's Request Options if you prefer:

client = Gemini.new(
  credentials: { service: 'vertex-ai-api', region: 'us-east4' },
  options: {
    model: 'gemini-pro',
    connection: {
      request: {
        timeout: 5,
        open_timeout: 5,
        read_timeout: 5,
        write_timeout: 5
      }
    }
  }
)

Error Handling

Rescuing

require 'gemini-ai'

begin
  client.stream_generate_content({
    contents: { role: 'user', parts: { text: 'hi!' } }
  })
rescue Gemini::Errors::GeminiError => error
  puts error.class # Gemini::Errors::RequestError
  puts error.message # 'the server responded with status 500'

  puts error.payload
  # { contents: [{ role: 'user', parts: { text: 'hi!' } }],
  #   generationConfig: { candidateCount: 1 },
  #   ...
  # }

  puts error.request
  # #<Faraday::ServerError response={:status=>500, :headers...
end

For Short

require 'gemini-ai/errors'

begin
  client.stream_generate_content({
    contents: { role: 'user', parts: { text: 'hi!' } }
  })
rescue GeminiError => error
  puts error.class # Gemini::Errors::RequestError
end

Errors

GeminiError

MissingProjectIdError
UnsupportedServiceError
BlockWithoutServerSentEventsError

RequestError

Development

bundle
rubocop -A

bundle exec ruby spec/tasks/run-client.rb

Purpose

This Gem is designed to provide low-level access to Gemini, enabling people to build abstractions on top of it. If you are interested in more high-level abstractions or more user-friendly tools, you may want to consider Nano Bots 💎 🤖.

Publish to RubyGems

gem build gemini-ai.gemspec

gem signin

gem push gemini-ai-3.2.0.gem

Updating the README

Install Babashka:

curl -s https://raw.githubusercontent.com/babashka/babashka/master/install | sudo bash

Update the template.md file and then:

bb tasks/generate-readme.clj

Trick for automatically updating the README.md when template.md changes:

sudo pacman -S inotify-tools # Arch / Manjaro
sudo apt-get install inotify-tools # Debian / Ubuntu / Raspberry Pi OS
sudo dnf install inotify-tools # Fedora / CentOS / RHEL

while inotifywait -e modify template.md; do bb tasks/generate-readme.clj; done

Trick for Markdown Live Preview:

pip install -U markdown_live_preview

mlp README.md -p 8076

Resources and References

These resources and references may be useful throughout your learning process.

Disclaimer

This is not an official Google project, nor is it affiliated with Google in any way.

This software is distributed under the MIT License. This license includes a disclaimer of warranty. Moreover, the authors assume no responsibility for any damage or costs that may result from using this project. Use the Gemini AI Ruby Gem at your own risk.

gemini-ai's People

Contributors

Stargazers

Watchers

Forkers

niciliketo hitech777 whuhxb cassimahmedattia patentfield shimojik nesichen vetsuccess bouchard hshdhshehhs ay0tgourmetart obi-a jamesiarmes surendharreddy

gemini-ai's Issues

JFYI - Your code also works with Ultra and Embeddings!

I've tried to substitute 'gemini-pro' with other strings provided by the Gemini Vertex AI playground code, and JFYI they both works:

GeminiModel = "gemini-1.0-pro-001" # This works as 'gemini-pro'
GeminiModel = "gemini-1.0-ultra-001" # this also works! Yuppie!

JFYI - Looking fwd to speaking to you in person (Laurencio is arranging a chat).

Riccardo

Ways to pass temperature or max tokens?

Thanks for this project!

Is there a way to pass temperature or max tokens? Ideally I'd tell it to respond and keep it under X length - I can sometimes get it to work passing that in the request (under 200 characters) but sometimes you'll ask it something and it'll reply with ridiculously long answer.

I tried looking at the rest API but can't make head or tails of it - any tips on finding useful docs for google apis? They all seem to be automatic dumps without any actual info. https://cloud.google.com/vertex-ai/docs/reference/rest

Streaming is not working

Hey @gbaptista,
Good job on the integration with Gemini models! I was trying this gem out for a project of mine and I couldn't get streaming to actually work. I tried connecting to both vertex-ai-api and generative-language-api. As well as passing server_sent_events: true option to both client initializer and stream_generate_content method.

Initializing a client

client = Gemini.new(
  credentials: {
    service: 'vertex-ai-api',
    file_path: 'google-credentials.json',
    region: 'us-east4'
  },
  options: { model: 'gemini-pro', server_sent_events: true }
)

Running a request

client.stream_generate_content({ contents: { role: 'user', parts: { text: 'generate a short story...' } } }, server_sent_events: true) do |event|
  puts event
end

It just spits out all the puts and return results all at once. I thought there was some output buffering happening so that it outputs everything together, but the block gets called as data comes in, so I tried this:

client.stream_generate_content({ contents: { role: 'user', parts: { text: 'generate a short story...' } } }, server_sent_events: true) do |event|
  puts Time.now
end

and the output was:

2024-01-22 11:56:19.639536 +0700
2024-01-22 11:56:19.639734 +0700
2024-01-22 11:56:19.640454 +0700
2024-01-22 11:56:19.640596 +0700
2024-01-22 11:56:19.640801 +0700
2024-01-22 11:56:19.640929 +0700
2024-01-22 11:56:19.641091 +0700
2024-01-22 11:56:19.641239 +0700
2024-01-22 11:56:19.641432 +0700
2024-01-22 11:56:19.641593 +0700

So, even though the request took several seconds as the model generated a longish story all the blocks were executed in the same second when the final result was returned from stream_generate_content method.

Am I missing something that's needed to make streaming work? Or is it the way it works - you get response chunked, but chunks arrive at the same time?

support for Application default credentials

Would you consider adding support for this?
I have proposed a PR here #1

Support for system instructions

Does it support giving system instructions while creating the model?

Example in python:

model = genai.GenerativeModel(model_name="gemini-1.5-pro-latest",
                              generation_config=generation_config,
                              system_instruction=system_instruction,
                              safety_settings=safety_settings)

Using Safety Settings

Hi, @gbaptista.

Is it possible to use Safety Settings in the gem? I'm attempting to analyze certain content, but the process concludes due to a high HARM_CATEGORY_DANGEROUS_CONTENT reason. According to the Gemini API documentation, Safety Settings can be applied in this context, but I've been unsuccessful in make it work with the gem. Specifically, I'm using gemini-pro-vision to scan an image of a pesticide.

safetySettings: {
HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT: HarmBlockThreshold.BLOCK_NONE,
}

I will appreciate your help with this. Thanks.

My api result is an array with the response broken up over multiple candidates

I'm wondering if I'm missing a configuration. My response is broken up into numerous parts. I know I could combine them, but I'm wondering why this result is different than what I get when I run a curl command to hit the API.

I run this:

    client = Gemini.new(
      credentials: {
        service: "generative-language-api",
        api_key: ENV["GEMINI_API_KEY"],
      },
      options: { model: "gemini-pro", stream: false },
    )

    result = client.stream_generate_content({
      contents: { role: "user", parts: { text: "Write an essay on the history of Canada." } },
    })

and I get this response:

{
  "candidates"=>[
    {
      "content"=>{
        "parts"=>[
          {
            "text"=>"The history of Canada is a rich and complex tapestry of Indigenous civilizations, European exploration"
          }
        ],
        "role"=>"model"
      },
      "finishReason"=>"STOP",
      "index"=>0,
      "safetyRatings"=>[
        {
          "category"=>"HARM_CATEGORY_SEXUALLY_EXPLICIT",
          "probability"=>"NEGLIGIBLE"
        },
        {
          "category"=>"HARM_CATEGORY_HATE_SPEECH",
          "probability"=>"NEGLIGIBLE"
        },
        {
          "category"=>"HARM_CATEGORY_HARASSMENT",
          "probability"=>"NEGLIGIBLE"
        },
        {
          "category"=>"HARM_CATEGORY_DANGEROUS_CONTENT",
          "probability"=>"NEGLIGIBLE"
        }
      ]
    }
  ],
  "promptFeedback"=>{
    "safetyRatings"=>[
      {
        "category"=>"HARM_CATEGORY_SEXUALLY_EXPLICIT",
        "probability"=>"NEGLIGIBLE"
      },
      {
        "category"=>"HARM_CATEGORY_HATE_SPEECH",
        "probability"=>"NEGLIGIBLE"
      },
      {
        "category"=>"HARM_CATEGORY_HARASSMENT",
        "probability"=>"NEGLIGIBLE"
      },
      {
        "category"=>"HARM_CATEGORY_DANGEROUS_CONTENT",
        "probability"=>"NEGLIGIBLE"
      }
    ]
  }
}{
  "candidates"=>[
    {
      "content"=>{
        "parts"=>[
          {
            "text"=>", colonization, confederation, and nation-building. Spanning thousands of years, it is a story of diverse peoples, cultures, and events that have shaped"
          }
        ],
        "role"=>"model"
      },
      "finishReason"=>"STOP",
      "index"=>0,
      "safetyRatings"=>[
        {
          "category"=>"HARM_CATEGORY_SEXUALLY_EXPLICIT",
          "probability"=>"NEGLIGIBLE"
        },
        {
          "category"=>"HARM_CATEGORY_HATE_SPEECH",
          "probability"=>"NEGLIGIBLE"
        },
        {
          "category"=>"HARM_CATEGORY_HARASSMENT",
          "probability"=>"NEGLIGIBLE"
        },
        {
          "category"=>"HARM_CATEGORY_DANGEROUS_CONTENT",
          "probability"=>"NEGLIGIBLE"
        }
      ]
    }
  ]
} ... etc ...
{
  "candidates"=>[
    {
      "content"=>{
        "parts"=>[
          {
            "text"=>"th century, Canada played a major role in both World Wars, and its contributions helped to establish the country as a respected member of the international community. After the Second World War, Canada experienced a period of rapid economic growth and social change, leading to the Quiet Revolution in Quebec and the rise of a more diverse and multicultural society.\n\nToday, Canada is a modern, democratic, and prosperous country with a rich history and culture. It is a nation built on the principles of peace, order, and good government, and it continues to play an active role in global affairs, promoting peace, security, and human rights around the world."
          }
        ],
        "role"=>"model"
      },
      "finishReason"=>"STOP",
      "index"=>0,
      "safetyRatings"=>[
        {
          "category"=>"HARM_CATEGORY_SEXUALLY_EXPLICIT",
          "probability"=>"NEGLIGIBLE"
        },
        {
          "category"=>"HARM_CATEGORY_HATE_SPEECH",
          "probability"=>"NEGLIGIBLE"
        },
        {
          "category"=>"HARM_CATEGORY_HARASSMENT",
          "probability"=>"NEGLIGIBLE"
        },
        {
          "category"=>"HARM_CATEGORY_DANGEROUS_CONTENT",
          "probability"=>"NEGLIGIBLE"
        }
      ]
    }
  ]
}

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.