I'm a bit lost as to how to actually use stream: true

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

I solved it using the inbuilt node http / https module: <div class="snippet-clipbo

For server-side type can try: <div class="highlight highlight-source-js notr

<a class="user-mention notranslate" data-hovercard-type="user" data-hover

Thanks! <a class="user-mention notranslate" data-hovercard-type="user" da

Comments (157)

schnerd commented on August 17, 2024 222

@gfortaine we actually use @microsoft/fetch-event-source for the playground to do streaming with POST 👍

Thank you all for sharing your solutions here! I agree that @smervs solution currently looks like the best option available for the openai-node package. Here's a more complete example with proper error handling and no extra dependencies:

try {
    const res = await openai.createCompletion({
        model: "text-davinci-002",
        prompt: "It was the best of times",
        max_tokens: 100,
        temperature: 0,
        stream: true,
    }, { responseType: 'stream' });
    
    res.data.on('data', data => {
        const lines = data.toString().split('\n').filter(line => line.trim() !== '');
        for (const line of lines) {
            const message = line.replace(/^data: /, '');
            if (message === '[DONE]') {
                return; // Stream finished
            }
            try {
                const parsed = JSON.parse(message);
                console.log(parsed.choices[0].text);
            } catch(error) {
                console.error('Could not JSON parse stream message', message, error);
            }
        }
    });
} catch (error) {
    if (error.response?.status) {
        console.error(error.response.status, error.message);
        error.response.data.on('data', data => {
            const message = data.toString();
            try {
                const parsed = JSON.parse(message);
                console.error('An error occurred during OpenAI request: ', parsed);
            } catch(error) {
                console.error('An error occurred during OpenAI request: ', message);
            }
        });
    } else {
        console.error('An error occurred during OpenAI request', error);
    }
}

This could probably be refactored into a streamCompletion helper function (that uses either callbacks or es6 generators to emit new messages).

Apologies there's not an easier way to do this within the SDK itself – the team will continue evaluating how to get this added natively, despite the lack of support in the current sdk generator tool we're using.

from openai-node.

smervs commented on August 17, 2024 46

You can use axios stream response type. But you still need to parse the returned data.

const res = await openai.createCompletion({
  model: "text-davinci-002",
  prompt: "Say this is a test",
  max_tokens: 6,
  temperature: 0,
  stream: true,
}, { responseType: 'stream' });

res.on('data', console.log)

from openai-node.

justinmahar commented on August 17, 2024 34

📦 Client and server side streaming solution via npm

Hey everyone! After some tinkering, I've created a working client and server side solution for this. The GitHub project is here, and you can try the client/browser demo here. (Demo is in React but solution is framework agnostic)

You can now drop in support for streaming chat completions in both the server (Node.js) and client (browser) via the npm package openai-ext. This solution was inspired by everyone's work above, especially @YoseptF and @edelauna. Thanks everyone for working on this together.

This solution supports stopping completions, too.

Full usage examples below.

To install via npm:

npm i openai-ext@latest

Browser / Client

👁️ View live demo

Use the following solution in a browser environment:

import { OpenAIExt } from "openai-ext";

// Configure the stream (use type ClientStreamChatCompletionConfig for TypeScript users)
const streamConfig = {
  apiKey: `123abcXYZasdf`, // Your API key
  handler: {
    // Content contains the string draft, which may be partial. When isFinal is true, the completion is done.
    onContent(content, isFinal, xhr) {
      console.log(content, "isFinal?", isFinal);
    },
    onDone(xhr) {
      console.log("Done!");
    },
    onError(error, status, xhr) {
      console.error(error);
    },
  },
};

// Make the call and store a reference to the XMLHttpRequest
const xhr = OpenAIExt.streamClientChatCompletion(
  {
    model: "gpt-3.5-turbo",
    messages: [
      { role: "system", content: "You are a helpful assistant." },
      { role: "user", content: "Tell me a funny joke." },
    ],
  },
  streamConfig
);

// If you'd like to stop the completion, call xhr.abort(). The onDone() handler will be called.
xhr.abort();

Node.js / Server

Use the following solution in a Node.js or server environment:

import { Configuration, OpenAIApi } from 'openai';
import { OpenAIExt } from "openai-ext";

const apiKey = `123abcXYZasdf`; // Your API key
const configuration = new Configuration({ apiKey });
const openai = new OpenAIApi(configuration);

// Configure the stream (use type ServerStreamChatCompletionConfig for TypeScript users)
const streamConfig = {
  openai: openai,
  handler: {
    // Content contains the string draft, which may be partial. When isFinal is true, the completion is done.
    onContent(content, isFinal, stream) {
      console.log(content, "isFinal?", isFinal);
    },
    onDone(stream) {
      console.log('Done!');
    },
    onError(error, stream) {
      console.error(error);
    },
  },
};

const axiosConfig = {
  // ...
};

// Make the call to stream the completion
OpenAIExt.streamServerChatCompletion(
  {
    model: 'gpt-3.5-turbo',
    messages: [
      { role: 'system', content: 'You are a helpful assistant.' },
      { role: 'user', content: 'Tell me a funny joke.' },
    ],
  },
  streamConfig,
  axiosConfig
);

If you'd like to stop the completion, call stream.destroy(). The onDone() handler will be called.

const response = await OpenAIExt.streamServerChatCompletion(...);
const stream = response.data;
stream.destroy();

You can also stop completion using an Axios cancellation in the Axios config.

Let me know if you have any suggestions. PRs always welcome! I'll be adding a live demo to the project's Storybook site soon.

Update: I've added support for Node.js/server chat completion streaming. This library now supports both server and client. Woohoo 😎

Update 2: Live React demo now available -- view here

from openai-node.

Awendel commented on August 17, 2024 24

I solved it using the inbuilt node http / https module:

const prompt = "Sample prompt. What's 2+2?"

const req = https.request({
	hostname:"api.openai.com",
	port:443,
	path:"/v1/completions",
	method:"POST",
	headers:{
		"Content-Type":"application/json",
		"Authorization":"Bearer "+ KEY_API  
	}
}, function(res){
	res.on('data', (chunk) => {
		console.log("BODY: "+chunk);
	});
	res.on('end', () => {
		console.log('No more data in response.');
	});
})

const body = JSON.stringify({
	model:"text-davinci-003",
	prompt:prompt,
	temperature:0.6,
	max_tokens:512,
	top_p:1.0,
	frequency_penalty:0.5,
	presence_penalty:0.7,
	stream:true
})

req.on('error', (e) => {
	console.error("problem with request:"+e.message);
		});

req.write(body)

req.end()

from openai-node.

edelauna commented on August 17, 2024 24

For server-side typescript can try:

const response = await openai.createChatCompletion({
    model: "gpt-3.5-turbo",
    messages: messages,
    stream: true,
}, { responseType: 'stream' });

const stream = response.data as unknown as IncomingMessage

stream.on('data', (chunk: Buffer) => {
   // Messages in the event stream are separated by a pair of newline characters.
   const payloads = chunk.toString().split("\n\n")
   for (const payload of payloads) {
       if (payload.includes('[DONE]')) return;
       if (payload.startsWith("data:")) {
           const data = payload.replaceAll(/(\n)?^data:\s*/g, ''); // in case there's multiline data event
           try {
               const delta = JSON.parse(data.trim())
               console.log(delta.choices[0].delta?.content)
           } catch (error) {
               console.log(`Error with JSON.parse and ${payload}.\n${error}`)
           }
       }
   }
})

stream.on('end', () => console.log('Stream done'))
stream.on('error', (e: Error) => console.error(e))

from openai-node.

smervs commented on August 17, 2024 16

@smervs your code is working for me, but it logs as
<Buffer 64 61 74 61 3a 20 7b 22 69 64 22 3a 20 22 63 6d 70 6c 2d 36 4a 6e 56 35 4d 70 4d 41 44 4f 41 61 56 74 50 64 30 56 50 72 45 42 4f 62 34 48 54 6c 22 2c ... 155 more bytes>
Do you know how to parse this response?

here

res.data.on('data', data => console.log(data.toString()))

from openai-node.

brianfoody commented on August 17, 2024 15

This format still waits and gives you the entire response at the end though no? Is there not a way to get the results as they stream back as per the OpenAI frontend?

from openai-node.

gfortaine commented on August 17, 2024 14

@gfortaine we actually use @microsoft/fetch-event-source for the playground to do streaming with POST 👍

try {
    const res = await openai.createCompletion({
        model: "text-davinci-002",
        prompt: "It was the best of times",
        max_tokens: 100,
        temperature: 0,
        stream: true,
    }, { responseType: 'stream' });
    
    res.data.on('data', data => {
        const lines = data.toString().split('\n').filter(line => line.trim() !== '');
        for (const line of lines) {
            const message = line.replace(/^data: /, '');
            if (message === '[DONE]') {
                return; // Stream finished
            }
            try {
                const parsed = JSON.parse(message);
                console.log(parsed.choices[0].text);
            } catch(error) {
                console.error('Could not JSON parse stream message', message, error);
            }
        }
    });
} catch (error) {
    if (error.response?.status) {
        console.error(error.response.status, error.message);
        error.response.data.on('data', data => {
            const message = data.toString();
            try {
                const parsed = JSON.parse(message);
                console.error('An error occurred during OpenAI request: ', parsed);
            } catch(error) {
                console.error('An error occurred during OpenAI request: ', message);
            }
        });
    } else {
        console.error('An error occurred during OpenAI request', error);
    }
}

This could probably be refactored into a streamCompletion helper function (that uses either callbacks or es6 generators to emit new messages).

@schnerd Here it is (streamCompletion helper function code inspired by this snippet, courtesy of @rauschma) 👍 :

// https://2ality.com/2018/04/async-iter-nodejs.html#generator-%231%3A-from-chunks-to-lines
async function* chunksToLines(chunksAsync) {
  let previous = "";
  for await (const chunk of chunksAsync) {
    const bufferChunk = Buffer.isBuffer(chunk) ? chunk : Buffer.from(chunk);
    previous += bufferChunk;
    let eolIndex;
    while ((eolIndex = previous.indexOf("\n")) >= 0) {
      // line includes the EOL
      const line = previous.slice(0, eolIndex + 1).trimEnd();
      if (line === "data: [DONE]") break;
      if (line.startsWith("data: ")) yield line;
      previous = previous.slice(eolIndex + 1);
    }
  }
}

async function* linesToMessages(linesAsync) {
  for await (const line of linesAsync) {
    const message = line.substring("data :".length);

    yield message;
  }
}

async function* streamCompletion(data) {
  yield* linesToMessages(chunksToLines(data));
}

try {
  const completion = await openai.createCompletion(
    {
      model: "text-davinci-003",
      max_tokens: 100,
      prompt: "It was the best of times",
      stream: true,
    },
    { responseType: "stream" }
  );

  for await (const message of streamCompletion(completion.data)) {
    try {
      const parsed = JSON.parse(message);
      const { text } = parsed.choices[0];

      process.stdout.write(text);
    } catch (error) {
      console.error("Could not JSON parse stream message", message, error);
    }
  }

  process.stdout.write("\n");
} catch (error) {
  if (error.response?.status) {
    console.error(error.response.status, error.message);

    for await (const data of error.response.data) {
      const message = data.toString();

      try {
        const parsed = JSON.parse(message);

        console.error("An error occurred during OpenAI request: ", parsed);
      } catch (error) {
        console.error("An error occurred during OpenAI request: ", message);
      }
    }
  } else {
    console.error("An error occurred during OpenAI request", error);
  }
}

from openai-node.

danneu commented on August 17, 2024 14

There are examples above that show how to consume the token stream with events and callbacks, but a potentially simpler alternative is to use an async iterator that yields tokens, even if you ultimately want to just push each token into a stream or callback anyways.

Usage:

const messages = [
    { role: 'user', content: 'what is the meaning of life?' }
]

let answer = ''
for await (const token of streamChatCompletion(messages)) {
    answer += token
    // do something async here if you want
}
console.log('answer finished:', answer)

The implementation is easy because an Axios' response set to streaming already lets you consume its chunks with an async iterator:

const { OpenAIApi } = require('openai')
const openai = new OpenAIApi(...)

async function* streamChatCompletion(messages) {
    const response = await openai.createChatCompletion(
        {
            model: 'gpt-3.5-turbo',
            messages,
            stream: true,
        },
        {
            responseType: 'stream',
        },
    )

    for await (const chunk of response.data) {
        const lines = chunk
            .toString('utf8')
            .split('\n')
            .filter((line) => line.trim().startsWith('data: '))

        for (const line of lines) {
            const message = line.replace(/^data: /, '')
            if (message === '[DONE]') {
                return
            }

            const json = JSON.parse(message)
            const token = json.choices[0].delta.content
            if (token) {
                yield token
            }
        }
    }
}

You can see this impl in my Telegram bot: https://github.com/danneu/telegram-chatgpt-bot/blob/24b76f880094b87a5c0a9a42c3571bbecfb12caa/openai.ts#L25

Or, instead of returning an async iterator, you just replace yield token with stream.push(token) or onToken(token) or whatever makes most sense for your app (remembering to change function* back to function).

from openai-node.

smervs commented on August 17, 2024 13

Thanks! @smervs currently getting: Property 'on' does not exist on type 'AxiosResponse<CreateCompletionResponse, any>' when trying though - have you had any luck?

can you try this?

res.data.on('data', console.log)

from openai-node.

gfortaine commented on August 17, 2024 13

Here is a fetch based client fully generated from SDK auto-generation tool 🎉 cc @schnerd @santimirandarp @blakeross @gtokman @dan-kwiat : #45 (comment)

(Bonus : it is wrapped by @vercel/fetch to provide retry (429 Network Error, ...) & DNS caching)

import { createConfiguration, OpenAIApi } from "@fortaine/openai";
import { streamCompletion } from "@fortaine/openai/stream";

import dotenv from "dotenv-flow";
dotenv.config({
  node_env: process.env.APP_ENV || process.env.NODE_ENV || "development",
  silent: true,
});

const configurationOpts = {
  authMethods: {
    apiKeyAuth: {
      accessToken: process.env.OPENAI_API_KEY,
    },
  },
};

const configuration = createConfiguration(configurationOpts);

const openai = new OpenAIApi(configuration);

try {
  const completion = await openai.createCompletion({
    model: "text-davinci-003",
    prompt: "1,2,3,",
    max_tokens: 193,
    temperature: 0,
    stream: true,
  });

  for await (const message of streamCompletion(completion)) {
    try {
      const parsed = JSON.parse(message);
      const { text } = parsed.choices[0];

      process.stdout.write(text);
    } catch (error) {
      console.error("Could not JSON parse stream message", message, error);
    }
  }
  process.stdout.write("\n");
} catch (error) {
  if (error.code) {
    try {
      const parsed = JSON.parse(error.body);
      console.error("An error occurred during OpenAI request: ", parsed);
    } catch (error) {
      console.error("An error occurred during OpenAI request: ", error);
    }
  } else {
    console.error("An error occurred during OpenAI request", error);
  }
}

from openai-node.

gfortaine commented on August 17, 2024 11

@schnerd Please find a PR : #45, as well as an updated example. Comments are welcome 👍 :

http://www.github.com/gfortaine/fortbot

import { Configuration, OpenAIApi } from "@fortaine/openai";
import { streamCompletion } from "@fortaine/openai/stream";

const configuration = new Configuration({
  apiKey: process.env.OPENAI_API_KEY,
});
const openai = new OpenAIApi(configuration);

try {
  const completion = await openai.createCompletion(
    {
      model: "text-davinci-003",
      max_tokens: 100,
      prompt: "It was the best of times",
      stream: true,
    },
    { responseType: "stream" }
  );

  for await (const message of streamCompletion(completion.data)) {
    try {
      const parsed = JSON.parse(message);
      const { text } = parsed.choices[0];

      process.stdout.write(text);
    } catch (error) {
      console.error("Could not JSON parse stream message", message, error);
    }
  }

  process.stdout.write("\n");
} catch (error) {
  if (error.response?.status) {
    console.error(error.response.status, error.message);

    for await (const data of error.response.data) {
      const message = data.toString();

      try {
        const parsed = JSON.parse(message);

        console.error("An error occurred during OpenAI request: ", parsed);
      } catch (error) {
        console.error("An error occurred during OpenAI request: ", message);
      }
    }
  } else {
    console.error("An error occurred during OpenAI request", error);
  }
}

from openai-node.

blakeross commented on August 17, 2024 11

@gfortaine This solution works great with next.js API endpoints running on localhost. But once you deploy to Vercel, streaming responses via serverless functions are prohibited by AWS Lambda. You can get around this limitation by switching to next.js' experimental new Edge runtime, but then as far as I can tell that doesn't work with axios... which your solution relies on. So I still haven't found a way to actually stream openAI responses via next.js in production. Any ideas?

from openai-node.

FaceMr commented on August 17, 2024 10

java okHttpClient

BufferedSource source = response.body().source(); Buffer buffer = new Buffer(); StringBuilder result = new StringBuilder(); while (!source.exhausted()) { long count = response.body().source().read(buffer, 8192); // handle data in buffer. String r = buffer.readUtf8(); log.info("result:" + r); result.append(r); buffer.clear(); }

result eg ：
(非常多的这样的数据)
data: {"id": "cmpl-xxxx", "object": "text_completion", "created": 1672230176, "choices": [{"text": "\u672f", "index": 0, "logprobs": null, "finish_reason": null}], "model": "text-davinci-003"}
data: {"id": "cmpl-xxxx", "object": "text_completion", "created": 1672230176, "choices": [{"text": "\uff1a", "index": 0, "logprobs": null, "finish_reason": null}], "model": "text-davinci-003"}

from openai-node.

Awendel commented on August 17, 2024 10

Yes I also found this strange, sometimes the OpenAI API returns multiple segments of
data: {}
that are not comma seperated and hence hard to parse as JSON
What I did:
string replace all "data: {" with ", {" instead of the first occurence (there just use "{")

then it can be parsed via JSON.parse, and one can extract all the text parts via .choices[0].text

from openai-node.

commented on August 17, 2024 10

In my use case streams is more useful for the request data though, so that you can concatenate the results from different requests.

There is no dependency here apart from dotenv.

This is for the response anyways. Uses fetch which is now built into node v19 (and prev. versions using experimental flags)

See code

import * as dotenv from 'dotenv';

// I just used a story as a string with backticks
import { text } from './string.mjs';
dotenv.config();

const apiUrl = 'https://api.openai.com/v1/completions';
const apiKey = process.env.OPENAI_API_KEY;

const fetchOptions = {
  method: 'POST',
  headers: {
    Accept: 'application/json',
    'Content-Type': 'application/json',
    Authorization: `Bearer ${apiKey}`,
  },
  body: JSON.stringify({
    model: 'text-davinci-003',
    //queues the model to return a summary, works fine.
    prompt: `Full Text: ${text}
         Summary:`,
    temperature: 0,
    max_tokens: 1000,
    presence_penalty: 0.0,
    stream: true,
    //    stop: ['\n'],
  }),
};

fetch(apiUrl, fetchOptions).then(async (response) => {
  const r = response.body;
  if (!r) throw new Error('No response body');
   
  const d = new TextDecoder('utf8');
  const reader = await r.getReader();
  let fullText = ''
  while (true) {
    const { value, done } = await reader.read();
    if (done) {
      console.log('done');
      break;
    } else {
      const decodedString = d.decode(value);
      console.log(decodedString);
      try {
        //fixes string not json-parseable otherwise
        fullText += JSON.parse(decodedString.slice(6)).choices[0].text;
      } catch (e) {
        // the last line is data: [DONE] which is not parseable either, so we catch that.
        console.log(
          e, '\n\n\n\n'
          'But parsed string is below\n\n\n\n',
        );
        console.log(fullText);
      }
    }
  }
});

Also simplest code without any library:

See code

/* eslint-disable camelcase */
import * as dotenv from 'dotenv';

import { text } from './string.mjs';

//populates `process.env` with .env variables
dotenv.config();

const apiUrl = 'https://api.openai.com/v1/completions';
const apiKey = process.env.OPENAI_API_KEY;

const fetchOptions = {
  method: 'POST',
  headers: {
    Accept: 'application/json',
    'Content-Type': 'application/json',
    Authorization: `Bearer ${apiKey}`,
  },
  body: JSON.stringify({
    model: 'text-davinci-003',
    prompt: `Full Text: ${text}
         Summary:`,
    temperature: 0,
    max_tokens: 1000,
    presence_penalty: 0.0,
    //    stream: true,
    //    stop: ['\n'],
  }),
};

fetch(apiUrl, fetchOptions).then(async (response) => {
  const r = await response.json();
  console.log(r);
});

from openai-node.

lennartle commented on August 17, 2024 9

My take on it, with punctuation detection to prevent response over-spamming

const openAiCompletion = async (messages, onText) => {
    try {
        const response = await fetch('https://api.openai.com/v1/chat/completions', {
            method: 'POST',
            headers: {
                'Authorization': `Bearer ${OPENAI_TOKEN}`,
                'Content-Type': 'application/json',
            },
            body: JSON.stringify({
                messages,
                model: "gpt-3.5-turbo",
                max_tokens: 2048,
                stream: true
            })
        });

        const decoder = new TextDecoder('utf8');
        const reader = response.body.getReader();

        let fullText = ''
        let lastFire = 0

        async function read() {
            const { value, done } = await reader.read();

            if (done) return onText(fullText)

            const delta = decoder.decode(value).match(/"delta":\s*({.*?"content":\s*".*?"})/)?.[1]

            if (delta) {
                const content = JSON.parse(delta).content

                fullText += content

                //Detects punctuation, if yes, fires onText once per .5 sec
                if (/[\p{P}\p{S}]/u.test(content)) {
                    const now = Date.now();

                    if (now - lastFire > 500) {
                        lastFire = now
                        onText(fullText)
                    }
                }
            }

            await read()

        }

        await read()

        return fullText
    } catch (error) {
        return error;
    }
}

use it like this

const aiResponse = await openAiCompletion(prompt, (text) => { 
//update UI or whatever
})
        
//do something with full response
console.log(aiResponse)

from openai-node.

YoseptF commented on August 17, 2024 9

I've no idea why none of the anwers posted before worked for me, what ended up working is something similar to what #18 (comment) said.

So in case there's someone else still looking for another options, here's what ended up working for me:

 const chat= async (
    prompt: string,
    previousChats: IpreviousChats[],
  ) => {
    const apiKey = window.localStorage.getItem(LOCAL_STORAGE_KEY);
    const url = "https://api.openai.com/v1/chat/completions";

    const xhr = new XMLHttpRequest();
    xhr.open("POST", url);
    xhr.setRequestHeader("Content-Type", "application/json");
    xhr.setRequestHeader("Authorization", "Bearer " + apiKey);

    xhr.onprogress = function(event) {
      console.log("Received " + event.loaded + " bytes of data.");
      console.log("Data: " + xhr.responseText);
      const newUpdates = xhr.responseText
      .replace("data: [DONE]", "")
      .trim()
      .split('data: ')
      .filter(Boolean)

      
      const newUpdatesParsed = newUpdates.map((update) => {
        const parsed = JSON.parse(update);
        return parsed.choices[0].delta?.content || '';
      }
      );

      const newUpdatesJoined = newUpdatesParsed.join('')
      console.log('current message so far',newUpdatesJoined);
    };

    xhr.onreadystatechange = function() {
      if (xhr.readyState === 4) {
        if (xhr.status === 200) {
          console.log("Response complete.");
          console.log("Final data: " + xhr.responseText);
        } else {
          console.error("Request failed with status " + xhr.status);
        }
      }
    };

    const data = JSON.stringify({
      model: currentChatModelRef.current,
      messages: [
        ...previousChats,
        {
          role: "user",
          content: prompt,
        }],
      temperature: 0.5,
      frequency_penalty: 0,
      presence_penalty: 0,
      stream: true,
    });

    xhr.send(data);
  }

2023-03-30.15-11-23.mp4

good luck everyone :DDD

from openai-node.

josephrocca commented on August 17, 2024 8

I'm using code that's roughly like this:

let response = await fetch("https://api.openai.com/v1/engines/davinci/completions",
  {
    headers: {
      "Content-Type": "application/json",
      Authorization: "Bearer " + OPENAI_KEY,
    },
    method: "POST",
    body: JSON.stringify({
      prompt: selected,
      temperature: 0.75,
      top_p: 0.95,
      max_tokens: 10,
      stream: true,
      stop: ["\n\n"],
    }),
  }
);

const reader = es.body?.pipeThrough(new TextDecoderStream()).getReader();

while (true) {
  const res = await reader?.read();
  if (res?.done) break;
  console.log(res?.value);
}

Look at the console.log output to see the format - you have to trim the data: part off, and then JSON.parse() it.

from openai-node.

Munkyfoot commented on August 17, 2024 7

Here's a quick and dirty workaround.

Edit: If you are using NextJS, a better solution can be found here https://vercel.com/blog/gpt-3-app-next-js-vercel-edge-functions.

Server-Side:

// Import the Readable stream module
import { Readable } from "stream";

// Set the response headers
res.setHeader("Content-Type", "text/event-stream");
res.setHeader("Cache-Control", "no-cache");
res.setHeader("Connection", "keep-alive");

// Generate the response using the OpenAI API
const response = await openai.createCompletion({
    prompt: "It was the best of times",
    stream: true,
    ...
}, { responseType: 'stream' });

// Convert the response to a Readable stream (this is a temporary workaround)
const stream = response.data as any as Readable;

// Process the data stream
let streamHead = true; // Flag to indicate whether a message begins the stream or is a continuation
stream.on("data", (chunk) => {
    try {
        // Parse the chunk as a JSON object
        const data = JSON.parse(chunk.toString().trim().replace("data: ", ""));
        console.log(data);
        // Write the text from the response to the output stream
        res.write(JSON.stringify({text: data.choices[0].text, streamHead: streamHead}));
        streamHead = false;
        // Send immediately to allow chunks to be sent as they arrive
        res.flush();
    } catch (error) {
        // End the stream but do not send the error, as this is likely the DONE message from createCompletion
        console.error(error);
        res.end();
    }
});

// Send the end of the stream on stream end
stream.on("end", () => {
    res.end();
});

// If an error is received from the completion stream, send an error message and end the response stream
stream.on("error", (error) => {
    console.error(error);
    res.end(JSON.stringify({ error: true, message: "Error generating response." }));
});

Client-Side:

// Query your endpoint
const res = await fetch('/yourapi/', {...})
// Create a reader for the response body
const reader = res.body.getReader();
// Create a decoder for UTF-8 encoded text
const decoder = new TextDecoder("utf-8");
let result = "";
// Function to read chunks of the response body
const readChunk = async () => {
    return reader.read().then(({ value, done }) => {
        if (!done) {
            const dataString = decoder.decode(value);
            const data = JSON.parse(dataString);
            console.log(data);

            if (data.error) {
                console.error("Error while generating content: " + data.message);
            } else {
                result = data.streamHead ? data.text : result + data.text;
                return readChunk();
            }
        } else {
            console.log("done");
        }
    });
};

await readChunk();

The result variable is updated as the content arrives.

from openai-node.

willguest commented on August 17, 2024 6

i finally managed to get the stream working on chrome, but not firefox. The route i picked yields tokens with the async iterator and write them as the body of the response, and that is picked up as a readable stream.

index.js

setResult('');
  const response = await fetch("/api/generate", {
    method: "POST",
    headers: {
      "Content-Type": "application/json",
    },
    body: JSON.stringify({ 
      querySystem: islandSystemInput, 
      queryUser: islandUserInput 
    }),
  });

  const reader = response.body?.pipeThrough(new TextDecoderStream()).getReader();
  
  while (true) {
    const res = await reader?.read();
    if (res?.value?.toString() !== undefined){
      setResult(result => result + res?.value);
    }
    if (res?.done) break; 
  }
}

generate.js

async function* streamChatCompletion(messages) {
  const completion = await openai.createCompletion(
      {
          model: 'gpt-4-0314',
          messages: messages,
          max_tokens: 10,
          stream: true,
          stop: ["\n\n"],
      },
      {
          responseType: 'stream',
      },
  )

  for await (const chunk of completion.data) {
      const lines = chunk
          .toString('utf8')
          .split('\n')
          .filter((line) => line.trim().startsWith('data: '))

      for (const line of lines) {
          const message = line.replace(/^data: /, '')
          if (message === '[DONE]') {
              return
          }
          const json = JSON.parse(message)
          const token = json.choices[0].delta.content
          if (token) {
            yield token;
          }
      }
  }
}

my generate.js is mostly vanilla otherwise, just calling the above function with a for await loop

I hope this is helpful for someone.

from openai-node.

yairhaimo commented on August 17, 2024 5

I created an MIT library that helps you compose and run AI pipelines (including GPT streaming). It supports Vercel Edge Functions out of the box.
Check it out at https://client.aigur.dev

from openai-node.

zachkirsch commented on August 17, 2024 5

Here's another SDK that supports streaming but in a more first class way https://github.com/fern-openai/openai-node

It also has callbacks for onError and onFinish

If you have feedback, please file an issue on that repo! You can make a PR as a proof of concept, but the SDK is autogenerated by Fern so any code changes will have to go in the generator.

from openai-node.

dan-kwiat commented on August 17, 2024 4

@gtokman @blakeross may be useful: https://github.com/dan-kwiat/openai-edge

from openai-node.

$fracergu avatar$ fracergu commented on August 17, 2024 4

Here you have the custom hook I'm using with React, in case you find it useful.
setInputMessages receives the list of messages from which we expect a completion and triggers the fetch. partialText returns the partial text of the response as it is received in real time, and finally fullText returns the response when it is complete. You have the types I use at the beginning of the file. It still lacks error handling, as it is still under development.
I'm sorry if there is any bad practice or incorrectness, as I'm fairly new to React.

import { useState, useEffect } from 'react'

enum Role {
  ASSISTANT = 'assistant',
  USER = 'user',
}

type Message = {
  role: Role
  content: string
}

const API_URL = 'https://api.openai.com/v1/chat/completions'
const OPENAI_API_KEY = import.meta.env.VITE_OPENAI_API_KEY
const OPENAI_CHAT_MODEL = 'gpt-3.5-turbo'

const utf8Decoder = new TextDecoder('utf-8')

const decodeResponse = (response?: Uint8Array) => {
  if (!response) {
    return ''
  }

  const pattern = /"delta":\s*({.*?"content":\s*".*?"})/g
  const decodedText = utf8Decoder.decode(response)
  const matches: string[] = []

  let match
  while ((match = pattern.exec(decodedText)) !== null) {
    matches.push(JSON.parse(match[1]).content)
  }
  return matches.join('')
}

export const useStreamCompletion = () => {
  const [partialText, setPartialText] = useState('')
  const [fullText, setFullText] = useState('')
  const [inputMessages, setInputMessages] = useState<Message[]>([])
  const abortController = new AbortController()

  useEffect(() => {
    if (!inputMessages.length) return

    const onText = (text: string) => {
      setPartialText(text)
    }

    const fetchData = async () => {
      try {
        const response = await fetch(API_URL, {
          method: 'POST',
          headers: {
            Authorization: `Bearer ${OPENAI_API_KEY}`,
            'Content-Type': 'application/json',
          },
          body: JSON.stringify({
            messages: inputMessages,
            model: OPENAI_CHAT_MODEL,
            stream: true,
          }),
          signal: abortController.signal, // assign the abort controller signal to the fetch request
        })

        if (!response.ok) {
          const error = await response.json()
          throw new Error(error.error)
        }

        if (!response.body) throw new Error('No response body')

        const reader = response.body.getReader()

        let fullText = ''

        async function read() {
          const { value, done } = await reader.read()

          if (done) return onText(fullText)

          const delta = decodeResponse(value)

          if (delta) {
            fullText += delta
            onText(fullText.trim())
          }

          await read()
        }

        await read()

        setFullText(fullText)
      } catch (error) {
        console.error(error)
      }
    }

    fetchData()

    return () => {
      abortController.abort()
    }
  }, [inputMessages])

  return { partialText, fullText, setInputMessages }
}

from openai-node.

raphaelrk commented on August 17, 2024 3

This post really blew up 😆 here's the code I actually wound up using for this, works on both frontend and backend, not yet updated for chat. It's okay code. Rewriting with generators could be nice.

Usage:

await streamOne(modelName, prompt, onToken, onDone, onError, options);

Source

import { Configuration, OpenAIApi } from 'openai';
const OPENAI_API_KEY = 'sk-...';
const configuration = new Configuration({ apiKey: OPENAI_API_KEY });
const api = new OpenAIApi(configuration);

type OtherOptions = {
  maxTokens?: number;
  temp?: number;
  n?: number;
  stop?: string | string[];
}

type CompleteResponse = {
  responses: string[];
  cost: number;
  tokenUsage: number;
  promptTokens: number;
  completionTokens: number;
};

type CompleteOneResponse = {
  response: string;
  cost: number;
  tokenUsage: number;
  promptTokens: number;
  completionTokens: number;
};

export type OpenAIModel =
  'ada' |
  'babbage' |
  'curie' |
  'davinci' |
  'text-ada-001' |
  'text-babbage-001' |
  'text-curie-001' |
  'text-davinci-001' |
  'text-davinci-002' |
  'text-davinci-003' |
  'code-cushman-001' |
  'code-davinci-002';

export const modelToPrice: Record<OpenAIModel, number> = {
  'ada': 0.0004,
  'babbage': 0.0005,
  'curie': 0.002,
  'davinci': 0.02,
  'text-ada-001': 0.0004,
  'text-babbage-001': 0.0005,
  'text-curie-001': 0.002,
  'text-davinci-001': 0.02,
  'text-davinci-002': 0.02,
  'text-davinci-003': 0.02,
  'code-cushman-001': 0.0,
  'code-davinci-002': 0.0,
};

export async function complete(model: OpenAIModel, prompt: string, otherOptions?: OtherOptions): Promise<CompleteResponse> {
  const price = modelToPrice[model];
  if (price === undefined) throw new Error('Unknown model: ' + model);

  let max_tokens = otherOptions?.maxTokens || 80;
  let temperature = otherOptions?.temp || 0.3;
  let n = otherOptions?.n || 1;
  let best_of = n;
  let stop = otherOptions?.stop || '\n';
  let frequency_penalty = 0.2;
  let presence_penalty = 0.1;

  const res = await api.createCompletion({ model, prompt, max_tokens, temperature, n, best_of, stop, frequency_penalty, presence_penalty, stream: false });

  const tokenUsage = res.data.usage.total_tokens;
  const cost = tokenUsage / 1000 * price;
  const responses = res.data.choices.map(choice => choice.text);
  const promptTokens = res.data.usage.prompt_tokens;
  const completionTokens = res.data.usage.completion_tokens;

  return { responses, cost, tokenUsage, promptTokens, completionTokens };
}

export async function completeOne(model: OpenAIModel, prompt: string, otherOptions?: OtherOptions): Promise<CompleteOneResponse> {
  const res = await complete(model, prompt, otherOptions);
  return { response: res.responses[0], cost: res.cost, tokenUsage: res.tokenUsage, promptTokens: res.promptTokens, completionTokens: res.completionTokens };
}

type OnToken = (token: string) => any;
type OnDone = () => any;
type OnError = (msg: string) => any;

// TODO: estimate prompt tokens / cost including prompt tokens
export async function streamOne(model: OpenAIModel, prompt: string, onToken: OnToken, onDone: OnDone, onError: OnError, otherOptions?: OtherOptions): Promise<void> {
  // verify model and get price
  const price = modelToPrice[model];
  if (price === undefined) throw new Error('Unknown model: ' + model);

  // set options
  let max_tokens = otherOptions?.maxTokens || 80;
  let temperature = otherOptions?.temp || 0.3;
  let n = otherOptions?.n || 1;
  let best_of = n;
  let stop = otherOptions?.stop || '\n';
  let frequency_penalty = 0.2;
  let presence_penalty = 0.1;

  // create stream
  const fetchPromise = fetch(`https://api.openai.com/v1/completions`, {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'Authorization': `Bearer ${OPENAI_API_KEY}`,
    },
    body: JSON.stringify({ model, prompt, max_tokens, temperature, n, best_of, stop, frequency_penalty, presence_penalty, stream: true }),
  });
  const response = await fetchPromise;
  const reader = response.body.getReader();
  const decoder = new TextDecoder();

  // keep track of tokens
  let concat = '';
  let completionTokenCount = 0;

  // read stream
  let gotReaderDone = false;
  let gotDoneMessage = false;
  while (true) {
    // get next chunk
    const { done, value } = await reader.read();
    if (done) {
      gotReaderDone = true;
      break;
    }
    const text = decoder.decode(value);

    // split chunk into lines
    // todo: there's probs a better way to do this
    const lines = text.split('\n').filter(line => line.trim() !== '');
    for (const line of lines) {
      // remove the data: prefix
      const lineMessage = line.replace(/^data: /, '');

      // if we got the done message, stop
      if (lineMessage === '[DONE]') {
        gotDoneMessage = true;
        break; // return;
      }

      // try to parse the line as JSON, and if it works, get the token and call the callback
      try {
        const parsed = JSON.parse(lineMessage);
        const token = parsed.choices[0].text;
        concat += token;
        completionTokenCount++;
        onToken(token);
      } catch (error) {
        // todo: handle error better -- retry? inform caller?
        console.error(`Could not JSON parse stream message`, { text, lines, line, lineMessage, error });

        try {
          let errorInfo = JSON.parse(text);
          console.error(`Error info`, errorInfo);
          if (errorInfo.message) return onError(errorInfo.message);
          if (errorInfo.error.message) return onError(errorInfo.error.message);
        } catch (error) {
          // ignore
        }
      }
    }
    if (gotDoneMessage) break;
  }

  const cost = completionTokenCount / 1000 * price;
  console.log(`Streamed ${completionTokenCount} tokens. $${cost}`);
  console.log('Final text:', concat);
  onDone();
}

from openai-node.

ponytojas commented on August 17, 2024 3

My take on it, with punctuation detection to prevent response over-spamming

const openAiCompletion = async (messages, onText) => {
    try {
        const response = await fetch('https://api.openai.com/v1/chat/completions', {
            method: 'POST',
            headers: {
                'Authorization': `Bearer ${OPENAI_TOKEN}`,
                'Content-Type': 'application/json',
            },
            body: JSON.stringify({
                messages,
                model: "gpt-3.5-turbo",
                max_tokens: 2048,
                stream: true
            })
        });

        const decoder = new TextDecoder('utf8');
        const reader = response.body.getReader();

        let fullText = ''
        let lastFire = 0

        async function read() {
            const { value, done } = await reader.read();

            if (done) return onText(fullText)

            const delta = decoder.decode(value).match(/"delta":\s*({.*?"content":\s*".*?"})/)?.[1]

            if (delta) {
                const content = JSON.parse(delta).content

                fullText += content

                //Detects punctuation, if yes, fires onText once per .5 sec
                if (/[\p{P}\p{S}]/u.test(content)) {
                    const now = Date.now();

                    if (now - lastFire > 500) {
                        lastFire = now
                        onText(fullText)
                    }
                }
            }

            await read()

        }

        await read()

        return fullText
    } catch (error) {
        return error;
    }
}

use it like this

const aiResponse = await openAiCompletion(prompt, (text) => { 
//update UI or whatever
})
        
//do something with full response
console.log(aiResponse)

I have found some errors in the execution of the code due to the regex because sometimes I have found multiple deltas.

I leave here my solution in case it could be useful for someone.

  __cleanResponse = (res) => {
          const deltas = [];
          const splitted = res.split('\n');
          let finished = false;
          for (let i = 0; i < splitted.length; i += 1) {
              try {
                  let test = splitted[i];
                  if (test === 'data: [DONE]' || test.indexOf('data: [DONE]') >= 0) {
                      finished = true;
                      continue;
                  }
                  if (test.length === 0) continue;
                  if (test.startsWith('data:')) test = test.slice(5);
                  test = JSON.parse(test);
                  if (test.choices[0].delta.content) deltas.push(test.choices[0].delta.content);
              } catch (err) {
                  console.error(Error during message clean: " + err);
                  continue;
              }
          }
          return { deltas, finished };
      }

    __readResponse = async (reader, previousText = '', cbStream = () => { }) => {
        let fullText = previousText;
        const decoder = new TextDecoder('utf8');

        const { value, done } = await reader.read();

        if (done) return cbStream(fullText);
        const { deltas, finished } = this.__cleanResponse(decoder.decode(value));
        let inserted = false;
        if (deltas.length > 0) {
            for (let deltaIndex = 0; deltaIndex < deltas.length; deltaIndex += 1) {
                if (!inserted) inserted = true;
                fullText += deltas[deltaIndex];
            }
            if (inserted) cbStream(fullText);
        }
        if (!finished) fullText += await this.__readResponse(reader, fullText, cbStream);
        return fullText;
    }

  function ask() {
    let fullText = '';
    fetch('https://api.openai.com/v1/chat/completions', {
                method: 'POST',
                headers: {
                    Authorization: "Bearer" + this.apiKey,
                    'Content-Type': 'application/json',
                },
                body: JSON.stringify({
                    messages,
                    model: 'gpt-3.5-turbo',
                    stream: true,
                }),
            }).then(async (res) => {
                const reader = res.body.getReader();
                fullText += await this.__readResponse(reader, fullText, cbStream);
            }).catch((err) => {
                error = true;
                console.error(err);
            });
  }

from openai-node.

FurriousFox commented on August 17, 2024 3

I'm not sure if this is useful for you guys, but here's my modified version of eventsource with added support for setting method and payload/body which should be sufficient to have a fully featured SSE connection to openai https://gist.github.com/FurriousFox/f43eaf9645302e51ab01cf0b1853aa4e

something like this should then work

const EventSource = require("./eventsource.js");

let es = new EventSource("https://api.openai.com/v1/chat/completions", {
    method: "POST",
    headers: {
        "Content-Type": "application/json",
        "Authorization": `Bearer ${api_key}`
    },
    payload: JSON.stringify({
        "model": "gpt-3.5-turbo",
        "messages": [{
            "role": "system",
            "content": instruction
        }, {
            "role": "user",
            "content": prompt
        }],
        stream: true,
    }),
});
es.onmessage = (e) => {
    if (e.data == "[DONE]") {
        es.close();
    } else {
        let delta = JSON.parse(e.data).choices[0].delta.content;
        if (delta) {
            console.log(delta);
        }
    }
};

from openai-node.

MatchuPitchu commented on August 17, 2024 3

I've developed a custom hook in React with TypeScript to stream data from the OpenAI API. This solution is based on several approaches listed here in the comments.
https://github.com/MatchuPitchu/open-ai

The custom hook allows streaming API responses, as well as canceling the response stream and resetting the messages context. Additionally, the project supports code syntax highlighting and the ability to copy code snippets to the clipboard, as well as displaying additional meta data for each response.

import { useCallback, useState } from 'react';
import type { DeepRequired } from '@/utils/type-helpers';

export type GPT35 = 'gpt-3.5-turbo' | 'gpt-3.5-turbo-0301';
export type GPT4 = 'gpt-4' | 'gpt-4-0314' | 'gpt-4-32k' | 'gpt-4-32k-0314';
export type Model = GPT35 | GPT4;

export type ChatRole = 'user' | 'assistant' | 'system' | '';

export type ChatCompletionResponseMessage = {
  content: string; // content of the completion
  role: ChatRole; // role of the person/AI in the message
};

export type ChatMessageToken = ChatCompletionResponseMessage & {
  timestamp: number;
};

export type ChatMessageParams = ChatCompletionResponseMessage & {
  timestamp?: number; // timestamp of completed request
  meta?: {
    loading?: boolean; // completion state
    responseTime?: string; // total elapsed time between completion start and end
    chunks?: ChatMessageToken[]; // returned chunks of completion stream
  };
};

export type ChatMessage = DeepRequired<ChatMessageParams>;

export type ChatCompletionChunk = {
  id: string;
  object: string;
  created: number;
  model: Model;
  choices: {
    delta: Partial<ChatCompletionResponseMessage>;
    index: number;
    finish_reason: string | null;
  }[];
};

type RequestOptions = {
  headers: Record<string, string>;
  method: 'POST';
  body: string;
  signal: AbortSignal;
};

export type OpenAIStreamingProps = {
  apiKey: string;
  model: Model;
};

const OPENAI_COMPLETIONS_URL = 'https://api.openai.com/v1/chat/completions';
const MILLISECONDS_PER_SECOND = 1000;

const updateLastItem = <T>(currentItems: T[], updatedLastItem: T) => {
  const newItems = currentItems.slice(0, -1);
  newItems.push(updatedLastItem);
  return newItems;
};

// transform chat message structure with metadata to a limited shape that OpenAI API expects
const getOpenAIRequestMessage = ({ content, role }: ChatMessage): ChatCompletionResponseMessage => ({
  content,
  role
});

const getOpenAIRequestOptions = (
  apiKey: string,
  model: Model,
  messages: ChatMessage[],
  signal: AbortSignal
): RequestOptions => ({
  headers: {
    'Content-Type': 'application/json',
    Authorization: `Bearer ${apiKey}`
  },
  method: 'POST',
  body: JSON.stringify({
    model,
    messages: messages.map(getOpenAIRequestMessage),
    // TODO: define value: max_tokens: 100,
    stream: true
  }),
  signal
});

// transform chat message into a chat message with metadata
const createChatMessage = ({ content, role, meta }: ChatMessageParams): ChatMessage => ({
  content,
  role,
  timestamp: Date.now(),
  meta: {
    loading: false,
    responseTime: '',
    chunks: [],
    ...meta
  }
});

export const useOpenAIChatStream = ({ model, apiKey }: OpenAIStreamingProps) => {
  const [messages, setMessages] = useState<ChatMessage[]>([]);
  const [controller, setController] = useState<AbortController | null>(null);
  const [isLoading, setIsLoading] = useState<boolean>(false);

  const resetMessages = () => setMessages([]);

  const abortStream = () => {
    // abort fetch request by calling abort() on the AbortController instance
    if (!controller) return;
    controller.abort();
    setController(null);
  };

  const closeStream = (startTimestamp: number) => {
    // determine the final timestamp, and calculate the number of seconds the full request took.
    const endTimestamp = Date.now();
    const differenceInSeconds = (endTimestamp - startTimestamp) / MILLISECONDS_PER_SECOND;
    const formattedDiff = `${differenceInSeconds.toFixed(2)}s`;

    // update last entry of message list with the final details
    setMessages((prevMessages) => {
      const lastMessage = prevMessages.at(-1);
      if (!lastMessage) return [];

      const updatedLastMessage = {
        ...lastMessage,
        timestamp: endTimestamp,
        meta: {
          ...lastMessage.meta,
          loading: false,
          responseTime: formattedDiff
        }
      };

      return updateLastItem(prevMessages, updatedLastMessage);
    });
  };

  const submitPrompt = useCallback(
    async (newPrompt: ChatMessageParams[]) => {
      // a) no new request if last stream is loading
      // b) no request if empty string as prompt
      if (isLoading || !newPrompt[0].content) return;

      setIsLoading(true);

      const startTimestamp = Date.now();
      const chatMessages: ChatMessage[] = [...messages, ...newPrompt.map(createChatMessage)];

      const newController = new AbortController();
      const signal = newController.signal;
      setController(newController);

      try {
        const response = await fetch(
          OPENAI_COMPLETIONS_URL,
          getOpenAIRequestOptions(apiKey, model, chatMessages, signal)
        );

        if (!response.body) return;
        // read response as data stream
        const reader = response.body.getReader();
        const decoder = new TextDecoder('utf-8');

        // placeholder for next message that will be returned from API
        const placeholderMessage = createChatMessage({ content: '', role: '', meta: { loading: true } });
        let currentMessages = [...chatMessages, placeholderMessage];

        // eslint-disable-next-line no-constant-condition
        while (true) {
          const { done, value } = await reader.read();
          if (done) {
            closeStream(startTimestamp);
            break;
          }
          // parse chunk of data
          const chunk = decoder.decode(value);
          const lines = chunk.split(/(\n){2}/);

          const parsedLines: ChatCompletionChunk[] = lines
            .map((line) => line.replace(/(\n)?^data:\s*/, '').trim()) // remove 'data:' prefix
            .filter((line) => line !== '' && line !== '[DONE]') // remove empty lines and "[DONE]"
            .map((line) => JSON.parse(line)); // parse JSON string

          for (const parsedLine of parsedLines) {
            let chunkContent: string = parsedLine.choices[0].delta.content ?? '';
            chunkContent = chunkContent.replace(/^`\s*/, '`'); // avoid empty line after single backtick
            const chunkRole: ChatRole = parsedLine.choices[0].delta.role ?? '';

            // update last message entry in list with the most recent chunk
            const lastMessage = currentMessages.at(-1);
            if (!lastMessage) return;

            const updatedLastMessage = {
              content: `${lastMessage.content}${chunkContent}`,
              role: `${lastMessage.role}${chunkRole}` as ChatRole,
              timestamp: 0,
              meta: {
                ...lastMessage.meta,
                chunks: [
                  ...lastMessage.meta.chunks,
                  {
                    content: chunkContent,
                    role: chunkRole,
                    timestamp: Date.now()
                  }
                ]
              }
            };

            currentMessages = updateLastItem(currentMessages, updatedLastMessage);
            setMessages(currentMessages);
          }
        }
      } catch (error) {
        if (signal.aborted) {
          console.error(`Request aborted`, error);
        } else {
          console.error(`Error during chat response streaming`, error);
        }
      } finally {
        setController(null); // reset AbortController
        setIsLoading(false);
      }
    },
    [apiKey, isLoading, messages, model]
  );

  return { messages, submitPrompt, resetMessages, isLoading, abortStream };
};

from openai-node.

schnerd commented on August 17, 2024 2

Unfortunately streaming is not currently supported by this library 😢

I'm not sure if the SDK auto-generation tool we use (openai-generator) is able to support event streams. Will have to do more research.

The python openai package does support it: https://pypi.org/project/openai/

If anyone knows of a good way to consume server-sent events in Node (that also supports POST requests), please share!

from openai-node.

LasseSander commented on August 17, 2024 2

Thanks! @smervs currently getting: Property 'on' does not exist on type 'AxiosResponse<CreateCompletionResponse, any>' when trying though - have you had any luck?

from openai-node.

gfortaine commented on August 17, 2024 2

Many thanks for this very insightful discussion 👍

As a side note, it looks like that one could consume Server-Sent Events in Node and at the same supports POST requests (even if it is not spec compliant given that only GET requests should be allowed) cc @schnerd :

• @microsoft/fetch-event-source

• launchdarkly-eventsource

However, it appears that we would lose all the benefits of SDK auto-generation tool. Moreover, it seems that the only TS generator supporting stream at the time of writing is the axios one (typescript-fetch doesn’t expose a method to consume the body as stream).

Hence, @smervs' answer is perfectly valid and should be the accepted one. However, we could enhance it, especially regarding the parser because a few options exist. By example, if we take the one from a customized @microsoft/fetch-event-source (note : the package has been specially retrofitted for the purpose by exporting ./parse), here is the result :

http://www.github.com/gfortaine/fortbot

import { Configuration, OpenAIApi } from "openai";
import * as parse from "@fortaine/fetch-event-source/parse";

const configuration = new Configuration({
  apiKey: process.env.OPENAI_API_KEY,
});
const openai = new OpenAIApi(configuration);

const prompt = "Hello world";
// https://help.openai.com/en/articles/4936856-what-are-tokens-and-how-to-count-them
const max_tokens = 4097 - prompt.length;

const completion = await openai.createCompletion(
  {
    model: "text-davinci-003",
    max_tokens,
    prompt,
    stream: true,
  },
  { responseType: "stream" }
);

completion.data.on(
  "data",
  parse.getLines(
    parse.getMessages((event) => {
      const { data } = event;

      // https://beta.openai.com/docs/api-reference/completions/create#completions/create-stream
      if (data === "[DONE]") {
        process.stdout.write("\n");
        return;
      }

      const { text } = JSON.parse(data).choices[0];
      process.stdout.write(text);
    })
  )
);

from openai-node.

shawnswed commented on August 17, 2024 2

@gfortaine we actually use @microsoft/fetch-event-source for the playground to do streaming with POST 👍

try {
    const res = await openai.createCompletion({
        model: "text-davinci-002",
        prompt: "It was the best of times",
        max_tokens: 100,
        temperature: 0,
        stream: true,
    }, { responseType: 'stream' });
    
    res.data.on('data', data => {
        const lines = data.toString().split('\n').filter(line => line.trim() !== '');
        for (const line of lines) {
            const message = line.replace(/^data: /, '');
            if (message === '[DONE]') {
                return; // Stream finished
            }
            try {
                const parsed = JSON.parse(message);
                console.log(parsed.choices[0].text);
            } catch(error) {
                console.error('Could not JSON parse stream message', message, error);
            }
        }
    });
} catch (error) {
    if (error.response?.status) {
        console.error(error.response.status, error.message);
        error.response.data.on('data', data => {
            const message = data.toString();
            try {
                const parsed = JSON.parse(message);
                console.error('An error occurred during OpenAI request: ', parsed);
            } catch(error) {
                console.error('An error occurred during OpenAI request: ', message);
            }
        });
    } else {
        console.error('An error occurred during OpenAI request', error);
    }
}

This could probably be refactored into a streamCompletion helper function (that uses either callbacks or es6 generators to emit new messages).

Hi. Thanks for the great code. It works great in straight Node.js but in React it throws a 'res.data.on is not a function error. Maybe something to do with Webpack. Any insight would be appreciated. Thanks again.

from openai-node.

shawnswed commented on August 17, 2024 2

Hi everyone.@smervs solution works great with straight Node.js but in React it throws a 'res.data.on() is not a function error. Maybe something to do with Webpack. Any insight would be appreciated. Thanks again.

from openai-node.

Munkyfoot commented on August 17, 2024 2

Here is a much better solution than the hacky one I posted earlier: https://vercel.com/blog/gpt-3-app-next-js-vercel-edge-functions

I found that you need to modify the return new Response(stream) line to return new Response(stream, { headers: { 'Content-Type': 'text/event-stream' } }) in the code from the Edge example to get it to stream the text as it comes in.

from openai-node.

leducgiachoang commented on August 17, 2024 2

Tôi đã tạo một thư viện MIT giúp bạn soạn và chạy các quy trình AI (bao gồm cả phát trực tuyến GPT). Nó hỗ trợ các Chức năng Vercel Edge ngay lập tức. Kiểm tra nó tại https://client.aigur.dev

@yairhaimo How to integrate for pure Vuejs application, using javascripts

from openai-node.

drorm commented on August 17, 2024 2

I released https://github.com/drorm/gish which has a full example of streaming. If you just want to see it in action, just click on the screencast in the README.
The code is at https://github.com/drorm/gish/blob/c88d39fcc97150d3107cf84eca3051fa3b18cd14/src/LLM.ts#L103 , based on a lot of the info in here. Thank you. My license is so steal at your pleasure, MIT :-).

from openai-node.

arthcmr commented on August 17, 2024 2

Nice work @justinmahar your solution is super easy to use and works great with Next.js and React apps 💯

from openai-node.

atonamy commented on August 17, 2024 2

index.js

setResult('');
  const response = await fetch("/api/generate", {
    method: "POST",
    headers: {
      "Content-Type": "application/json",
    },
    body: JSON.stringify({ 
      querySystem: islandSystemInput, 
      queryUser: islandUserInput 
    }),
  });

  const reader = response.body?.pipeThrough(new TextDecoderStream()).getReader();
  
  while (true) {
    const res = await reader?.read();
    if (res?.value?.toString() !== undefined){
      setResult(result => result + res?.value);
    }
    if (res?.done) break; 
  }
}

generate.js

async function* streamChatCompletion(messages) {
  const completion = await openai.createCompletion(
      {
          model: 'gpt-4-0314',
          messages: messages,
          max_tokens: 10,
          stream: true,
          stop: ["\n\n"],
      },
      {
          responseType: 'stream',
      },
  )

  for await (const chunk of completion.data) {
      const lines = chunk
          .toString('utf8')
          .split('\n')
          .filter((line) => line.trim().startsWith('data: '))

      for (const line of lines) {
          const message = line.replace(/^data: /, '')
          if (message === '[DONE]') {
              return
          }
          const json = JSON.parse(message)
          const token = json.choices[0].delta.content
          if (token) {
            yield token;
          }
      }
  }
}

my generate.js is mostly vanilla otherwise, just calling the above function with a for await loop

I hope this is helpful for someone.

It will work in Firefox if u set Content-Type header in the response of /api/generate endpoint this is know issue still didn't fix.

from openai-node.

zachariahtimothy commented on August 17, 2024 2

Inspired by @YoseptF I was able to accomplish the same using just the openai SDK. This utilized Typescript, React, and Zustand.

type ConversationMessage = CreateChatCompletionRequest["messages"][0] & {
  id?: string;
};

export interface ChatCompletionSlice {
  conversationMessages: ConversationMessage[];
  createChatCompletion: (
    request: Omit<CreateChatCompletionRequest, "model">
  ) => Promise<void>;
  resetConversation: () => void;
}

export const createChatCompletionSlice: StateCreator<
  MainSlice & ChatCompletionSlice,
  [],
  [],
  ChatCompletionSlice
> = (set, get) => ({
  conversationMessages: [],
  createChatCompletion: async (request) => {
    const { messages: requestMessages, stream, ...restRequest } = request;
    // Add users message in
    set({
      conversationMessages: get().conversationMessages.concat(requestMessages),
    });
    const messages = get().conversationMessages.map(
      ({ id, ...restMessage }) => restMessage
    );
    const requestData: CreateChatCompletionRequest = {
      messages,
      stream,
      ...restRequest,
      model: get().selectedModelId,
    };

    if (stream) {
      openAi.createChatCompletion(requestData, {
        onDownloadProgress(event: ProgressEvent) {
          const target = event.target as XMLHttpRequest;
          const newUpdates = target.responseText
            .replace("data: [DONE]", "")
            .trim()
            .split("data: ")
            .filter(Boolean);
          let id = "";
          const newUpdatesParsed: string[] = newUpdates.map((update) => {
            const parsed = JSON.parse(update);
            id = parsed.id;
            return parsed.choices[0].delta?.content || "";
          });
          const newUpdatesJoined = newUpdatesParsed.join("");
          const existingMessages = get().conversationMessages.map((x) => x);
          const existingMessageIndex = existingMessages.findLastIndex(
            (x) => x.role === "assistant" && x.id === id
          );

          if (existingMessageIndex !== -1) {
            existingMessages[existingMessageIndex].content = newUpdatesJoined;
            set({
              conversationMessages: existingMessages,
            });
          } else {
            set({
              conversationMessages: existingMessages.concat([
                {
                  role: "assistant",
                  content: newUpdatesJoined,
                  id,
                },
              ]),
            });
          }
        },
      });
    } else {
      const response = await openAi.createChatCompletion(requestData);

      if (response.data) {
        const newMessages: CreateChatCompletionRequest["messages"] =
          response.data.choices
            .filter((x) => x.message !== undefined)
            .map((x) => ({
              role: x.message!.role,
              content: x.message!.content,
            }));
        set({
          conversationMessages: get().conversationMessages.concat(newMessages),
        });
      }
    }
  },
  resetConversation: () => {
    set({ conversationMessages: [] });
  },
});

from openai-node.

mattgabor commented on August 17, 2024 1

@smervs your code is working for me, but it logs as

<Buffer 64 61 74 61 3a 20 7b 22 69 64 22 3a 20 22 63 6d 70 6c 2d 36 4a 6e 56 35 4d 70 4d 41 44 4f 41 61 56 74 50 64 30 56 50 72 45 42 4f 62 34 48 54 6c 22 2c ... 155 more bytes>

Do you know how to parse this response?

from openai-node.

blakeross commented on August 17, 2024 1

@gfortaine Have got it working using fetch directly instead of the openAI lib but I believe there's a bug with chunksToLine. It appears to assume that chunks will be >= 1 line, but chunks can actually be part of a line. @rauschma's original implementation addresses this.

from openai-node.

xithalius commented on August 17, 2024 1

@shawnswed You can use onDownloadProgress which you can pass to the createCompletion options. You can use @schnerd 's snippet on progressEvent.currentTarget.response.

from openai-node.

xithalius commented on August 17, 2024 1

Please move the conversation about the Aigur Client elsewhere.

Note for those using the new ChatGPT model in combination with streaming. The snippet from @schnerd has to be updated from parsed.choices[0].text to parsed.choices[0].delta.content. The typings of CreateChatCompletionResponseChoicesInner do not match when using streaming.

from openai-node.

$fracergu avatar$ fracergu commented on August 17, 2024 1

Related to @ponytojas solution I have found that most of the time the first response contains several token blocks, so when the regular expression is applied it only takes the first one and discards the others

I've added this function for decoding:

const utf8Decoder = new TextDecoder('utf-8')

const decodeResponse = (response?: Uint8Array) => {
  if (!response) {
    return ''
  }

  const pattern = /"content"\s*:\s*"([^"]*)"/g
  const decodedText = utf8Decoder.decode(response)
  const matches: string[] = []

  let match
  while ((match = pattern.exec(decodedText)) !== null) {
    matches.push(match[1])
  }

  return matches.join('')
}

And used it on read() function of the approach, also remove JSON.Parse():

  ...
  async function read() {
    const { value, done } = await reader.read()

    if (done) return onText(fullText)

    const delta = decodeResponse(value)

    if (delta) {
      fullText += delta

      //Detects punctuation, if yes, fires onText once per .5 sec
      if (/[\p{P}\p{S}]/u.test(delta)) {
        const now = Date.now()

        if (now - lastFire > 500) {
          lastFire = now
          onText(fullText)
        }
      }
    }

    await read()
  }
  ...

Now I'm getting all the tokens:

I hope you find it helpful

from openai-node.

shezhangzhang commented on August 17, 2024 1

@darknoon Yes, it works. But I created an issue with it, because it didn't handle the error chunk.
Nutlope/twitterbio#25

from openai-node.

Yafaa commented on August 17, 2024 1

for the example with openai-ext it's fine but with this https://github.com/openai/openai-node/issues/18#issuecomment-1369996933 and #18 (comment) it throws TypeError: completion.data.on is not a function

from openai-node.

MentalGear commented on August 17, 2024 1

OpenAI Streams Library

There's a new node.js lib specific for streams, which should render this problem solved. (Make sure to check them out, and maybe make a PR regarding whisper.)

https://github.com/SpellcraftAI/openai-streams

from openai-node.

keraf commented on August 17, 2024

If anyone knows of a good way to consume server-sent events in Node (that also supports POST requests), please share!

This can be done with the request method of Node's https API. You can create a request with the options you want (such as POST as a method) and then read the streamed data using the data event on the response. You can also use the close event to know when the request has finished.

from openai-node.

schnerd commented on August 17, 2024

Thanks @keraf, we'll try to look into getting this working soon.

from openai-node.

Awendel commented on August 17, 2024

I second this, streaming experience is currently not good and only seems to return all chunks in bulk instead of as they come in.

This is especially problematic with large responses, where it takes a long time to finish - a much better user experience would be to show early tokens as they come in - really just being able to match Playground UX.

A pure HTTP example using request / curl would also be fine for now, would be happy to create a higher level utility function once I see a working example

from openai-node.

gtokman commented on August 17, 2024

@blakeross do you have any sample code on how you got it to work with next.js and vercel? Wouldn't the lambda finish if you sent a response back to the client?

from openai-node.

blakeross commented on August 17, 2024

@gtokman it works if you use Vercel's new Edge runtime functions

from openai-node.

DerBasler commented on August 17, 2024

@shawnswed I am facing the same issue:
Property 'on' does not exist on type 'CreateCompletionResponse'
🤔 I assume that we all using "openai": "^3.1.0",
I saw the pr from @gfortaine #45 so hopefully this one will soon be in
In the mean time I will try to somehow trick ts to ignore type and try to see if it works anyway. I hope I remember to update you ^^

from openai-node.

shawnswed commented on August 17, 2024

Thanks, DerBasler. Please keep me in the loop.

from openai-node.

microsoftbuild commented on August 17, 2024

Thanks for the neat implementation @schnerd

I am using this with the listFineTuneEvents() and getting similar error as reported by @DerBasler :
Property 'on' does not exist on type 'ListFineTuneEventsResponse'.

Currently on "openai": "^3.1.0"

from openai-node.

smadikanti commented on August 17, 2024

Running into The provided value 'stream' is not a valid enum value of type XMLHttpRequestResponseType.

from openai-node.

MarkoTelek commented on August 17, 2024

Here's a quick and dirty workaround.

Edit: If you are using NextJS, a better solution can be found here https://vercel.com/blog/gpt-3-app-next-js-vercel-edge-functions.

Server-Side:

// Import the Readable stream module
import { Readable } from "stream";

// Set the response headers
res.setHeader("Content-Type", "text/event-stream");
res.setHeader("Cache-Control", "no-cache");
res.setHeader("Connection", "keep-alive");

// Generate the response using the OpenAI API
const response = await openai.createCompletion({
    prompt: "It was the best of times",
    stream: true,
    ...
}, { responseType: 'stream' });

// Convert the response to a Readable stream (this is a temporary workaround)
const stream = response.data as any as Readable;

// Process the data stream
let streamHead = true; // Flag to indicate whether a message begins the stream or is a continuation
stream.on("data", (chunk) => {
    try {
        // Parse the chunk as a JSON object
        const data = JSON.parse(chunk.toString().trim().replace("data: ", ""));
        console.log(data);
        // Write the text from the response to the output stream
        res.write(JSON.stringify({text: data.choices[0].text, streamHead: streamHead}));
        streamHead = false;
        // Send immediately to allow chunks to be sent as they arrive
        res.flush();
    } catch (error) {
        // End the stream but do not send the error, as this is likely the DONE message from createCompletion
        console.error(error);
        res.end();
    }
});

// Send the end of the stream on stream end
stream.on("end", () => {
    res.end();
});

// If an error is received from the completion stream, send an error message and end the response stream
stream.on("error", (error) => {
    console.error(error);
    res.end(JSON.stringify({ error: true, message: "Error generating response." }));
});

Client-Side:

// Query your endpoint
const res = await fetch('/yourapi/', {...})
// Create a reader for the response body
const reader = res.body.getReader();
// Create a decoder for UTF-8 encoded text
const decoder = new TextDecoder("utf-8");
let result = "";
// Function to read chunks of the response body
const readChunk = async () => {
    return reader.read().then(({ value, done }) => {
        if (!done) {
            const dataString = decoder.decode(value);
            const data = JSON.parse(dataString);
            console.log(data);

            if (data.error) {
                console.error("Error while generating content: " + data.message);
            } else {
                result = data.streamHead ? data.text : result + data.text;
                return readChunk();
            }
        } else {
            console.log("done");
        }
    });
};

await readChunk();

The result variable is updated as the content arrives.

Hey. How do I get the flush function in the NextApiReponse? It works when I run it on my machine, but when I deploy to Vercel, then it throws an Error because there's no flush function in the NextApiReponse.

from openai-node.

Munkyfoot commented on August 17, 2024

Hey. How do I get the flush function in the NextApiReponse? It works when I run it on my machine, but when I deploy to Vercel, then it throws an Error because there's no flush function in the NextApiReponse.

I highly recommend using this implementation instead. https://vercel.com/blog/gpt-3-app-next-js-vercel-edge-functions

There's a version for a standard Next api route and one that uses Edge. If your project allows, I highly recommend using the Edge version as it will save on resources and comes with some other advantages.

from openai-node.

MarkoTelek commented on August 17, 2024

Hey. How do I get the flush function in the NextApiReponse? It works when I run it on my machine, but when I deploy to Vercel, then it throws an Error because there's no flush function in the NextApiReponse.

I highly recommend using this implementation instead. https://vercel.com/blog/gpt-3-app-next-js-vercel-edge-functions

There's a version for a standard Next api route and one that uses Edge. If your project allows, I highly recommend using the Edge version as it will save on resources and comes with some other advantages.

I've been trying that one as well. But it seems like these edge functions run on client side? Because I can't get it to run prisma/database, nor do I see a way to use useSession with it. I couldn't even get it to run level db. How do I distinguish between users using the api without a database with edge function, or am I missing something? I remember reading somewhere that a lot of npm packages are not supported with edge functions but I wasnt able to find the source of it.

from openai-node.

maceip commented on August 17, 2024

But it seems like these edge functions run on client side? Because I can't get it to run prisma/database, nor do I see a way to use useSession with it.

use https://www.npmjs.com/package/@auth/core, the web compatible version of nextauth (useSession is not included)

Edge functions are technically server side but don't support most node apis

from openai-node.

yairhaimo commented on August 17, 2024

@yairhaimo How to integrate for pure Vuejs application, using javascripts

Im not really familiar with Vue so I can't give exact instructions but if you want to run the Pipelines on the frontend you just need to compose them (check out the "getting started" section in the docs) and just call them using the invoke method (also in the "getting started"). If you want to move it to the server side (invoking from the client but running on the server) it depends which server and infrastructure you're using.

Just be aware that if you're running it on the client side your OpenAI key will be exposed and it ~~might~~ will be abused by people.

from openai-node.

shitianfang commented on August 17, 2024

我创建了一个 MIT 库，可帮助你编写和运行 AI 管道（包括 GPT 流式处理）。它支持开箱即用的Vercel Edge功能。在 https://client.aigur.dev 查看

@yairhaimo Will there be a delay time limit for vercel Free, such as 10 seconds, or will stream reduce the time to resolve the limit

from openai-node.

yairhaimo commented on August 17, 2024

我创建了一个 MIT 库，可帮助你编写和运行 AI 管道（包括 GPT 流式处理）。它支持开箱即用的Vercel Edge功能。在 https://client.aigur.dev 查看

@yairhaimo Will there be a delay time limit for vercel Free, such as 10 seconds, or will stream reduce the time to resolve the limit

Right now the Vercel helper function returns the response only when the Pipeline finishes executing so the 10 second limit on the Hobby plan might an issue. I'll change it to return the response right away if it's a streaming Pipeline so the 10 second timeout wont be an issue.
Meanwhile you can do that yourself if you dont use the Vercel helper and dont want to wait for a fix.

from openai-node.

revmischa commented on August 17, 2024

I just published a really basic completion streamer module: https://www.npmjs.com/package/openai-stream-mini
Has no dependencies, uses built-in node 18 or browser fetch

Give it a try

from openai-node.

raphaelrk commented on August 17, 2024

Also for token estimation I've been using:

tokenizer.ts

// Name:
//   tokenizer.ts
//
// Description:
//   Tokenizes a string into a list of tokens
//   Uses the python tokenizer.py script via the python_util.ts wrapper
//   Supports string and string[] inputs
//
// Example usage:
//   import { tokenize } from './tokenizer';
//
//   let result1 = await tokenize('hello world');
//   console.log(result1); // ['hello', ' world']
//
//   let result2 = await tokenize(['hello world', 'goodbye world']);
//   console.log(result2); // [['hello', ' world'], ['good', 'bye', ' world']]

import { PythonShellInstance } from './python_util';

const tokenizerPyshell = new PythonShellInstance('tokenizer.py');
const chatGPTtokenizerPyshell = new PythonShellInstance('chat_gpt_tokenizer.py');
export type TokenizerEnum = 'cl100k_base' | 'chat_gpt';

export async function tokenizeOne(msg: string, tokenizer: TokenizerEnum = "cl100k_base"): Promise<string[]> {
  let pyshell = tokenizer === 'cl100k_base' ? tokenizerPyshell : chatGPTtokenizerPyshell;
  const { tokens } = await pyshell.sendJsonToPythonAndParseOne<{ tokens: string[] }>({ input: msg });
  return tokens;
}

export async function tokenizeMany(msgs: string[], tokenizer: TokenizerEnum ="cl100k_base"): Promise<string[][]> {
  let results : string[][] = [];
  for (let msg of msgs) {
    results.push(await tokenizeOne(msg, tokenizer));
  }
  return results;
}

export async function tokenize<T extends string | string[]>(msg: T, tokenizer: TokenizerEnum ="cl100k_base"): Promise<T extends string ? string[] : string[][]> {
  if (typeof msg === 'string') {
    return await tokenizeOne(msg, tokenizer) as T extends string ? string[] : string[][];
  } else {
    return await tokenizeMany(msg, tokenizer) as T extends string ? string[] : string[][];
  }
}

python_util.ts (requires python-shell -- npm i python-shell)

// Name:
//   python_util.ts
//
// Description:
//   Utility functions for interacting with python scripts
//   Wraps the python-shell library
//   Given a python script, creates a singleton python shell
//   Sends messages to the python shell and waits for a response
//   Parses the response as JSON
//
// Example usage:
//   import { PythonShellInstance } from './python_util';
//   let pyshell = new PythonShellInstance('tokenizer.py');
//   let result = await pyshell.sendToPythonAndParseOne('hello world');
//   console.log(result);
//
// Note:
//   at some point need to automate this,
//   but be sure to run these commands in the ./py directory:
//     source venv/bin/activate.fish
//     pip install -r requirements.txt

import { PythonShell } from 'python-shell';

export class PythonShellInstance {
  private pyshell: PythonShell;
  private pyshellResolveQueue: ((msg: string) => void)[] = [];

  constructor(scriptName: string) {
    this.pyshell = new PythonShell(scriptName, {
      mode: 'text',
      pythonPath: './py/venv/bin/python',
      pythonOptions: ['-u'],
      scriptPath: './py',
    });

    // when we get a message from python, run the corresponding callback
    this.pyshell.on('message', msg => {
      console.log("Got message from python:", { msg });
      let resolve = this.pyshellResolveQueue.shift();
      if (resolve) resolve(msg);
    });

    // log other events:
    // - close
    // - stderr
    // - pythonError
    // - error (error spawning, killing, or messaging the process)
    this.pyshell.on('close', () => console.log("Python shell closed"));
    this.pyshell.on('stderr', err => console.log("Got stderr from python:", { err }));
    this.pyshell.on('pythonError', err => console.log("Got stderr from python:", { err }));
    this.pyshell.on('error', err => console.log("Got error from python:", { err }));
  }

  // function to send a json message to python and wait for a response
  async sendJsonToPythonAndParseOne<T>(json: object): Promise<T> {
    console.log("Sending json to python:", { json });
    return new Promise(resolve => {
      this.pyshellResolveQueue.push(msg => resolve(JSON.parse(msg)));
      this.pyshell.send(JSON.stringify(json));
    });
  }

  // function to send a string message to python and wait for a response
  async sendTextToPythonAndParseOne<T>(msg: string): Promise<T> {
    console.log("Sending message to python:", { msg });
    return new Promise(resolve => {
      this.pyshellResolveQueue.push(msg => resolve(JSON.parse(msg)));
      this.pyshell.send(msg);
    });
  }
}

requirements.txt

tiktoken==0.2.0

tokenizer.py

#######################################################
# Name:
#   tokenizer.js
#
# Description:
#   tokenizer.py reads a string from stdin, encodes it, and prints the tokens to stdout
#   the input string is expected to be a JSON object with a single key "input"
#   the output is a JSON object with a single key "tokens"
#   the value of "tokens" is an array of strings, each string is a token
#
# Example:
#   $ echo '{"input": "hello world"}' | python tokenizer.py
#   {"tokens": ["hello", " world"]}
#
#######################################################

# import openai's tiktoken library and json
import tiktoken
import json

# initialize the tokenizer
enc = tiktoken.get_encoding("cl100k_base")

# loop: read from stdin, tokenize, print to stdout
while True:
    # try to read a line from stdin
    try:
        raw_line = input()
    except EOFError:
        break

    # parse the line as JSON
    line_json = json.loads(raw_line)

    # get the input string
    input_str = line_json["input"]

    # tokenize the input string
    tokens = enc.encode(input_str)

    # convert the tokens to a list of strings
    tokens_str = [enc.decode([t]) for t in tokens]

    # create dict with the tokens, and print it as JSON
    output_json = {"tokens": tokens_str}
    print(json.dumps(output_json))

chat_gpt_tokenizer.py

#######################################################
# Name:
#   tokenizer.js
#
# Description:
#   tokenizer.py reads a string from stdin, encodes it, and prints the tokens to stdout
#   the input string is expected to be a JSON object with a single key "input"
#   the output is a JSON object with a single key "tokens"
#   the value of "tokens" is an array of strings, each string is a token
#
# Example:
#   $ echo '{"input": "hello world"}' | python tokenizer.py
#   {"tokens": ["hello", " world"]}
#
#######################################################

# import openai's tiktoken library and json
import tiktoken
import json

# initialize the tokenizer
cl100k_base = tiktoken.get_encoding("cl100k_base")
enc = tiktoken.Encoding(
    name="chat-davinci-003",
    pat_str=cl100k_base._pat_str,
    mergeable_ranks=cl100k_base._mergeable_ranks,
    special_tokens={
        **cl100k_base._special_tokens,
        "<|im_start|>": 100264,
        "<|im_end|>": 100265,
        "<|im_sep|>": 100266,
    }
)

# test
tokens = enc.encode(
    "<|im_start|>user\nHello<|im_end|><|im_start|>assistant",
    allowed_special={"<|im_start|>", "<|im_end|>"},
)
assert len(tokens) == 7
assert tokens == [100264, 882, 198, 9906, 100265, 100264, 78191]

# loop: read from stdin, tokenize, print to stdout
while True:
    # try to read a line from stdin
    try:
        raw_line = input()
    except EOFError:
        break

    # parse the line as JSON
    line_json = json.loads(raw_line)

    # get the input string
    input_str = line_json["input"]

    # tokenize the input string
    tokens = enc.encode(
        input_str,
        allowed_special={"<|im_start|>", "<|im_end|>"},
    )

    # convert the tokens to a list of strings
    tokens_str = [enc.decode([t]) for t in tokens]

    # create dict with the tokens, and print it as JSON
    output_json = {"tokens": tokens_str}
    print(json.dumps(output_json))

from openai-node.

syonfox commented on August 17, 2024

Sick raphaelrk thanks what I was looking for check out https://github.com/syonfox/GPT-3-Encoder/tree/GPToken for a js/ts implementation of gpt token encoder. One todo item is to validate it against the reference python implementation but it works for estimation in the browser :)

feel free to open an issue there. Eventually the project will surpass that repo but I want the demo_app to be a decent base.

I am putting together for more proper demo.

One open question is whether anyone knows how the n / best_of works with streaming.
I have found that logprobs: 3 does work but the others don't seem to have much of an effect.

Also, it seems that different models have slight variations on how the response is constructed.

Is this expected or do the models all use a common API interface that is standardized? for the complete API.

Give me the code! node18 fetch with readable streams needed. could find a proper polyfill (needs readable stream support) but using a new node was easiest.

First of all thanks for the great starting point. I think this small lib might be a good base for some other stuff. Just rip out this complexity. still needs a little refining but can update it here or keep an eye on the demo_app

`nano streamOne.js`

// https://github.com/openai/openai-node/issues/18#issuecomment-1463774674
// import { Configuration, OpenAIApi } from 'openai';
// const OPENAI_API_KEY = 'sk-...';
// const configuration = new Configuration({ apiKey: OPENAI_API_KEY });
// const api = new OpenAIApi(configuration);
// type OtherOptions = {
//   maxTokens?: number;
//   temp?: number;
//   n?: number;
//   stop?: string | string[];
// }

const modelToPrice/*: Record<OpenAIModel, number>*/ = {
    'ada': 0.0004,
    'babbage': 0.0005,
    'curie': 0.002,
    'davinci': 0.02,
    'text-ada-001': 0.0004,
    'text-babbage-001': 0.0005,
    'text-curie-001': 0.002,
    'text-davinci-001': 0.02,
    'text-davinci-002': 0.02,
    'text-davinci-003': 0.02,
    'code-cushman-001': 0.0,
    'code-davinci-002': 0.0,
};
const gptoken = require("gptoken");

class TextCompletion {
    constructor(prompt, config) {

        this.done = false;
        streamOne(); // todo this class might be a way of tracking response and streaming it to the user's browser by listening and creating socket events or something.
//just for inspiration not used :)
    }


}

class Token {

    /**
     * Represents all we can know about a token from a streamed response
     * {
     *   "text": "\n",
     *   "index": 0,
     *   "logprobs": {
     *     "tokens": [
     *       "\n"
     *     ],
     *     "token_logprobs": [
     *       -0.4092595
     *     ],
     *     "top_logprobs": null,
     *     "text_offset": [
     *       68
     *     ]
     *   },
     *   "finish_reason": null
     * }
     * @param choice
     */
    constructor(choice) {
        this.text = choice.text;
        this.token = gptoken.encode(this.text);
        this.choice = choice;
        this.log_index = choice.index;
        this.logprobs = choice.logprobs;
        
        this.text_offset = this.logprobs.text_offset[this.log_index];
        this.prob = this.logprobs.token_logprobs[this.log_index];
        
    }
}

function estamateTokens(prompt, response_limit=0) {
    return gptoken.countTokens(prompt) + response_limit
}

async function streamOne(model, prompt, onToken, onDone, onError, otherOptions) {

    // verify model and get price
    const price = modelToPrice[model];
    if (price === undefined) throw new Error('Unknown model: ' + model);


    // set options
    otherOptions = otherOptions || {};
    let max_tokens = otherOptions.maxTokens || 250;
    let temperature = otherOptions.temp || 0.5;// 0 to 1
    let top_p = otherOptions.topP || 0.8 // 0 to 1
    let n = otherOptions.n || 1; // not sure this working with stream
    let logprobs = otherOptions.logProbs || 0; // 0 - 5
    let best_of = otherOptions.bestOf || 1; // not sure this working with stream
    let stop = otherOptions.stop || null; //string or array  Optional  Defaults to null  Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.
    let frequency_penalty = 0.2;
    let presence_penalty = 0.1;
    let endpoint = otherOptions.endpoint || "/v1/completions"


    // create stream
    const fetchPromise = fetch(`https://api.openai.com${endpoint}`, {
        method: 'POST',
        headers: {
            'Content-Type': 'application/json',
            'Authorization': `Bearer ${process.env.OPENAI_API_KEY}`,
        },
        body: JSON.stringify({
            model,
            prompt,
            max_tokens,
            temperature,
            n,
            best_of,
            logprobs,
            stop,
            frequency_penalty,
            presence_penalty,
            stream: true
        }),
    }).catch(e => {
        //catch network error/ fetch bugs
        console.error("Fetch Error: What is wrong? ", e);
        onError(e);
    });


    //handel response
    const response = await fetchPromise;
    const reader = response.body.getReader();
    const decoder = new TextDecoder();


    // keep track of tokens
    // let full_text = prompt; // in c we could make this a buffer of size prompt + limit
    // and insert tokend by index for max performance.
    let tokens = []
    let concat = '';

    let completionTokenCount = 0;

    // read stream
    let gotReaderDone = false;
    let gotDoneMessage = false;
    console.debug("ttfb... openai-processing-time: ", response.headers.get("openai-processing-ms"))

    while (true) {
        // get next chunk
        const {done, value} = await reader.read();
        if (done) {
            gotReaderDone = true;
            break;
        }
        const text = decoder.decode(value);
        // console.log(text);
        // split chunk into lines
        // todo: there's probs a better way to do this idk seems ok.
        const lines = text.split('\n').filter(line => line.trim() !== '');
        // console.log("Number of tokens in chuck: ", lines.length);
        for (const line of lines) {
            // remove the data: prefix
            const lineMessage = line.replace(/^data: /, '');

            // if we got the done message, stop
            if (lineMessage === '[DONE]') {
                gotDoneMessage = true;
                break; // return;
            }

            // try to parse the line as JSON, and if it works, get the token and call the callback
            try {
                const parsed = JSON.parse(lineMessage);
                const choice = parsed.choices[0];
                
                //completion.addToken(choice);

                let myToken = new Token(choice);

                completionTokenCount++; // tokens.length??
                tokens.push(myToken);

                // const logprobs = choice.logprobs;
                // const token = choice.text;
                // let i = choice.index;
                // let prob = logprobs.token_logprobs[i];
                // let text_offset = logprobs.text_offset[i];
                // let text = logprobs.tokens[i];

                // let top_p = logprobs.top_logprobs;
                concat += myToken.text;
                onToken(myToken.text, myToken, parsed, response);
            } catch (error) {
                // todo: handle error better -- retry? inform caller?
                console.error(`Could not JSON parse stream message`, {text, lines, line, lineMessage, error});
                try {
                    let errorInfo = JSON.parse(text);
                    console.error(`Error info`, errorInfo);
                    if (errorInfo.message) return onError(errorInfo.message);
                    if (errorInfo.error.message) return onError(errorInfo.error.message);
                } catch (error) {
                    console.error("Failed to parse error response from stream.")
                    // ignore if we cant read the error
                }
            }
        }
        if (gotDoneMessage) break;
    }
    const fullCount = estamateTokens(prompt+concat)
    const cost = completionTokenCount / 1000 * price;
    const fullCost = fullCount / 1000 * price;

    console.log(` Streamed ${completionTokenCount} tokens. $${cost} ... full cost with prompt(${fullCount}): $${fullCost}`);
    console.log('Final text:', prompt + concat);

    console.log("Full req+res token count: ", );

    // let promptCount =     gptoken.countTokens(prompt)

    onDone(prompt, concat, tokens, fullCost,  response);
}

streamOne.modelToPrice = modelToPrice;


/**
 * A test function to ask gpt for a quote / example usage
 * @returns {Promise<void>}
 */
streamOne.test = async function () {
    function onToken(text, token, parsed, response) {
        console.log("Got Token: ", text, token.prob, token.text_offset )
    }

    function onDone(prompt, data, tokens, cost, response) {
        console.log("Got Data: ", data)
        let fulltext = prompt + data;
        console.log(fulltext)

        console.log("full text token count: ", gptoken.countTokens(fulltext))
    }

    function onError(err) {
        console.error("Got Error:)");
    }

    let data = await streamOne("text-davinci-002",
        "What is the inspirational quote of the day in one sentence? \n QOTD: ",
        onToken, onDone, onError, {stop: ['.'], logProbs: 3});

    //You can't be a real country unless you have a beer and an airline- it helps if you have some kind of a football team, or some nuclear weapons, but at the very least you need a beer

    console.log("Data Resolved:", data);
}
module.exports = streamOne;

/**
 *
 * Hola Mi Amie,
 *
 * Today I will tell you all you need to know about text completion streaming
 *
 * each token is returned on at a time. you have the option to select a different word theoretically
 * to do this turn on logprobs: 3-5 max and then you can offer an editing on a per word bases after compleation.
 *
 * each chunk is returned as data: {... choices: [{text, index, logprobs}]}
 * Each choice looks like not this info we want is
 * text: choice[0].text
 * prob: choice[0].logprobs.token_logprobs[0]
 * full_text_offset: logprobs.text_offset[0]
 *
 */

`nano demo.js`

const gptoken  = require('gptoken');

require('dotenv').config();


const streamOne  = require('./streamOne');

const {encode, decode, countTokens, tokenStats} = gptoken;

const aiia = require("./aiia");
// import aiia from "./aiia"
streamOne.test();

const str = 'This is an example sentence to try encoding out on!'
const encoded = encode(str)
console.log('Encoded this string looks like: ', encoded)

console.log('We can look at each token and what it represents')
for (let token of encoded) {
    console.log({token, string: decode([token])})
}

//example count tokens usage
if (countTokens(str) > 5) {
    console.log("String is over five tokens, inconcevable");
}

`nano .env`

OPENAI_API_KEY="sk-...."
OPENAI_ENDPOINT="https://api.openai.com/"

Running it

mkdir app
cd app
npm init
npm install gptoken dotenv
nano streamOne.js
nano demo.js
nano .env


nvm install v18.15.0 # or find a legit good polyfill for fetch :) .. and lmk
node demo.js

Sample output

Encoded this string looks like:  [
  1212,   318, 281,
  1672,  6827, 284,
  1949, 21004, 503,
   319,     0
]
We can look at each token and what it represents
{ token: 1212, string: 'This' }
{ token: 318, string: ' is' }
{ token: 281, string: ' an' }
{ token: 1672, string: ' example' }
{ token: 6827, string: ' sentence' }
{ token: 284, string: ' to' }
{ token: 1949, string: ' try' }
{ token: 21004, string: ' encoding' }
{ token: 503, string: ' out' }
{ token: 319, string: ' on' }
{ token: 0, string: '!' }
String is over five tokens, inconcevable
String Token Stats:  {
  count: 6,
  unique: 5,
  frequency: { '275': 1, '1031': 1, '2318': 2, '21943': 1, '22944': 1 },
  positions: {
    '275': [ 4 ],
    '1031': [ 5 ],
    '2318': [ 2, 3 ],
    '21943': [ 0 ],
    '22944': [ 1 ]
  },
  tokens: [ 21943, 22944, 2318, 2318, 275, 1031 ]
}
We can decode it back into:
 This is an example sentence to try encoding out on!
Fetching... get :  https://api.openai.com/v1/models
Server listening on port 3000
ttfb... openai-processing-time:  260
Got Token:  
 -0.41367477 68
Got Token:  
 -0.14588133 69
Got Token:  " -0.37068415 70
Got Token:  You -1.591554 71
Got Token:   can -0.598935 74
Got Token:  't -0.011316275 78
Got Token:   be -0.4461555 80
Got Token:   a -0.00020513259 83
Got Token:   real -0.0061629214 85
Got Token:   country -0.0022323753 90
Fine tuning Models:  [
  'cushman:2020-05-03',
  'if-davinci:3.0.0',
  'davinci-if:3.0.0',
  'davinci-instruct-beta:2.0.0'
]
Got Token:   unless -0.000015567284 98
Got Token:   you -3.076318e-7 105
Got Token:   have -0.0000020966954 109
Got Token:   a -0.000014736571 114
Got Token:   beer -0.00026218753 116
Got Token:   and -0.000015805701 121
Got Token:   an -0.00003583558 125
Got Token:   airline -0.00001855031 128
Got Token:  - -0.009223087 136
Got Token:   it -0.015826264 137
Got Token:   helps -0.00012548709 140
Got Token:   if -0.0000034089344 146
Got Token:   you -0.000016282536 149
Got Token:   have -0.000059085025 153
Got Token:   some -0.00004144026 158
Got Token:   kind -0.00048095893 163
Got Token:   of -4.277735e-7 168
Got Token:   a -0.06332793 171
Got Token:   football -0.00010092916 173
Got Token:   team -0.000097351025 182
Got Token:  , -0.0008433579 187
Got Token:   or -0.000088051806 188
Got Token:   some -0.00004871012 191
Got Token:   nuclear -0.0014420545 196
Got Token:   weapons -0.00001306671 204
Got Token:  , -0.000055269407 212
Got Token:   but -0.000027133337 213
Got Token:   at 0 217
Got Token:   the -0.0000061516675 220
Got Token:   very -0.0000071062755 224
Got Token:   least -0.000009251094 229
Got Token:   you -0.000012468796 235
Got Token:   need -0.0000029311614 239
Got Token:   a -0.000009606849 244
Got Token:   beer -0.000016642034 246
 Streamed 45 tokens. $0.0009 ... full cost with prompt(64): $0.00128
Final text: What is the inspirational quote of the day in one sentence? 
 QOTD: 

"You can't be a real country unless you have a beer and an airline- it helps if you have some kind of a football team, or some nuclear weapons, but at the very least you need a beer
Full req+res token count: 
Got Data:  

"You can't be a real country unless you have a beer and an airline- it helps if you have some kind of a football team, or some nuclear weapons, but at the very least you need a beer

What is the inspirational quote of the day in one sentence? 
 QOTD: 

"You can't be a real country unless you have a beer and an airline- it helps if you have some kind of a football team, or some nuclear weapons, but at the very least you need a beer
full text token count:  64
Data Resolved: undefined

from openai-node.

george-i commented on August 17, 2024

kudos to @ponytojas for the regex
const delta = decoder.decode(value).match(/"delta":\s*({.*?"content":\s*".*?"})/)?.[1]

from openai-node.

tsenguunchik commented on August 17, 2024

@fracergu great solution, but the regex doesn't work when the content have double quote in them (an escaped character). Like "content": """ it just shows \ which is wrong. It should show "

from openai-node.

$fracergu avatar$ fracergu commented on August 17, 2024

@tsenguunchik Yes, I made the publication too fast and I'm just now struggling with it. I didn't see the problem until I started requesting code and it started failing to receive double quotes. I will update here when I get a solution.

EDIT:
Problem solved, I recovered the original regex and parsing and now it works pretty well

const decodeResponse = (response?: Uint8Array) => {
  if (!response) {
    return ''
  }

  const pattern = /"delta":\s*({.*?"content":\s*".*?"})/g
  const decodedText = utf8Decoder.decode(response)
  const matches: string[] = []

  let match
  while ((match = pattern.exec(decodedText)) !== null) {
    matches.push(JSON.parse(match[1]).content)
  }
  return matches.join('')
}

Also is not losing the few first tokens that come "in pack"

from openai-node.

darknoon commented on August 17, 2024

I found this code from Hassan at Vercel helpful for streaming the OpenAI api in an edge function

from openai-node.

shezhangzhang commented on August 17, 2024

Does anybody know why the chunks have been split when I deployed it to vercel edge function?

from openai-node.

UncaughtCursor commented on August 17, 2024

@shezhangzhang This chunk-split issue has only happened to me once so far in an hour of testing. I'm running a Node.js instance on localhost right now, not on Vercel. Perhaps it's a rare occurrence, but an occurrence we have to account for nonetheless.

from openai-node.

shezhangzhang commented on August 17, 2024

@shezhangzhang This chunk-split issue has only happened to me once so far in an hour of testing. I'm running a Node.js instance on localhost right now, not on Vercel. Perhaps it's a rare occurrence, but an occurrence we have to account for nonetheless.

Yup, I didn't figured it out. When you deployed it on Vercel, the chunk-split issue could happen with every request. I don't know the reason🥲

from openai-node.

KaleRakker commented on August 17, 2024

For me, this doesn't solve the issue with the unescaped " character. Any suggestions?

EDIT:

this happens with the following input.
data: {"id":"chatcmpl-6wbcJjX6ttujYAT6rFv8eZMxS2daD","object":"chat.completion.chunk","created":1679425643,"model":"gpt-3.5-turbo-0301","choices":[{"delta":{"content":"\"}"},"index":0,"finish_reason":null}]}

@tsenguunchik Yes, I made the publication too fast and I'm just now struggling with it. I didn't see the problem until I started requesting code and it started failing to receive double quotes. I will update here when I get a solution.

EDIT: Problem solved, I recovered the original regex and parsing and now it works pretty well
const decodeResponse = (response?: Uint8Array) => {
  if (!response) {
    return ''
  }

  const pattern = /"delta":\s*({.*?"content":\s*".*?"})/g
  const decodedText = utf8Decoder.decode(response)
  const matches: string[] = []

  let match
  while ((match = pattern.exec(decodedText)) !== null) {
    matches.push(JSON.parse(match[1]).content)
  }
  return matches.join('')
}
Also is not losing the few first tokens that come "in pack"

from openai-node.

george-i commented on August 17, 2024

I have a problem with encoding when making requests.

For this text:

Remove citation from this text: Gardening can be defined as an activity in a garden setting to grow, cultivate, and look after plants (e.g., flowers, vegetables) for non-commercial use (Gillard, 2001, p. 832; Kingsley et al., 2021). There is some evidence to suggest gardening is a moderately intense physical activity ranging from low-to moderate-intensity for older age groups (>63 years; Park et al., 2008; Park et al., 2011) to moderate-to high-intensity in younger adults (>20 years; Park et al., 2014)

The request fails before sending it around the text

(>20 years

I tried with encodeURIComponent and works fine, but in return the API says among others:

just a reminder to please use proper formatting

Furthermore, sometimes the response is itself encoded.

from openai-node.

syonfox commented on August 17, 2024

Ok just my 2 cents, Network fishyness and the packet failing or the actual server randomly optimizing. I predict this is one of the reasons openai responses fail sometimes in chat.openai.com anyhow. The solution may include a library that very flexibly parses the input and ensure text is joined and any random input is accepted and pares in a most forgiving way. if i remember correctly jsmn is a c++ lib if anyone has a super clean c implementation of this I would be interested pr ++ anyhow I think the regex above a ways is probably the best middle ground they nerfed the response stats anyways. Happy coding

…

On Wed, Mar 22, 2023 at 2:15 PM Micha ***@***.***> wrote: I'm not sure if this is useful for you guys, but here's my modified version of eventsource <https://www.npmjs.com/package/eventsource> with added support for setting method and payload/body which should be sufficient to have a fully featured SSE connection to openai https://gist.github.com/FurriousFox/f43eaf9645302e51ab01cf0b1853aa4e something like this should then work const EventSource = require("./eventsource.js"); let es = new EventSource("https://api.openai.com/v1/chat/completions", { method: "POST", headers: { "Content-Type": "application/json", "Authorization": `Bearer ${api_key}` }, payload: JSON.stringify({ "model": "gpt-3.5-turbo", "messages": [{ "role": "system", "content": instruction }, { "role": "user", "content": prompt }], stream: true, }),});es.onmessage = (e) => { if (e.data == "[DONE]") { es.close(); } else { let delta = JSON.parse(e.data).choices[0].delta.content; if (delta) { console.log(delta); } }}; — Reply to this email directly, view it on GitHub <#18 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ACHJVDN6FWH33CNKDGYBKJ3W5NTYNANCNFSM54BFHL3Q> . You are receiving this because you commented.Message ID: ***@***.***>

from openai-node.

Christopher-Hayes commented on August 17, 2024

@danneu great solution, thank you for sharing.

from openai-node.

justinmahar commented on August 17, 2024

    for await (const chunk of response.data) {

@danneu @Christopher-Hayes VSCode complains with Type 'CreateChatCompletionResponse' must have a '[Symbol.asyncIterator]()' method that returns an async iterator. on this line -- is there something else you guys are doing to ensure you're getting an async iterator? Hacking it with an any typecast doesn't magically fix things either.

from openai-node.

Christopher-Hayes commented on August 17, 2024

    for await (const chunk of response.data) {
@danneu @Christopher-Hayes VSCode complains with Type 'CreateChatCompletionResponse' must have a '[Symbol.asyncIterator]()' method that returns an async iterator. on this line -- is there something else you guys are doing to ensure you're getting an async iterator? Hacking it with an any typecast doesn't magically fix things either.

Yeah, idk why TS was giving that error. But, I just casted to unknown and then to the async iterator type it wants.

I used the code here if you want an example implementation: https://github.com/Christopher-Hayes/vscode-chatgpt-reborn/blob/3c191f34b52e1473a171531da42177857f15304c/src/api-provider.ts#L42

from openai-node.

lassegit commented on August 17, 2024

Does there exists a way to safely JSON.parse the response and ensure that you get all the content provided by OpenAI? The methods above all fails in one way or another which is particularly problematic when generating code snippets but also markdown.

from openai-node.

lassegit commented on August 17, 2024

@josephrocca Still sometimes getting the JSON.parse error in production using Vercel Edge Runtime.

from openai-node.

devilyouwei commented on August 17, 2024

This is my approach:

Create util.ts

// import what you need in util.ts
import {
    OpenAIApi,
    Configuration,
    ChatCompletionRequestMessage,
    CreateChatCompletionResponse,
} from 'openai'

// config openAI sdk
const openai = new OpenAIApi(
    new Configuration({
        apiKey: process.env.OPENAI_API_KEY,
        basePath: process.env.OPENAI_PROXY
    })
)

// write a util function, chat
export default {
    async chat(messages: ChatCompletionRequestMessage[], stream: boolean = false) {
        const responseType: ResponseType = stream ? 'stream' : 'json'
        return (
            await openai.createChatCompletion(
                {
                    model: 'gpt-3.5-turbo',
                    messages,
                    stream
                },
                { responseType }
            )
        ).data
    }
}

Install json detector

yarn add @stdlib/assert-is-json

Use util.ts in a controller or service ware

import openai from './openai'
import isJSON from '@stdlib/assert-is-json'
import { IncomingMessage } from 'http'

async chatStream(content: string, callback: CreateChatCompletionStreamResponseCallback) {
        const role = ChatCompletionRequestMessageRoleEnum.User
        // Transfer to IncomingMessage type, this is a Stream type
        const res = ((await openai.chat([{ role, content }], true)) as any) as IncomingMessage
        let tmp = '' // cache, store temporary string data
        res.on('data', (data: Buffer) => {
            // buffer to utf8 string, then split to string data array
            const message = data
                .toString('utf8')
                .split('\n')
                .filter(m => m.length > 0)
            for (const item of message) {
                // remove the first head word: 'data: '
                tmp += item.replace(/^data: /, '')
                // only when tmp string is a json, you transfer to object and callback it
                if (isJSON(tmp)) {
                    const data: CreateChatCompletionStreamResponse = JSON.parse(tmp)
                    tmp = ''
                    callback(data)
                }
            }
        })
    }

Interface file for openAI, Interface.ts

interface CreateChatCompletionStreamResponse {
    id: string
    object: string
    created: number
    model: string
    choices: Array<CreateChatCompletionStreamResponseChoicesInner>
}

interface CreateChatCompletionStreamResponseChoicesInner {
    delta: { role?: string; content?: string }
    index: number
    finish_reason: string
}

type CreateChatCompletionStreamResponseCallback = (response: CreateChatCompletionStreamResponse) => void

from openai-node.

wrsulliv commented on August 17, 2024

Web Client Streaming

I was trying to access the OpenAI API from a web-client, and none of the solutions worked except @YoseptF - #18 (comment)

I believe it has to do with the client-side Axios implementation, but I may be wrong. Either way, thanks @YoseptF !

from openai-node.

frankgreco commented on August 17, 2024

Here's my contribution 👇🏼

// Initiate the stream request.
const response = await fetch(/* <url> */, {
  method: 'POST',
  headers: { /* <headers> */ },
  body: JSON.stringify({
    ...{ stream: true },
    ...restOfBody
  })
});

if (!response?.body) {
  throw new Error('The response from OpenAI does not contain a body.');
}

// OpenAI responses seem to always begin with two newlines. We'll ignore those.
let noise = true;

// @ts-ignore
for await (const message of response.body) {
  for (const chunk of message.toString().split('\n\n')) {

    // https://github.com/openai/openai-node/issues/18#issuecomment-1369996933
    // https://github.com/openai/openai-node/issues/18#issuecomment-1493132878
    const msg: string = chunk.replace(/^data: /, '')

    // The stream is done?
    // https://platform.openai.com/docs/api-reference/chat/create
    if (msg === '[DONE]' || msg.length === 0) {
      continue;
    }

    let parsed: any;
    try {
      parsed = JSON.parse(msg.trim());
    } catch (e) {
      throw new Error(`Could not parse OpenAI response (${msg}).`);
    }

    let choice: string;
    try {
      choice = parsed?.choices?.[0]?.text;
    } catch (e) {
      throw new Error(`Could not dereference OpenAI message format (${parsed}).`);
    }

    if (noise && choice === '\n') {
      continue;
    }

    noise = false;
    // your final message will be here.
  }
}

NOTE: Sending { stop: ['\n\n'] } as part of the request did not work for me.

from openai-node.

justinsteven commented on August 17, 2024

Using a pattern similar to that shown in #18 (comment) in a client-side React app I'm getting stream.on is not a function

I assume this is because Axios appears to not support { responseType: 'stream' } in the browser, only in server-side node

See https://stackoverflow.com/a/60117409

from openai-node.

shreypjain commented on August 17, 2024

Yeah so @justinsteven I'm running into the same issue using client-side React just to build something simple out. Here is the best work around I could find but I'll show you the issue with it in just a second:

Would recommend adding this to a button click in React and have your state altered based on it:

await axios.post(
        "https://api.openai.com/v1/chat/completions",
        {
          messages: newMessages,
          stream: true,
          model: "gpt-3.5-turbo",
        },
        {
          headers: {
            Authorization: "Bearer " + getOpenAISK(),
          },
          onDownloadProgress: (event) => {
            const payload = event.currentTarget.response;

            const result = payload
              .replace(/data:\s*/g, "")
              .replace(/[\r\n\t]/g, "")
              .split("}{")
              .join("},{");
            const cleanedJsonString = `[${result}]`;

            if (payload.includes("[DONE]")) return;

            try {
              const parsedJson: [] = JSON.parse(cleanedJsonString);
              // console.log(JSON.stringify(parsedJson, null, 2));

              let newContent: string = ""
              parsedJson.forEach((item: any) => {
                if (
                  item.choices &&
                  item.choices.length > 0 &&
                  item.choices[0].delta &&
                  item.choices[0].delta.content
                ) {
                  newContent += item.choices[0].delta.content;
                }
              });

              if (newContent !== laggingLatestMessage) {
                const extraContent = newContent.slice(
                  laggingLatestMessage.length
                );
                laggingLatestMessage += extraContent;
                setLatestMessage(laggingLatestMessage);
              }
            } catch (e) {
              setIsGenerating(false);
              console.log("error parsing json", e);
            }
          },
          responseType: "stream",
        }
      );

Unfortunately, sometimes the output will come ugly looking like this, but I believe this is the best work around when using axios stream in client side React.

from openai-node.

jaankoppe commented on August 17, 2024

How to implement this thing so that the OpenAI API request will be made from the backend because I do not want to expose the API key to the client side?

from openai-node.

shreypjain commented on August 17, 2024

Hey @jaankoppe, you unfortunately wouldn't be able to use this server side (this is more so a client side solution to play around with a local chatGPT bot). I would recommend using the solutions above, as they should all work with axios and server side chat completions if needed it. Use this solution as a reference:

#18 (comment)

from openai-node.

justinmahar commented on August 17, 2024

@jaankoppe @shreypjain I've updated the solution to support both server and client streaming. Give it a shot and let us know how it works for you - #18 (comment)

from openai-node.

Yafaa commented on August 17, 2024

I am running those examples but keep getting : TypeError: completion.data.on is not a function
Tried update my node version to the latest but same.

from openai-node.

justinmahar commented on August 17, 2024

@Yafaa Which environment are you in (node.js or browser)? Are you using the correct call for the environment?

The latest version will throw an error when used in the wrong environment. Try it out -- npm i openai-ext@latest

from openai-node.

lgh06 commented on August 17, 2024

I used https://github.com/PawanOsman/ChatGPT/blob/b705a2511b71cf2a6077db76a7048ddbca1ecbb1/routes.js#L178 solution on Next.js API side, it worked.

UPDATE: Browser frontend Joseph's solution worked for me

MDN NOT WORK: ~~And I am still working on React side following this MDN guide and this~~

higher level wrapper sucks, Node.js & browser native functions are winner in dealing with OpenAI's stream scenario

from openai-node.

fukemy commented on August 17, 2024

streamChatCompletions

Hi, I got error:

Invalid attempt to iterate non-iterable instance.
In order to be iterable, non-array objects must have a [Symbol.iterator]() method

Can u help?

from openai-node.

rodrigoGA commented on August 17, 2024

@gfortaine we actually use @microsoft/fetch-event-source for the playground to do streaming with POST +1

try {
    const res = await openai.createCompletion({
        model: "text-davinci-002",
        prompt: "It was the best of times",
        max_tokens: 100,
        temperature: 0,
        stream: true,
    }, { responseType: 'stream' });
    
    res.data.on('data', data => {
        const lines = data.toString().split('\n').filter(line => line.trim() !== '');
        for (const line of lines) {
            const message = line.replace(/^data: /, '');
            if (message === '[DONE]') {
                return; // Stream finished
            }
            try {
                const parsed = JSON.parse(message);
                console.log(parsed.choices[0].text);
            } catch(error) {
                console.error('Could not JSON parse stream message', message, error);
            }
        }
    });
} catch (error) {
    if (error.response?.status) {
        console.error(error.response.status, error.message);
        error.response.data.on('data', data => {
            const message = data.toString();
            try {
                const parsed = JSON.parse(message);
                console.error('An error occurred during OpenAI request: ', parsed);
            } catch(error) {
                console.error('An error occurred during OpenAI request: ', message);
            }
        });
    } else {
        console.error('An error occurred during OpenAI request', error);
    }
}

This could probably be refactored into a streamCompletion helper function (that uses either callbacks or es6 generators to emit new messages).

lack

Thank you @schnerd
. I don't quite understand the limitations of this solution. It uses @microsoft/fetch-event-source which, as far as I understand, encodes the information in the URL, and it's not doing a POST.
The library page mentions a limitation of 2000 characters in most browsers, but it's not clear to me whether this also affects Node.js.
I appreciate any clarification on the limitation of this solution.

from openai-node.

juzarantri commented on August 17, 2024

@rodrigoGA Can you tell the npm package for openai

from openai-node.

ckarsan commented on August 17, 2024

@justinmahar This is a great solution thanks - I'm testing it on node, however it seems to strip out the triple backticks ``` for when code is generated by the AI response, any way to keep those in by any chance as I use them to format the code. Thanks !

from openai-node.

juzarantri commented on August 17, 2024

@ckarsan what do we need to pass in
const axiosConfig = {
// ...
};
?

from openai-node.

ckarsan commented on August 17, 2024

@juzarantri
Think it's just optional parameters i left it out entirely

from openai-node.

juzarantri commented on August 17, 2024

okay bro

from openai-node.

How to use stream: true? about openai-node HOT 157 CLOSED

Comments (157)

📦 Client and server side streaming solution via npm

Browser / Client

Node.js / Server

OpenAI Streams Library

`nano streamOne.js`

`nano demo.js`

`nano .env`

Running it

This is my approach:

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent