Giter Club home page Giter Club logo

yakgpt's Introduction

YakGPT

A simple, locally running ChatGPT UI that makes your text generation faster and chatting even more engaging!

Features

  • GPT 3.5 & GPT 4 via OpenAI API
  • Speech-to-Text via Azure & OpenAI Whisper
  • Text-to-Speech via Azure & Eleven Labs
  • Run locally on browser – no need to install any applications
  • Faster than the official UI – connect directly to the API
  • Easy mic integration – no more typing!
  • Use your own API key – ensure your data privacy and security
  • Data submitted via the API is not used for training and stored for 30 days only
  • All state stored locally in localStorage – no analytics or external service calls
  • Access on https://yakgpt.vercel.app or run locally!

Note that GPT-4 API access is needed to use it. GPT 3.5 is enabled for all users.

Screenshots

Mobile Voice Mode Light Theme Dark Theme
image image image image

🚀 Getting Started

Visit YakGPT to try it out without installing, or follow these steps to run it locally:

Prerequisites

You'll need the following tools installed on your computer to run YakGPT locally.

  • Git
  • Yarn (or npm or pnpm)
  • Any modern web browser like Google Chrome, Mozilla Firefox, or Microsoft Edge

Installation

  1. Clone the repository:
$ git clone https://github.com/yakGPT/YakGPT.git
  1. Install dependencies, build the bundle and run the server
$ yarn
$ yarn build
$ yarn start

Then navigate to http://localhost:3000

Congratulations! 🎉 You are now running YakGPT locally on your machine.

🔑 API Key Configuration

To utilize YakGPT, you'll need to acquire an API key for OpenAI. The app should prompt you to insert you key.

Add to .env.local(⚠️ Local use only)

If you want the keys to persist across app builds, you can add it to the .env.local.

$ echo "NEXT_PUBLIC_OPENAI_API_KEY=<your-open-ai-key-here>" > .env.local
$ echo "NEXT_PUBLIC_11LABS_API_KEY=<your-eleven-labs-key-here>" >> .env.local

🐳 Docker

To use the pre-built Docker image from Docker Hub (only for amd64), run:

$ docker run -it -p 3000:3000 yakgpt/yakgpt:latest

To build the Docker image yourself (such as if you're on arm64), run:

$ docker build -t yakgpt:latest .
$ docker run -it -p 3000:3000 yakgpt:latest

🎤 Microphone Integration

YakGPT makes chatting a breeze with its microphone integration! Activate your microphone using your browser's permissions, and YakGPT will automatically convert your speech into text.

You can also toggle the mic integration as needed by clicking on the microphone icon in the app.

Remember to use a supported web browser and ensure your microphone is functioning properly.

🛡️ Data Privacy and Security

YakGPT ensures your data privacy and security by letting you use your own API key. Your conversation with YakGPT takes place directly between your browser and OpenAI's GPT-3 API, with no intermediary servers.

📃 License

This project is licensed under the MIT License - see the LICENSE file for details

🙌 Acknowledgments

  • OpenAI for building such amazing models and making them cheap as chips.
  • Mantine UI just an all-around amazing UI library.
  • opus-media-recorder A real requirement for me was to be able to walk-and-talk. OpenAI's Whisper API is unable to accept the audio generated by Safari, and so I went back to wav recording which due to lack of compression makes things incredibly slow on mobile networks. opus-media-recorder saved my butt by allowing cross-platform compressed audio recording via web worker magic. 🤗

Got feedback, questions or ideas? Feel free to submit an issue!

yakgpt's People

Contributors

clueed avatar dev-msp avatar francescor avatar gbro3n avatar luoxi avatar mbotsu avatar pabl-o-ce avatar paulchiu avatar poteat avatar rannmann avatar rbalsick avatar rikhuijzer avatar strlns avatar thelinuxkid avatar yakgpt avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

yakgpt's Issues

Feature Request: Add support for custom API endpoints

Besides the official OpenAI API, there are also other providers like Azure OpenAI. In some regions where the official API is not accessible, users heavily rely on these third-party APIs. Adding support for custom OpenAI keys and URLs would play a crucial role in such scenarios.

There are only a handful of projects that currently offer this feature, one such example is BetterChatGPT.

I apologize for any confusion caused in the previous issue. I mistakenly referred to Azure's Speech API as its OpenAI API.

Error when editing API key

When I view the edit modal, I get this:

image

If I hit save without making any changes, it gives an error as the key isn't properly formatted (since it's a redacted version of what is saved). Maybe we can disable the save button unless the field has changed?

Better support for custome prompts

Is possible to work on a better way to add costume prompts

for example, instead of having images of the avatars, put a input or text area for there put the costume prompts you want to test?

Install instructions in README fail on Windows 11

This tool looks amazing, thanks for all the work you've put in. Just to let you know the install instructions might be failing for some people. I suspect it's just that I don't have the dependencies setup for my machine. I don't know much about yarn / npm as it's not an ecosystem I work with so I haven't attempted to debug.

Yarn works.

yarn build results in:


info  - Collecting page data
info  - Creating an optimized production build ...TypeError: Cannot read properties of null (reading 'useRef')
    at exports.useRef (C:\...\yakGPT\node_modules\react\cjs\react.production.min.js:25:337)
> Build error occurred
Error: Export encountered errors on following paths:
        /
        /_error: /404
        /_error: /500
    at c:\...\yakGPT\node_modules\next\dist\export\index.js:425:19
    at processTicksAndRejections (node:internal/process/task_queues:96:5)
    at async Span.traceAsyncFn (c:\...\yakGPT\node_modules\next\dist\trace\trace.js:79:20)
    at async c:\...\yakGPT\node_modules\next\dist\build\index.js:1422:21
    at async Span.traceAsyncFn (c:\...\yakGPT\node_modules\next\dist\trace\trace.js:79:20)
    at async c:\...\yakGPT\node_modules\next\dist\build\index.js:1280:17
    at async Span.traceAsyncFn (c:\...\yakGPT\node_modules\next\dist\trace\trace.js:79:20)
    at async Object.build [as default] (c:\...\yakGPT\node_modules\next\dist\build\index.js:73:29)
error Command failed with exit code 1.
info Visit https://yarnpkg.com/en/docs/cli/run for documentation about this command.

Conflict dependencies with `npm install`

After running npm install, I got the following error messages stating conflict dependencies:

npm ERR! code ERESOLVE
npm ERR! ERESOLVE could not resolve
npm ERR!
npm ERR! While resolving: [email protected]
npm ERR! Found: @mantine/[email protected]
npm ERR! node_modules/@mantine/core
npm ERR!   @mantine/core@"^6.0.2" from the root project
npm ERR!   peer @mantine/core@"6.0.4" from @mantine/[email protected]
npm ERR!   node_modules/@mantine/carousel
npm ERR!     @mantine/carousel@"^6.0.4" from the root project
npm ERR!
npm ERR! Could not resolve dependency:
npm ERR! @mantine/notifications@"^6.0.2" from the root project
npm ERR!
npm ERR! Conflicting peer dependency: @mantine/[email protected]
npm ERR! node_modules/@mantine/core
npm ERR!   peer @mantine/core@"6.0.2" from @mantine/[email protected]
npm ERR!   node_modules/@mantine/notifications
npm ERR!     @mantine/notifications@"^6.0.2" from the root project
npm ERR!
npm ERR! Fix the upstream dependency conflict, or retry
npm ERR! this command with --force or --legacy-peer-deps
npm ERR! to accept an incorrect (and potentially broken) dependency resolution.
npm ERR!
npm ERR!
npm ERR! For a full report see:
npm ERR! C:\Users\me\AppData\Local\npm-cache\_logs\2023-03-31T07_02_31_381Z-eresolve-report.txt

npm ERR! A complete log of this run can be found in:
npm ERR!     C:\Users\me\AppData\Local\npm-cache\_logs\2023-03-31T07_02_31_381Z-debug-0.log

I am using node.js v18.15.0 with npm v9.5.0.

Support for local llama indexes and text-based settings of how AI should behave.

Excellent, thank you for creating the UI.
Personally, I thought it would be very useful to create a tab list on the settings screen that would allow selection of the index created by the llama index, and the content of the index selected there would be reflected in the conversation.
So, create a script called /YakGPT/pages/api/llama_index.ts.
Paste the following contents.

「import { NextApiRequest, NextApiResponse } from "next";
import { GPTTreeIndex, SimpleDirectoryReader } from "llama_index";

async function handler(req: NextApiRequest, res: NextApiResponse) {
const { query, indexName } = req.body;

// LlamaIndexを使用してインデックスを作成します
const documents = SimpleDirectoryReader("data").load_data();
const index = GPTTreeIndex.from_documents(documents);

// カスタムプロンプトを設定
function custom_prompt(query, text_name, chapter, page_number) {
return (
"You always tell the questioner the truth in a very clear and polite manner, but"
"You always tell the questioner the truth in a very clear and polite manner, but you also believe in making the questioner think once by creating a question when it is generally better to create a question or when there is no instruction not to create a question."
"In doing so, you stay close to the questioner by summarizing lengthy information. You also always try to clarify page sources and provide clear answers."
f"\n\n{query}\nPage source: text_name}, chapter - {chapter}, page - {page_number}"
);
}

// カスタムプロンプトをデフォルトプロンプトとしてインデックスに設定
index.set_default_prompt(custom_prompt);

// インデックスを使用して質問に回答します
const answer = await index.query(query, mode="default");
res.status(200).json({ answer });
}

export default handler;」

Next, I tried to implement it by adding the following code to /YakGPT/components/SettingsModal.tsx but it was too difficult for me.

「<Tabs.Panel value="index"> {/* 新しいタブ */}
<Box style={{ paddingTop: "20px" }}>
<Group position="apart" style={{ alignItems: "center" }}>
インデックスを選択:
<Select
data={[
{ value: "index1", label: "インデックス1" },
{ value: "index2", label: "インデックス2" },
// 他のインデックスオプションを追加
]}
value={selectedIndex}
onChange={(value) => handleIndexChange(value)}
menuPortalTarget={document.body}
styles={{
menuPortal: (base: any) => ({ ...base, zIndex: 9999 }),
menuPortal: () => ({ zIndex: 9999 })
}}
transition="skew-up"
transitionDuration={200}
/>


</Tabs.Panel>」

If you don't mind, it would be helpful if you could implement this feature so that the prompts for how the AI should behave can be customized based on the text in the settings.

Is yakGPT a little more expensive in some way?

When I compared different projects on github about apikey using openai, I found that they are not the same in terms of GPT-4 usage costs.

It seems that yakGPT costs a little bit more, basically around 0.09$ for one issue, and is relatively stable. A simpler program fluctuates between 0.04$ and 0.09$.

I'm curious about this: their questions and answers are close in terms of word count, so what accounts for the different costs of utilizing apikey for different open source projects?

Microphone initialization fails in voice-to-text functionality on Windows

Issue description:
The voice-to-text functionality in the application is experiencing a complete failure to initialize the microphone interface on my Windows 10 Pro computer using Google Chrome. Instead of displaying the large microphone icon in the text input area as expected, a spinning circle appears and persists indefinitely. I have waited up to a minute without success. Initially, the feature worked for about ten uses, but stopped functioning after I refreshed the page or performed a similar action. Interestingly, the functionality still works correctly on my Android device. I restarted my computer and still have the issue.

Steps to reproduce the issue:

Open the application (https://yakgpt.vercel.app/) in Google Chrome on a Windows 10 Pro computer. Ensure both Chrome and Windows are up to date.
Click on the microphone icon to initiate the voice-to-text functionality.
Observe the spinning circle appearing in the text input area, which persists indefinitely instead of displaying the large microphone icon.
Expected behavior:
The large microphone icon should appear almost instantly after clicking on the initial microphone icon to initiate the voice-to-text functionality.

Observed behavior:
The large microphone icon fails to appear, and the spinning circle continues seemingly forever, rendering the voice-to-text feature non-functional on Windows. However, it still works on Android devices.

Additional information:
I have already checked the application and Whisper API documentation (I skimmed it), as well as the README.txt file, for potential solutions without finding any relevant information. Any guidance or assistance in resolving this problem would be highly appreciated.

Thank you for your attention.

Yarn build fails on macOS with no helpful error message

yarn build

yarn run v1.22.10
$ next build
/opt/homebrew/Cellar/node/19.8.0/bin/node[79944]: ../src/module_wrap.cc:599:MaybeLocal<v8::Promise> node::loader::ImportModuleDynamically(Local<v8::Context>, Local<v8::Data>, Local<v8::Value>, Local<v8::String>, Local<v8::FixedArray>): Assertion `(it) != (env->id_to_function_map.end())' failed.
 1: 0x100262100 node::Abort() [/opt/homebrew/Cellar/node/19.8.0/bin/node]
 2: 0x1002620e4 node::Abort() [/opt/homebrew/Cellar/node/19.8.0/bin/node]
 3: 0x10022583c node::loader::ImportModuleDynamically(v8::Local<v8::Context>, v8::Local<v8::Data>, v8::Local<v8::Value>, v8::Local<v8::String>, v8::Local<v8::FixedArray>) [/opt/homebrew/Cellar/node/19.8.0/bin/node]
 4: 0x1004f9d0c v8::internal::Isolate::RunHostImportModuleDynamicallyCallback(v8::internal::MaybeHandle<v8::internal::Script>, v8::internal::Handle<v8::internal::Object>, v8::internal::MaybeHandle<v8::internal::Object>) [/opt/homebrew/Cellar/node/19.8.0/bin/node]
 5: 0x10089b5dc v8::internal::Runtime_DynamicImportCall(int, unsigned long*, v8::internal::Isolate*) [/opt/homebrew/Cellar/node/19.8.0/bin/node]
 6: 0x1000b7524 Builtins_CEntry_Return1_DontSaveFPRegs_ArgvInRegister_NoBuiltinExit [/opt/homebrew/Cellar/node/19.8.0/bin/node]
 7: 0x10015c49c Builtins_CallRuntimeHandler [/opt/homebrew/Cellar/node/19.8.0/bin/node]
 8: 0x100034064 Builtins_InterpreterEntryTrampoline [/opt/homebrew/Cellar/node/19.8.0/bin/node]
 9: 0x10006a8b4 Builtins_AsyncFunctionAwaitResolveClosure [/opt/homebrew/Cellar/node/19.8.0/bin/node]
10: 0x10010ae38 Builtins_PromiseFulfillReactionJob [/opt/homebrew/Cellar/node/19.8.0/bin/node]
11: 0x10005a834 Builtins_RunMicrotasks [/opt/homebrew/Cellar/node/19.8.0/bin/node]
12: 0x1000323c4 Builtins_JSRunMicrotasksEntry [/opt/homebrew/Cellar/node/19.8.0/bin/node]
13: 0x1004dec2c v8::internal::(anonymous namespace)::Invoke(v8::internal::Isolate*, v8::internal::(anonymous namespace)::InvokeParams const&) [/opt/homebrew/Cellar/node/19.8.0/bin/node]
14: 0x1004df374 v8::internal::(anonymous namespace)::InvokeWithTryCatch(v8::internal::Isolate*, v8::internal::(anonymous namespace)::InvokeParams const&) [/opt/homebrew/Cellar/node/19.8.0/bin/node]
15: 0x100501e48 v8::internal::MicrotaskQueue::RunMicrotasks(v8::internal::Isolate*) [/opt/homebrew/Cellar/node/19.8.0/bin/node]
16: 0x100501c78 v8::internal::MicrotaskQueue::PerformCheckpointInternal(v8::Isolate*) [/opt/homebrew/Cellar/node/19.8.0/bin/node]
17: 0x100194ae8 node::InternalCallbackScope::Close() [/opt/homebrew/Cellar/node/19.8.0/bin/node]
18: 0x100194ff4 node::InternalMakeCallback(node::Environment*, v8::Local<v8::Object>, v8::Local<v8::Object>, v8::Local<v8::Function>, int, v8::Local<v8::Value>*, node::async_context) [/opt/homebrew/Cellar/node/19.8.0/bin/node]
19: 0x1001b07a8 node::AsyncWrap::MakeCallback(v8::Local<v8::Function>, int, v8::Local<v8::Value>*) [/opt/homebrew/Cellar/node/19.8.0/bin/node]
20: 0x1002692ec node::fs::FSReqCallback::Resolve(v8::Local<v8::Value>) [/opt/homebrew/Cellar/node/19.8.0/bin/node]
21: 0x10026acec node::fs::AfterStat(uv_fs_s*) [/opt/homebrew/Cellar/node/19.8.0/bin/node]
22: 0x10025c438 node::MakeLibuvRequestCallback<uv_fs_s, void (*)(uv_fs_s*)>::Wrapper(uv_fs_s*) [/opt/homebrew/Cellar/node/19.8.0/bin/node]
23: 0x102c4eff0 uv__work_done [/opt/homebrew/Cellar/libuv/1.44.2/lib/libuv.1.dylib]
24: 0x102c523c4 uv__async_io [/opt/homebrew/Cellar/libuv/1.44.2/lib/libuv.1.dylib]
25: 0x102c621e0 uv__io_poll [/opt/homebrew/Cellar/libuv/1.44.2/lib/libuv.1.dylib]
26: 0x102c527bc uv_run [/opt/homebrew/Cellar/libuv/1.44.2/lib/libuv.1.dylib]
27: 0x1001958a8 node::SpinEventLoopInternal(node::Environment*) [/opt/homebrew/Cellar/node/19.8.0/bin/node]
28: 0x1002a41ec node::NodeMainInstance::Run(node::ExitCode*, node::Environment*) [/opt/homebrew/Cellar/node/19.8.0/bin/node]
29: 0x1002a3f6c node::NodeMainInstance::Run() [/opt/homebrew/Cellar/node/19.8.0/bin/node]
30: 0x10022a830 node::LoadSnapshotDataAndRun(node::SnapshotData const**, node::InitializationResultImpl const*) [/opt/homebrew/Cellar/node/19.8.0/bin/node]
31: 0x10022a9b4 node::Start(int, char**) [/opt/homebrew/Cellar/node/19.8.0/bin/node]
32: 0x1a6203f28 start [/usr/lib/dyld]
error Command failed with signal "SIGABRT".
info Visit https://yarnpkg.com/en/docs/cli/run for documentation about this command.

This tool is not hands free?

Is it correct that I do have to manually click a button to submit the voice prompt to the system? That sounds non hands-free to me

Install instructions could use some clarity

I'm primarily a Windows admin, but love using YakGPT in my browser so I figured it was time to self-host.

I fired up a Mint XFCE VM and got git and yarn installed.
I cloned the repo...
and I'm lost. yarn throws an error saying there are no scenarios.

ENV

ADD ELEVEN LABS KEY TO ENV

Questions: Why .yarn not in .gitignore, production .env files, yarn version

Sorry if these are dumb question.
I am not used to Yarn and haven't used Vercel so far either.

Thank you for providing this nice frontend! 👍

As for my questions:

why is .yarn not in .gitignore?
Same for production .env files?

AFAIK, this should not be needed for deployment workflows, especially not on Vercel – but I might be wrong as I don't have any experience with Vercel.

Usually, it should be possible to build during deployment without commiting the packages,
and handle build artifacts as well as production .env files during deployment, isolated from the main git repository.

Here, packages aren't commited, but they aren't ignored either?

Or is my configuration strange?

I created a branch in my fork which

  • updates lockfile after installing with newer yarn version
  • adds .yarn to .gitignore
  • adds any and all .envfiles to .gitignore, not only .env.local

I also added an .nvmrc, but that's probably not a useful change for most people. On my machine, yarn runs via nvm and is only installed in my environment for Node 18.

I did not create a PR because I don't know which changes would break existing workflows or are unwanted. The NVM thing is probably not relevant for most people.

But the things listed as bullet points I think should be sensible.

Maybe someone could check if there is a yarn lockfile compatible with both ARM and x64?

Suggestion: Improve TTS responsiveness by sending sentences as they generate (segmentation)

To enhance the user experience and reduce pauses in the TTS playback, consider the following improvements:

  1. Begin TTS playback as soon as the first sentence is generated. Implement a buffer or threshold (e.g., 1 second) to determine when to send the next sentence (or group of sentences ready to send) to the TTS engine.

  2. Analyze punctuation marks and linguistic cues to dynamically group sentences before sending them to the TTS engine. If the second sentence is shorter than the buffer duration or threshold, wait until the next sentence is ready and send them together. Continue this process until all text has been sent to the TTS engine.

  3. Handle edge cases, such as reaching the TTS engine's maximum character limit mid-sentence. In these cases, split the text at the nearest natural break (e.g., comma or period) and send the remaining text in the next TTS request.

This approach will minimize the waiting time for text generation, especially noticeable with GPT-4, and aims to provide a seamless TTS experience without pauses.

Please also update the play icon at the bottom left to play all the audio files in sequence.

Remember to consider the character limits of different TTS engines:

  • ElevenLabs: 2,500 characters per request for non-subscribers, 5,000 characters for subscribers
  • Azure (future?): ???

By incorporating these changes, the TTS experience will be smoother and more enjoyable for users. Thoughts?

Dear_expert

Can You add button for login and register for this website for people who don't have API
I f they register than can use it

nvm 14 fails

Steps to reproduce:

  • nvm use 14.18.0
  • yarn
  • yarn build

Fails to compile due to ReferenceError

updating to nvm 16 fixes the issue

GPT-4 support?

Great project!

Ii got the impression from Hacker News that this would work out of the box for GPT-4 (I have access to this via the OpenAI website.) However, I'm not seeing GPT-4 as one of models in the dropdown - is this intended?

Cheers

Suggestions

  1. Have the auto classify/summarise conversion request as an optional tickbox -- Personally I'd rather not waste credits on it being on. (and would prefer to edit the convo title myself)

  2. Have the spoken text captured from the mic go into to textfield -- not auto submit -- so it can be adjusted incase of any transcription errors or reconsiderations after speaking -- rather than trying to re-speak the whole spiel again.

  3. maybe have the max tokens also visible on the main area, so can adjust more quickly. // maybe having an advanced and basic ui toggle.

  4. With the API access, we have access to the system, assistant and user roles -- Being able to built the 'fake' prior context out of those roles would be useful. As is done in (https://platform.openai.com/playground?mode=chat)

Good stuff tho!

Internationalization

If only i18n is implemented, I would be able to share the site to my family who don't know English.

Bug: BGCard images should have object-fit inline style

On some viewports, the carousel with images representing characters / preludes have distorted images.

As far as I can see, it's a matter of adding
style={{objectFit: "cover"}}

or

style={{objectFit: "contain"}}

to this usage of the mantine Imagecomponent:

There was a direct property objectFit in previous version of mantine-ui, which is deprecated.
Of course you could also add this rule to a style defined with useStyles.

[Dependencies] Does it need to have ~465 dependencies?

Very cool project but as the title say I ask if it is a good practice to have such a long list of dependencies.

Having a large number of dependencies in a Node.js project can have several potential drawbacks:

  • Increased complexity: Each dependency may have its own set of dependencies, leading to a complex web of dependencies that can be difficult to manage and debug. This can make it challenging to understand how different parts of the codebase interact with each other.

  • Slower performance: Each dependency must be loaded and initialized at runtime, which can slow down the performance of the application. This can be particularly problematic on low-power devices or in applications that require high levels of performance.

  • Increased security risks: Every dependency represents a potential security risk, as each one introduces new code that may contain vulnerabilities. This can be especially concerning if the dependencies are not well-maintained or are outdated.

  • Compatibility issues: Dependencies may not always be compatible with each other or with the version of Node.js being used. This can result in difficult-to-diagnose issues that can be time-consuming to resolve.

  • Maintenance burden: Each dependency must be updated and maintained over time to ensure that the project remains secure and functional. This can be a significant burden, especially if the dependencies are poorly documented or have complex APIs.

While having a large number of dependencies is not necessarily a bad thing in and of itself, it's important to carefully evaluate each one to ensure that it is necessary and that its benefits outweigh its potential drawbacks.

also, I think you forgot the yarn next telemetry disable step in the README -> Installation.

Variables

What a are the docker variables to make api and chats persitent?

Export/Import conversations ?

Love the app, thank you!

It'd be super helpful to have the ability to back up and import all and/or selected conversations.

Feature Request: Add support for custom APIs

Apart from the official OpenAI API, there are actually other API providers, such as Azure OpenAI. In some regions where the official OpenAI API is unavailable, they heavily rely on these third-party APIs. Adding support for custom OpenAI keys and domains would play a crucial role in such situations.

There are only a few projects that currently provide this feature, one such example is BetterChatGPT.

TTS support (via ElevenLabs?)

It would be great if you could optionally plug in your ElevenLabs API key and each chat response would be read aloud to you. If you're interested, I can fork and have a go at implementing this.

This model's maximum context length is 4097 tokens.

Full error message: "This model's maximum context length is 4097 tokens. However, you requested 4245 tokens (2980 in the messages, 1265 in the completion). Please reduce the length of the messages or completion."

How do I fix this error as per the suggestion of "Unlimited" option here? Thanks for the great tool by the way.

Close mic when not in use

Presently, once the mic is used once, the mic for this app keeps listening, which puts the orange dot,
image
on a Mac.

Even if YakGPT isn't using misusing the data, I can't use the orange indicator to know if my other apps are using the mic.

Feature Request: Option to delete individual messages in YakGPT GUI

In certain situations, an AI-generated response may be irrelevant or unhelpful to the ongoing conversation. To maintain the context and improve user experience, it would be beneficial to allow users to delete specific messages from the chat history.

Suggestion: Implement a delete button (e.g., an 'X' icon) next to each question or answer in the YakGPT GUI, enabling users to remove individual messages from the conversation easily. This icon can appear next to the 'regenerate' and 'edit' icons respectively.

Feature Request: Add chat export feature

I am not sure how chat is being stored at the moment, but I think the ability to export chat is always a useful feature. As a bare minimum, it would be nice to be able to get a copy-and-pasteable JSON format of chat.

Could not read whole reply when it's long.

This is a great project that make gpt has ability to have voice.

But I met a question: when the reply of gpt is long (like 200 words), it would not read the final part of reply.

Any guy met similar question?

How to reproduce similar issue:

Ask it question like "how to build a repo on Github", you will get a step-by-step guide, and the voice would keep going to step4 or 5, but could not read to the last word.

[BUG] Microphone input doesn't work

start [_app-eda8b4ac237e4a89.js:50023:251379](http://localhost:3000/_next/static/chunks/pages/_app-eda8b4ac237e4a89.js)
Starting recording... 
Object { _stream: MediaStream, _state: "inactive", _mimeType: "audio/webm", _audioBitsPerSecond: undefined, workerState: "inactive", _wasmPath: "https://cdn.jsdelivr.net/npm/opus-media-recorder@latest/WebMOpusEncoder.wasm", _workerFactory: _workerFactory(e)
, worker: Worker }
[_app-eda8b4ac237e4a89.js:50023:252208](http://localhost:3000/_next/static/chunks/pages/_app-eda8b4ac237e4a89.js)
rendered with audioState recording [_app-eda8b4ac237e4a89.js:50024:19852](http://localhost:3000/_next/static/chunks/pages/_app-eda8b4ac237e4a89.js)
Assertion failed: channel_count > 0 && channel_count <= 2, at: /build/src/ContainerInterface.cpp,22,init [2ad46c9d-0d7c-4935-8c20-88f8ad7846bb:1:21416](blob:http://localhost:3000/2ad46c9d-0d7c-4935-8c20-88f8ad7846bb)
Assertion failed: channel_count > 0 && channel_count <= 2, at: /build/src/ContainerInterface.cpp,22,init [2ad46c9d-0d7c-4935-8c20-88f8ad7846bb:1:21425](blob:http://localhost:3000/2ad46c9d-0d7c-4935-8c20-88f8ad7846bb)
RuntimeError: abort(Assertion failed: channel_count > 0 && channel_count <= 2, at: /build/src/ContainerInterface.cpp,22,init). Build with -s ASSERTIONS=1 for more info. [2ad46c9d-0d7c-4935-8c20-88f8ad7846bb:1:21495](blob:http://localhost:3000/2ad46c9d-0d7c-4935-8c20-88f8ad7846bb)
Stopping recording... submit= true [_app-eda8b4ac237e4a89.js:50023:252308](http://localhost:3000/_next/static/chunks/pages/_app-eda8b4ac237e4a89.js)
rendered with audioState transcribing

I'm using Firefox 112, on Arch Linux.

RuntimeError: abort(Assertion failed: channel_count > 0 && channel_count <= 2, at: /build/src/ContainerInterface.cpp,22,init

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.