yakgpt / yakgpt Goto Github PK

View Code? Open in Web Editor NEW

1.6K 1.6K 257.0 16.31 MB

Locally running, hands-free ChatGPT UI

Home Page: https://yakgpt.vercel.app

License: MIT License

TypeScript 94.26% JavaScript 0.09% CSS 4.47% Dockerfile 1.17%

yakgpt's Introduction

YakGPT

A simple, locally running ChatGPT UI that makes your text generation faster and chatting even more engaging!

Features

GPT 3.5 & GPT 4 via OpenAI API
Speech-to-Text via Azure & OpenAI Whisper
Text-to-Speech via Azure & Eleven Labs
Run locally on browser – no need to install any applications
Faster than the official UI – connect directly to the API
Easy mic integration – no more typing!
Use your own API key – ensure your data privacy and security
Data submitted via the API is not used for training and stored for 30 days only
All state stored locally in localStorage – no analytics or external service calls
Access on https://yakgpt.vercel.app or run locally!

Note that GPT-4 API access is needed to use it. GPT 3.5 is enabled for all users.

Screenshots

Mobile	Voice Mode	Light Theme	Dark Theme

🚀 Getting Started

Visit YakGPT to try it out without installing, or follow these steps to run it locally:

Prerequisites

You'll need the following tools installed on your computer to run YakGPT locally.

Git
Yarn (or npm or pnpm)
Any modern web browser like Google Chrome, Mozilla Firefox, or Microsoft Edge

Installation

Clone the repository:

$ git clone https://github.com/yakGPT/YakGPT.git

Install dependencies, build the bundle and run the server

$ yarn
$ yarn build
$ yarn start

Then navigate to http://localhost:3000

Congratulations! 🎉 You are now running YakGPT locally on your machine.

🔑 API Key Configuration

To utilize YakGPT, you'll need to acquire an API key for OpenAI. The app should prompt you to insert you key.

Add to .env.local(⚠️ Local use only)

If you want the keys to persist across app builds, you can add it to the .env.local.

$ echo "NEXT_PUBLIC_OPENAI_API_KEY=<your-open-ai-key-here>" > .env.local
$ echo "NEXT_PUBLIC_11LABS_API_KEY=<your-eleven-labs-key-here>" >> .env.local

🐳 Docker

To use the pre-built Docker image from Docker Hub (only for amd64), run:

$ docker run -it -p 3000:3000 yakgpt/yakgpt:latest

To build the Docker image yourself (such as if you're on arm64), run:

$ docker build -t yakgpt:latest .
$ docker run -it -p 3000:3000 yakgpt:latest

🎤 Microphone Integration

YakGPT makes chatting a breeze with its microphone integration! Activate your microphone using your browser's permissions, and YakGPT will automatically convert your speech into text.

You can also toggle the mic integration as needed by clicking on the microphone icon in the app.

Remember to use a supported web browser and ensure your microphone is functioning properly.

🛡️ Data Privacy and Security

YakGPT ensures your data privacy and security by letting you use your own API key. Your conversation with YakGPT takes place directly between your browser and OpenAI's GPT-3 API, with no intermediary servers.

📃 License

This project is licensed under the MIT License - see the LICENSE file for details

🙌 Acknowledgments

OpenAI for building such amazing models and making them cheap as chips.
Mantine UI just an all-around amazing UI library.
opus-media-recorder A real requirement for me was to be able to walk-and-talk. OpenAI's Whisper API is unable to accept the audio generated by Safari, and so I went back to wav recording which due to lack of compression makes things incredibly slow on mobile networks. opus-media-recorder saved my butt by allowing cross-platform compressed audio recording via web worker magic. 🤗

Got feedback, questions or ideas? Feel free to submit an issue!

yakgpt's People

Contributors

Stargazers

Watchers

Forkers

everydayseries cjharmath artisr zaklampert hirajanwin salehhindi xeare dannykansas th3botanist uakbr ssteo bipvanwinkle vinckr rogervaas rbalsick kfdslsope abul22 itsharex hbcbh1999 imerica damiendonnelly chrisblossom segmond send2cloud ericvolp12 fly51fly test-temporary suryatmodulus salolivares chhabraamit valleybay jonmellman rkaede pixelpotts gbro3n merrickfox therealbithive raafat-hantoush diyism amonforstmann francescor ouphili xuchunyang voska gms1979 hikari31768 yajun312890225 rossman22590 lguzzon-scratchbook fabriziogiordano sdsdffg-rgb paulchiu jloures baponkar welcomefrank orzcc weongit genta314 atjsh yhangang tluyben michellmelloo linecode dev-msp mashirashinaa zsh13meta arianit94 mdsahed corina26 mayurjobanputra pabl-o-ce wesbragagt developerisnow poteat rsohlot bheemaiahnn luoxi riccardod voberoi rhkdgh255 hhy5277 sohamshah yoshinga codeyourwayup danes postpcera neversettle2018 trinhnv1205 rjrobben strlns clueed ekryski christophercelaya 18106574249 kenhuangus sultanfendonus miurabo alejandrogomez314 cyberflamego brainstencil

yakgpt's Issues

Add support for other top models such as Claude

Fascinating project!

Asking to support Claude:
https://console.anthropic.com/docs/api
https://python.langchain.com/en/latest/_modules/langchain/llms/anthropic.html

And Jumbo2 from AI21
https://docs.ai21.com/reference/j2-instruct-api-ref
https://python.langchain.com/en/latest/_modules/langchain/llms/ai21.html

Feature Request: Add support for custom API endpoints

Besides the official OpenAI API, there are also other providers like Azure OpenAI. In some regions where the official API is not accessible, users heavily rely on these third-party APIs. Adding support for custom OpenAI keys and URLs would play a crucial role in such scenarios.

There are only a handful of projects that currently offer this feature, one such example is BetterChatGPT.

I apologize for any confusion caused in the previous issue. I mistakenly referred to Azure's Speech API as its OpenAI API.

"Please reduce the length of the messages or completion" on short messages sent in long conversation

This model's maximum context length is 4097 tokens. However, you requested 4295 tokens (3245 in the messages, 1050 in the completion). Please reduce the length of the messages or completion.

Keep getting this error once a conversation has gone on for a period of time. Is this expected or should the text being sent be truncated?

Error when editing API key

When I view the edit modal, I get this:

If I hit save without making any changes, it gives an error as the key isn't properly formatted (since it's a redacted version of what is saved). Maybe we can disable the save button unless the field has changed?

Better support for custome prompts

Is possible to work on a better way to add costume prompts

for example, instead of having images of the avatars, put a input or text area for there put the costume prompts you want to test?

Install instructions in README fail on Windows 11

This tool looks amazing, thanks for all the work you've put in. Just to let you know the install instructions might be failing for some people. I suspect it's just that I don't have the dependencies setup for my machine. I don't know much about yarn / npm as it's not an ecosystem I work with so I haven't attempted to debug.

Yarn works.

yarn build results in:


info  - Collecting page data
info  - Creating an optimized production build ...TypeError: Cannot read properties of null (reading 'useRef')
    at exports.useRef (C:\...\yakGPT\node_modules\react\cjs\react.production.min.js:25:337)

> Build error occurred
Error: Export encountered errors on following paths:
        /
        /_error: /404
        /_error: /500
    at c:\...\yakGPT\node_modules\next\dist\export\index.js:425:19
    at processTicksAndRejections (node:internal/process/task_queues:96:5)
    at async Span.traceAsyncFn (c:\...\yakGPT\node_modules\next\dist\trace\trace.js:79:20)
    at async c:\...\yakGPT\node_modules\next\dist\build\index.js:1422:21
    at async Span.traceAsyncFn (c:\...\yakGPT\node_modules\next\dist\trace\trace.js:79:20)
    at async c:\...\yakGPT\node_modules\next\dist\build\index.js:1280:17
    at async Span.traceAsyncFn (c:\...\yakGPT\node_modules\next\dist\trace\trace.js:79:20)
    at async Object.build [as default] (c:\...\yakGPT\node_modules\next\dist\build\index.js:73:29)
error Command failed with exit code 1.
info Visit https://yarnpkg.com/en/docs/cli/run for documentation about this command.

Feature Request: Add Docker image to Docker Hub

This would allow automatic updates of the application using Portainer or similar.

Conflict dependencies with `npm install`

After running npm install, I got the following error messages stating conflict dependencies:

npm ERR! code ERESOLVE
npm ERR! ERESOLVE could not resolve
npm ERR!
npm ERR! While resolving: [email protected]
npm ERR! Found: @mantine/[email protected]
npm ERR! node_modules/@mantine/core
npm ERR!   @mantine/core@"^6.0.2" from the root project
npm ERR!   peer @mantine/core@"6.0.4" from @mantine/[email protected]
npm ERR!   node_modules/@mantine/carousel
npm ERR!     @mantine/carousel@"^6.0.4" from the root project
npm ERR!
npm ERR! Could not resolve dependency:
npm ERR! @mantine/notifications@"^6.0.2" from the root project
npm ERR!
npm ERR! Conflicting peer dependency: @mantine/[email protected]
npm ERR! node_modules/@mantine/core
npm ERR!   peer @mantine/core@"6.0.2" from @mantine/[email protected]
npm ERR!   node_modules/@mantine/notifications
npm ERR!     @mantine/notifications@"^6.0.2" from the root project
npm ERR!
npm ERR! Fix the upstream dependency conflict, or retry
npm ERR! this command with --force or --legacy-peer-deps
npm ERR! to accept an incorrect (and potentially broken) dependency resolution.
npm ERR!
npm ERR!
npm ERR! For a full report see:
npm ERR! C:\Users\me\AppData\Local\npm-cache\_logs\2023-03-31T07_02_31_381Z-eresolve-report.txt

npm ERR! A complete log of this run can be found in:
npm ERR!     C:\Users\me\AppData\Local\npm-cache\_logs\2023-03-31T07_02_31_381Z-debug-0.log

I am using node.js v18.15.0 with npm v9.5.0.

Support for local llama indexes and text-based settings of how AI should behave.

Excellent, thank you for creating the UI.
Personally, I thought it would be very useful to create a tab list on the settings screen that would allow selection of the index created by the llama index, and the content of the index selected there would be reflected in the conversation.
So, create a script called /YakGPT/pages/api/llama_index.ts.
Paste the following contents.

「import { NextApiRequest, NextApiResponse } from "next";
import { GPTTreeIndex, SimpleDirectoryReader } from "llama_index";

async function handler(req: NextApiRequest, res: NextApiResponse) {
const { query, indexName } = req.body;

// LlamaIndexを使用してインデックスを作成します
const documents = SimpleDirectoryReader("data").load_data();
const index = GPTTreeIndex.from_documents(documents);

// カスタムプロンプトを設定
function custom_prompt(query, text_name, chapter, page_number) {
return (
"You always tell the questioner the truth in a very clear and polite manner, but"
"You always tell the questioner the truth in a very clear and polite manner, but you also believe in making the questioner think once by creating a question when it is generally better to create a question or when there is no instruction not to create a question."
"In doing so, you stay close to the questioner by summarizing lengthy information. You also always try to clarify page sources and provide clear answers."
f"\n\n{query}\nPage source: text_name}, chapter - {chapter}, page - {page_number}"
);
}

// カスタムプロンプトをデフォルトプロンプトとしてインデックスに設定
index.set_default_prompt(custom_prompt);

// インデックスを使用して質問に回答します
const answer = await index.query(query, mode="default");
res.status(200).json({ answer });
}

export default handler;」

Next, I tried to implement it by adding the following code to /YakGPT/components/SettingsModal.tsx but it was too difficult for me.

「<Tabs.Panel value="index"> {/* 新しいタブ */}
<Box style={{ paddingTop: "20px" }}>
<Group position="apart" style={{ alignItems: "center" }}>
インデックスを選択:
<Select
data={[
{ value: "index1", label: "インデックス1" },
{ value: "index2", label: "インデックス2" },
// 他のインデックスオプションを追加
]}
value={selectedIndex}
onChange={(value) => handleIndexChange(value)}
menuPortalTarget={document.body}
styles={{
menuPortal: (base: any) => ({ ...base, zIndex: 9999 }),
menuPortal: () => ({ zIndex: 9999 })
}}
transition="skew-up"
transitionDuration={200}
/>

</Tabs.Panel>」

If you don't mind, it would be helpful if you could implement this feature so that the prompts for how the AI should behave can be customized based on the text in the settings.

ENV and API Issue

Add Eleven Labs and Azure to ENV

pleassseeee :)

Feature request: build the openai key into an environment variable

I can share it with my friends, and they don't need to enter a key If I build the key into an environment variable, instead, I can use a tiny account authentication system.

Feature Request: Rename Global Model to "Default Model for New Chats" and Allow Model Switching within Each Chat

I use GPT 3.5 for quick and inexpensive solutions on certain topics. For more complex issues, I rely on GPT4.
Rename the global model as "Default Model for New Chats" and allow switching models within each chat.

Is yakGPT a little more expensive in some way?

When I compared different projects on github about apikey using openai, I found that they are not the same in terms of GPT-4 usage costs.

It seems that yakGPT costs a little bit more, basically around 0.09$ for one issue, and is relatively stable. A simpler program fluctuates between 0.04$ and 0.09$.

I'm curious about this: their questions and answers are close in terms of word count, so what accounts for the different costs of utilizing apikey for different open source projects?

Feature request: Allow changing model up top (not settings); allow different models in different chats.

Simple as that really. This is a major feature of the chat.openai prompt. Would love to switch to this experience in browser (vimium keys etc.) but this is a real deal breaker unfortunately.

How do I put my API so that everyone can use it for free?

Can you please tell me how to put your API so that everyone can use it for free? I am cloning a project on Vercel

can i use key in env?

can i add an env file for the key?

Microphone initialization fails in voice-to-text functionality on Windows

Issue description:
The voice-to-text functionality in the application is experiencing a complete failure to initialize the microphone interface on my Windows 10 Pro computer using Google Chrome. Instead of displaying the large microphone icon in the text input area as expected, a spinning circle appears and persists indefinitely. I have waited up to a minute without success. Initially, the feature worked for about ten uses, but stopped functioning after I refreshed the page or performed a similar action. Interestingly, the functionality still works correctly on my Android device. I restarted my computer and still have the issue.

Steps to reproduce the issue:

Open the application (https://yakgpt.vercel.app/) in Google Chrome on a Windows 10 Pro computer. Ensure both Chrome and Windows are up to date.
Click on the microphone icon to initiate the voice-to-text functionality.
Observe the spinning circle appearing in the text input area, which persists indefinitely instead of displaying the large microphone icon.
Expected behavior:
The large microphone icon should appear almost instantly after clicking on the initial microphone icon to initiate the voice-to-text functionality.

Observed behavior:
The large microphone icon fails to appear, and the spinning circle continues seemingly forever, rendering the voice-to-text feature non-functional on Windows. However, it still works on Android devices.

Additional information:
I have already checked the application and Whisper API documentation (I skimmed it), as well as the README.txt file, for potential solutions without finding any relevant information. Any guidance or assistance in resolving this problem would be highly appreciated.

Thank you for your attention.

Add support for AWS and/or GCP TTS API

Many people already have access to AWS or GCP API keys for TTS.
Please add support to one or both

Yarn build fails on macOS with no helpful error message

yarn build

yarn run v1.22.10
$ next build
/opt/homebrew/Cellar/node/19.8.0/bin/node[79944]: ../src/module_wrap.cc:599:MaybeLocal<v8::Promise> node::loader::ImportModuleDynamically(Local<v8::Context>, Local<v8::Data>, Local<v8::Value>, Local<v8::String>, Local<v8::FixedArray>): Assertion `(it) != (env->id_to_function_map.end())' failed.
 1: 0x100262100 node::Abort() [/opt/homebrew/Cellar/node/19.8.0/bin/node]
 2: 0x1002620e4 node::Abort() [/opt/homebrew/Cellar/node/19.8.0/bin/node]
 3: 0x10022583c node::loader::ImportModuleDynamically(v8::Local<v8::Context>, v8::Local<v8::Data>, v8::Local<v8::Value>, v8::Local<v8::String>, v8::Local<v8::FixedArray>) [/opt/homebrew/Cellar/node/19.8.0/bin/node]
 4: 0x1004f9d0c v8::internal::Isolate::RunHostImportModuleDynamicallyCallback(v8::internal::MaybeHandle<v8::internal::Script>, v8::internal::Handle<v8::internal::Object>, v8::internal::MaybeHandle<v8::internal::Object>) [/opt/homebrew/Cellar/node/19.8.0/bin/node]
 5: 0x10089b5dc v8::internal::Runtime_DynamicImportCall(int, unsigned long*, v8::internal::Isolate*) [/opt/homebrew/Cellar/node/19.8.0/bin/node]
 6: 0x1000b7524 Builtins_CEntry_Return1_DontSaveFPRegs_ArgvInRegister_NoBuiltinExit [/opt/homebrew/Cellar/node/19.8.0/bin/node]
 7: 0x10015c49c Builtins_CallRuntimeHandler [/opt/homebrew/Cellar/node/19.8.0/bin/node]
 8: 0x100034064 Builtins_InterpreterEntryTrampoline [/opt/homebrew/Cellar/node/19.8.0/bin/node]
 9: 0x10006a8b4 Builtins_AsyncFunctionAwaitResolveClosure [/opt/homebrew/Cellar/node/19.8.0/bin/node]
10: 0x10010ae38 Builtins_PromiseFulfillReactionJob [/opt/homebrew/Cellar/node/19.8.0/bin/node]
11: 0x10005a834 Builtins_RunMicrotasks [/opt/homebrew/Cellar/node/19.8.0/bin/node]
12: 0x1000323c4 Builtins_JSRunMicrotasksEntry [/opt/homebrew/Cellar/node/19.8.0/bin/node]
13: 0x1004dec2c v8::internal::(anonymous namespace)::Invoke(v8::internal::Isolate*, v8::internal::(anonymous namespace)::InvokeParams const&) [/opt/homebrew/Cellar/node/19.8.0/bin/node]
14: 0x1004df374 v8::internal::(anonymous namespace)::InvokeWithTryCatch(v8::internal::Isolate*, v8::internal::(anonymous namespace)::InvokeParams const&) [/opt/homebrew/Cellar/node/19.8.0/bin/node]
15: 0x100501e48 v8::internal::MicrotaskQueue::RunMicrotasks(v8::internal::Isolate*) [/opt/homebrew/Cellar/node/19.8.0/bin/node]
16: 0x100501c78 v8::internal::MicrotaskQueue::PerformCheckpointInternal(v8::Isolate*) [/opt/homebrew/Cellar/node/19.8.0/bin/node]
17: 0x100194ae8 node::InternalCallbackScope::Close() [/opt/homebrew/Cellar/node/19.8.0/bin/node]
18: 0x100194ff4 node::InternalMakeCallback(node::Environment*, v8::Local<v8::Object>, v8::Local<v8::Object>, v8::Local<v8::Function>, int, v8::Local<v8::Value>*, node::async_context) [/opt/homebrew/Cellar/node/19.8.0/bin/node]
19: 0x1001b07a8 node::AsyncWrap::MakeCallback(v8::Local<v8::Function>, int, v8::Local<v8::Value>*) [/opt/homebrew/Cellar/node/19.8.0/bin/node]
20: 0x1002692ec node::fs::FSReqCallback::Resolve(v8::Local<v8::Value>) [/opt/homebrew/Cellar/node/19.8.0/bin/node]
21: 0x10026acec node::fs::AfterStat(uv_fs_s*) [/opt/homebrew/Cellar/node/19.8.0/bin/node]
22: 0x10025c438 node::MakeLibuvRequestCallback<uv_fs_s, void (*)(uv_fs_s*)>::Wrapper(uv_fs_s*) [/opt/homebrew/Cellar/node/19.8.0/bin/node]
23: 0x102c4eff0 uv__work_done [/opt/homebrew/Cellar/libuv/1.44.2/lib/libuv.1.dylib]
24: 0x102c523c4 uv__async_io [/opt/homebrew/Cellar/libuv/1.44.2/lib/libuv.1.dylib]
25: 0x102c621e0 uv__io_poll [/opt/homebrew/Cellar/libuv/1.44.2/lib/libuv.1.dylib]
26: 0x102c527bc uv_run [/opt/homebrew/Cellar/libuv/1.44.2/lib/libuv.1.dylib]
27: 0x1001958a8 node::SpinEventLoopInternal(node::Environment*) [/opt/homebrew/Cellar/node/19.8.0/bin/node]
28: 0x1002a41ec node::NodeMainInstance::Run(node::ExitCode*, node::Environment*) [/opt/homebrew/Cellar/node/19.8.0/bin/node]
29: 0x1002a3f6c node::NodeMainInstance::Run() [/opt/homebrew/Cellar/node/19.8.0/bin/node]
30: 0x10022a830 node::LoadSnapshotDataAndRun(node::SnapshotData const**, node::InitializationResultImpl const*) [/opt/homebrew/Cellar/node/19.8.0/bin/node]
31: 0x10022a9b4 node::Start(int, char**) [/opt/homebrew/Cellar/node/19.8.0/bin/node]
32: 0x1a6203f28 start [/usr/lib/dyld]
error Command failed with signal "SIGABRT".
info Visit https://yarnpkg.com/en/docs/cli/run for documentation about this command.

feature: add a link to this repo in the interface

When I am in the interface, i miss a link to arrive to this repo :)

Congrats!

Exceeded model's max. context length should not stop me

I'm getting the following message after a sufficiently long dialogue:

I would like this banner to come up but not stop me from keep asking YakGPT questions. Is this a bug or a feature?

This tool is not hands free?

Is it correct that I do have to manually click a button to submit the voice prompt to the system? That sounds non hands-free to me

希望有更多的模型可以选择，比如gpt4

Install instructions could use some clarity

I'm primarily a Windows admin, but love using YakGPT in my browser so I figured it was time to self-host.

I fired up a Mint XFCE VM and got git and yarn installed.
I cloned the repo...
and I'm lost. yarn throws an error saying there are no scenarios.

ENV

ADD ELEVEN LABS KEY TO ENV

Unable to select model version in UI dropdown.

I'm expecting to see available GPT models in the drop down, but there are no options when clicked:

Questions: Why .yarn not in .gitignore, production .env files, yarn version

Sorry if these are dumb question.
I am not used to Yarn and haven't used Vercel so far either.

Thank you for providing this nice frontend! 👍

As for my questions:

why is .yarn not in .gitignore?
Same for production .env files?

AFAIK, this should not be needed for deployment workflows, especially not on Vercel – but I might be wrong as I don't have any experience with Vercel.

Usually, it should be possible to build during deployment without commiting the packages,
and handle build artifacts as well as production .env files during deployment, isolated from the main git repository.

Here, packages aren't commited, but they aren't ignored either?

Or is my configuration strange?

I created a branch in my fork which

updates lockfile after installing with newer yarn version
adds .yarn to .gitignore
adds any and all .envfiles to .gitignore, not only .env.local

I also added an .nvmrc, but that's probably not a useful change for most people. On my machine, yarn runs via nvm and is only installed in my environment for Node 18.

I did not create a PR because I don't know which changes would break existing workflows or are unwanted. The NVM thing is probably not relevant for most people.

But the things listed as bullet points I think should be sensible.

Maybe someone could check if there is a yarn lockfile compatible with both ARM and x64?

Suggestion: Improve TTS responsiveness by sending sentences as they generate (segmentation)

To enhance the user experience and reduce pauses in the TTS playback, consider the following improvements:

Begin TTS playback as soon as the first sentence is generated. Implement a buffer or threshold (e.g., 1 second) to determine when to send the next sentence (or group of sentences ready to send) to the TTS engine.
Analyze punctuation marks and linguistic cues to dynamically group sentences before sending them to the TTS engine. If the second sentence is shorter than the buffer duration or threshold, wait until the next sentence is ready and send them together. Continue this process until all text has been sent to the TTS engine.
Handle edge cases, such as reaching the TTS engine's maximum character limit mid-sentence. In these cases, split the text at the nearest natural break (e.g., comma or period) and send the remaining text in the next TTS request.

This approach will minimize the waiting time for text generation, especially noticeable with GPT-4, and aims to provide a seamless TTS experience without pauses.

Please also update the play icon at the bottom left to play all the audio files in sequence.

Remember to consider the character limits of different TTS engines:

ElevenLabs: 2,500 characters per request for non-subscribers, 5,000 characters for subscribers
Azure (future?): ???

By incorporating these changes, the TTS experience will be smoother and more enjoyable for users. Thoughts?

Dear_expert

Can You add button for login and register for this website for people who don't have API
I f they register than can use it

Copy code does not work while the response is still streaming

It just doesn't copy anything to the clipboard. As soon as streaming has finished, copying works.

nvm 14 fails

Steps to reproduce:

nvm use 14.18.0
yarn
yarn build

Fails to compile due to ReferenceError

updating to nvm 16 fixes the issue

Feature request: a way to input text via the keyboard as well

It'd be nice if we could toggle a text box for cases when it's easier to type something than speak it.

(We could just use OpenAI's ChatGPT UI -- but it'd be handy to be able to do it all from one interface.)

GPT-4 support?

Great project!

Ii got the impression from Hacker News that this would work out of the box for GPT-4 (I have access to this via the OpenAI website.) However, I'm not seeing GPT-4 as one of models in the dropdown - is this intended?

Cheers

Suggestions

Have the auto classify/summarise conversion request as an optional tickbox -- Personally I'd rather not waste credits on it being on. (and would prefer to edit the convo title myself)
Have the spoken text captured from the mic go into to textfield -- not auto submit -- so it can be adjusted incase of any transcription errors or reconsiderations after speaking -- rather than trying to re-speak the whole spiel again.
maybe have the max tokens also visible on the main area, so can adjust more quickly. // maybe having an advanced and basic ui toggle.
With the API access, we have access to the system, assistant and user roles -- Being able to built the 'fake' prior context out of those roles would be useful. As is done in (https://platform.openai.com/playground?mode=chat)

Good stuff tho!

Internationalization

If only i18n is implemented, I would be able to share the site to my family who don't know English.

Bug: BGCard images should have object-fit inline style

On some viewports, the carousel with images representing characters / preludes have distorted images.

As far as I can see, it's a matter of adding
style={{objectFit: "cover"}}

style={{objectFit: "contain"}}

to this usage of the mantine Imagecomponent:

yakGPT/components/BGCard.tsx

Line 89 in fef4cae

<Image

There was a direct property objectFit in previous version of mantine-ui, which is deprecated.
Of course you could also add this rule to a style defined with useStyles.

[Dependencies] Does it need to have ~465 dependencies?

Very cool project but as the title say I ask if it is a good practice to have such a long list of dependencies.

Having a large number of dependencies in a Node.js project can have several potential drawbacks:

Increased complexity: Each dependency may have its own set of dependencies, leading to a complex web of dependencies that can be difficult to manage and debug. This can make it challenging to understand how different parts of the codebase interact with each other.
Slower performance: Each dependency must be loaded and initialized at runtime, which can slow down the performance of the application. This can be particularly problematic on low-power devices or in applications that require high levels of performance.
Increased security risks: Every dependency represents a potential security risk, as each one introduces new code that may contain vulnerabilities. This can be especially concerning if the dependencies are not well-maintained or are outdated.
Compatibility issues: Dependencies may not always be compatible with each other or with the version of Node.js being used. This can result in difficult-to-diagnose issues that can be time-consuming to resolve.
Maintenance burden: Each dependency must be updated and maintained over time to ensure that the project remains secure and functional. This can be a significant burden, especially if the dependencies are poorly documented or have complex APIs.

While having a large number of dependencies is not necessarily a bad thing in and of itself, it's important to carefully evaluate each one to ensure that it is necessary and that its benefits outweigh its potential drawbacks.

also, I think you forgot the yarn next telemetry disable step in the README -> Installation.

Minor issue: Missing icons on Apple devices

Safari does not show a Favicon
Neither does iPhone when you choose to "Add to home" the webapp.

Variables

What a are the docker variables to make api and chats persitent?

Export/Import conversations ?

Love the app, thank you!

It'd be super helpful to have the ability to back up and import all and/or selected conversations.

Is it possible to accept more language audio inputs?

yakGPT/components/AudioRecorder.tsx

Line 84 in 7717d66

formData.append("language", "en");

maybe do more i18n
#9

Feature Request: Add support for custom APIs

Apart from the official OpenAI API, there are actually other API providers, such as Azure OpenAI. In some regions where the official OpenAI API is unavailable, they heavily rely on these third-party APIs. Adding support for custom OpenAI keys and domains would play a crucial role in such situations.

There are only a few projects that currently provide this feature, one such example is BetterChatGPT.

TTS support (via ElevenLabs?)

It would be great if you could optionally plug in your ElevenLabs API key and each chat response would be read aloud to you. If you're interested, I can fork and have a go at implementing this.

This model's maximum context length is 4097 tokens.

Full error message: "This model's maximum context length is 4097 tokens. However, you requested 4245 tokens (2980 in the messages, 1265 in the completion). Please reduce the length of the messages or completion."

How do I fix this error as per the suggestion of "Unlimited" option here? Thanks for the great tool by the way.

Azure TTS says "end quote" every time there is a quotation mark.

Can we just have it read dialogue from a story without saying this all the time, please? It kind of ruins kid's bedtime stories.

Thank you!

Close mic when not in use

Presently, once the mic is used once, the mic for this app keeps listening, which puts the orange dot,

on a Mac.

Even if YakGPT isn't using misusing the data, I can't use the orange indicator to know if my other apps are using the mic.

Feature Request: Option to delete individual messages in YakGPT GUI

In certain situations, an AI-generated response may be irrelevant or unhelpful to the ongoing conversation. To maintain the context and improve user experience, it would be beneficial to allow users to delete specific messages from the chat history.

Suggestion: Implement a delete button (e.g., an 'X' icon) next to each question or answer in the YakGPT GUI, enabling users to remove individual messages from the conversation easily. This icon can appear next to the 'regenerate' and 'edit' icons respectively.

Feature Request: Add chat export feature

I am not sure how chat is being stored at the moment, but I think the ability to export chat is always a useful feature. As a bare minimum, it would be nice to be able to get a copy-and-pasteable JSON format of chat.

Could not read whole reply when it's long.

This is a great project that make gpt has ability to have voice.

But I met a question: when the reply of gpt is long (like 200 words), it would not read the final part of reply.

Any guy met similar question?

How to reproduce similar issue:

Ask it question like "how to build a repo on Github", you will get a step-by-step guide, and the voice would keep going to step4 or 5, but could not read to the last word.

[BUG] Microphone input doesn't work

start [_app-eda8b4ac237e4a89.js:50023:251379](http://localhost:3000/_next/static/chunks/pages/_app-eda8b4ac237e4a89.js)
Starting recording... 
Object { _stream: MediaStream, _state: "inactive", _mimeType: "audio/webm", _audioBitsPerSecond: undefined, workerState: "inactive", _wasmPath: "https://cdn.jsdelivr.net/npm/opus-media-recorder@latest/WebMOpusEncoder.wasm", _workerFactory: _workerFactory(e)
, worker: Worker }
[_app-eda8b4ac237e4a89.js:50023:252208](http://localhost:3000/_next/static/chunks/pages/_app-eda8b4ac237e4a89.js)
rendered with audioState recording [_app-eda8b4ac237e4a89.js:50024:19852](http://localhost:3000/_next/static/chunks/pages/_app-eda8b4ac237e4a89.js)
Assertion failed: channel_count > 0 && channel_count <= 2, at: /build/src/ContainerInterface.cpp,22,init [2ad46c9d-0d7c-4935-8c20-88f8ad7846bb:1:21416](blob:http://localhost:3000/2ad46c9d-0d7c-4935-8c20-88f8ad7846bb)
Assertion failed: channel_count > 0 && channel_count <= 2, at: /build/src/ContainerInterface.cpp,22,init [2ad46c9d-0d7c-4935-8c20-88f8ad7846bb:1:21425](blob:http://localhost:3000/2ad46c9d-0d7c-4935-8c20-88f8ad7846bb)
RuntimeError: abort(Assertion failed: channel_count > 0 && channel_count <= 2, at: /build/src/ContainerInterface.cpp,22,init). Build with -s ASSERTIONS=1 for more info. [2ad46c9d-0d7c-4935-8c20-88f8ad7846bb:1:21495](blob:http://localhost:3000/2ad46c9d-0d7c-4935-8c20-88f8ad7846bb)
Stopping recording... submit= true [_app-eda8b4ac237e4a89.js:50023:252308](http://localhost:3000/_next/static/chunks/pages/_app-eda8b4ac237e4a89.js)
rendered with audioState transcribing

I'm using Firefox 112, on Arch Linux.

RuntimeError: abort(Assertion failed: channel_count > 0 && channel_count <= 2, at: /build/src/ContainerInterface.cpp,22,init