gustavostz / whisper-clip Goto Github PK

WhisperClip simplifies your life by automatically transcribing audio recordings and saving the text directly to your clipboard. With just a click of a button, you can effortlessly convert spoken words into written text, ready to be pasted wherever you need it. This application harnesses the power of OpenAI’s Whisper for free.

Home Page: https://openai.com/research/whisper

License: MIT License

Python 100.00%

audio-processing audio-transcription clipboard openai productivity productivity-tools python speech-recognition speech-to-text whisper

whisper-clip's Introduction

WhisperClip: One-Click Audio Transcription

Features
Installation
Usage
Configuration
Feedback
Acknowledgments

Features

Record audio with a simple click.
Automatically transcribe audio using Whisper (free).
Option to save transcriptions directly to the clipboard.

Installation

Prerequisites

Python 3.8 or higher
CUDA is highly recommended for better performance but not necessary. WhisperClip can also run on a CPU.

Setting Up the Environment

Clone the repository:

git clone https://github.com/gustavostz/whisper-clip.git
cd whisper-clip

Install PyTorch if you don't have it already. Refer to PyTorch's website for installation instructions.
Install the required dependencies:
```
pip install -r requirements.txt
```

Choosing the Right Model

Based on your GPU's VRAM, choose the appropriate Whisper model for optimal performance. Below is a table of available models with their required VRAM and relative speed:

Size	Required VRAM	Relative speed
tiny	~1 GB	~32x
base	~1 GB	~16x
small	~2 GB	~6x
medium	~5 GB	~2x
large	~10 GB	1x

For English-only applications, .en models (e.g., tiny.en, base.en) tend to perform better.

To change the model, modify the model_name variable in config.json to the desired model name.

Usage

Run the application:

python main.py

Click the microphone button to start and stop recording.
If "Save to Clipboard" is checked, the transcription will be copied to your clipboard automatically.

Configuration

The default shortcut for toggling recording is Alt+Shift+R. You can modify this in the config.json file.
You can also change the Whisper model used for transcription in the config.json file.

Feedback

If there's interest in a more user-friendly, executable version of WhisperClip, I'd be happy to consider creating one. Your feedback and suggestions are welcome! Just let me know through the GitHub issues.

Acknowledgments

This project uses OpenAI's Whisper for audio transcription.

whisper-clip's People

Contributors

Stargazers

Watchers

Forkers

mumujianguang angiek123 jeffmartson alvintree sorokinvld crossoufire chemistwang

whisper-clip's Issues

Unable to run normally after clicking the button

I followed the readme steps before proceeding with the operation.
After I finish executing Python main.py
Pop up UI interface with microphone
Click on the microphone

python main.py
Exception in thread Thread-6 (record_audio):
Traceback (most recent call last):
File "C:\Users\qaz21\AppData\Local\Programs\Python\Python312\Lib\threading.py", line 1073, in _bootstrap_inner
self.run()
File "C:\Users\qaz21\AppData\Local\Programs\Python\Python312\Lib\threading.py", line 1010, in run
self._target(*self._args, **self._kwargs)
File "C:\Codes\whisper-clip\audio_recorder.py", line 96, in record_audio
with sd.InputStream(callback=self.audio_callback):
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\qaz21\AppData\Local\Programs\Python\Python312\Lib\site-packages\sounddevice.py", line 1421, in init
_StreamBase.init(self, kind='input', wrap_callback='array',
File "C:\Users\qaz21\AppData\Local\Programs\Python\Python312\Lib\site-packages\sounddevice.py", line 817, in init
_get_stream_parameters(kind, device, channels, dtype, latency,
File "C:\Users\qaz21\AppData\Local\Programs\Python\Python312\Lib\site-packages\sounddevice.py", line 2660, in get_stream_parameters
info = query_devices(device)
^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\qaz21\AppData\Local\Programs\Python\Python312\Lib\site-packages\sounddevice.py", line 569, in query_devices
raise PortAudioError(f'Error querying device {device}')
sounddevice.PortAudioError: Error querying device -1
Exception in Tkinter callback
Traceback (most recent call last):
File "C:\Users\qaz21\AppData\Local\Programs\Python\Python312\Lib\tkinter_init.py", line 1967, in call
return self.func(*args)
^^^^^^^^^^^^^^^^
File "C:\Codes\whisper-clip\audio_recorder.py", line 54, in toggle_recording
self.stop_recording()
File "C:\Codes\whisper-clip\audio_recorder.py", line 69, in stop_recording
audio_data = np.concatenate(self.recordings)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: need at least one array to concatenate
Exception in thread Thread-7 (record_audio):
Traceback (most recent call last):
File "C:\Users\qaz21\AppData\Local\Programs\Python\Python312\Lib\threading.py", line 1073, in _bootstrap_inner
self.run()
File "C:\Users\qaz21\AppData\Local\Programs\Python\Python312\Lib\threading.py", line 1010, in run
self._target(*self._args, **self._kwargs)
File "C:\Codes\whisper-clip\audio_recorder.py", line 96, in record_audio
with sd.InputStream(callback=self.audio_callback):
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\qaz21\AppData\Local\Programs\Python\Python312\Lib\site-packages\sounddevice.py", line 1421, in init
_StreamBase.init(self, kind='input', wrap_callback='array',
File "C:\Users\qaz21\AppData\Local\Programs\Python\Python312\Lib\site-packages\sounddevice.py", line 817, in init
_get_stream_parameters(kind, device, channels, dtype, latency,
File "C:\Users\qaz21\AppData\Local\Programs\Python\Python312\Lib\site-packages\sounddevice.py", line 2660, in get_stream_parameters
info = query_devices(device)
^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\qaz21\AppData\Local\Programs\Python\Python312\Lib\site-packages\sounddevice.py", line 569, in query_devices
raise PortAudioError(f'Error querying device {device}')
sounddevice.PortAudioError: Error querying device -1
Exception in Tkinter callback
Traceback (most recent call last):
File "C:\Users\qaz21\AppData\Local\Programs\Python\Python312\Lib\tkinter_init.py", line 1967, in call
return self.func(*args)
^^^^^^^^^^^^^^^^
File "C:\Codes\whisper-clip\audio_recorder.py", line 54, in toggle_recording
self.stop_recording()
File "C:\Codes\whisper-clip\audio_recorder.py", line 69, in stop_recording
audio_data = np.concatenate(self.recordings)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: need at least one array to concatenate

Create a Video Demonstration of WhisperClip Usage

Description

We are looking for a contributor to create a video demonstration showing how to use WhisperClip. The video should cover the installation process, basic usage, and highlight the key features of the application.

Requirements

The video should be clear and easy to understand.
It should cover the following:
- Cloning the repository and setting up the environment.
- Running the application.
- Starting and stopping audio recording.
- Enabling the "Save to Clipboard" option and demonstrating its use.
- Changing the default shortcut and Whisper model in the config.json file (optional).
The video can be of any length, as long as it effectively demonstrates how to use WhisperClip.

Reward

We value your contribution! If your video is selected, we will feature it in the Home Page (ReadMe section) of the WhisperClip GitHub repository. This could be a great opportunity for marketing your channel or video, as it will be visible to all visitors of the project.

Submission

Please submit your video by adding a comment to this issue with a link to the video (e.g., YouTube, Vimeo). We look forward to seeing your submissions!