Giter Club home page Giter Club logo

olney1 / chatgpt-openai-smart-speaker Goto Github PK

View Code? Open in Web Editor NEW
194.0 194.0 24.0 147.97 MB

This AI Smart Speaker uses speech recognition and text-to-speech to enable voice-driven conversations and vision capabilities with OpenAI and Agents. The user speaks a prompt into the microphone, and the program sends the prompt to OpenAI to generate a response. The response is then converted to an audio file and played back to the user.

License: MIT License

Python 100.00%
agents ai artificial-intelligence chatgpt gpt-4 langchain langsmith openai smarthome smartspeaker speech-recognition speech-to-text text-to-speech vision

chatgpt-openai-smart-speaker's People

Contributors

benjam23 avatar olney1 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

chatgpt-openai-smart-speaker's Issues

ReSpeaker 4mic Array for Raspberry Pi is not detected

Install another driver for the ReSpeaker

If you are using the ReSpeaker, you may encounter the same problem that the Raspberry Pi does not recognise our microphone.

You can check if this is the case using the code below:

arecord -L

If you have installed the correct driver, you will probably see the result like this:

pi@raspberrypi:~ $ arecord -L
null
    Discard all samples (playback) or generate zero samples (capture)
jack
    JACK Audio Connection Kit
pulse
    PulseAudio Sound Server
default
playback
ac108
sysdefault:CARD=seeed4micvoicec
    seeed-4mic-voicecard,
    Default Audio Device
dmix:CARD=seeed4micvoicec,DEV=0
    seeed-4mic-voicecard,
    Direct sample mixing device
dsnoop:CARD=seeed4micvoicec,DEV=0
    seeed-4mic-voicecard,
    Direct sample snooping device
hw:CARD=seeed4micvoicec,DEV=0
    seeed-4mic-voicecard,
    Direct hardware device without any conversions
plughw:CARD=seeed4micvoicec,DEV=0
    seeed-4mic-voicecard,
    Hardware device with all software conversions
usbstream:CARD=seeed4micvoicec
    seeed-4mic-voicecard
    USB Stream Output
usbstream:CARD=ALSA
    bcm2835 ALSA
    USB Stream Output

If your Raspberry Pi does not show CARD=seeed4micvoicec, the microphone is not deteced.

If that is your case, you should follow the steps below:

Step 1

If you have already downloaded the wrong driver, I recommend you to restart from installing new Raspberry Pi OS (64GB recommended)
When you finish downloading it, access to your Raspberry Pi and get the Seeed voice card source code:

sudo apt-get update
git clone https://github.com/HinTak/seeed-voicecard.git
cd seeed-voicecard
sudo ./install.sh
sudo reboot

Step 2

Select audio output on Raspberry Pi:

sudo raspi-config
# Select 1 System options
# Select S2 Audio
# Select your preferred Audio output device (USB if you are connecting speaker with usb cable)
# Select Finish

Step 3

Check that the sound card name looks like this:

pi@raspberrypi:~ $ arecord -L
null
    Discard all samples (playback) or generate zero samples (capture)
jack
    JACK Audio Connection Kit
pulse
    PulseAudio Sound Server
default
playback
ac108
sysdefault:CARD=seeed4micvoicec
    seeed-4mic-voicecard,
    Default Audio Device
dmix:CARD=seeed4micvoicec,DEV=0
    seeed-4mic-voicecard,
    Direct sample mixing device
dsnoop:CARD=seeed4micvoicec,DEV=0
    seeed-4mic-voicecard,
    Direct sample snooping device
hw:CARD=seeed4micvoicec,DEV=0
    seeed-4mic-voicecard,
    Direct hardware device without any conversions
plughw:CARD=seeed4micvoicec,DEV=0
    seeed-4mic-voicecard,
    Hardware device with all software conversions
usbstream:CARD=seeed4micvoicec
    seeed-4mic-voicecard
    USB Stream Output
usbstream:CARD=ALSA
    bcm2835 ALSA
    USB Stream Output

If you get the same results, congrats you have now finished setting up the ReSpeaker!

If you want to test it, you can go through the code that the ReSpeaker maker published on their website:
https://wiki.seeedstudio.com/ReSpeaker_4_Mic_Array_for_Raspberry_Pi/

Error, not work on laptop win 10

D:\New folder\ChatGPT-OpenAI-Smart-Speaker-main>python smart_speaker.py
Say something!
result2:
{ 'alternative': [ { 'confidence': 0.79627585,
'transcript': 'same time captain'},
{'transcript': 'same time come on'},
{'transcript': 'same time'},
{'transcript': 'same time okay'},
{'transcript': 'same time cap on'}],
'final': True}
Google Speech Recognition thinks you said same time captain
result2:
{ 'alternative': [ { 'confidence': 0.79627585,
'transcript': 'same time captain'},
{'transcript': 'same time come on'},
{'transcript': 'same time'},
{'transcript': 'same time okay'},
{'transcript': 'same time cap on'}],
'final': True}
This is the prompt being sent to OpenAIsame time captain
james cook was born in marton-in-cleveland, england

1728
'mpg321' is not recognized as an internal or external command,
operable program or batch file.

Error 263 for command:
    open response.mp3
The specified device is not open or is not recognized by MCI.

Error 263 for command:
    close response.mp3
The specified device is not open or is not recognized by MCI.

Failed to close the file: response.mp3
Traceback (most recent call last):
File "D:\New folder\ChatGPT-OpenAI-Smart-Speaker-main\smart_speaker.py", line 68, in
main()
File "D:\New folder\ChatGPT-OpenAI-Smart-Speaker-main\smart_speaker.py", line 65, in main
play_audio_file()
File "D:\New folder\ChatGPT-OpenAI-Smart-Speaker-main\smart_speaker.py", line 55, in play_audio_file
playsound("response.mp3", block=False) # There’s an optional second argument, block, which is set to True by default. Setting it to False makes the function run asynchronously.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\thucn\AppData\Local\Programs\Python\Python311\Lib\site-packages\playsound.py", line 72, in _playsoundWin
winCommand(u'open {}'.format(sound))
File "C:\Users\thucn\AppData\Local\Programs\Python\Python311\Lib\site-packages\playsound.py", line 64, in winCommand
raise PlaysoundException(exceptionMessage)
playsound.PlaysoundException:
Error 263 for command:
open response.mp3
The specified device is not open or is not recognized by MCI.

PyAudio IOError: No Default Input Device Available

Hi I installed all the components on an AWS EC2 instance (ubuntu20) and got this error message when I run "python smart_speaker.py"

ALSA lib conf.c:5178:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory
ALSA lib conf.c:5701:(snd_config_expand) Evaluate error: No such file or directory
ALSA lib pcm.c:2664:(snd_pcm_open_noupdate) Unknown PCM default
ALSA lib confmisc.c:855:(parse_card) cannot find card '0'
ALSA lib conf.c:5178:(_snd_config_evaluate) function snd_func_card_id returned error: No such file or directory
ALSA lib confmisc.c:422:(snd_func_concat) error evaluating strings
ALSA lib conf.c:5178:(_snd_config_evaluate) function snd_func_concat returned error: No such file or directory
ALSA lib confmisc.c:1334:(snd_func_refer) error evaluating name
ALSA lib conf.c:5178:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory
ALSA lib conf.c:5701:(snd_config_expand) Evaluate error: No such file or directory
ALSA lib pcm.c:2664:(snd_pcm_open_noupdate) Unknown PCM dmix
Cannot connect to server socket err = No such file or directory
Cannot connect to server request channel
jack server is not running or cannot be started
JackShmReadWritePtr::~JackShmReadWritePtr - Init not done for -1, skipping unlock
JackShmReadWritePtr::~JackShmReadWritePtr - Init not done for -1, skipping unlock
Traceback (most recent call last):
File "/root/ChatGPT-OpenAI-Smart-Speaker/smart_speaker.py", line 68, in
main()
File "/root/ChatGPT-OpenAI-Smart-Speaker/smart_speaker.py", line 59, in main
prompt = recognize_speech()
File "/root/ChatGPT-OpenAI-Smart-Speaker/smart_speaker.py", line 16, in recognize_speech
with sr.Microphone() as source:
File "/usr/local/lib/python3.10/dist-packages/speech_recognition/init.py", line 99, in init
device_info = audio.get_device_info_by_index(device_index) if device_index is not None else audio.get_default_input_device_info()
File "/usr/lib/python3/dist-packages/pyaudio.py", line 949, in get_default_input_device_info
device_index = pa.get_default_input_device

Any idea how I can fix this?

Sound not playing with play_audio_file()

Redefine play_audio_file() in smart_speaker.py using pydub

The initial code looks like below:

def play_audio_file():
    # play the audio file and wake speaking LEDs
    pixels.speak()
    # os.system("mpg321 response.mp3")
    playsound("response.mp3", block=False) # There’s an optional second argument, block, which is set to True by default. Setting it to False makes the function run asynchronously.

However, we can simply fix this using pydub package instead of playsound.

This is the code that works for me:

from pydub import AudioSegment
from pydub.playback import play

def play_audio_file():
    song = AudioSegment.from_mp3("response.mp3")
    play(song)

Testing Microphone and Speaker

Record sound with Python

When using the code which the ReSpeaker Website provided, some errors are generated.

To run the following examples, clone https://github.com/respeaker/4mics_hat.git repository to your Raspberry Pi:

git clone https://github.com/respeaker/4mics_hat.git

All the Python scripts, mentioned in the examples below can be found inside this repository. To install the necessary dependencies, from mic_hat repository folder, run:

sudo apt-get install portaudio19-dev libatlas-base-dev
cd 4mics_hat/ # Do not forget to change to the correct directory
pip3 install -r requirements.txt

We use PyAudio python library to record sound with Python.

python3 recording_examples/get_device_index.py

You will see the device ID as below.

Input Device id  2  -  seeed-4mic-voicecard: - (hw:1,0)

To record the sound, open recording_examples/record.py file with nano or other text editor and change RESPEAKER_INDEX = 2 to index number of ReSpeaker on your system. Then run python script record.py to make a recording:

python3 recording_examples/record.py

To play the recorded samples you can use aplay:

aplay output.wav # Recorded voice will be saved in the output.wav file

If the sound does not play, check another issue page that I published to check if both microphone and speaker are connected properly: #4

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.