ibmtjbot / node-tjbotlib Goto Github PK

View Code? Open in Web Editor NEW

33.0 12.0 43.0 2.35 MB

Node.js library for programming TJBot

Home Page: https://ibmtjbot.github.io

License: Apache License 2.0

JavaScript 99.78% Shell 0.22%

tjbot ibm-cloud raspberry-pi ibm-watson

node-tjbotlib's Introduction

TJBot Library

Node.js library that encapsulates TJBot's capabilities: seeing, listening, speaking, shining, and waving.

This library can be used to create your own recipes for TJBot.

Some of TJBot's capabilities require IBM Cloud services. For example, seeing is powered by the IBM Watson Visual Recognition service. Speaking and listening are powered by the IBM Watson Text to Speech and IBM Watson Speech to Text services.

To use these services, you will need to sign up for a free IBM Cloud account, create instances of the services you need, and download the authentication credentials.

Usage

Install the library using npm.

$ npm install --save tjbot

💡 Note: The TJBot library was developed for use on Raspberry Pi. It may be possible to develop and test portions of this library on other Linux-based systems (e.g. Ubuntu), but this usage is not officially supported.

Import the TJBot library.

TJBot is packaged as both an ES6 and a CommonJS module (explained in this guide), which means you may import it using either the ES6 import statement or the CommonJS require method.

For ES6, import TJBot as follows:

import TJBot from 'tjbot';

For CommonJS, import TJBot as follows:

const TJBot = require('tjbot').default;

💡 Note: For CommonJS, the TJBot class is exported under a .default reference.

Instantiate the TJBot object.

const tj = new TJBot();
tj.initialize([TJBot.HARDWARE.LED_NEOPIXEL, TJBot.HARDWARE.SERVO, TJBot.HARDWARE.MICROPHONE, TJBot.HARDWARE.SPEAKER]);

This code will configure your TJBot with an LED (Neopixel), servo, microphone, and speaker. The default configuration of TJBot uses English as the main language with a male voice. Here is an example of a TJBot that speaks with a female voice in Japanese:

const tj = new TJBot({ 
    robot: { 
        gender: TJBot.GENDERS.FEMALE 
    }, 
    speak: { 
        language: TJBot.LANGUAGES.SPEAK.JAPANESE 
    }
});

IBM Watson Credentials

If you are using IBM Watson services, store your authentication credentials in a file named ibm-credentials.env. Credentials may be downloaded from the page for your service instance, in the section named "Credentials."

If you are using multiple IBM Watson services, you may combine all of the credentials together in a single file.

The file ibm-credentials.sample.env shows a sample of how credentials are stored.

💡 Note: You may also specify the path to the credentials file in the TJBot constructor using the credentialsFile argument. For example, const tj = new TJBot(credentialsFile="/home/pi/my-credentials.env").

Hardware Configuration

The entire list of hardware devices supported by TJBot is defined in TJBot.HARDWARE and includes CAMERA, LED_NEOPIXEL, LED_COMMON_ANODE, MICROPHONE, SERVO, and SPEAKER. Each of these hardware devices may be configured by passing in configuration options to the TJBot constructor as follows.

var configuration = {
    log: {
        level: 'info', // valid levels are 'error', 'warn', 'info', 'verbose', 'debug', 'silly'
    },
    robot: {
        gender: TJBot.GENDERS.MALE, // see TJBot.GENDERS
    },
    converse: {
        assistantId: undefined, // placeholder for Watson Assistant's assistantId
    },
    listen: {
        microphoneDeviceId: 'plughw:1,0', // plugged-in USB card 1, device 0; see 'arecord -l' for a list of recording devices
        inactivityTimeout: -1, // -1 to never timeout or break the connection. Set this to a value in seconds e.g 120 to end connection after 120 seconds of silence
        backgroundAudioSuppression: 0.4, // should be in the range [0.0, 1.0] indicating how much audio suppression to perform
        language: TJBot.LANGUAGES.LISTEN.ENGLISH_US, // see TJBot.LANGUAGES.LISTEN
    },
    wave: {
        servoPin: 7, // default pin is GPIO 7 (physical pin 26)
    },
    speak: {
        language: TJBot.LANGUAGES.SPEAK.ENGLISH_US, // see TJBot.LANGUAGES.SPEAK
        voice: undefined, // use a specific voice; if undefined, a voice is chosen based on robot.gender and speak.language
        speakerDeviceId: 'plughw:0,0', // plugged-in USB card 1, device 0; 'see aplay -l' for a list of playback devices
    },
    see: {
        confidenceThreshold: 0.6,
        camera: {
            height: 720,
            width: 960,
            verticalFlip: false, // flips the image vertically, may need to set to 'true' if the camera is installed upside-down
            horizontalFlip: false, // flips the image horizontally, should not need to be overridden
        },
        language: TJBot.LANGUAGES.SEE.ENGLISH_US,
    },
    shine: {
        // see https://pinout.xyz for a pin diagram
        neopixel: {
            gpioPin: 18, // default pin is GPIO 18 (physical pin 12)
            grbFormat: false // if false, the RGB color format will be used for the LED; if true, the GRB format will be used
        },
        commonAnode: {
            redPin: 19, // default red pin is GPIO 19 (physical pin 35)
            greenPin: 13, // default green pin is GPIO 13 (physical pin 33)
            bluePin: 12 // default blue pin is GPIO 12 (physical pin 32)
        }
    }
};
const tj = new TJBot(configuration);

Capabilities

TJBot has a number of capabilities that you can use to bring it to life. Capabilities are combinations of hardware and Watson services that enable TJBot's functionality. For example, "listening" is a combination of having a speaker and the speech_to_text service. Internally, the _assertCapability() method checks to make sure your TJBot is configured with the right hardware and services before it performs an action that depends on having a capability. Thus, the method used to make TJBot listen, tj.listen(), first checks that your TJBot has been configured with a speaker and the speech_to_text service.

TJBot's capabilities are:

Analyzing Tone with the Watson Tone Analyzer service
Conversing with the Watson Assistant service
Listening with the Watson Speech to Text service
Seeing with the Watson Visual Recognition service
Shining its LED
Speaking with the Watson Text to Speech service
Translating between languages with the Watson Language Translator service
Waving its arm

The full list of capabilities can be accessed programatically via TJBot.CAPABILITIES, the full list of hardware components can be accessed programatically via TJBot.HARDWARE, and the full list of Watson services can be accessed programatically via TJBot.SERVICES.

TJBot API

Please see the API docs for documentation of the TJBot API.

💡 Please see the Migration Guide for guidance on migrating your code to the latest version of the TJBot API.

Tests

TJBotLib uses the Jest framework for basic testing of the library. These tests may be run from the tjbotlib directory using npm:

npm test

The tests run by this command only covers basic functionality of the library. A separate set of tests (see below) covers hardware-specific behaviors. These tests also do not cover functionality provided by Watson services.

A suite of hardware tests exists in the main TJBot repository in the tests directory.

Contributing

We encourage you to make enhancements to this library and contribute them back to us via a pull request.

License

This project uses the Apache License Version 2.0 software license.

node-tjbotlib's People

Contributors

Stargazers

Watchers

node-tjbotlib's Issues

ES6 breaks compatibility with Node-Red

@jweisz. As of release 2.0.0 the TJBotLib has been packaged as an ES6 module.
This is making it very hard to to include the library in Node-Red as I discovered the hard way:
https://discourse.nodered.org/t/best-practice-to-import-an-es6-module-in-a-node-for-node-red/41014

Can you please share the rationale to package the library as an ES6 module?
Would it be possible to keep support for CommonJS?

Add support for temperature sensor to node-tjbotlib

Speech Queue

Delibrate on how TJBot handles speech and queuing.
Currently any requests to speak are discarded if the bot is currently speaking. Upside is that the bot doesn't not start saying things that are no longer useful or important when a user has moved from from context.

Allow for custom URLs to pass to the SDK in tjbot library

The TJ bot library hard-codes the URL to https://stream.watsonplatform.net/speech-to-text/api/ . And in cases where clients are operating services outside of the US-region, it will be necessary to change the URL to gain access to the service. For example, Speech to text services created in this Sydney region will need to use the endpoint https://gateway-syd.watsonplatform.net/speech-to-text/api . Please enable custom URLs to pass to the SDK in this library.

Please update "rpi-ws281x-native" dependency to "^0.10.1"

The update of this module contains a fix to include support for a new revision code on the Raspberry Pi 4s that use bcm2711.

Version 2.0.2 missing from npmjs

It doesn't look like 2.0.2 has been published to npmjs. I hit problems which have been fixed in 2.0.2 and although I managed to work round the missing version on npmjs, it would be great to have it published if possible. Thanks.

Redesign "capabilities"

We might want to re-work the whole "capabilities" part of tjbotlib to automatically detect which hardware is connected, rather than having it be user specified. That said, we might not actually be able to do this in all cases (e.g. is there a way to tell if an LED or servo are connected to the GPIO pins? How do we detect a microphone? Speaker?)

Implement new Rock Paper Scissors recipe

Watson Visual Recognition is deprecated, so we should find a (preferably) open source replacement

Add Support for Cathode LEDs

We support Anode and Neopixel LEDs, but not Cathode LEDs. Adding Cathode support would be helpful since Neopixel LEDs are currently more difficult that Cathode LEDs.

Support keyword synomyms for robotname

Due to transcription errors (watson transcribed as whats on), support a list of synonyms that are frequently transcribed for a given keyword.
Extra credit: support local keyword detection (e.g. snowboy)

Update Watson SDK to v5

Pretty please keep up with the dependencies changes.

First contribution: refactoring and improvements

I forked it and changed some items. I will submit a pull request. If my changes are going in a right way I will continue maintain and improving it. I decided to change these points to take this lib more flexible to grow.

Changelog:

TJBot have two setup parameters: credentials and config. To use the same logic config.js and samples were refactored to have these two objects and override what is necessary.
Moved TJBot.prototype.defaultConfiguration to config.js making it easy to access the list of parameters and setup TJ.
Now config.js could have all service parameters. If you don't like to use some service, just keep user and password as blank.
For development purpose (minimize the raspberry dependency), the conversation service was tested in Mac using SOX in replacement of ALSA (works but speaker player are throwing an exception). So moved tests to Ubuntu using ALSA. It works fine with little adjusts. Readme updated.
Added config.microphone for microphone hardware setup (card/device). It is necessary in some situations.
Auto choose voice based on language and gender: Instead of set specific voice manually, in the initialization TJ will get tts voices available and choose one based on preferred language and gender.

Discussion on what to do with IBM cloud services in tjbotlib

Given the churn we've seen in the IBM cloud sdk and now the incorporation of foundation model support into ibm-watson-machine-learning (with a corresponding WML service required in IBM cloud), does it make sense to:

continue providing wrappers inside tjbotlib around the Watson APIs? or
remove all Watson API stuff from tjbotlib and push it into the recipes, leaving tjbotlib to only abstract the hardware interface (e.g. led, mic, speaker, camera, servo)

how do I keep subsequent tj commands from running into each other

I am having tj bot speak two sentences and then listen for a user response.
The issue I am having is that tj bot is speaking the first sentence and then only one or two words of the second sentence before it is cut off. I observed that the code is continuing and not waiting for the text to speech to play.

Rename master to main

The master branch needs to be renamed to "main"

Drag and drop interface for TJBot

Recipes/Modules that tie in well with the tjbot lib but allow for drap and drop based programming for TJBot. Good candidates include integration with node-red for mid-level skill, and integration with ScratchX for beginner/entry level.
Some initial Node-red work already done here.

Discussion on whether to keep tjbotlib as javascript or rewrite in python?

We want to do things with watsonx.ai, but those SDKs are python. Same with langchain. So… we need to port tjbotlib to python for that, right? 🤦

possible tj.listen() issue

This might be an generic issue with using node.js or a problem with the tj.listen(). Here's my scenario. I have a conversation feedback loop: listen(), converse(), speak(), and repeat.
This is all handled through a callback in the listen command.
However, I need to briefly pull out of this feedback loop to play a mini-game with tjbot where I orchestrate all of the commands (listen, speak, wave, and shine) and then return to this feedback loop when completed. What I am finding is that when I stop listening and start listening from another function, the original feedback loop starts processing the utterance instead of the function I had just called.

e.g.)
//main feedback loop
tj.listen(function(msg){
tj.pauseListening();
tj.converse(function(resp){
if (rest == "play game"){
playGameA();
}
tj.resumeListening();
});
});

Support use of multiple chained leds

tjbot listen doesn't resume after hit the 100MB

When I get the error "Payload exceeds the 104857600 bytes limit" I believe it is expected that tjbot will automatically restart the mic and stt connection and keep listening. This is not happening though.

I found a possible solution. Adding self.stopListening(); to the error handler of _micTextStream inside listen method.

This way I believe the process running the microphone record will stop and all the resources becomes free for a mic restart (the code for mic restart seems to be already in place).

I'll try to send a pull request with this proposed fix.

Customization ID for Speech-To-text

To improve recognition capabilities of Watson STT, It would be great to support language model customization in input parameter

Left handed TJ

Hi there. I've recently assembled a TJ Bot, and after running somes recipes that uses the arm, I found that no matter how I place the servo, the movements are wrong.
By swapping default servo positions I got it right. Default are:

TJBot.prototype._SERVO_ARM_BACK = 500;
TJBot.prototype._SERVO_ARM_UP = 1400;
TJBot.prototype._SERVO_ARM_DOWN = 2300;

Swapping BACK and DOWN solves the problem, since the arm is in the left side of the bot.

Event Logging :

Allow an application to provide a function that is called whenever an even occurs within the tjbot library. Events in this case can be "listen", "see", "wave", "speak", "conversation turn" etc.

An application can then chose to use this in perhaps visualizing the underlying process the bot is engaging in.
See here as an example of such visualization.

Support sound prompts and prosody

Support prompts that indicate the state of TJBot (e.g. beeps that indicate the bot is listening or stopped listening etc).

Prosody: Allow customization in the way TJBot speaks - e.g. express surprise, high pitch etc.

Find alternative to Watson Language Translator

Watson Language Translator is now deprecated. It's unclear what the replacement service will be, if there will be one.

apikey services

update the libraries for using apikey instead of username / password
I already have this solved, as well as the integration with the node-red node, just give me permission to submit my new branch and the pull request :)

return "result" in SHINE and WAVE methods.

If you check out the shine and wave methods in the Node.js library, they do not return a result. Therefore, the nodes don't output anything.

https://github.com/ibmtjbot/tjbotlib/blob/master/lib/tjbot.js#L905
https://github.com/ibmtjbot/tjbotlib/blob/master/lib/tjbot.js#L1369

Transposing them in Node-Red (as done by Jeancarl Bisson) means that these nodes will not have an output and, so, they cannot chain in sequence. If I have to WAVE or SHINE to draw person attention before SEE something, the only way is to introduce delay nodes and manage parallel flows... a not comfortable solution.

Is it possibile to "fix" TJBot library introducing a return value?

Mobile App for Wifi Setup and Remote Control

App that enables users connect to and setup their TJBot.

Launch bluetooth and listen for incoming connects to setup TJBot
- Connect with a mobile app via bluetooth
- Provide wifi credentials to Pi via mobile app
- Send Pi IP address to mobile
Allow controlling of bot from mobile app
- Setting up watson credentials
- Hardware control ... wave, change led color,
- Capability control ... e.g play music, text to speech,

tjbot.js is looking for watson modules in the wrong location

the file node_modules/tjbot/lib/tjbot.js is trying to include Watson modules from:
import AssistantV2 from 'ibm-watson/assistant/v2.js' It appears that there is a missing directory the path (dist). I think it should be: import AssistantV2 from 'ibm-watson/dist/assistant/v2.js' This holds true for all of the watson imports (6).

Feature: TJBot Startup Script

A bootstrap script that prepares a stock rpi with the stuff it needs to “become” tjbot — e.g. installing / configuring alsa, blacklisting certain kernel modules, checking out tjbot code, performing hardware tests to make sure the user hooked up everything on the right pins etc.

Install configure alsa
Install configure latest nodejs, npm
blacklist certain kernel modules (sound/led interference problem)
Checkout TJBot code
Perform hardware tests .. servo, led, mic, speaker
Also check for Raspberry PI version. RPi v2 needs some additional installation steps

upgrade nodejs, npm, node-gyp and gcc.
https://github.com/audstanley/NodeJs-Raspberry-Pi
nodejs/node-gyp#809
https://community.thinger.io/t/starting-with-the-raspberry-pi/36

Replace Tone Analyzer with Natural Language Understanding

Watson Tone Analyzer is deprecated, so we should find a (preferably) open source replacement

Integration

Is it possible to integrate and run two or more modules together at once(Tone Analyzer & TTS/STT)

Support multi voices in speech-to-text and select it automatically

Voice model used in recognition is automatically set based on user
language.
New sample conversation that repeat exactly what you said (used to
validate speech recognition).

How to specify a customer classifier on tj.see or tj.recognizeObjectsInPhoto

I've trained a custom classifier in my watson visual recognition service, but am not sure the best way to have my tjbot code pick up this classifier when using tj.see or tj.recognizeObjectsInPhoto. Either of those are using the "generic" classifiers and not the custom model i have training in my visual recognition service.

Any help, tips?

Update versions of pigpio and rpi-ws281x-native?

I got my tjbot as a gift last year, but didn't get around to playing with him until now. My 11 year old nephew constructed and wired him without assistance. What an awesome experience for him. I, however, had a little more difficulty due to my late start. :-)

It would appear that the version of node installed with Raspbian Buster won't properly build the pigpio or rpi-ws281x-native modules. In other recipes I tried, I was able to update package.json to reference the latest versions of these. But I can't do that in the tjbot bootstrap module. I was able to fanangle it by building them separately and copying them into the node_modules folder, but that defeats the point of bootstrapping.

Just wanted to make this recommendation on behalf of any new tjbot users that might come along. I appreciate the work you guys have put into this. It's really cool and a great learning tool for him and me.

Thanks a ton!