This summer, I was pleased to get selected for Google Summer of Code'21 under the organization Rocket.Chat. Working under the organization, my project was to Introduce SpeechToText capabilities into Rocket.Chat as a standalone Rocket.Chat App.
Rocket.Chat is a modern team chat and collaborating platform written in full-stack JavaScript offering a full featured rich team chat experience on modern browsers, comparable to Slack and Microsoft Teams. Mobile and desktop clients run on iOS, Android, MacOSX, Windows, and Linux. It is one of the largest active open source (permissive MIT source license) nodeJS communications platform communities on GitHub with more than 12Million users worldwide.
Rocket.Chat users have an option of sending audio files or live audio messages to a room or a chat. My project introduces a new native Rocket.chat app that utilises the capabilities of Rocket.Chat.Apps-Engine and allows the user to be able to not only transcribe the audio files but also save the transcript as metadata. This whole functionality is built packaged into a Rocket.Chat.App, which can be configured to use the transcription provider of user's choice.
The deliverables of the project are as follows:
-
To be able to transcribe the audio files on server side & store the transcript metadata on demand.
-
Create an efficient API to be able to queue/get transcripts.
-
Design & Develop an interactive app interface.
- Utililsed Rocket.Chat.Fuselage Monorepo for creating App components.
-
Provide the User and option to choose from a variety of API providers for trascription. Currently supported providers are:
- Assembly AI.
- Mcirosoft Cognitive Services .
- & Rev AI.
-
Write flexible code so that it's easy to add support for a new provider in the future.
All of the listed deliverables were completed within the GSoC period. YAY !! ๐
The work can be found here - App on Github. Please refer to the Readme.md for setup.
This is a standalone app built over Rocket.Chat.Apps-Engine & Rocket.Chat.Apps-Cli, which can be installed for free from the Rocket.Chat Apps marketplace. It is not a part of the core Rocket.Chat but lives inside the codebase, been provided with the maintainer rights i didn't need to create PR's so here are the commits that sum up the project.
Commit Link | Description | Status |
---|---|---|
Setup | [New] Instantiated a boilerplate app using Rocket.Chat CLI | Approved โ |
Webhook APi & User settings | [New] Added user settings and created Webhook for provider | Approved โ |
Message helpers | [New] Helper functions for creating and sending status messages | Approved โ |
Provider Interface | [New] A new interface that can be implemented whenever an API provider is to be added | Approved โ |
JWT-implementation | [New] custom implementation of JWT using crypto module for file security | Approved โ |
Provider | [Add] Added AssemblyAI support | Approved โ |
Provider | [Add] Added Microsoft Cognitive Services support | Approved โ |
Provider | [Add] Added support for Rev AI | Aproved โ |
Test Mode | [New,Improve] Added a new test mode setting and improved error handling | Aproved โ |
Slash command | [Refactor] Refactored slash commands | Aproved โ |
Bug Fixes | [Improve,Refactor] Added error handling layers and comments for readability] | Aproved โ |
Apart from these contributions, I have contributed to other Rocket.Chat projects. They can be summarized as:
Project | Reference |
---|---|
Rocket.Chat - main repo | Pull requests |
Rocket.Chat.Fuselage | Pull requests |
I would like to thank my mentors Marcelo Schmidt & Douglas Gubert for helping me reach this milestone and providing me this opportunity. ๐๐
I have not only learned about the Open Source culture and how to write good code but also, learned how the industry works and the import
I would also like to thank Priya Bihani for her constant support.
Student | Kartik Gupta |
---|---|
Organization | Rocket.Chat |
Project | Speech To Text |
GitHub | Kartik18g |
Kartik Gupta | |
Gkaartik | |
[email protected] |