common-voice / common-voice Goto Github PK

View Code? Open in Web Editor NEW

3.2K 131.0 808.0 1.83 GB

Common Voice is part of Mozilla's initiative to help teach machines how real people speak.

Home Page: https://commonvoice.mozilla.org/

License: Mozilla Public License 2.0

JavaScript 0.88% HTML 0.29% TypeScript 88.76% CSS 9.84% Shell 0.17% Dockerfile 0.06%

open-data voice crowdsourcing internet-freedom

common-voice's Introduction

Common Voice

This is the web app for Mozilla Common Voice, a platform for collecting speech donations in order to create public domain datasets for training voice recognition-related tools.

Upcoming releases

Type	Release Cadence	More info
Platform code & sentences	Monthly, or as needed	Release notes
Dataset	Quarterly	Dataset metadata

Quick links

How to contribute

🎉 First off, thanks for taking the time to contribute! This project would not be possible without people like you. 🎉

There are many ways to get involved with Common Voice - you don't have to know how to code to contribute!

To add or correct the translation of the web interface, please use the Mozilla localization platform Pontoon. Please note, we do not accept any direct pull requests for changing localization content.
For information on how to add or edit sentences to Common Voice, see SENTENCES.md
For instructions on setting up a local development environment, see DEVELOPMENT.md
For information on how to add a new language to Common Voice, see LANGUAGE.md
For information on how to get in contact with existing language communities, see COMMUNITIES.md

For more general guidance on building your own language community using Mozilla voice tools, please refer to the Mozilla Voice Community Playbook.

Discussion

For general discussion (feedback, ideas, random musings), head to our Discourse Category.

For bug reports or specific feature, please use the GitHub issue tracker.

For live chat, join us on Matrix.

Licensing and content source

This repository is released under MPL (Mozilla Public License) 2.0.

The majority of our sentence text in /server/data comes directly from user submissions in our Sentence Collector or they are scraped from Wikipedia using our extractor tool, and are released under a CC0 public domain Creative Commons license.

Any files that follow the pattern europarl-VERSION-LANG.txt (such as europarl-v7-de.txt) were extracted with our thanks from the Europarl Corpus, which features transcripts from proceedings in the European parliament.

Citation

If you use the data in a published academic work we would appreciate if you cite the following article:

Ardila, R., Branson, M., Davis, K., Henretty, M., Kohler, M., Meyer, J., Morais, R., Saunders, L., Tyers, F. M. and Weber, G. (2020) "Common Voice: A Massively-Multilingual Speech Corpus". Proceedings of the 12th Conference on Language Resources and Evaluation (LREC 2020). pp. 4211—4215

The BiBTex is:

@inproceedings{commonvoice:2020,
  author = {Ardila, R. and Branson, M. and Davis, K. and Henretty, M. and Kohler, M. and Meyer, J. and Morais, R. and Saunders, L. and Tyers, F. M. and Weber, G.},
  title = {Common Voice: A Massively-Multilingual Speech Corpus},
  booktitle = {Proceedings of the 12th Conference on Language Resources and Evaluation (LREC 2020)},
  pages = {4211--4215},
  year = 2020
}

Cross Browser Testing

This project is tested with Browserstack

common-voice's People

Contributors

Stargazers

Watchers

Forkers

mcav gozer andrenatal tabs16 janetcshih fiji-flo ameeransari dmateusp benexile tddmanju ayushbansal07 mrcljx stevenlol shyamalschandra mehdi-dali mozillasv nshoo nextgenintelligence ariestiyansyah alejandrosuarez jhutar won21kr kant rakuna molodiuc nomadicmehul zhiqinghuang gavinljj daitiantian001 guyueyuqi amitkumarj441 joeskinov augustnmonteiro tinnightcap ziegeer 174high kenrick95 td-picardie thethakuri olachan showeryoung missingworld siclemx libo729 reinhart1010 helloiss cedroid09 douglasbagnall programmerq baaslaawe danielhartnell-mozilla sahwar nmstoker ad349 aries0d0f arjunpalakkal angjelkom mikealphabravo heltmm ckjpn codemargaret alaasarhan sandizapar elahmo brunovrbanaquino techiaith karlambuckner vestigej epq mikehenrty gregoor tenosiswono jameslinus renevolution erolaliyev iain8 ozipi jcfausto anbenson iveskins dngros g10dras acidburn0zzz ukii wmsbill armagheadon jing1201 bacharakis gogiarishabh aldass pethin corner4world shubhampachori12110095 zevfung l0op mithlesh4257 burakdev ayu250 jameswestman willstott101

common-voice's Issues

Need final decision on website URL

Ideally, it would be voice.mozilla.org. But we need to figure out how to get this.

Finalize SOW for Softvision for QA

Figure out scope of testing, writing into SOW, get over to Softvision for quote

[Bug] Cannot listen to m4a on desktop, cannot listen to ogg on iphone

We'll need to do client detection and server conversion.

Perhaps one thing we could do is add extension to the url: /upload/random.m4a to give the server a hint about what to serve.

Better though would be client side audio conversion. Need to see what is easier, converting ogg -> m4a on webkit, or the other way around on everything else, or neither.

JuneActivation release

Investigate iOS wrapper app for mobile website

Is it possible for us to publish a "wrapper" app on the app store, and then change the content later? If so, let's build it!

[iOS App] Don't take up full screen

Right now the app displays underneath the system statusbar (which is transparent). This looks a little weird. Can we make it so that the app displays inside a frame and not underneath the statusbar?

[iOS] We need a splash screen

Basic flow, Wire Frames

Landing Page (main)

Show overall progress towards goal
Big Donate now button

Leaderboard

show top 3
show where you are
show bottom 2

Rewards Page

List any rewards the user has (with unclaimed ones at the top)
Ability to add name for leaderboard, new sentence, login

Contributors

Dynamic list of all users who reached level 2

Voice Donation Page

experience points (voice-coins), perhaps count of how many submissions from your user
Next reward widget - a

Profile Page

Decision on Hosting new site

Whose budget?
What hosting company?
Security review?

Animation for contributing a single voice

It is important that the submission functionality has a pleasant effect, since we want people to do this often.

Document data collection overall and review with Legal

[iOS] need to show something when offline

right now we just show a blank screen. i have a feeling this will be a show stopper for the apple review

Make final decisions on the specific "stickiness/gamification" functionality

We need a final decision and set of issues/user stories generated on what specific stickiness/gamification functionality we are building for the June Activation release.

@kdavis-mozilla is the final decision-maker

The rest of us will build out options that he can sign-off.

Use usertesting.com to get some rapid feedback on stickiness options

Quickly mock-up (not code) and get feedback on stickiness/gamification options before we make a final choice.

Depends on building out a long-list of options.

[iOS] need to show something while is loading

Need video or animation to help explain Voice Dataset concept

[Listen Screen] Flow

There are several issues here I'm not sure if I should break them up in different issues.

The general flow should be the user only needs to press on Yes / No to move on to the next sentence. No separate Submit button.
Instead of the browser sound controls, we should have a Play button, which becomes Repeat after the first time it is pressed.
After submitting, the next sentence should be loaded and the audio played automatically. The goal is to have as few user interactions as possible.
A user should not hear the same (sentence, audio) pair more than once.
There seems to several types of recorded audio samples:
- Accurate readout of the sentence
- Medium quality, perhaps a word or two are missing
- Nothing was recorded at all
- Background noise
- Something else was recorded entirely, perhaps the used accidentally pressed the record button and left it on, posing a privacy concern.
- Profanity, or vulgar content that should be flagged.

Legal review of technical architecture separating voice data collection and user info

Dependant on draft of architecture in issue #13

[iOS] cannot listen to audio files on iPhone

Getting an error from the audio element.

We need a larger corpus of text input

Right now, we only have 3K sentences or so.

Two possible new sources are

Wikimedia text
BYU Corpus
Leipzig Corpora

Text for initial iOS app and website

Home page:
Build the world's most diverse set of voice data that researchers and others can use for free to create better voice technologies for the Internet.

[maybe button here?]
Click 'Record' to start donating your voice.

Your voice donations will be made available for researchers and others to use under a Creative Commons license [https://creativecommons.org/publicdomain/zero/1.0/]. Your name or any other identifying information will not be associated with this voice data.

This project is governed by Mozilla's Privacy Policy [https://mozilla.org/privacy/websites/]

About page:
Project Common Voice is brought to you by Mozilla, the proudly non-profit champions of the Internet.

Today's technologies that allow learning from data are freely available for anyone to use, and are resulting a wave of innovation online. However, voice technologies (for example, speech recognition) are not seeing the same innovation because little data is freely available to train machine learning technologies. The data that is available is from a set of speakers with limited diversity of accents and languages.

Our aim with Project Common Voice is to enable "voice donors" to build the world's largest and most diverse set of voice data that is freely available for anyone to use. Our vision is that researchers and others will be able to use this data to increase innovation in voice related technologies. This will help everyone have access to a new wave of voice technologies, and ensure that people aren't locked-in to using services from a small number of Internet giants.

First time use incorrectly initializes User

If one goes to the website without ever previously visiting the site, one hits the error

Uncaught TypeError: Cannot read property 'userId' of null
    at User.Component.setState (bundle.js:404)
    at User.restore (bundle.js:487)
    at new User (bundle.js:475)
    at new App (bundle.js:1430)
    at HTMLDocument.<anonymous> (bundle.js:301)

due to restore() incorrectly initializing userId on first time use.

[iOS] API to open app permissions page on Settings

Add voice verification functionality

Users can help us validate voice input. Right now we don't have this scoped, so we would need to do that if this becomes a priority.

This was originally brought up in issue #3.

Legal review of experience we will be submitting to Apple for review

Decision on data architecture/flow

flow-chart to describe data structures and relationships
legal review per issue #12

[iOS] Add a debug box so we can see console logs for ios app

Hire a UX/UI person

Need a UX/UI contractor for roughly 1-2 weeks for interaction and visual design of the responsive website layout.

Add metrics tracking

Bounce rate at various entry points
First time experience funnel

We need a mysql table for tracking user info

We know we at least need the following fields:

email address
userid (generated in the client, separate from collection id)
accent
year of birth
gender
display name

Figure out who is submitting for review to Apple

Finalize decision on license type for the data sets that will be made public

Decision based on advice from legal on the license type for the data sets that will be made public.

[iOS] We need a png to be displayed when the user is offline

Figure out approach to security review of products

Reach out Selena D

Identifying any legal issues in the current iOS app experience

iOS build to bsmith for legal review and identifying issues - @andrenatal
feedback from bsmith --> leads to issues to incorporate

[iOS app] Disable overscroll

Right now you can drag the content of the webview up and down due to Safari's overscroll ability. Can we disable this in the app?

Trademark clearance search for Project Common Voice

[iOS] Uploading audio is not working

[iOS] uploading is sending "undefined" in place of userid

which is bad because we use userid in the audio file path

[Web] If the user denies the permission to record, give instructions to do so in Settings

Legal review of text on the web app

Finalize production server environment with IT DevOps

Need a production environment on IT managed AWS account. Ensure that is setup so that we can prod there.

[First Time Experience] Need User Flow for FTE

Here's a proposal for a simple first time experience that introduces all the features. Obviously this is still open to suggestions and/or complete rewrites.

User starts at Homepage:
(Text needs to be short and clear, and have call to action.)

Welcome to Voice Commons!
Our goal is to create a Public Domain (CC0) database of voices in every language with any accent. We believe this data can empower universities, researchers, buinesses, non-profits (like ourselves), or anyone who is interested in Voice Recognition technology to be creators in this emerging space. You can learn more about Common Voice through our Mision Statement.

Take our guided tour:
Button text could be Lend your Voice or Donate or Try it out

(button takes you to donate screen)

Donate Screen

insert brief description of why we need "labeled" voice clips.

(pehaps we can call out that your data won't be uploaded yet)

User clicks records their voice, and clicks Submit.

(here we could have some sort of nice indication they just contributed)
(submitting their voice clip takes them to the listening screen.)

Listen Screen

Explain that we need to make sure that what everyone says is correct
"Play this sound clip, and tell is if what they said matches the senstence"

(here users could listen to your own clip, or we could have a pre-canned example)

User selects yes or no:
Yes || no

(Whether they say yes or no, move on to registration)

Don't worry, what you just recorded has not been added to our database yet. First, we would like you to take a look at our Privacy Policy.
tl;dr When we publish the voice database, we will strip all personally identifying information about you. But while using this site, you may share as much or as little personal information as you'd like with other users.

To use all the features of this site, you need to register. Again, your email will not be tied to any published voice data.
Registration Form
(perhaps make it clear they can still donate as Guest)