Giter Club home page Giter Club logo

frequensee's Introduction

Where language analysis meets cultural insights


Why

Many of the "most common words" list I've ever seen for language learners were based largely on works such as the Brown Corpus research study compiled in the late 1960's at the Brown University in Providence Rhode Island. But English has changed a lot since then, and much of the world has their own culturally / regionally unique flavor of English that make most lists of common words not only antiquated, but sometimes useless or even wasteful.

For example, a common list of the 800 supposedly most common words in Armenian translates the English word "parcel" to the Armenian word "ծանրոց", a word rarely used in real daily life. Another similar list for Hindi shows the English word "carraige" translated as "गाडी", but this word really means "vehicle", and by "carraige", in real daily life in India, a person probably means to say "rickshaw" or simply use the English word "car".

Lists like this develop because most "common words" lists start in the learner's own language, instead of the target language. Then translaters are forced to do the best they can to approximate the concept in the target language. Not only is this a very subjective process, but also just because a word is common in the learner's language doesn't mean the same concept will be common in the target language.

The bigger, and more real challenge to a non-native language learner is the fact that the vocabulary used at home (informal and usually intimate and familial or even coloquial) (like "hey hun, pass the butter!") is often vastly different than the vocabulary used at the workplace (more formal and professional) ("John, the web server just went down again, could you reboot it?"), which is still very different than the vocabulary you hear on the news ("Man caught smuggling drugs through airport security"), or on TV ("Set phase pistols to stun!"), or in books ("Oh mister Darcy!"). What I mean is that language is extreamely context sensitive. The top 100 most common words used at home might not really get you very far at work, or in a public setting. You may not really care about learning vocabulary of all the different body parts, unless you plan on working in the medical field, for example, and you may never hear important workplace vocabulary while listening to the news. So, why not target vocabulary learning to the most urgent and specific type of setting that YOU will mostly likely find yourself in today?

Even more to the point, imagine the most practical list of words that's perfectly suited just to you. That list of helpful words will most certainly change over time as you become more fluent, and as native people start expecting more vocabulary from you. In other words, even the most modern generic list of 1000 words will almost certainly contain only a fraction of the words you really need long term, and yet still be filled with tons of words you may never realistically need.

The purpose of this project is to empower you with the ability to find the vocabulary that you need to work on today. If you can collect and enter text that is more specific to the setting you're interested in, be it a household setting, proffessional, technical, or a public forum setting, the more relavant the statistics are to the words you need to learn. This allows you to be more dynamic, able to adapt your vocabulary to your current context and subject of interest.

How to run it

Download the code, and run it in Visual Studio Community Edition or higher. A small UI is used to copy and paste text into the program, which will produce a word frequency and letter frequency chart.

frequensee's People

Contributors

jakemclelland avatar

Watchers

 avatar

frequensee's Issues

dictionary service

Save a configurable dictionary source such as http://www.nayiri.com/search?l=en&dt=HY_EN&query={word} as the dictionary source.

load the definitions in the background and present them on hover

or maybe google translate
https://translate.google.com/#hy/en/{word}
(look for span id="result_box")

in order to make this accommodate configurable future languages, we need an interface that includes the specified language in the url routing, a method to send the request, and a method to parse the results. This language should be modularized, and configurable so that the next language can simply be configured from the front end without any code changes if possible.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.