Giter Club home page Giter Club logo

sonalgan / bhasha Goto Github PK

View Code? Open in Web Editor NEW
1.0 1.0 0.0 10 MB

Bhasha: Deep Learning Web App for Multilingual Text Detection. Detects 10 Indic languages and English from text. Trained on Azure ML Studio, deployed on Heroku using Docker. Achieves over 80% accuracy. Utilizes TensorFlow, Keras, Flask, and Docker for seamless deployment.

License: MIT License

Python 39.42% HTML 13.76% CSS 45.69% Dockerfile 1.13%
deeplearning language-detection neural-network nlp azure docker

bhasha's Introduction

Logo

Bhasha Web App: Indic Languages Detection from Text

Bhasha Web App is a deep learning-based web application designed to detect multiple Indian languages from a given text. The model achieves an accuracy rate of over 80% in predicting the language of the provided input text. The training and testing data for the model were sourced from the MultiIndicMT dataset, which encompasses 10 major Indic languages: Bengali, Gujarati, Hindi, Kannada, Malayalam, Marathi, Oriya, Punjabi, Tamil, Telugu, along with English.

Features and Development

  • Data Encoding: The web app employs encoding techniques to map the diverse multilingual characters into a more standardized encoding format known as ISCII (Indian Script Code for Information Interchange).

  • Model Development: The initial model's accuracy and loss were analyzed and fine-tuned using various regularization techniques. The model's performance was improved through rigorous testing and refinement.

  • Hyperparameter Tuning: The final model's performance was further optimized using Azure Hyperdrive, which allowed for fine-tuning the model's hyperparameters to achieve the best results.

Usage

  1. Web Interface: Users can input text in mixed Indic languages or even an unknown combination of languages. The web app will display the percentage distribution of Indic languages present in the input text, allowing users to identify the dominant language components.

  2. Dockerized Deployment: The web app is deployed using Docker containerization technology. This ensures a consistent and reliable deployment process that can be easily replicated across different environments.

Deployment

The Bhasha Web App is hosted on the Heroku platform using Dockerized containers. This deployment approach offers scalability, flexibility, and ease of management.

Figures

Fig. 1: Encoding of diverse multilingual characters to ISCII encoding format for uniform representation.

ISCII Encoding

Fig. 2: Initial model accuracy and loss, followed by the implementation of various regularization techniques.

Model Accuracy and Loss

Fig. 3 and 4: Demonstration of the web app's functionality: inputting unknown mixed Indic language text and receiving the percentage distribution of languages.

Input Output

Contributors

License

This project is licensed under the MIT License.

Training scripts for the deep learning model can be found in the DeepLearning repository.

Feel free to contribute and collaborate to enhance the accuracy and language coverage of the Bhasha Web App!

bhasha's People

Contributors

sonalgan avatar dependabot[bot] avatar

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.