Giter Club home page Giter Club logo

Comments (13)

thomas536 avatar thomas536 commented on May 8, 2024 3

Some support was added to LibreTranslate in LibreTranslate/LibreTranslate#12

from argos-translate.

hollorol avatar hollorol commented on May 8, 2024 3

Recently I saw an article about the comparison of language detection tools. FastText can be a viable option instead of langdetect, because it is lot faster.
image

We have an another option which can be quite accurate in case of longer texts: N-grams. There are predetermined n-grams for all supported languages and it is easy the generate new lists. The advantages of using this approach is that the models are really small, the implementation is easy and we it does not need any extra library. In any case, if help needed, I can implement these.

from argos-translate.

PJ-Finlay avatar PJ-Finlay commented on May 8, 2024 3

This is the way to do it for core Argos Translate, the only thing I might change is "detect" instead of "auto-detect".

from argos-translate.

pierotofy avatar pierotofy commented on May 8, 2024 2

This would be pretty useful for any automated translation mechanism!

from argos-translate.

PJ-Finlay avatar PJ-Finlay commented on May 8, 2024 2

@hollorol If you can do this with jus the Python standard library a pull request would be appreciated.

from argos-translate.

PJ-Finlay avatar PJ-Finlay commented on May 8, 2024 2

LibreTranslate already has a system for language detection so this hasn't been a priority. My plan was to use CTranslate2 models to map input text into a language code but open to suggestions.

from argos-translate.

PJ-Finlay avatar PJ-Finlay commented on May 8, 2024 1

Interesting, I think using the same pipeline would be a good long term solution but this could be a something to do in the meantime. One issue with using the pipeline is that as soon as a we add a new language we have to also retrain the detector. This would probably also be lighter weight vs a 100MB model file. The main interest for this is currently from LibreTranslate so if someone wants to extend the Python API to use this that would be welcome and then the API could be reimplemented in the future if it makes sense.

from argos-translate.

TechnologyClassroom avatar TechnologyClassroom commented on May 8, 2024 1

Not everyone uses LibreTranslate.

from argos-translate.

PJ-Finlay avatar PJ-Finlay commented on May 8, 2024 1

The way Argos Translate currently works it would be a breaking change to add this but I'm planning to add it in the next major version. It would also be possible to add language detection to the GUI (which is in a separate repo) using a third party library like Lingua.

from argos-translate.

hollorol avatar hollorol commented on May 8, 2024

@PJ-Finlay, I'll do it only for the cli, because I don't use the GUI part of the program; but I guess after it, adapt it to the GUI will be easy.

from argos-translate.

PJ-Finlay avatar PJ-Finlay commented on May 8, 2024

That sounds good, it should probably be it's own file/module that can be integrated into the CLI.

from argos-translate.

TechnologyClassroom avatar TechnologyClassroom commented on May 8, 2024

Lingua might be useful for this. Lingua is made with python, works with short strings, works offline, and licensed under Apache-2.0.

from argos-translate.

TechnologyClassroom avatar TechnologyClassroom commented on May 8, 2024

I could see it being used like a special input that would trigger the language detection. Syntax could be something like this:

echo "Text to translate" | argos-translate --from-lang auto-detect --to-lang en

from argos-translate.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.