Giter Club home page Giter Club logo

Comments (4)

glample avatar glample commented on July 20, 2024 1

For now this is not possible, but this is definitely something we were planning to do at some point. Applying the rotation matrix to every word embedding and ngram embedding in the fastText binary (i.e. the entire input matrix) should do the trick. I don't know if the fastText API allows it. It looks like the API has a "save_model" functionality, but I don't know if you can modify the components of a model easily.

Unfortunately I won't have time to look at this anytime soon but if you want to try I can help you integrating it into MUSE.

from muse.

glample avatar glample commented on July 20, 2024

Hi,

There was indeed a small bug (which appeared when fastText is trained with ngrams and not without ngrams like in a word2vec setting). This commit was designed to handle the bug:
6e0b460

Are you using the latest version? Otherwise, maybe check that the dimension of your embeddings is 300. Otherwise, can you print and copy here the value of (len(words), params.emb_dim) and of embeddings.size()?

from muse.

crypotex avatar crypotex commented on July 20, 2024

Yes, thank you, it trains now. I want to get the OOV words translated (there is issue #13 , but I cannot get embeddings described there).

Is there a way to get out binaries as well ? The output is torch.save(), but I was wondering if it is possible to get the binaries that fasttext(python library or gensim version) ? Is there way to transform it into bin format ?

from muse.

crypotex avatar crypotex commented on July 20, 2024

I do not have time as well right now, but maybe if I have time over the summer.

from muse.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.