Giter Club home page Giter Club logo

Comments (5)

yumeng5 avatar yumeng5 commented on May 20, 2024 2

Hi Sachin,

Thanks for letting me know! I'm glad it solved the issue.

Regarding Riemannian optimization implementation, I'm not aware of existing PyTorch projects for the spherical space, but there are some for the hyperbolic space. For example, the Poincare embedding codebase has PyTorch implementation on Riemannian optimization in the Poincare space. Maybe you can take a look specifically at the Poincare manifold implementation where the Riemannian gradient is implemented, as well as the RSGD implementation. Although the optimization formula will be different for the spherical space, I feel the above code might be used as a great reference and template.

Please let me know if you have any other questions!

Best,
Yu

from spherical-text-embedding.

yumeng5 avatar yumeng5 commented on May 20, 2024 1

Hi,

Thanks for letting me know the issue. I haven't tried running the code on a corpus with more than 4B tokens, so I can't comment on how much memory it will take approximately (I apologize for not being able to try it right now since I'm attending a conference). However, if it were due to the memory error, you should have received a memory allocation error instead of a segmentation fault.

My current best guess is that you have too many documents/paragraphs in the corpus.

const int corpus_max_size = 40000000; // Maximum 40M documents in the corpus

As shown in the above line of code, the maximum number of documents allowed is hard-coded here. If your corpus has more documents than this number, the code will run into a segmentation fault. To solve this issue, simply change it to some number larger than the number of lines (which is equal to the number of documents/paragraphs) in your corpus file. Maybe you can give it a try to see if this solves your issue.

Please let me know if you still encounter any errors or have other questions!

Best,
Yu

from spherical-text-embedding.

daskol avatar daskol commented on May 20, 2024

@Sachin19 See related issue #6.

from spherical-text-embedding.

yumeng5 avatar yumeng5 commented on May 20, 2024

Hi @Sachin19,

Thanks again for posting this issue. I was wondering if you got a chance to try my suggestions and could provide any update on this issue?

Thanks,
Yu

from spherical-text-embedding.

Sachin19 avatar Sachin19 commented on May 20, 2024

Hi Yu,

Thank you so much for your suggestion. Line 18 was exactly the issue I was facing and it resolved the issue when I increased the number of documents.

I was also wondering if you could point me to resources on how to implement riemannian optimization in a package like pytorch.

Thanks,
Sachin

from spherical-text-embedding.

Related Issues (15)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.