Comments (5)
Hi Sachin,
Thanks for letting me know! I'm glad it solved the issue.
Regarding Riemannian optimization implementation, I'm not aware of existing PyTorch projects for the spherical space, but there are some for the hyperbolic space. For example, the Poincare embedding codebase has PyTorch implementation on Riemannian optimization in the Poincare space. Maybe you can take a look specifically at the Poincare manifold implementation where the Riemannian gradient is implemented, as well as the RSGD implementation. Although the optimization formula will be different for the spherical space, I feel the above code might be used as a great reference and template.
Please let me know if you have any other questions!
Best,
Yu
from spherical-text-embedding.
Hi,
Thanks for letting me know the issue. I haven't tried running the code on a corpus with more than 4B tokens, so I can't comment on how much memory it will take approximately (I apologize for not being able to try it right now since I'm attending a conference). However, if it were due to the memory error, you should have received a memory allocation error instead of a segmentation fault.
My current best guess is that you have too many documents/paragraphs in the corpus.
Spherical-Text-Embedding/src/jose.c
Line 18 in b0f8820
As shown in the above line of code, the maximum number of documents allowed is hard-coded here. If your corpus has more documents than this number, the code will run into a segmentation fault. To solve this issue, simply change it to some number larger than the number of lines (which is equal to the number of documents/paragraphs) in your corpus file. Maybe you can give it a try to see if this solves your issue.
Please let me know if you still encounter any errors or have other questions!
Best,
Yu
from spherical-text-embedding.
@Sachin19 See related issue #6.
from spherical-text-embedding.
Hi @Sachin19,
Thanks again for posting this issue. I was wondering if you got a chance to try my suggestions and could provide any update on this issue?
Thanks,
Yu
from spherical-text-embedding.
Hi Yu,
Thank you so much for your suggestion. Line 18 was exactly the issue I was facing and it resolved the issue when I increased the number of documents.
I was also wondering if you could point me to resources on how to implement riemannian optimization in a package like pytorch.
Thanks,
Sachin
from spherical-text-embedding.
Related Issues (15)
- [Question] About subwords and bpe tokenization approach HOT 3
- uSIF vs Averaging HOT 1
- 400D and 500D Spherical embeddings for NER HOT 3
- Python package (bindings) HOT 2
- Experiment setting HOT 3
- Is it applicable to other data? HOT 1
- Unbounded write
- Can you provide the pretrained word embeddings? HOT 1
- [Question] What does '<\s>' in vocabulary means? HOT 2
- Is the method suited for representing Sentence Embedding for paraphrase identification task? HOT 6
- Paper to code mapping HOT 6
- Faulty Memory Management in Implementation of Hash Table with Linear Probing HOT 2
- Licensing HOT 1
- OOV problem HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from spherical-text-embedding.