Hello! When I first tried your ETM package in R using the Belgian parliament data, it

Error in seq_len(nrow(x)) : argument must be coercible to non-negative integer about etm HOT 4 CLOSED

bnosac commented on June 15, 2024

Error in seq_len(nrow(x)) : argument must be coercible to non-negative integer

from etm.

Comments (4)

jwijffels commented on June 15, 2024

code seq_len(nrow(x)) is only used when splitting the data in a train/test set in https://github.com/bnosac/ETM/blob/master/R/ETM.R#L398 namely at https://github.com/bnosac/ETM/blob/master/R/ETM.R#L475
The error indicates your dtm argument has no data.

Did you check on what you were passing on to the function calls?

from etm.

tinltan commented on June 15, 2024

code seq_len(nrow(x)) is only used when splitting the data in a train/test set in https://github.com/bnosac/ETM/blob/master/R/ETM.R#L398 namely at https://github.com/bnosac/ETM/blob/master/R/ETM.R#L475 The error indicates your dtm argument has no data.

Did you check on what you were passing on to the function calls?

Thank you, I will work on your query and input above.

Earlier, though, I changed my dataset to a little bit more data, which resulted to these dimensions:

dim(dtm)
[1] 190 31
dim(embeddings)
[1] 31 25

After entering this command: loss <- model$fit(data = dtm, optimizer = optimizer, epoch = 20, batch_size = 1000), the prior error did not come out. But I got this new error instead:

Error in Tensor_slice_put(tensor$ptr, environment(), value, mask = .d) :
rhs must be a torch_tensor or scalar value.

I will review the R code as well...

from etm.

jwijffels commented on June 15, 2024

Check on your input data of dtm and embeddings. Make sure there are no NA values in embeddings due to mismatch between embedding matrix and document term matrix
Think twice before applying this model on merely 190 text records which is just not what this model is built for

from etm.

tinltan commented on June 15, 2024

Check on your input data of dtm and embeddings. Make sure there are no NA values in embeddings due to mismatch between embedding matrix and document term matrix

Think twice before applying this model on merely 190 text records which is just not what this model is built for

I tried the algorithm on the 20 newsgroups dataset, and it worked smoothly! (Just had a ggrepel warning saying "10 unlabeled data points (too many overlaps). Consider increasing max.overlaps.")

I will look for a larger dataset than the one I'm using. I will also look into the NA values in embeddings for the previous dataset. These may be the sources of the original error I had been encountering.

I will also try the other suggested plots in pythonrepo. Thank you very much for your great help! 👍 👍 👍

from etm.

Error in seq_len(nrow(x)) : argument must be coercible to non-negative integer about etm HOT 4 CLOSED

Comments (4)

Related Issues (10)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent