Giter Club home page Giter Club logo

hne's People

Contributors

xiaoyuxin1002 avatar yangji9181 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

hne's Issues

Segmentation Fault [TransE]

Hi,
I tried to run TranE on two different datasets. Both times it throws segmentation fault.
### Program terminated with signal 11, Segmentation fault.
#0 _int_malloc (av=av@entry=0x7fda5cf9b760 <main_arena>, bytes=bytes@entry=800000) at malloc.c:3780
3780 set_head(remainder, remainder_size | PREV_INUSE);

It works on PubMed and my other dataset (133k triplets)
Any pointers to solve this problem..

ps: The system has around 700 Gb ram...

Error in MAGNN

my dgl version is 0.5.2
in MAGNN utils.py line 206:
g.from_network(ng)
raise an error:
raise DGLError('DGLGraph.from_networkx is deprecated. Please call the following\n\n'
dgl._ffi.base.DGLError: DGLGraph.from_networkx is deprecated. Please call the following
dgl.from_networkx(nx_graph, node_attrs, edge_attrs)
, which creates a new DGLGraph from the networkx graph.

Question about R-GCN training

In model.eval(), it iterate through every batch to get node embeddings.
main.py line 112: for batch_num in range(batch_total)

I am wondering why in model.train(), you don't iterate through all batch in each epoch?

Cannot assign node feature "h" on device cuda:0 to a graph on device cpu.

R-GCN Model
dataset="PubMed"
python 3.7 pytorch 1.7.0 dgl-cu102 0.5.2

Traceback (most recent call last):
File "src/main.py", line 185, in
main(args)
File "src/main.py", line 90, in main
embed, pred = model(g, node_id, edge_type, edge_norm)
File "/home/lqd/software/anaconda3/envs/HNE/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/lqd/code/HNE-master/Model/R-GCN/src/model.py", line 117, in forward
output = self.rgcn.forward(g, h, r, norm)
File "/home/lqd/code/HNE-master/Model/R-GCN/src/model.py", line 52, in forward
h = layer(g, h, r, norm)
File "/home/lqd/software/anaconda3/envs/HNE/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/lqd/software/anaconda3/envs/HNE/lib/python3.7/site-packages/dgl/nn/pytorch/conv/relgraphconv.py", line 274, in forward
g.srcdata['h'] = feat
File "/home/lqd/software/anaconda3/envs/HNE/lib/python3.7/site-packages/dgl/view.py", line 81, in setitem
self._graph._set_n_repr(self._ntid, self._nodes, {key : val})
File "/home/lqd/software/anaconda3/envs/HNE/lib/python3.7/site-packages/dgl/heterograph.py", line 3811, in _set_n_repr
' same device.'.format(key, F.context(val), self.device))
dgl._ffi.base.DGLError: Cannot assign node feature "h" on device cuda:0 to a graph on device cpu. Call DGLGraph.to() to copy the graph to the same device.

Data missing

The file "path.dat" is lost in all datasets, while these files are necessary in MAGNN/utils.py

dgl version conflict

When I run MAGNN model, the dgl version matches 0.3, but this version is not suit for R-GCN model. I see this awesome rep contains 13 algorithms, they all have original git rep. So there may be conflict in some packages, if anyone successfully run all algorithms, it will be very nice to provide the version of the important packages such as dgl

Dimension out of range (expected to be in range of [-1, 0], but got -2) [HGT]

pytorch 1.4.0
Dimension out of range (expected to be in range of [-1, 0], but got -2)

dataset : freebase
model: HGT

maybe the data format is wrong

Traceback (most recent call last):
File "src/main.py", line 137, in
node_rep, _ = model.forward(node_feature.to(device), node_type.to(device), edge_time.to(device), edge_type.to(device), edge_index.to(device))
File "/home/bopa/project/HNE/Model/HGT/src/model.py", line 170, in forward
meta_xs = gc(meta_xs, node_type, edge_index, edge_type, edge_time)
File "/home/bopa/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "/home/bopa/project/HNE/Model/HGT/src/model.py", line 49, in forward
return self.propagate(edge_index, node_inp=node_inp, node_type=node_type, edge_type=edge_type, edge_time=edge_time)
File "/home/bopa/.local/lib/python3.6/site-packages/torch_geometric/nn/conv/message_passing.py", line 233, in propagate
kwargs)
File "/home/bopa/.local/lib/python3.6/site-packages/torch_geometric/nn/conv/message_passing.py", line 156, in collect
self.set_size(size, dim, data)
File "/home/bopa/.local/lib/python3.6/site-packages/torch_geometric/nn/conv/message_passing.py", line 119, in set_size
elif the_size != src.size(self.node_dim):
IndexError: Dimension out of range (expected to be in range of [-1, 0], but got -2)

What does the meta='1, 2, 4, 8' mean in HAN run.sh file?

Hi, can you give a few more examples or explanations on the meta meaning in run.sh? I noticed that you wrote a comment saying
"# Choose the meta-paths used for training. Suppose the targeting node type is 1 and link type 1 is between node type 0 and 1, then meta="1" means that we use meta-paths "101"." Do the metapaths all end with targeting node type 1? So, 1, 2, 4, 8 actually mean four meta-paths: 101, 201, 401 and 801? Thank you very much!

DistMult Streambatcher

Hi,
I followed your steps in running the DistMult. I got this error
File "src/main.py", line 182, in
main(args)
File "src/main.py", line 83, in main
train_batcher = StreamBatcher(args.data, 'train', args.batch_size, randomize=True, keys=input_keys, loader_threads=args.loader_threads)
File ".....HNE/Model/DistMult/src/spodernet/spodernet/preprocessing/batching.py", line 217, in init
log.error('Path {0} does not exists! Have you forgotten to preprocess your dataset?', config_path)
File "....HNE/Model/DistMult/src/spodernet/spodernet/utils/logger.py", line 106, in error
raise Exception(message.format(*args))
Exception: Path ...../.data/PubMed/train/hdf5_config.pkl does not exists! Have you forgotten to preprocess your dataset?
Exception ignored in: <bound method StreamBatcher.del of <spodernet.preprocessing.batching.StreamBatcher object at 0x7f4869e3b5c0>>
Traceback (most recent call last):
File ".....HNE/Model/DistMult/src/spodernet/spodernet/preprocessing/batching.py", line 268, in del
for worker in self.loaders:
AttributeError: 'StreamBatcher' object has no attribute 'loaders'

R-GCN RuntimeError: Found dtype Long but expected Float

Dataset: Yelp

Supervised = 'True'

Traceback (most recent call last):
File "src/main.py", line 188, in
main(args)
File "src/main.py", line 94, in main
if args.supervised=='True': loss = model.get_supervised_loss(pred, matched_labels, matched_index, multi)
File "/home/lqd/code/HNE-master/Model/R-GCN/src/model.py", line 142, in get_supervised_loss
predict_loss = F.binary_cross_entropy(torch.sigmoid(embed[matched_index]), matched_labels)
File "/home/lqd/software/anaconda3/envs/HNE/lib/python3.7/site-packages/torch/nn/functional.py", line 2526, in binary_cross_entropy
input, target, weight, reduction_enum)
RuntimeError: Found dtype Long but expected Float

HAN kill!

Hello
Thanks for this valuable work
I try to run HAN on colab
I install PyTorch
and after run transform.sh , try to execute run.sh in the HAN folder
but I see this message in the orange box
image
could you please guide me in this issue?
Thank you

DistMult running error

Sorry, since the issue #9 is closed. I reopen this issue.

I installed spodernet. However,

>>> import spodernet
>>> import spodernet.utils
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'spodernet.utils'

As you can see, spodernet is imported normally. But utils not found.

Negative samples

Hi,
DistMult uses negative sampling for the training...
I couldn't find Negative sampling parameter in the algo.
How did you generate the negative samples and how many NS per positive triplet?

Embedding of Relations

Hi,
I ran your code for DistMult on PubMed dataset.
It saves embeddings of entities, where does it save the embedding of relations.
I want to use them in scoring function and calculate MRR and Hits.
Thanks.

DistMult running error

Traceback (most recent call last):
  File "src/main.py", line 13, in <module>
    from spodernet.hooks import LossHook, ETAHook
  File "/home/usr/anaconda3/lib/python3.7/site-packages/spodernet/hooks.py", line 6, in <module>
    from spodernet.utils.util import Timer
ModuleNotFoundError: No module named 'spodernet.utils'

unsupervised training with HAN and MAGNN

Happy new year! I got a problem in running the code of HAN and MAGNN with unsuperviesd training, using Yelp's data. And this problem is that these codes run out of my 16G memory. ( T^T ) I am not sure that there are another problems,but I wonder if it's related to data's link number ? Also I want to ask whether there are any ways to help me run this code, such as minibatch?
Then I also try to use my data on MAGNN, which occurred the problem that the items numbers of " batch_node_features " and "batch_targets" are not equal. So that when compute the unsupervised loss, the node_features will occur the Keyword Error. ( Of course, the PubMed runs well ) So I think there is something wrong with my path.dat file , I can't figure out what problem is.

About config.dat in HAN

I am confused about what to put in config.dat of model HAN, you said config.dat: The first line specifies the targeting node type. The second line specifies the targeting link type. The third line specifies the information related to each link type, e.g., {head_node_type}\t{tail_node_type}\t{link_type}. I don't understand it fully.

Is it possible to train HAN on weighted graph?

HAN training input file:
link.dat: Each line is formatted as {head_node_id}\t{tail_node_id}\t{link_type}

Is it possible to add edge weight as input and training HAN on a weighted graph? If yes, please advise where shall I make the code modifications. Thanks.

MAGNN training memory consumption

Hello,

First of all, thank you for your paper and your code, it is a big help for anyone interested in HNE problem.
I have a question regarding MAGNN training. I tried fitting it to unattributed PubMed (smallest dataset of 4) in an unsupervised fashion. However, I couldn't do it - after the training started, all 32 gigs of RAM I have available were taken and then the script crashed.
I didn't change MAGNN parameters after cloning the repo and ran the data transform stage as indicated in readme. My question is, am I doing something wrong? Not entirely sure how 118 MB dataset and 2-layer net could do this. How much memory did you need for this task while obtaining the results for the paper?
I have seen in the closed issues that someone else has run into the same problem but there was no definite resolution there :(
Please help :)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.