How could I get a beeter result using my own data?

Sorry to bother you ,I really want to get a relatively high accurary using my own data to make link prediction, however, I get a result that accurary is almost zero. It makes me very confusing, may you give me some suggestions

ImportError: No module named 'tensorflow_backend'

I'm trying to run and I get this error:

Traceback (most recent call last):
  File "C:\Users\Axel\git\RelationPrediction\code\", line 5, in <module>

    from optimization.optimize import build_tensorflow
  File "C:\Users\Axel\git\RelationPrediction\code\optimization\", lin
e 1, in <module>
    import tensorflow_backend.algorithms as tensorflow_algorithms
ImportError: No module named 'tensorflow_backend'

I don't know which package I should install to resolve this module import error.

I was wondering how much GPU memory is needed to replicate the result in the paper? I tried on all three datasets but for all of them, I run into OOM issue.

missing valid_accuracy.txt


if I add in gcn_basis.exp


I get the following error:

FileNotFoundError: [Errno 2] No such file or directory: FB-Toutanova/valid_accuracy.txt

How can I generate that missing file?


Configuration settings for toy dataset

It would be nice to know which settings to choose for the Toy dataset:

  • internal encoder dimension?
  • regularization using basis or block-diagonal-decomposition?
  • If using basis decomposition, how many basis functions?

Ideally, it would be nice to know the specific .exp settings file to use. Thanks

How to run your code?

I'm so sorry to bother you? Can you give an example of the "configuration"? I don't know how to set it.

Best approach to read new graph after training?

The input graph in subject,predicate,object form is transformed into matrices for training. Is the output graph after training also available in subject,predicate,object form? If not, what is the easiest way to check out the new output graph and the new added links? I noticed the creation of .index and .meta files in the models folder. Do these files describe the output graph? If so, how can I read these files?

0% GPU Utility most of the time

When I trained the R-GCN model with dataset 'FB-Toutanova', using setting 'gcn_basis.exp', I used nvtop to monitor the GPU utilization and found that the GPU utilization was mostly 0% during the training process, and would not exceed 50%. Instead, memory usage is staggering, with frequent OOMs on large datasets. The training process is very slow, taking close to a day for 10,000 iterations. Perhaps there are many work for performance improvements.


How do adjacency matrices be used in code

  1. How does base decomposition reduce computational complexity? Is the base function the same throughout the code?
  2. How do adjacency matrices be used in code?
    How is the problem-specific normalized constant C calculated?
    Thanks for your reply !

Using training triplet index as edge ID?

In the script, copied and pasted below, the first element of the tuple for the adj_list is the index of the train_triplets,

adj_list = [[] for _ in entities]
for i,triplet in enumerate(train_triplets):
    adj_list[triplet[0]].append([i, triplet[2]])
    adj_list[triplet[2]].append([i, triplet[0]])

Later on in the sample_edge_neighborhood method, you have the follow code to sample the edge,

chosen_vertex = np.random.choice(np.arange(degrees.shape[0]), p=probabilities)
chosen_adj_list = adj_list[chosen_vertex]
seen[chosen_vertex] = True


chosen_edge = np.random.choice(np.arange(chosen_adj_list.shape[0]))
chosen_edge = chosen_adj_list[chosen_edge]
edge_number = chosen_edge[0]

The chosen_adj_list is an array of shape num_neighbors x 2, the second dimension being the tuple you're appending in the previous code block. But here, chosen_edge[0] would give you the index of the training triplet, which is not in anyway related to the edge type of the triplet, right?

Embedding error

When using a dataset with more relations than constants (e.g. Nations) the following error raises:

InvalidArgumentError (see above for traceback): indices[3] = 16 is not in [0, 16) [[Node: embedding_lookup_1 = Gather[Tindices=DT_INT32, Tparams=DT_FLOAT, _class=["loc:@Variable"], validate_indices=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"](Variable/read, strided_slice_1)]]

It's a slightly extended Toy dataset (adding new 9 relations). It's most likely caused by restricting embedding's lookup values by the number of entities (which is, in this case, lower than the number of relations). Is this fix appropriate?

file: common/
line: 28 & 29
input_shape = [max(int(encoder_settings['EntityCount']),int(encoder_settings['RelationCount'])), int(encoder_settings['CodeDimension'])]

work on entity alignment

does r-gcn works on entity alignment (existing R-GCN seems only focus on classification and link prediction)?

If yes, May I know do you have any working code on this?

or does this link prediction model works directly on that? Since “Links between two nodes exist” is very similar to “two nodes referring to same real-world object”

recommended hardware

I ran the relational prediction training on my PC without GPU using the toy dataset and it is still running after 12+h hours. Is there any specific hardware configuration using GPU which you can recommend?

