Giter Club home page Giter Club logo

openke's Introduction

OpenKE (sub-project of OpenSKL)

OpenKE is a sub-project of OpenSKL, providing an Open-source Knowledge Embedding toolkit for knowledge representation learning (KRL), with TransR and PTransE as key features to handle complex relations and relational paths in large-scale knowledge graphs.

Overview

OpenKE is an efficient implementation based on PyTorch for knowledge embedding. We use C++ to implement some underlying operations such as data preprocessing and negative sampling. For each specific model, it is implemented by PyTorch with Python interfaces so that there is a convenient platform to run models on GPUs. OpenKE contains 4 repositories:

OpenKE-PyTorch: the repository based on PyTorch, which provides the optimized and stable framework for knowledge graph embedding models.

OpenKE-Tensorflow1.0: OpenKE implemented with TensorFlow, also providing the optimized and stable framework for knowledge graph embedding models.

TensorFlow-TransX: light and simple version of OpenKE based on TensorFlow, including TransE, TransH, TransR and TransD.

Fast-TransX: efficient lightweight C++ inferences for TransE and its extended models utilizing the framework of OpenKE, including TransH, TransR, TransD, TranSparse and PTransE.

More information (especially the embedding databases of popular knowledge graphs obtained by OpenKE and related documents) is available on our website http://openke.thunlp.org/

Models

Besides our proposed TransR and PTransE, we also support the following typical knowledge embedding models:

OpenKE (PyTorch):

  • RESCAL
  • DistMult, ComplEx, Analogy
  • TransE, TransH, TransR, TransD
  • SimplE
  • RotatE

OpenKE (Tensorflow):

  • RESCAL, HolE
  • DistMult, ComplEx, Analogy
  • TransE, TransH, TransR, TransD

TensorFlow-TransX (TensorFlow):

  • TransE, TransH, TransR, TransD

Fast-TransX (C++):

  • TransE, TransH, TransR, TransD, TranSparse, PTransE

We welcome any issues and requests for model implementation and bug fix.

Evaluation

To validate the effectiveness of this toolkit, we employ the link prediction task on large-scale knowledge graphs for evaluation.

Settings

For each test triplet, the head is removed and replaced by each of the entities from the entity set in turn. The scores of those corrupted triplets are first computed by the models and then sorted by the order. Then, we get the rank of the correct entity. This whole procedure is also repeated by removing those tail entities. We report the proportion of those correct entities ranked in the top 10/3/1 (Hits@10, Hits@3, Hits@1). The mean rank (MR) and mean reciprocal rank (MRR) of the test triplets under this setting are also reported.

Because some corrupted triplets may be in the training set and validation set. In this case, those corrupted triplets may be ranked above the test triplet, but this should not be counted as an error because both triplets are true. Hence, we remove those corrupted triplets appearing in the training, validation or test set, which ensures the corrupted triplets are not in the dataset. We report the proportion of those correct entities ranked in the top 10/3/1 (Hits@10 (filter), Hits@3(filter), Hits@1(filter)) under this setting. The mean rank (MR (filter)) and mean reciprocal rank (MRR (filter)) of the test triplets under this setting are also reported.

More details of the above-mentioned settings can be found from the papers TransE, ComplEx.

For those large-scale entity sets, to corrupt all entities with the whole entity set is time-costing. Hence, we also provide the experimental setting named "type constraint" to corrupt entities with some limited entity sets determining by their relations.

Results

We have provided the hyper-parameters of some models to achieve the state-of-the-art performace (Hits@10 (filter)) on FB15K237 and WN18RR. These scripts can be founded in the folder "./examples/". The results of these models are as follows: the left two columns are the performance implemented by OpenKE, and the right two columns are the performance reported in the original papers. Overall, OpenKE can reproduce the results in the original papers.

Model WN18RR FB15K237 WN18RR (Paper*) FB15K237 (Paper*)
TransE (2013) 0.512 0.476 0.501 0.486
TransH (2014) 0.507 0.490 - -
TransR (2015) 0.519 0.511 - -
TransD (2015) 0.508 0.487 - -
DistMult (2014) 0.479 0.419 0.49 0.419
ComplEx (2016) 0.485 0.426 0.51 0.428
ConvE (2017) 0.506 0.485 0.52 0.501
RotatE (2019) 0.549 0.479 - 0.480
RotatE+adv (2019) 0.565 0.522 0.571 0.533

RotatE has the best performance by representing knowledge in complex space. Our proposed TransR has the second best performance, and the real-valued representations learned by TransR can be more easily integrated with other neural network models, e.g. pre-trained language models. Please refer to our another toolkit Knowledge-Plugin for such integration.

Usage

Installation

  1. Install PyTorch

  2. Clone the OpenKE-PyTorch branch:

git clone -b OpenKE-PyTorch https://github.com/thunlp/OpenKE --depth 1
cd OpenKE
cd openke
  1. Compile C++ files
bash make.sh
  1. Quick Start
cd ../
cp examples/train_transe_FB15K237.py ./
python train_transe_FB15K237.py

Data Format

  • For training, datasets contain three files:

    train2id.txt: training file, the first line is the number of triples for training. Then the following lines are all in the format (e1, e2, rel) which indicates there is a relation rel between e1 and e2 . Note that train2id.txt contains ids from entitiy2id.txt and relation2id.txt instead of the names of the entities and relations. If you use your own datasets, please check the format of your training file. Files in the wrong format may cause segmentation fault.

    entity2id.txt: all entities and corresponding ids, one per line. The first line is the number of entities.

    relation2id.txt: all relations and corresponding ids, one per line. The first line is the number of relations.

  • For testing, datasets contain additional two files (totally five files):

    test2id.txt: testing file, the first line is the number of triples for testing. Then the following lines are all in the format (e1, e2, rel) .

    valid2id.txt: validating file, the first line is the number of triples for validating. Then the following lines are all in the format (e1, e2, rel) .

    type_constrain.txt: type constraining file, the first line is the number of relations. Then the following lines are type constraints for each relation. For example, the relation with id 1200 has 4 types of head entities, which are 3123, 1034, 58 and 5733. The relation with id 1200 has 4 types of tail entities, which are 12123, 4388, 11087 and 11088. You can get this file through n-n.py in folder benchmarks/FB15K .

Citation

If you find OpenKE is useful for your research, please consider citing the following papers:

 @inproceedings{han2018openke,
   title={OpenKE: An Open Toolkit for Knowledge Embedding},
   author={Han, Xu and Cao, Shulin and Lv Xin and Lin, Yankai and Liu, Zhiyuan and Sun, Maosong and Li, Juanzi},
   booktitle={Proceedings of EMNLP},
   year={2018}
 }

This package is mainly contributed (in chronological order) by Xu Han, Yankai Lin, Ruobing Xie, Zhiyuan Liu, Xin Lv, Shulin Cao, Weize Chen, Jingqin Yang.


About OpenSKL

OpenSKL project aims to harness the power of both structured knowledge and natural languages via representation learning. All sub-projects of OpenSKL, under the categories of Algorithm, Resource and Application, are as follows.

  • Algorithm:
    • OpenKE
      • An effective and efficient toolkit for representing structured knowledge in large-scale knowledge graphs as embeddings, with TransR and PTransE as key features to handle complex relations and relational paths.
      • This toolkit also includes three repositories:
    • ERNIE
      • An effective and efficient toolkit for augmenting pre-trained language models with knowledge graph representations.
    • OpenNE
      • An effective and efficient toolkit for representing nodes in large-scale graphs as embeddings, with TADW as key features to incorporate text attributes of nodes.
    • OpenNRE
      • An effective and efficient toolkit for implementing neural networks for extracting structured knowledge from text, with ATT as key features to consider relation-associated text information.
      • This toolkit also includes two repositories:
  • Resource:
    • The embeddings of large-scale knowledge graphs pre-trained by OpenKE, covering three typical large-scale knowledge graphs: Wikidata, Freebase, and XLORE. The embeddings are free to use under the MIT license, and please click the following link to submit download requests.
    • OpenKE-Wikidata
      • Wikidata is a free and collaborative database, collecting structured data to provide support for Wikipedia. The original Wikidata contains 20,982,733 entities, 594 relations and 68,904,773 triplets. In particular, Wikidata-5M is the core subgraph of Wikidata, containing 5,040,986 high-frequency entities from Wikidata with their corresponding 927 relations and 24,267,796 triplets.
      • TransE version: Knowledge embeddings of Wikidata pre-trained by OpenKE.
      • TransR version of Wikidata-5M: Knowledge embeddings of Wikidata-5M pre-trained by OpenKE.
    • OpenKE-Freebase
      • Freebase was a large collaborative knowledge base consisting of data composed mainly by its community members. It was an online collection of structured data harvested from many sources. Freebase contains 86,054,151 entities, 14,824 relations and 338,586,276 triplets.
      • TransE version: Knowledge embeddings of Freebase pre-trained by OpenKE.
    • OpenKE-XLORE
      • XLORE is one of the most popular Chinese knowledge graphs developed by THUKEG. XLORE contains 10,572,209 entities, 138,581 relations and 35,954,249 triplets.
      • TransE version: Knowledge embeddings of XLORE pre-trained by OpenKE.
  • Application:
    • Knowledge-Plugin
      • An effective and efficient toolkit of plug-and-play knowledge injection for pre-trained language models. Knowledge-Plugin is general for all kinds of knowledge graph embeddings mentioned above. In the toolkit, we plug the TransR version of Wikidata-5M into BERT as an example of applications. With the TransR embedding, we enhance the knowledge ability of BERT without fine-tuning the original model, e.g., up to 8% improvement on question answering.

openke's People

Contributors

albertyang33 avatar chenweize1998 avatar dschaehi avatar erickguan avatar helloxcq avatar joker-song avatar mrlyk423 avatar pushpankar avatar shulincao avatar thucsthanxu13 avatar yuchenlin avatar zzy14 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

openke's Issues

Results of TransE

Sir, how can i get accuracy i want to find precision from confusion matrix and im getting my results something like this
transeoutput

segmentation fault

➜ OpenKE git:(master) ✗ python example.py
Input Files Path : ./data/FB15K/
Output Files Path : ./data/FB15K/
The toolkit is importing datasets.
[1] 21071 segmentation fault python example.p

confusing about the test or predict results of models.

Hi,thanks for your test case, and I train and test successfully.
But I was confused by the test stage, here are the core code picce:

self.lib.getHeadBatch(self.test_h_addr, self.test_t_addr, self.test_r_addr)
res = self.test_step(self.test_h, self.test_t, self.test_r)

when I debug ,I found that res is just a list with value range from -1 to 0. are they stand for the possiblity for the indexed entity be choosing ? so the index of the min value is the predict entity id ? If so, why is range from [-1,0] but not [0,1]?

Ambiguous dimension.

Try to run the example.py under python 3.5. I got the error:

Traceback (most recent call last):
File "example.py", line 21, in
con.set_model(models.TransE)
File "/home/deeplearning/OpenKE/config/Config.py", line 148, in set_model
self.trainModel = self.model(config = self)
File "/home/deeplearning/OpenKE/models/Model.py", line 63, in init
self.input_def()
File "/home/deeplearning/OpenKE/models/Model.py", line 42, in input_def
self.batch_h = tf.placeholder(tf.int64, [config.batch_seq_size])
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/array_ops.py", line 1494, in placeholder
shape = tensor_shape.as_shape(shape)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/tensor_shape.py", line 800, in as_shape
return TensorShape(shape)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/tensor_shape.py", line 436, in init
self._dims = [as_dimension(d) for d in dims_iter]
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/tensor_shape.py", line 436, in
self._dims = [as_dimension(d) for d in dims_iter]
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/tensor_shape.py", line 378, in as_dimension
return Dimension(value)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/tensor_shape.py", line 36, in init
raise ValueError("Ambiguous dimension: %s" % value)
ValueError: Ambiguous dimension: 9662.84

Need to fix the it to force the number to be integer @ Config.py.

Segmentation fault (core dumped)

Dear THUNLP,

I installed OpenKE successfully but it reports segmentation fault when I run example.py.

~/OpenKE$ python example.py
Input Files Path : ./data/FB15K/
Output Files Path : ./data/FB15K/
The toolkit is importing datasets.
Segmentation fault (core dumped)

model can not be exported

Hi, I clone the code, and run the example, it seems good and print the log. but I can not get model.vec.tf file in res folder.
it only contain some files like this:
55

Imposing Type Constraints for Link Prediction

Hi,
I was investigating the functionality around imposing type constraints - via the type_constrain.txt file.
If i understand correctly, this has not been implemented for link-prediction yet, has it? Only triple_classification appears to be dependent on it, and the training of a model for triple classification on FB15K returns a segmentation fault when type_constrain.txt is removed.
I have tried to understand the underlying implementation of type constraints in your code base, but have not been able to grasp it as clearly yet. Any elaboration on this would be much appreciated.

It would be interesting if type constraints are imposed on the link prediction task as one can force certain examples to label 0 based on the type constraints, which I feel would add valuable negative examples for learning. Otherwise, with uniform sampling for negative triples across the entire unrestricted range of entities, as is done now, we often return triples that could have directly been labeled as negative.

Is this something we can implement here for link prediction? Or am I missing something?

Thanks!

Knowledge-Embedding

I downloaded the wiki embedding files, name for entity2vec. Bin, but how do I use this file, what is the format of the content, the entity and the vector?or other, for example, when I was in the use of word2vec, first for the word back to vector, how to use this embedding?
one enamble:(dim=50)
give 0.14555487 0.05351801 0.10649003 0.45990866 0.08294252 -0.46757272
0.1927298 0.42644426 -0.44441804 0.47094241 -0.38926664 0.11177664
-0.38736385 -0.34790769 0.56327432 0.49133539 -0.10508542 0.07396693
0.4166784 -0.15814875 0.63791603 -0.52564949 0.02029056 -0.0246984
-0.1063034 -0.20384586 -0.10796665 0.0293985 0.49127114 -0.54218274
0.09538431 -0.3666929 -0.24478485 0.47898671 0.11772798 0.234074
0.16560279 0.25576589 0.17955644 0.10463575 -0.09829795 -0.24375246
0.37840015 0.18217942 0.64507526 0.52854127 0.15324134 0.73259991
0.09031539 0.0369201
what about you ?
thank you!

Is there any text data document ?

Is there any corresponding text data [such as (ID, entity_context) ] document available for download in order to be able to display the results visually? The data files downloaded from the website (http://openke.thunlp.org) only have IDs. Or I didn't find them. Thanks.

Request for functionality to pull out actual predictions on test examples.

Hi,
Thanks for the great framework!

Would you consider implementing functionality for returning a handle for the actual score predictions , i.e., predicted tensor elements on the test set? It will then be straightforward to rank the retrieved results and gain insight on how the model performs, on demonstrative examples.
This is essentially the same operation as your framework already covers while computing the MRR, hits@n, etc.

At the moment, I am implementing this atop the returned trained embeddings, and then carrying out my analysis. I maintain the mappings between entities/relations and their ids; and then fixing say the head 'h' and the relation 'r', I compute the corresponding scores for all possible tail candidates 't' (subject to type constraints), using their corresponding embeddings. Thereafter, I rank to retrieve the n top scoring results. My motivation behind this exercise is to then field those examples that have the true tail at a high rank, and those examples that perform poorly in their predictions.

I am, at the moment, also investigating the more proper (elegant) way of pulling out the relevant class methods and variables from your own code from config, models, etc. But it would be wonderful if you already have tested, working functionality of the same that you could share.

Some Advice

HI friends.. I hope i can get more examples, when I run example.py with TransH, the output is nan.. but i dont know why... and i hope you can post the configuration on the site like the version of Tensorflow and describe clearly like OpenNE ... Thanks you .

关于Triple Classification 测试

我阅读了readme里边的介绍,理解了这个测试通过计算某个三元组头尾节点之间的dissimilarity来衡量该三元组是否正确。
我困惑的地方在于这个测试的结果该如何理解,比如说我在自己的数据上跑的结果在执行Triple Classification测试的时候结果特别高(达到了0.99),那么我改如何理解这个测试结果呢?这个测试结果可以告诉我什么呢。
小白望赐教,O(∩_∩)O谢谢!

关于模型constraints部分的代码实现疑惑

您好,我阅读代码(PyTorch版)时有一个疑惑:以TransE为例,请问原论文算法流程图中提到实体norm,在代码中什么地方实现的,我好像没有找到?谢谢
即,e ← e/ ∥e∥ for each entity e ∈ E

训练集过大时出现段错误

您好,我在使用平台测试我自己的样本的时候出现了段错误,我调试+搜了好多相关的内容定位到了出错的位置在config.py 文件中的self.lib.importTrainFiles()句中,我在网上看到的是由于在import训练集的时候太大在c返回给python的时候就会有段错误,但我搜了很多也没有搜索到解决方案,想咨询一下你们是怎么处理的,我的训练集文件大小22M

Error in indices

tensorflow.python.framework.errors_impl.InvalidArgumentError: indices[0,0] = 19670950215894 is not in [0, 14951)

I just executed the example.py

更新后貌似不能使用自己的数据集运行训练和测试代码

我不知道是不是自己的问题...但是同样的数据集在更新之前是可以跑的,更新之后会报错:
Process finished with exit code 139 (interrupted by signal 11: SIGSEGV)
我看到之前的issue有的是因为三元组的顺序弄错了,但是我这个检查了一下是正确的,而且在之前的版本是可以正常运行的。

Confused about the meaning of test results

Hi! I'm running the code following your instructions:
python example_train_transe.py
which runs perfectly. However, I'm having trouble understanding the results it gives:
overall results: left 245.622864 0.382235 0.173181 0.000000 left(filter) 83.448372 0.579167 0.359415 0.000000 right 161.250595 0.460615 0.228031 0.000000 right(filter) 59.193935 0.631359 0.408847 0.000000
I'm not very familiar with C, so it's hard for me to understand your source code. So I'm having a very hard time figure out which number is mean rank, which number is hit@10, and why there are so many results. Can you kindly explain it for me? Thank you very much!

中文数据集

请问有没有中文的数据集,比如COAE2016任务三的数据集?

TransD模型中relation的维度问题

TransD模型中,relation和entity的向量维度是不同的,但是代码里relation和entity的维度大小定义是一样的,这样对话,TransD应该就和TransE没有太大区别了吧,所以想请问一下这块代码是否有问题?

Are there any official benchmark results of the framework?

Hi,

Thanks for this wonderful framework!

I was wondering whether you have reported your official results (with hyper-parameters) on the WN18 or FB15K using this framework. I am trying to reproduce the results of TransE on the two benchmark datasets, while it seems to be hard to tune for a comparable result reported by other papers.

Thus, I was thinking that you might have some benchmark results to verify that the framework could possibly reproduce similar results.

test tail

void testTail(REAL *con) {
INT h = testList[lastTail].h;
INT t = testList[lastTail].t;
INT r = testList[lastTail].r;

REAL minimal = con[t];
INT r_s = 0;
INT r_filter_s = 0;
INT r_s_constrain = 0;

for (INT j = 0; j <= entityTotal; j++) {
    REAL value = con[j];
    if (j != t && value < minimal) {
        r_s += 1;
        if (not _find(h, j, r))
            r_filter_s += 1;
    }
}

would you please tell me what does the 'j' mean ,thanks

dataset for transr

I want to make a movie dataset for TransR as it requires triples, entities, and relations. How these three are considered for ex: in train2id.txt (0 1 0) means 0's are entities and 1 is their relation is it so or just sequence numbers?

安装问题

请问目前还无法在windows系统上安装是吗?谢谢!

Segmentation fault

Hi,

When I run TransE, both of the python version and C version( from https://github.com/thunlp/KB2E ) rose "Segmentation fault".
My env is Python 3 +MacOS (I have already modified the init files to make them adapt to Python3)
snip20171111_121
snip20171111_122

I would appreciate if you could help me on this. Thank you so much.

confused by testing

Hello,I am a little confused by the predicted results,could you please explain what is the meaning of them,thanks

Analogy implementation

Hi,

I have a question about ANALOGY's implementation in PyTorch: in line 44 we see that the softplus activation function is used. Should not it be the sigmoid function however? In the original implementation we see that a sigmoid was used.

Btw, I opened a pull request on the implementation of Analogy in TensorFlow, following the implementation in PyTorch.

Thanks for your attention and support.

可否问下这里打印的数值的含义?

https://github.com/thunlp/OpenKE/blob/master/base/Test.h#L110-L114

printf("overall results:\n");
printf("left %f %f %f %f \n", l_rank/ testTotal, l_tot / testTotal, l3_tot / testTotal, l1_tot / testTotal);
printf("left(filter) %f %f %f %f \n", l_filter_rank/ testTotal, l_filter_tot / testTotal,  l3_filter_tot / testTotal,  l1_filter_tot / testTotal);
printf("right %f %f %f %f \n", r_rank/ testTotal, r_tot / testTotal,r3_tot / testTotal,r1_tot / testTotal);
printf("right(filter) %f %f %f %f\n", r_filter_rank/ testTotal, r_filter_tot / testTotal,r3_filter_tot / testTotal,r1_filter_tot / testTotal);

多谢多谢!

Add a license to OpenKE?

Have you considered adding a license to your code? Without an explicit license, the default copyright laws apply, meaning that you retain all rights to your source code and no one may reproduce, distribute, or create derivative works from your work (source). Since you've created an open-source project, would you consider adding an open-source license?

GitHub created choosealicense.com to help repo managers select the right license. They also created the Open Source Guide, which goes more in-depth on the pros/cons of each license for different kinds of open-source projects.

The MIT license would be a great choice because it’s short, very easy to understand, and allows anyone to do anything so long as they keep a copy of the license, including your copyright notice. You’ll be able to release the project under a different license if you ever need to.

Thanks in advance!

运算结果解释

我运行了example_train_transe.py,并得到了transE和transR算法的运行结果。但是对于打印的结果不太明白是什么含义。例如
l_filter_s: 2139
0.001585 0.006737 3068.608398 2783.705811
r_filter_s: 152
0.311274 0.336437 1587.949707 1586.709473

还有最后
overall results:
left 3068.608398 0.001585 0.000000 0.000000
left(filter) 2783.705811 0.006737 0.001387 0.000000
right 1587.949707 0.311274 0.085992 0.000000
right(filter) 1586.709473 0.336437 0.160491 0.000000

我看找到了对应的打印运行结果的C代码,但还是不太清楚这些打印结果的含义

关于embedding与entity的对应关系

小白又来提问了~!
这次的问题是,在训练出的embedding文件里,我该如何建立起embedding与entity之间的对应关系呢?
具体地,embedding.vec.json文件中ent_embeddings里的第一个100维向量对应的应该是entity2id.txt中第一行的实体,还是id为1的实体呢?
望不吝赐教,非常感谢!!

Test and Evaluation

Hi friend, how to do the evaluation in OpenKE? I did not see this part in your readme. Thanks

It seems that the valid sets are not used at all?

Hi there,

I suppose that valid set is typically used to do the early stop in the training phase. I understand that running a test on a valid set after each epoch is very time-consuming, while it is better to have that option.

Also, the other important function of valid sets, I think, is to select the hyper-parameters. However, the test phase seems to be only able to run on the test set. Are the valid sets used in this framework at all?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.