Giter Club home page Giter Club logo

pykg2vec's Introduction

Documentation Status CircleCI Python 3.7 Build Status PyPI version GitHub license Coverage Status Twitter

Pykg2vec: Python Library for KGE Methods

Pykg2vec is a library for learning the representation of entities and relations in Knowledge Graphs built on top of PyTorch 1.5 (TF2 version is available in tf-master branch as well). We have attempted to bring state-of-the-art Knowledge Graph Embedding (KGE) algorithms and the necessary building blocks in the pipeline of knowledge graph embedding task into a single library. We hope Pykg2vec is both practical and educational for people who want to explore the related fields.

Features:

  • Support state-of-the-art KGE model implementations and benchmark datasets. (also support custom datasets)
  • Support automatic discovery for hyperparameters.
  • Tools for inspecting the learned embeddings.
    • Support exporting the learned embeddings in TSV or Pandas-supported format.
    • Interactive result inspector.
    • TSNE-based, KPI summary visualization (mean rank, hit ratio) in various format. (csvs, figures, latex table)

We welcome any form of contribution! Please refer to CONTRIBUTING.md for more details.

To Get Started

Before using pykg2vec, we recommend users to have the following libraries installed:

  • python >=3.7 (recommended)
  • pytorch>= 1.5

Quick Guide for Anaconda users:

  • Setup a Virtual Environment: we encourage you to use anaconda to work with pykg2vec:
(base) $ conda create --name pykg2vec python=3.7
(base) $ conda activate pykg2vec
  • Setup Pytorch: we encourage to use pytorch with GPU support for good training performance. However, a CPU version also runs. The following sample commands are for setting up pytorch:
# if you have a GPU with CUDA 10.1 installed
(pykg2vec) $ conda install pytorch torchvision cudatoolkit=10.1 -c pytorch
# or cpu-only
(pykg2vec) $ conda install pytorch torchvision cpuonly -c pytorch
  • Setup Pykg2vec:
(pykg2vec) $ git clone https://github.com/Sujit-O/pykg2vec.git
(pykg2vec) $ cd pykg2vec
(pykg2vec) $ python setup.py install

For beginners, these papers, A Review of Relational Machine Learning for Knowledge Graphs, Knowledge Graph Embedding: A Survey of Approaches and Applications, and An overview of embedding models of entities and relationships for knowledge base completion can be good starting points!

User Documentation

The documentation is here.

Usage Examples

With pykg2vec command-line interface, you can

  1. Run a single algorithm with various models and datasets (customized dataset also supported).
    # Check all tunnable parameters.
    (pykg2vec) $ pykg2vec-train -h
    
    # Train TransE on FB15k benchmark dataset.
    (pykg2vec) $ pykg2vec-train -mn TransE
    
    # Train using different KGE methods.
    (pykg2vec) $ pykg2vec-train -mn [TransE|TransD|TransH|TransG|TransM|TransR|Complex|ComplexN3|
                        CP|RotatE|Analogy|DistMult|KG2E|KG2E_EL|NTN|Rescal|SLM|SME|SME_BL|HoLE|
                        ConvE|ConvKB|Proje_pointwise|MuRP|QuatE|OctonionE|InteractE|HypER]
    
    # For KGE using projection-based loss function, use more processes for batch generation.
    (pykg2vec) $ pykg2vec-train -mn [ConvE|ConvKB|Proje_pointwise] -npg [the number of processes, 4 or 6]
    
    # Train TransE model using different benchmark datasets.
    (pykg2vec) $ pykg2vec-train -mn TransE -ds [fb15k|wn18|wn18_rr|yago3_10|fb15k_237|ks|nations|umls|dl50a|nell_955]
    
    # Train TransE model using your own hyperparameters.
    (pykg2vec) $ pykg2vec-train -exp True -mn TransE -ds fb15k -hpf ./examples/custom_hp.yaml
    
    # Use your own dataset
    (pykg2vec) $ pykg2vec-train -mn TransE -ds [name] -dsp [path to the custom dataset]
    
  2. Tune a single algorithm.
    # Tune TransE using the benchmark dataset.
    (pykg2vec) $ pykg2vec-tune -mn [TransE] -ds [dataset name]
    
    # Tune TransE with your own search space
    (pykg2vec) $ pykg2vec-tune -exp True -mn TransE -ds fb15k -ssf ./examples/custom_ss.yaml
    
  3. Perform Inference Tasks (more advanced).
    # Train a model and perform inference tasks.
    (pykg2vec) $ pykg2vec-infer -mn TransE
    
    # Perform inference tasks over a pretrained model.
    (pykg2vec) $ pykg2vec-infer -mn TransE -ld [path to the pretrained model]
    

* NB: On Windows, use pykg2vec-train.exe, pykg2vec-tune.exe and pykg2vec-infer.exe instead.

For more usage of pykg2vec APIs, please check the programming examples.

Citation

Please kindly consider citing our paper if you find pykg2vec useful for your research.

  @article{yu2019pykg2vec,
  title={Pykg2vec: A Python Library for Knowledge Graph Embedding},
  author={Yu, Shih Yuan and Rokka Chhetri, Sujit and Canedo, Arquimedes and Goyal, Palash and Faruque, Mohammad Abdullah Al},
  journal={arXiv preprint arXiv:1906.04239},
  year={2019}
  }

pykg2vec's People

Contributors

aaksakal avatar arkdu avatar arquicanedo avatar baxtree avatar figroc avatar louisccc avatar mayo42 avatar sujit-o avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pykg2vec's Issues

infer tails and infer heads for ConvE

Hi,

Can you include the infer tails and infer heads functionality of the ConvE model? Its not there but I saw its available for TransE. I get an error when using the reshaping and batch normalisation for the inference provided in the test method. It works fine for testing a batch, but when given individual pairs like (head,rel) or (tail,rel) it does not work.

The error I get is,
cuDNN launch failure : input shape ([1,1,20,20])
[[node ConvEModel/ConvE/batch_normalization_v1/cond/FusedBatchNorm_1

Stopped running at epoch 162

Many thanks for your great library.

In fact, I have run TransE model on my custom dataset. Every things were ok for 100 epochs. But when I run it for 1000 epochs, the system stop at epoch 162 and don't go forward. At first, I thought it is related to low RAM but it was not. I use a GPU with 22G RAM and at epoch 162, 13G is free.

I have attached the state of running.
error

cannot import name 'KnowledgeGraph'

When I run the train.py example:

Traceback (most recent call last):
File "train.py", line 1, in
from pykg2vec.config.global_config import KnowledgeGraph
ImportError: cannot import name 'KnowledgeGraph'

deprecate warning

Describe the bug
to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead

(from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.

To Reproduce
Steps to reproduce the behavior:

  1. run "kg_pipeline.tune()"
  2. See warning

Question?

Where did you find the triplet classification accuracy of the models?

Change bayesian_optimizer.py

In Bayesian hyperparameter tuner, instead of using training loss, we should use the validation set's mean rank as a return value in the objective function.

[Linux] AttributeError: type object 'Path' has no attribute 'home'

I setup a virtual environment with python3

Installed pykg2vec as: python setup.py install

when executing the example I get this error:

$ python pykg2vec/example/train.py
Traceback (most recent call last):
File "pykg2vec/example/train.py", line 9, in
from pykg2vec.utils.trainer import Trainer
File "", line 969, in _find_and_load
File "", line 958, in _find_and_load_unlocked
File "", line 664, in _load_unlocked
File "", line 634, in _load_backward_compatible
File "/home/arqui/code/pykg2vec/pykg2vec-env/lib/python3.5/site-packages/pykg2vec-0.0.45-py3.5.egg/pykg2vec/utils/trainer.py", line 11, in
File "", line 969, in _find_and_load
File "", line 958, in _find_and_load_unlocked
File "", line 664, in _load_unlocked
File "", line 634, in _load_backward_compatible
File "/home/arqui/code/pykg2vec/pykg2vec-env/lib/python3.5/site-packages/pykg2vec-0.0.45-py3.5.egg/pykg2vec/utils/visualization.py", line 12, in
File "/home/arqui/code/pykg2vec/pykg2vec-env/lib/python3.5/site-packages/matplotlib/pyplot.py", line 32, in
import matplotlib.colorbar
File "/home/arqui/code/pykg2vec/pykg2vec-env/lib/python3.5/site-packages/matplotlib/colorbar.py", line 32, in
import matplotlib.contour as contour
File "/home/arqui/code/pykg2vec/pykg2vec-env/lib/python3.5/site-packages/matplotlib/contour.py", line 18, in
import matplotlib.font_manager as font_manager
File "/home/arqui/code/pykg2vec/pykg2vec-env/lib/python3.5/site-packages/matplotlib/font_manager.py", line 135, in
OSXFontDirectories.append(str(Path.home() / "Library/Fonts"))
AttributeError: type object 'Path' has no attribute 'home'

typo in install instructions

In the readme in the installation section there is a typo:
pip install tensoflow should be pip install tensorflow

SME test_batch function

While training SME there is a bug. When followed it leads to the test_batch method of SME class. It only has a pass. However, Evaluator is using that function.

def test_batch(self):
pass

The screenshot of the error:

image

Export embedding table

I'm trying to get embedding table (entity_name, embedding_representation_vector), but no good for now.

I used sample code in readme to embed Freebase15K and I tried to pickle dump the model and the trainer, but error occurs : TypeError: can't pickle _thread.RLock objects. I'm new to tensorflow and haven't found good solution for this.

Can I export the model to be used for transfer training?
How can I export the entity embedding table to json or csv?

Add link to preprint in readme

I see that the readme of this repo mentions the associated title under the Cite section: "pykg2vec: Python Knowledge Graph Embedding Library". Please also consider adding a link to the corresponding preprint on Arxiv: https://arxiv.org/abs/1906.04239

Having this link in the readme will be useful for reference purposes. Thanks.

"KG2E" object has no attribute 'ent_embeddings'

Describe the bug
"KG2E" object has no attribute, 'ent_mebeddings'

To Reproduce
Run train.py with kg2e as model.

Expected behavior
After training, the trainer tries to visualize kg2e embeddings. There is an error in initialization of Visualization class because kg2e has no ent/rel embeddings.

Screenshots
image

Bayesian Optimizer

Bug
Bayesian optimizer raises an error in the model parameter definition step: model.def_parameters() in trainer.build_model().

Since the error message is long I included it as a screenshot below.

To Reproduce
Run the tune_model.py under pykg2vec/example with the following command:

python tune_model.py -mn TransE

Same error with TransR, KG2E, RESCAL, SME, and RotatE as well. The others might have the same issue I have not tried them.

Expected behavior
Tuning process should start by printing out the iteration number and the minimum loss.

Screenshots
image

Desktop

  • OS: Windows 10

Issue for using pykg2vec

Hi,

I have a graph based on the wordnet knowledge base. I want to convert each relation edge into a vector with ConvE and TransE model using pykg2vec. Can anyone help me with the same?

Thanks!

ConvE performance issue

Hi,
Training the ConvE model with the given default parameters does not learn even after several epochs. Have you seen good performance with this model?

connection.py line 393

I got this error on my custom dataset.

iter[0] ---Train Loss: 549108.26129 ---time: 2612.50
Testing [992/391166] Triples
Inferring for Evaluation: | Done:100% Time: 0:01:38
Traceback (most recent call last):
File "/Users/canedo/anaconda3/envs/pykg2vec/lib/python3.6/multiprocessing/queues.py", line 240, in _feed
send_bytes(obj)
File "/Users/canedo/anaconda3/envs/pykg2vec/lib/python3.6/multiprocessing/connection.py", line 200, in send_bytes
self._send_bytes(m[offset:offset + size])
File "/Users/canedo/anaconda3/envs/pykg2vec/lib/python3.6/multiprocessing/connection.py", line 393, in _send_bytes
header = struct.pack("!i", n)
struct.error: 'i' format requires -2147483648 <= number <= 2147483647

TransM does not save embedding.

When we use the code, we find that the TransM algorithm does not automatically save the embeddings obtained from the training. Is it missing in the code?

About Format of User Defined Dataset

About Format of User Defined Dataset

I'm trying to reproduce the NTN model, using pykg2vec. I noticed issue#12, your team have done the user defined dataset using parameter, but I don't think it's detailed enough let me using it, Could you show more about format of dataset ,pls.

Bug Tips

Something wrong with README.md,
python train.py -m TransE # Run TransE model. ​
should be
python train.py -mn TransE # Run TransE model. ​

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.