Giter Club home page Giter Club logo

neuralkg's Introduction

Website Pypi Pypi Documentation

An Open Source Library for Diverse Representation Learning of Knowledge Graphs

English | 中文

NeuralKG is a python-based library for diverse representation learning of knowledge graphs implementing Conventional KGEs, GNN-based KGEs, and Rule-based KGEs. We provide comprehensive documents for beginners and an online website to organize an open and shared KG representation learning community.


Table of Contents


😃What's New


Overview

NeuralKG is built on PyTorch Lightning. It provides a general workflow of diverse representation learning on KGs and is highly modularized, supporting three series of KGEs. It has the following features:

  • Support diverse types of methods. NeuralKG, as a library for diverse representation learning of KGs, provides implementations of three series of KGE methods, including Conventional KGEs, GNN-based KGEs, and Rule-based KGEs.

  • Support easy customization. NeuralKG contains fine-grained decoupled modules that are commonly used in different KGEs, including KG Data Preprocessing, Sampler for negative sampling, Monitor for hyperparameter tuning, Trainer covering the training, and model validation.

  • long-term technical maintenance. The core team of NeuralKG will offer long-term technical maintenance. Other developers are welcome to pull requests.


Demo

There is a demonstration of NeuralKG.


Implemented KGEs

Components Models
KGEModel TransE, TransH, TransR, ComplEx, DistMult, RotatE, ConvE, BoxE, CrossE, SimplE, HAKE, PairRE, DualE
GNNModel RGCN, KBAT, CompGCN, XTransE, SEGNN
RuleModel ComplEx-NNE+AER, RUGE, IterE

Quick Start

Installation

Step1 Create a virtual environment using Anaconda and enter it

conda create -n neuralkg python=3.8
conda activate neuralkg

Step2 Install the appropriate PyTorch and DGL according to your cuda version

Here we give a sample installation based on cuda == 11.1

  • Install PyTorch
pip install torch==1.9.1+cu111 -f https://download.pytorch.org/whl/torch_stable.html
  • Install DGL
pip install dgl-cu111 dglgo -f https://data.dgl.ai/wheels/repo.html

Step3 Install package

  • From Pypi
pip install neuralkg
  • From Source
git clone https://github.com/zjukg/NeuralKG.git
cd NeuralKG
python setup.py install

Training

# Use bash script
sh ./scripts/your-sh

# Use config
python main.py --load_config --config_path <your-config>

Evaluation

python main.py --test_only --checkpoint_dir <your-model-path>

Hyperparameter Tuning

NeuralKG utilizes Weights&Biases supporting various forms of hyperparameter optimization such as grid search, Random search, and Bayesian optimization. The search type and search space are specified in the configuration file in the format "*.yaml" to perform hyperparameter optimization.

The following config file displays hyperparameter optimization of the TransE on the FB15K-237 dataset using bayes search:

command:
  - ${env}
  - ${interpreter}
  - ${program}
  - ${args}
program: main.py
method: bayes
metric:
  goal: maximize
  name: Eval|hits@10
parameters:
  dataset_name:
    value: FB15K237
  model_name:
    value: TransE
  loss_name:
    values: [Adv_Loss, Margin_Loss]
  train_sampler_class:
    values: [UniSampler, BernSampler]
  emb_dim:
    values: [400, 600]
  lr:
    values: [1e-4, 5e-5, 1e-6]
  train_bs:
    values: [1024, 512]
  num_neg:
    values: [128, 256]

Reproduced Results

There are some reproduced model results on FB15K-237 dataset using NeuralKG as below. See more results in here

Method MRR Hit@1 Hit@3 Hit@10
TransE 0.32 0.23 0.36 0.51
TransR 0.23 0.16 0.26 0.38
TransH 0.31 0.2 0.34 0.50
DistMult 0.30 0.22 0.33 0.48
ComplEx 0.25 0.17 0.27 0.40
SimplE 0.16 0.09 0.17 0.29
ConvE 0.32 0.23 0.35 0.50
RotatE 0.33 0.23 0.37 0.53
BoxE 0.32 0.22 0.36 0.52
HAKE 0.34 0.24 0.38 0.54
PairRE 0.35 0.25 0.38 0.54
DualE 0.33 0.24 0.36 0.52
XTransE 0.29 0.19 0.31 0.45
RGCN 0.25 0.16 0.27 0.43
KBAT* 0.28 0.18 0.31 0.46
CompGCN 0.34 0.25 0.38 0.52
SEGNN 0.36 0.27 0.39 0.54
IterE 0.26 0.19 0.29 0.41

*:There is a label leakage error in KBAT, so the corrected result is poor compared with the paper result. Details in deepakn97/relationPrediction#28


Notebook Guide

😃We use colab to provide some notebooks to help users use our library.

Colab Notebook


Detailed Documentation

https://zjukg.github.io/NeuralKG/neuralkg.html


Citation

Please cite our paper if you use NeuralKG in your work

@inproceedings{neuralkg,
  author    = {Wen Zhang and
               Xiangnan Chen and
               Zhen Yao and
               Mingyang Chen and
               Yushan Zhu and
               Hongtao Yu and
               Yufeng Huang and
               Yajing Xu and
               Ningyu Zhang and
               Zezhong Xu and
               Zonggang Yuan and
               Feiyu Xiong and
               Huajun Chen},
  title     = {NeuralKG: An Open Source Library for Diverse Representation Learning
               of Knowledge Graphs},
  booktitle = {{SIGIR}},
  pages     = {3323--3328},
  publisher = {{ACM}},
  year      = {2022}
}

NeuralKG Core Team

Wen Zhang, Xiangnan Chen, Zhen Yao, Mingyang Chen, Yushan Zhu, Hongtao Yu, Yufeng Huang, Zezhong Xu, Yajing Xu, Peng Ye, Yichi Zhang, Ningyu Zhang, Guozhou Zheng, Haofen Wang, Huajun Chen

neuralkg's People

Contributors

anselcmy avatar bighyf avatar chenxn2020 avatar jinlong22 avatar modberge avatar potsss avatar wencolani avatar wosigewozai-ffmm avatar yep96 avatar yushanzhu avatar zhang-each avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

neuralkg's Issues

训练显示的问题

Epoch 29: 87%|█████▏| 1262/1443 [00:20<00:02, 62.34it/s, loss=0.00156, v_num=3]
Epoch 29: 88%|█████▎| 1272/1443 [00:20<00:02, 62.47it/s, loss=0.00156, v_num=3]
Epoch 29: 89%|█████▎| 1282/1443 [00:20<00:02, 62.62it/s, loss=0.00156, v_num=3]
Epoch 29: 90%|█████▎| 1292/1443 [00:20<00:02, 62.78it/s, loss=0.00156, v_num=3]
Epoch 29: 90%|█████▍| 1302/1443 [00:20<00:02, 62.91it/s, loss=0.00156, v_num=3]
Validating: 49%|██████████████ | 133/274 [00:02<00:01, 86.90it/s]
Epoch 29: 91%|█████▍| 1312/1443 [00:20<00:02, 63.03it/s, loss=0.00156, v_num=3]
Epoch 29: 92%|█████▍| 1322/1443 [00:20<00:01, 63.16it/s, loss=0.00156, v_num=3]
Epoch 29: 92%|█████▌| 1332/1443 [00:21<00:01, 63.30it/s, loss=0.00156, v_num=3]
Epoch 29: 93%|█████▌| 1342/1443 [00:21<00:01, 63.43it/s, loss=0.00156, v_num=3]
Epoch 29: 94%|█████▌| 1352/1443 [00:21<00:01, 63.56it/s, loss=0.00156, v_num=3]
Epoch 29: 94%|█████▋| 1362/1443 [00:21<00:01, 63.69it/s, loss=0.00156, v_num=3]
Epoch 29: 95%|█████▋| 1372/1443 [00:21<00:01, 63.84it/s, loss=0.00156, v_num=3]
Epoch 29: 96%|█████▋| 1382/1443 [00:21<00:00, 63.98it/s, loss=0.00156, v_num=3]
Epoch 29: 96%|█████▊| 1392/1443 [00:21<00:00, 64.10it/s, loss=0.00156, v_num=3]
Epoch 29: 97%|█████▊| 1402/1443 [00:21<00:00, 64.24it/s, loss=0.00156, v_num=3]
Epoch 29: 98%|█████▊| 1412/1443 [00:21<00:00, 64.38it/s, loss=0.00156, v_num=3]
Epoch 29: 99%|█████▉| 1422/1443 [00:22<00:00, 64.51it/s, loss=0.00156, v_num=3]
Epoch 29: 99%|█████▉| 1432/1443 [00:22<00:00, 64.64it/s, loss=0.00156, v_num=3]
Epoch 29: 100%|█| 1443/1443 [00:22<00:00, 64.37it/s, loss=0.00156, v_num=3, Eval
Epoch 59: 81%|▊| 1170/1443 [00:18<00:04, 64.21it/s, loss=0.00152, v_num=3, Eval
训练时这样是正常的吗请问?

执行语句问题

执行config命令:
python main.py --load_config --config/FreeBase/<ConvE_FB15K237.yaml>
报错:
syntax error near unexpected token `newline'

执行sh:
sh ./scripts/ConvE_FB.sh
报错:
Can't open ./scripts/ConvE_FB.sh

use_weight的功能

在查阅代码的过程中,我反复地看到一些与args.use_weight参数相关的代码。看起来这个use_weight是在loss计算时实现一种加权的loss值。请问关于use_weight使用的这种算法,具体是如何实现的,有没有相关的论文介绍?

关于ConvE模型性能的问题

使用自带的ConvE模型的配置文件训练模型时,在FB15K237和WN18RR两个数据集上,MRR、Hits@n等各项评价指标都达不到文档中给出的结果,例如在Hits@10只有0.29左右,远低于文档中的0.50。我想知道模型性能偏低的这种情况是由我自己的问题导致的,还是因为代码本身的问题导致的。

保存预测三元组

你好!请问验证、测试阶段怎么把预测的三元组都输出保存呢?有直接的方法入口吗?

CompGCN模型问题

你好,我按照要求训练了CompGCN模型,但是在测试的时候,当我加载训练好的预训练模型,出现了如下的错误:
RuntimeError:Error(s) in loading state_dict for CompGCNLitModel:
unpected key(s) in state_dict:“model.GraphCov.rel”, "loss.model.GraphCov.rel".
请问这个问题该如何解决呢?

Demo.py

Hi, thank you for great code! I notice there is a demo.py in your README demonstration, but don't find it in your code base. I would appreciate if you could provide it.

关于自定义训练、测试模块的问题

为了观察、调试模型,我修改了src/neuralkg/lit_model/KGELitModel.py和src/neuralkg/eval_task/link_prediction.py中的代码。但是在使用main.py训练、测试的过程中,这些修改的代码都好像没有执行,而是保持修改之前的状态。请问这种现象出现的原因是什么,以及我如果想把这些修改的代码应用于训练,应该如何去做?

NeuralKG中是否提供与原始论文中一样的TransE模型?

在阅读代码中发现,NeuralKG中提供的默认的TransE模型不是与TransE模型的原始论文中不完全相同的,如loss函数采用了一种基于logSigmoid的方式,这与原始TransE基于margin的方式不同。

请问对原始论文中模型的这些改变,对模型性能的影响有多大?除此以外,NeuralKG中是否提供与原始论文中一样的TransE模型?

模型是否存在early stop

我是用您提供的脚本TransE_FB.sh,发现只指定了MAX_EPOCHS=5000,我想问下是否存在early stop参数来控制模型训练

Inquiries about code issues

Hello, I was browsing on github for the open source library on knowledge graph for diverse representation learning(NeuralKG).It demonstrates a demo about knowledge questions and answers on github.I encountered a problem about multithreading when debugging the code of the demo locally, and the specific problem is shown in the following figure:
Uploading QQ截图20240412202145.png…

Local runtime environment:
win11,torch2.12.1+cu116

关于lable.shape的问题

想问lable的shape可以设置吗,是一直都是[128,14541]吗,如果想要其成为[128, ]该如何操作呢

Trying CompGCN on pharmkg dataset

While experimenting with CompGCN I tried to change the embedding size from 100 to 150 and that's the error it returns:
Traceback (most recent call last):
File "main.py", line 111, in
main()
File "main.py", line 101, in main
trainer.fit(lit_model, datamodule=kgdata)
File "/media/users/caforio/anaconda3/envs/neuralkg/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 740, in fit
self._call_and_handle_interrupt(
File "/media/users/caforio/anaconda3/envs/neuralkg/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 685, in _call_and_handle_interrupt
return trainer_fn(*args, **kwargs)
File "/media/users/caforio/anaconda3/envs/neuralkg/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 777, in _fit_impl
self._run(model, ckpt_path=ckpt_path)
File "/media/users/caforio/anaconda3/envs/neuralkg/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1199, in _run
self._dispatch()
File "/media/users/caforio/anaconda3/envs/neuralkg/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1279, in _dispatch
self.training_type_plugin.start_training(self)
File "/media/users/caforio/anaconda3/envs/neuralkg/lib/python3.8/site-packages/pytorch_lightning/plugins/training_type/training_type_plugin.py", line 202, in start_training
self._results = trainer.run_stage()
File "/media/users/caforio/anaconda3/envs/neuralkg/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1289, in run_stage
return self._run_train()
File "/media/users/caforio/anaconda3/envs/neuralkg/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1311, in _run_train
self._run_sanity_check(self.lightning_module)
File "/media/users/caforio/anaconda3/envs/neuralkg/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1375, in _run_sanity_check
self._evaluation_loop.run()
File "/media/users/caforio/anaconda3/envs/neuralkg/lib/python3.8/site-packages/pytorch_lightning/loops/base.py", line 145, in run
self.advance(*args, **kwargs)
File "/media/users/caforio/anaconda3/envs/neuralkg/lib/python3.8/site-packages/pytorch_lightning/loops/dataloader/evaluation_loop.py", line 110, in advance
dl_outputs = self.epoch_loop.run(dataloader, dataloader_idx, dl_max_batches, self.num_dataloaders)
File "/media/users/caforio/anaconda3/envs/neuralkg/lib/python3.8/site-packages/pytorch_lightning/loops/base.py", line 145, in run
self.advance(*args, **kwargs)
File "/media/users/caforio/anaconda3/envs/neuralkg/lib/python3.8/site-packages/pytorch_lightning/loops/epoch/evaluation_epoch_loop.py", line 122, in advance
output = self._evaluation_step(batch, batch_idx, dataloader_idx)
File "/media/users/caforio/anaconda3/envs/neuralkg/lib/python3.8/site-packages/pytorch_lightning/loops/epoch/evaluation_epoch_loop.py", line 217, in _evaluation_step
output = self.trainer.accelerator.validation_step(step_kwargs)
File "/media/users/caforio/anaconda3/envs/neuralkg/lib/python3.8/site-packages/pytorch_lightning/accelerators/accelerator.py", line 239, in validation_step
return self.training_type_plugin.validation_step(*step_kwargs.values())
File "/media/users/caforio/anaconda3/envs/neuralkg/lib/python3.8/site-packages/pytorch_lightning/plugins/training_type/training_type_plugin.py", line 219, in validation_step
return self.model.validation_step(*args, **kwargs)
File "/media/users/caforio/anaconda3/envs/neuralkg/lib/python3.8/site-packages/neuralkg/lit_model/CompGCNLitModel.py", line 40, in validation_step
ranks = link_predict(batch, self.model, prediction='all')
File "/media/users/caforio/anaconda3/envs/neuralkg/lib/python3.8/site-packages/neuralkg/eval_task/link_prediction.py", line 18, in link_predict
tail_ranks = tail_predict(batch, model)
File "/media/users/caforio/anaconda3/envs/neuralkg/lib/python3.8/site-packages/neuralkg/eval_task/link_prediction.py", line 59, in tail_predict
pred_score = model.get_score(batch, "tail_predict")
File "/media/users/caforio/anaconda3/envs/neuralkg/lib/python3.8/site-packages/neuralkg/model/GNNModel/CompGCN.py", line 127, in get_score
score = self.ConvE(head_emb, rela_emb, x)
File "/media/users/caforio/anaconda3/envs/neuralkg/lib/python3.8/site-packages/neuralkg/model/GNNModel/CompGCN.py", line 165, in ConvE
x = torch.mm(x, all_ent.transpose(1, 0)) # [batch_size, ent_num]
RuntimeError: mat1 and mat2 shapes cannot be multiplied (384x200 and 300x7241)

Here's the configuration:
DATA_DIR=dataset

MODEL_NAME=CompGCN
DATASET_NAME=pharmkg
DATA_PATH=$DATA_DIR/$DATASET_NAME
LITMODEL_NAME=CompGCNLitModel
TRAIN_SAMPLER_CLASS=CompGCNSampler
TEST_SAMPLER_CLASS=CompGCNTestSampler
MAX_EPOCHS=2000
EMB_DIM=150
LOSS_NAME=Cross_Entropy_Loss
TRAIN_BS=2048
EVAL_BS=256
NUM_NEG=1
LR=0.0001
CHECK_PER_EPOCH=50
DECODER_MODEL=ConvE
OPN=mult
NUM_WORKERS=16
GPU=0,1

执行脚本出现错误

你好,我在使用sh ./scripts/CCKS/Grail.sh指令时,过程中碰到以下问题,请问应该怎么解决?
image
image

怎么提取embedding呢?

尊敬的作者:
您好!
我是涉足KG的一位菜鸟,认为您的这份工作具有非常非常重要的意义!
想请问一下,如果想获得不同KGE方法获得的embedding,请问需要怎样修改main.py呢?期待您的回复,不胜感激!

where is the parameter of epoch

Dear Authors,

Thanks for your work, which provide much convenience for reproducing the KGE works. I met a problem when I reproduced the models. I found I can't change the epoch by changing the num_epoch in config. Could you give me some suggestions?

Many thanks,
Best wishes

如何在训练过程中进行调试和修改?

您好,我准备在模型上加入新的模块,并预备使用Darts框架(一种基于梯度的神经网络搜索架构)进行训练,在这个过程中将会涉及到不同模块的梯度更新公式的修改,以及对不同模块的相应训练迭代次数的设置,在代码中看来我似乎必须对Pytorch_lightning.trainer进行继承和修改?请问是否能在你们这里得到一些建议?不胜感激!

ccks参数不可用

截屏2023-07-04 20 41 03
在跑sh scripts/CCKS/Grail.sh的时候出现main.py: error: unrecognized arguments: --ccks False的错误,当单独在 main.py添加parser = setup_parser() # 设置参数
args = parser.parse_args()
print(args.ccks)
时,出现
Traceback (most recent call last):
File "main.py", line 188, in
print(args.ccks)
AttributeError: 'Namespace' object has no attribute 'ccks'
请问该如何解决?

运行demo.py报错RuntimeError: Early stopping conditioned on metric `Eval_mrr` which is not available. Pass in or modify your `EarlyStopping` callback to use any of the following: `Train|loss`, `Eval|mrr`, `Eval|hits@10`

在windows上把Eval|mrr改成了Eval_mrr,运行demo.py失败并报错:
RuntimeError: Early stopping conditioned on metric Eval_mrr which is not available. Pass in or modify your EarlyStopping callback to use any of the following: Train|loss, Eval|mrr, Eval|hits@10
`Global seed set to 321
GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

| Name | Type | Params

0 | model | TransE | 662
1 | loss | Adv_Loss | 662

660 Trainable params
2 Non-trainable params
662 Total params
0.003 Total estimated model params size (MB)
Epoch 0: 25%|██▌ | 1/4 [00:00<00:00, 83.55it/s, loss=1.12, v_num=8]Global seed set to 321
Epoch 1: 80%|████████ | 4/5 [00:00<00:00, 95.48it/s, loss=0.937, v_num=8]
Validating: 0it [00:00, ?it/s]
Validating: 0%| | 0/1 [00:00<?, ?it/s]Traceback (most recent call last):
File "H:\xxxx\20231118-NeuralKG-test\demo.py", line 121, in
main(arg_path = 'config/TransE_demo_kg.yaml')
File "H:\xxxx\20231118-NeuralKG-test\demo.py", line 99, in main
trainer.fit(lit_model, datamodule=kgdata)
File "C:\Users\liukuan.conda\envs\neuralkg2\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 740, in fit
self._call_and_handle_interrupt(
File "C:\Users\liukuan.conda\envs\neuralkg2\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 685, in _call_and_handle_interrupt
return trainer_fn(*args, **kwargs)
File "C:\Users\liukuan.conda\envs\neuralkg2\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 777, in _fit_impl
self._run(model, ckpt_path=ckpt_path)
File "C:\Users\liukuan.conda\envs\neuralkg2\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 1199, in _run
self._dispatch()
File "C:\Users\liukuan.conda\envs\neuralkg2\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 1279, in _dispatch
self.training_type_plugin.start_training(self)
File "C:\Users\liukuan.conda\envs\neuralkg2\lib\site-packages\pytorch_lightning\plugins\training_type\training_type_plugin.py", line 202, in start_training
self._results = trainer.run_stage()
File "C:\Users\liukuan.conda\envs\neuralkg2\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 1289, in run_stage
return self._run_train()
File "C:\Users\liukuan.conda\envs\neuralkg2\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 1319, in _run_train
self.fit_loop.run()
File "C:\Users\liukuan.conda\envs\neuralkg2\lib\site-packages\pytorch_lightning\loops\base.py", line 145, in run
self.advance(*args, **kwargs)
File "C:\Users\liukuan.conda\envs\neuralkg2\lib\site-packages\pytorch_lightning\loops\fit_loop.py", line 234, in advance
self.epoch_loop.run(data_fetcher)
File "C:\Users\liukuan.conda\envs\neuralkg2\lib\site-packages\pytorch_lightning\loops\base.py", line 146, in run
self.on_advance_end()
File "C:\Users\liukuan.conda\envs\neuralkg2\lib\site-packages\pytorch_lightning\loops\epoch\training_epoch_loop.py", line 242, in on_advance_end
self._run_validation()
File "C:\Users\liukuan.conda\envs\neuralkg2\lib\site-packages\pytorch_lightning\loops\epoch\training_epoch_loop.py", line 337, in _run_validation
self.val_loop.run()
File "C:\Users\liukuan.conda\envs\neuralkg2\lib\site-packages\pytorch_lightning\loops\base.py", line 151, in run
output = self.on_run_end()
File "C:\Users\liukuan.conda\envs\neuralkg2\lib\site-packages\pytorch_lightning\loops\dataloader\evaluation_loop.py", line 140, in on_run_end
self._on_evaluation_end()
File "C:\Users\liukuan.conda\envs\neuralkg2\lib\site-packages\pytorch_lightning\loops\dataloader\evaluation_loop.py", line 202, in _on_evaluation_end
self.trainer.call_hook("on_validation_end", *args, **kwargs)
File "C:\Users\liukuan.conda\envs\neuralkg2\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 1495, in call_hook
callback_fx(*args, **kwargs)
File "C:\Users\liukuan.conda\envs\neuralkg2\lib\site-packages\pytorch_lightning\trainer\callback_hook.py", line 221, in on_validation_end
callback.on_validation_end(self, self.lightning_module)
File "C:\Users\liukuan.conda\envs\neuralkg2\lib\site-packages\pytorch_lightning\callbacks\early_stopping.py", line 194, in on_validation_end
self._run_early_stopping_check(trainer)
File "C:\Users\liukuan.conda\envs\neuralkg2\lib\site-packages\pytorch_lightning\callbacks\early_stopping.py", line 200, in _run_early_stopping_check
if trainer.fast_dev_run or not self._validate_condition_metric( # disable early_stopping with fast_dev_run
File "C:\Users\liukuan.conda\envs\neuralkg2\lib\site-packages\pytorch_lightning\callbacks\early_stopping.py", line 151, in _validate_condition_metric
raise RuntimeError(error_msg)
RuntimeError: Early stopping conditioned on metric Eval_mrr which is not available. Pass in or modify your EarlyStopping callback to use any of the following: Train|loss, Eval|mrr, Eval|hits@10
`
想请教一下应该怎么解决?十分感谢!

Inductive Models Question

Really excited to try out this library! I was curious if any of the built-in models are inductive and able to generate embeddings for nodes that are new to networks?

大规模KG的训练

Hi,

请问NeuralKG是否支持大规模KG的训练,如ogb的wikikg90m数据集?

main函数中似乎有路径错误

我正在尝试执行CCKS中的Grail脚本,但是在main函数中这几行似乎是错误的
image
这里的model_checkpoint.best_model_path是空值,我按照README_CCKS中的流程尝试跑一个epoch,会报以下错误
image
我检查了main中的代码逻辑,发现这里似乎应该改为model_checkpoint.dirpath,不知道我的想法是否正确?谢谢!

More Guidance Needed on Training Models on Own Datasets

Would it be possible for you to provide more details on how to train various models on users' own datasets?

It's not clear what has to go into the config files (e.g., what specifically should be mentioned for env, interpreter, program, or args (or why program appears twice in the config).

One thing that may be helpful is in the docs, you share results of the library on various datasets (https://zjukg.github.io/NeuralKG/result.html). If you could provide the command you used to run each of pipelines, that would be great.

Also, it's not clear from the docs how one must treat the data loaders differently compared to tabular data, as the examples in the docs refer to image datasets: https://zjukg.github.io/NeuralKG/neuralkg.data.html#neuralkg.data.base_data_module.BaseDataModule.train_dataloader. What must the structure of datasets be for various models? What can be done to datasets to better prepare them for different models (e.g., encoding entities/relations, etc.)

使用demo_kg训练问题

将ConvE_*.yaml改成demo_kg数据集之后数据集似乎不能正确加载,会出现警告
D:\Users\lenovo\anaconda3\envs\deepKE\lib\site-packages\pytorch_lightning\utilities\data.py:122: UserWarning: DataLoader returned 0 length. Please make sure this was your intention.
rank_zero_warn(
D:\Users\lenovo\anaconda3\envs\deepKE\lib\site-packages\pytorch_lightning\utilities\data.py:153: UserWarning: Total length of CombinedLoader across ranks is zero. Please make sure this was your intention.
rank_zero_warn(
导致模型不能训练,也没有参数保存下来,最终报错
Traceback (most recent call last):
File ".\main.py", line 112, in
main()
File ".\main.py", line 107, in main
lit_model.load_state_dict(torch.load(path)["state_dict"])
File "D:\Users\lenovo\anaconda3\envs\deepKE\lib\site-packages\torch\serialization.py", line 594, in load
with _open_file_like(f, 'rb') as opened_file:
File "D:\Users\lenovo\anaconda3\envs\deepKE\lib\site-packages\torch\serialization.py", line 230, in _open_file_like
return _open_file(name_or_buffer, mode)
File "D:\Users\lenovo\anaconda3\envs\deepKE\lib\site-packages\torch\serialization.py", line 211, in init
super(_open_file, self).init(open(name, mode))
FileNotFoundError: [Errno 2] No such file or directory: ''

超参数优化

您好!请问官方文档可以再详细一点吗?很感兴趣但是上手有点难度

Multi-process for dataloader when set num_workers larger than 0 will raise an error by RGCN model

Hi, when I set num_workers > 0, an error are raised. I think it one reason raising the error is the DGL, but i am not sure.
Here is the error code
`Sanity Checking: 0it [00:00, ?it/s]Traceback (most recent call last):
File "/home/test/anaconda3/envs/lrz/lib/python3.8/site-packages/pytorch_lightning/trainer/call.py", line 38, in _call_and_handle_interrupt
return trainer_fn(*args, **kwargs)
File "/home/test/anaconda3/envs/lrz/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 650, in _fit_impl
self._run(model, ckpt_path=self.ckpt_path)
File "/home/test/anaconda3/envs/lrz/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1103, in _run
results = self._run_stage()
File "/home/test/anaconda3/envs/lrz/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1182, in _run_stage
self._run_train()
File "/home/test/anaconda3/envs/lrz/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1195, in _run_train
self._run_sanity_check()
File "/home/test/anaconda3/envs/lrz/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1267, in _run_sanity_check
val_loop.run()
File "/home/test/anaconda3/envs/lrz/lib/python3.8/site-packages/pytorch_lightning/loops/loop.py", line 199, in run
self.advance(*args, **kwargs)
File "/home/test/anaconda3/envs/lrz/lib/python3.8/site-packages/pytorch_lightning/loops/dataloader/evaluation_loop.py", line 152, in advance
dl_outputs = self.epoch_loop.run(self._data_fetcher, dl_max_batches, kwargs)
File "/home/test/anaconda3/envs/lrz/lib/python3.8/site-packages/pytorch_lightning/loops/loop.py", line 199, in run
self.advance(*args, **kwargs)
File "/home/test/anaconda3/envs/lrz/lib/python3.8/site-packages/pytorch_lightning/loops/epoch/evaluation_epoch_loop.py", line 121, in advance
batch = next(data_fetcher)
File "/home/test/anaconda3/envs/lrz/lib/python3.8/site-packages/pytorch_lightning/utilities/fetching.py", line 184, in next
return self.fetching_function()
File "/home/test/anaconda3/envs/lrz/lib/python3.8/site-packages/pytorch_lightning/utilities/fetching.py", line 275, in fetching_function
return self.move_to_device(batch)
File "/home/test/anaconda3/envs/lrz/lib/python3.8/site-packages/pytorch_lightning/utilities/fetching.py", line 294, in move_to_device
batch = self.batch_to_device(batch)
File "/home/test/anaconda3/envs/lrz/lib/python3.8/site-packages/pytorch_lightning/loops/dataloader/evaluation_loop.py", line 142, in batch_to_device
batch = self.trainer._call_strategy_hook("batch_to_device", batch, dataloader_idx=dataloader_idx)
File "/home/test/anaconda3/envs/lrz/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1485, in _call_strategy_hook
output = fn(*args, **kwargs)
File "/home/test/anaconda3/envs/lrz/lib/python3.8/site-packages/pytorch_lightning/strategies/strategy.py", line 273, in batch_to_device
return model._apply_batch_transfer_handler(batch, device=device, dataloader_idx=dataloader_idx)
File "/home/test/anaconda3/envs/lrz/lib/python3.8/site-packages/pytorch_lightning/core/module.py", line 332, in _apply_batch_transfer_handler
batch = self._call_batch_hook("transfer_batch_to_device", batch, device, dataloader_idx)
File "/home/test/anaconda3/envs/lrz/lib/python3.8/site-packages/pytorch_lightning/core/module.py", line 320, in _call_batch_hook
return trainer_method(hook_name, *args)
File "/home/test/anaconda3/envs/lrz/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1347, in _call_lightning_module_hook
output = fn(*args, **kwargs)
File "/home/test/anaconda3/envs/lrz/lib/python3.8/site-packages/pytorch_lightning/core/hooks.py", line 632, in transfer_batch_to_device
return move_data_to_device(batch, device)
File "/home/test/anaconda3/envs/lrz/lib/python3.8/site-packages/lightning_fabric/utilities/apply_func.py", line 101, in move_data_to_device
return apply_to_collection(batch, dtype=_TransferableDataType, function=batch_to)
File "/home/test/anaconda3/envs/lrz/lib/python3.8/site-packages/lightning_utilities/core/apply_func.py", line 70, in apply_to_collection
return {k: function(v, *args, **kwargs) for k, v in data.items()}
File "/home/test/anaconda3/envs/lrz/lib/python3.8/site-packages/lightning_utilities/core/apply_func.py", line 70, in
return {k: function(v, *args, kwargs) for k, v in data.items()}
File "/home/test/anaconda3/envs/lrz/lib/python3.8/site-packages/lightning_fabric/utilities/apply_func.py", line 95, in batch_to
data_output = data.to(device, kwargs)
File "/home/test/anaconda3/envs/lrz/lib/python3.8/site-packages/dgl/heterograph.py", line 5709, in to
ret._graph = self._graph.copy_to(utils.to_dgl_context(device))
File "/home/test/anaconda3/envs/lrz/lib/python3.8/site-packages/dgl/heterograph_index.py", line 255, in copy_to
return _CAPI_DGLHeteroCopyTo(self, ctx.device_type, ctx.device_id)
File "dgl/_ffi/_cython/./function.pxi", line 295, in dgl._ffi._cy3.core.FunctionBase.call
File "dgl/_ffi/_cython/./function.pxi", line 227, in dgl._ffi._cy3.core.FuncCall
File "dgl/_ffi/_cython/./function.pxi", line 217, in dgl._ffi._cy3.core.FuncCall3
dgl._ffi.base.DGLError: [16:47:07] /opt/dgl/src/runtime/cuda/cuda_device_api.cc:343: Check failed: e == cudaSuccess || e == cudaErrorCudartUnloading: CUDA: unspecified launch failure
Stack trace:
[bt] (0) /home/test/anaconda3/envs/lrz/lib/python3.8/site-packages/dgl/libdgl.so(+0x8b0b95) [0x7f396a58eb95]
[bt] (1) /home/test/anaconda3/envs/lrz/lib/python3.8/site-packages/dgl/libdgl.so(dgl::runtime::CUDADeviceAPI::CopyDataFromTo(void const
, unsigned long, void
, unsigned long, unsigned long, DGLContext, DGLContext, DGLDataType)+0x82) [0x7f396a590ff2]
[bt] (2) /home/test/anaconda3/envs/lrz/lib/python3.8/site-packages/dgl/libdgl.so(dgl::runtime::NDArray::CopyFromTo(DGLArray
, DGLArray
)+0x10d) [0x7f396a4074cd]
[bt] (3) /home/test/anaconda3/envs/lrz/lib/python3.8/site-packages/dgl/libdgl.so(dgl::runtime::NDArray::CopyTo(DGLContext const&) const+0x103) [0x7f396a443033]
[bt] (4) /home/test/anaconda3/envs/lrz/lib/python3.8/site-packages/dgl/libdgl.so(dgl::UnitGraph::CSR::CopyTo(DGLContext const&) const+0x1f0) [0x7f396a5619d0]
[bt] (5) /home/test/anaconda3/envs/lrz/lib/python3.8/site-packages/dgl/libdgl.so(dgl::UnitGraph::CopyTo(std::shared_ptrdgl::BaseHeteroGraph, DGLContext const&)+0xd1) [0x7f396a550d01]
[bt] (6) /home/test/anaconda3/envs/lrz/lib/python3.8/site-packages/dgl/libdgl.so(dgl::HeteroGraph::CopyTo(std::shared_ptrdgl::BaseHeteroGraph, DGLContext const&)+0xf6) [0x7f396a44f876]
[bt] (7) /home/test/anaconda3/envs/lrz/lib/python3.8/site-packages/dgl/libdgl.so(+0x7802b6) [0x7f396a45e2b6]
[bt] (8) /home/test/anaconda3/envs/lrz/lib/python3.8/site-packages/dgl/libdgl.so(DGLFuncCall+0x48) [0x7f396a3ec558]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "main.py", line 122, in
main()
File "main.py", line 112, in main
trainer.fit(lit_model, datamodule=kgdata)
File "/home/test/anaconda3/envs/lrz/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 608, in fit
call._call_and_handle_interrupt(
File "/home/test/anaconda3/envs/lrz/lib/python3.8/site-packages/pytorch_lightning/trainer/call.py", line 63, in _call_and_handle_interrupt
trainer._teardown()
File "/home/test/anaconda3/envs/lrz/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1166, in _teardown
self.strategy.teardown()
File "/home/test/anaconda3/envs/lrz/lib/python3.8/site-packages/pytorch_lightning/strategies/strategy.py", line 496, in teardown
self.lightning_module.cpu()
File "/home/test/anaconda3/envs/lrz/lib/python3.8/site-packages/lightning_fabric/utilities/device_dtype_mixin.py", line 78, in cpu
return super().cpu()
File "/home/test/anaconda3/envs/lrz/lib/python3.8/site-packages/torch/nn/modules/module.py", line 967, in cpu
return self._apply(lambda t: t.cpu())
File "/home/test/anaconda3/envs/lrz/lib/python3.8/site-packages/torch/nn/modules/module.py", line 810, in _apply
module._apply(fn)
File "/home/test/anaconda3/envs/lrz/lib/python3.8/site-packages/torch/nn/modules/module.py", line 810, in _apply
module._apply(fn)
File "/home/test/anaconda3/envs/lrz/lib/python3.8/site-packages/torch/nn/modules/module.py", line 833, in _apply
param_applied = fn(param)
File "/home/test/anaconda3/envs/lrz/lib/python3.8/site-packages/torch/nn/modules/module.py", line 967, in
return self._apply(lambda t: t.cpu())
RuntimeError: CUDA error: unspecified launch failure
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

Exception in thread Thread-3:
Traceback (most recent call last):
File "/home/test/anaconda3/envs/lrz/lib/python3.8/threading.py", line 932, in _bootstrap_inner
self.run()
File "/home/test/anaconda3/envs/lrz/lib/python3.8/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/home/test/anaconda3/envs/lrz/lib/python3.8/site-packages/torch/utils/data/_utils/pin_memory.py", line 54, in _pin_memory_loop
do_one_step()
File "/home/test/anaconda3/envs/lrz/lib/python3.8/site-packages/torch/utils/data/_utils/pin_memory.py", line 31, in do_one_step
r = in_queue.get(timeout=MP_STATUS_CHECK_INTERVAL)
File "/home/test/anaconda3/envs/lrz/lib/python3.8/multiprocessing/queues.py", line 116, in get
return _ForkingPickler.loads(res)
File "/home/test/anaconda3/envs/lrz/lib/python3.8/site-packages/torch/multiprocessing/reductions.py", line 355, in rebuild_storage_fd
fd = df.detach()
File "/home/test/anaconda3/envs/lrz/lib/python3.8/multiprocessing/resource_sharer.py", line 57, in detach
with _resource_sharer.get_connection(self._id) as conn:
File "/home/test/anaconda3/envs/lrz/lib/python3.8/multiprocessing/resource_sharer.py", line 87, in get_connection
c = Client(address, authkey=process.current_process().authkey)
File "/home/test/anaconda3/envs/lrz/lib/python3.8/multiprocessing/connection.py", line 508, in Client
answer_challenge(c, authkey)
File "/home/test/anaconda3/envs/lrz/lib/python3.8/multiprocessing/connection.py", line 752, in answer_challenge
message = connection.recv_bytes(256) # reject large message
File "/home/test/anaconda3/envs/lrz/lib/python3.8/multiprocessing/connection.py", line 216, in recv_bytes
buf = self._recv_bytes(maxlength)
File "/home/test/anaconda3/envs/lrz/lib/python3.8/multiprocessing/connection.py", line 414, in _recv_bytes
buf = self._recv(4)
File "/home/test/anaconda3/envs/lrz/lib/python3.8/multiprocessing/connection.py", line 379, in _recv
chunk = read(handle, remaining)
ConnectionResetError: [Errno 104] Connection reset by peer
terminate called after throwing an instance of 'c10::Error'
what(): CUDA error: unspecified launch failure
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

Exception raised from c10_cuda_check_implementation at ../c10/cuda/CUDAException.cpp:44 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x57 (0x7f3a2d21d617 in /home/test/anaconda3/envs/lrz/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::string const&) + 0x64 (0x7f3a2d1d898d in /home/test/anaconda3/envs/lrz/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #2: c10::cuda::c10_cuda_check_implementation(int, char const*, char const*, int, bool) + 0x118 (0x7f3a2d2cec38 in /home/test/anaconda3/envs/lrz/lib/python3.8/site-packages/torch/lib/libc10_cuda.so)
frame #3: + 0x126123e (0x7f3a2e5fa23e in /home/test/anaconda3/envs/lrz/lib/python3.8/site-packages/torch/lib/libtorch_cuda.so)
frame #4: + 0x519806 (0x7f3a97efe806 in /home/test/anaconda3/envs/lrz/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #5: + 0x55ca7 (0x7f3a2d202ca7 in /home/test/anaconda3/envs/lrz/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #6: c10::TensorImpl::~TensorImpl() + 0x1e3 (0x7f3a2d1facb3 in /home/test/anaconda3/envs/lrz/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #7: c10::TensorImpl::~TensorImpl() + 0x9 (0x7f3a2d1fae49 in /home/test/anaconda3/envs/lrz/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #8: + 0x7ca2c8 (0x7f3a981af2c8 in /home/test/anaconda3/envs/lrz/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #9: THPVariable_subclass_dealloc(_object*) + 0x325 (0x7f3a981af675 in /home/test/anaconda3/envs/lrz/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #10: python() [0x4d39ff]
frame #11: python() [0x4e0970]
frame #12: python() [0x4f1828]
frame #13: python() [0x4f1811]
frame #14: python() [0x4f1811]
frame #15: python() [0x4f1811]
frame #16: python() [0x4f1811]
frame #17: python() [0x4f1811]
frame #18: python() [0x4f1811]
frame #19: python() [0x4f1811]
frame #20: python() [0x4f1811]
frame #21: python() [0x4f1811]
frame #22: python() [0x4f1811]
frame #23: python() [0x4f1811]
frame #24: python() [0x4f1489]
frame #25: python() [0x4f983a]
frame #26: python() [0x4f144d]
frame #27: python() [0x4c9310]

frame #33: __libc_start_main + 0xe7 (0x7f3ac90e1c87 in /lib/x86_64-linux-gnu/libc.so.6)
frame #34: python() [0x579d3d]`

My system is ubuntu 16.04 and GPU is A100, Thanks!

链接预测mode问题

有关系预测的mode吗,我看KGEmodel文件里的model只有头尾实体预测的mode,没看到关系预测的mode

can‘t run main.py successfully

您好,我已在colab上成功安装环境并成功运行demo.py示例,但是无法成功运行main.py文件和script中的sh文件。运行main.py报错如下:
image

另外,我想请问一下,我想在colab中运行script中的sh文件,所需要的命令是什么呢?因为我一直运行不通,我想看看是否是我的命令输入有误

windows平台上CPU占用和磁盘读写异常

你好,我在windows平台上运行CCKS数据集的Grail.sh,第一轮训练速度非常快,发现第二轮开始训练速度和CPU占用和磁盘读写都不太正常。请问有可能是什么原因?感谢解答

a57bd86a7b822c67988a1b0f967aa3f ad8074fa98a54149b48beb6f657906f

运行demo报错

报错信息如下:

This demo is powered by NeuralKG 
Global seed set to 321
GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
Traceback (most recent call last):
  File "/chu xue zhehome/zcy/Code/PyCharm/NeuralKG/demo.py", line 121, in <module>
    main(arg_path='config/TransE_demo_kg.yaml')
  File "/home/zcy/Code/PyCharm/NeuralKG/demo.py", line 99, in main
    trainer.fit(lit_model, datamodule=kgdata)
  File "/home/zcy/Code/PyCharm/NeuralKG/venv/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 740, in fit
    self._call_and_handle_interrupt(
  File "/home/zcy/Code/PyCharm/NeuralKG/venv/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 685, in _call_and_handle_interrupt
    return trainer_fn(*args, **kwargs)
  File "/home/zcy/Code/PyCharm/NeuralKG/venv/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 777, in _fit_impl
    self._run(model, ckpt_path=ckpt_path)
  File "/home/zcy/Code/PyCharm/NeuralKG/venv/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1145, in _run
    self.accelerator.setup(self)
  File "/home/zcy/Code/PyCharm/NeuralKG/venv/lib/python3.8/site-packages/pytorch_lightning/accelerators/gpu.py", line 46, in setup
    return super().setup(trainer)
  File "/home/zcy/Code/PyCharm/NeuralKG/venv/lib/python3.8/site-packages/pytorch_lightning/accelerators/accelerator.py", line 93, in setup
    self.setup_optimizers(trainer)
  File "/home/zcy/Code/PyCharm/NeuralKG/venv/lib/python3.8/site-packages/pytorch_lightning/accelerators/accelerator.py", line 354, in setup_optimizers
    optimizers, lr_schedulers, optimizer_frequencies = self.training_type_plugin.init_optimizers(
  File "/home/zcy/Code/PyCharm/NeuralKG/venv/lib/python3.8/site-packages/pytorch_lightning/plugins/training_type/training_type_plugin.py", line 245, in init_optimizers
    return trainer.init_optimizers(model)
  File "/home/zcy/Code/PyCharm/NeuralKG/venv/lib/python3.8/site-packages/pytorch_lightning/trainer/optimizers.py", line 44, in init_optimizers
    lr_schedulers = self._configure_schedulers(lr_schedulers, monitor, not pl_module.automatic_optimization)
  File "/home/zcy/Code/PyCharm/NeuralKG/venv/lib/python3.8/site-packages/pytorch_lightning/trainer/optimizers.py", line 192, in _configure_schedulers
    raise ValueError(f'The provided lr scheduler "{scheduler}" is invalid')
ValueError: The provided lr scheduler "<torch.optim.lr_scheduler.MultiStepLR object at 0x7fc858e5fb80>" is invalid

Process finished with exit code 1

大概环境配置如下:

conda create -n neuralkg python=3.8
conda activate neuralkg
pip install torch torchvision torchaudio
pip install  dgl -f https://data.dgl.ai/wheels/cu117/repo.html
pip install  dglgo -f https://data.dgl.ai/wheels-test/repo.html
pip install neuralkg
git clone [email protected]:zjukg/NeuralKG.git
python setup.py install

我是初学者,是我环境配置的有问题吗?

在cpu版linux上操作和win7版GPU上操作遇到的问题

cpu版上Linux上执行
sh scripts/FreeBase/NNE_FB.sh
报错如下:

Using backend: pytorch
Traceback (most recent call last):
File "main.py", line 8, in
from neuralkg.utils import setup_parser
File "/home/laicx/anaconda3/envs/neuralkg/lib/python3.8/site-packages/neuralkg/init.py", line 3, in
from .lit_model import *
File "/home/laicx/anaconda3/envs/neuralkg/lib/python3.8/site-packages/neuralkg/lit_model/init.py", line 1, in
from .BaseLitModel import BaseLitModel
File "/home/laicx/anaconda3/envs/neuralkg/lib/python3.8/site-packages/neuralkg/lit_model/BaseLitModel.py", line 6, in
from neuralkg import loss
File "/home/laicx/anaconda3/envs/neuralkg/lib/python3.8/site-packages/neuralkg/loss/init.py", line 8, in
from .IterE_Loss import IterE_Loss
ModuleNotFoundError: No module named 'neuralkg.loss.IterE_Loss'

在win7版GPU设备上使用pycharm执行
sh NNE_FB.sh
报错如下:

D:\ProgramData\Anaconda3\envs\neuralkg38\python.exe: can't open file 'main.py': [Errno 2] No such file or directory

关于wandb使用

现在部分较新的模型在运行时自动支持wandb保存训练过程,但大部分较旧的模型仍不支持。请问想要使较旧的模型如DistMult等也能使用wandb,可以如何修改代码来实现?

lmdb.MemoryError: dataset/CCKS_train_subgraph: Cannot allocate memory

2023-06-08 03:47:27.993739: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-06-08 03:47:28.937943: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
DGL backend not selected or invalid. Assuming PyTorch for now.
Setting the default backend to "pytorch". You can change it in the ~/.dgl/config.json file or export the DGLBACKEND environment variable. Valid options are: pytorch, mxnet, tensorflow (all lowercase)
Global seed set to 321
100% 26874/26874 [00:02<00:00, 13065.71it/s]
100% 1/1 [00:00<00:00, 4744.69it/s]
Traceback (most recent call last):
File "/content/NeuralKG/main.py", line 188, in
main()
File "/content/NeuralKG/main.py", line 30, in main
gen_subgraph_datasets(args) # [头, 尾, 关系]
File "/content/NeuralKG/./src/neuralkg_ind/utils/tools.py", line 227, in gen_subgraph_datasets
links2subgraphs(adj_list, graphs, args, max_label_value, testing)
File "/content/NeuralKG/./src/neuralkg_ind/utils/tools.py", line 451, in links2subgraphs
env = lmdb.open(params.db_path, map_size=map_size, max_dbs=6)
lmdb.MemoryError: dataset/CCKS_train_subgraph: Cannot allocate memory

我在Colab上运行CCKS/Grail.sh,机器内存肯定是够的。请问是什么原因导致这个报错呢?

question

博主试过现在官网这个安装方法吗,按照流程走下来,会提示
pkg_resources.DistributionNotFound: The 'python' distribution was not found and is required by the application,
您有遇到过这个问题吗?能否提供下具体包的运行版本

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.