Giter Club home page Giter Club logo

eva's Introduction

EVA: Entity Visual Alignment

EVA logo

Entity Alignment is the task of linking entities with the same real-world identity from different knowledge graphs. EVA is a set of algorithms that leverage images in knowledge graphs for facilitating Entity Alignment.

This repo holds code for reproducing models presented in our paper: Visual Pivoting for (Unsupervised) Entity Alignment [arxiv][aaai] at AAAI 2021.

Data

Download the used data (DBP15k, DWY15 along with precomputed features) from dropbox or BaiduDisk (code: dhya) (1.3GB after unzipping) and place under data/.

Original sources of DBP15k and DWY15k:

[optional] The raw images of entities appeared in DBP15k and DWY15k can be downloaded from dropbox (108GB after unzipping). All images are saved as title-image pairs in dictionaries and can be accessed with the following code:

import pickle
zh_images = pickle.load(open("eva_image_resources/dbp15k/zh_dbp15k_link_img_dict_full.pkl",'rb'))
print(en_images["http://zh.dbpedia.org/resource/香港有線電視"].size)

Dataset Descriptions

We use the DWY15k dataset as an example (files not used in experiments are omitted).

data/DWY_data/
├── dwy15k_dense_sf_vec.npy: surface form vectors encoded by fastText (dense split)
├── dwy15k_norm_sf_vec.npy: surface form vectors encoded by fastText (normal split)
├── dbp_wd_15k_V1/: normal split
│   ├── mapping/
│   │   ├── 0_3/: the third split (used across all experiments)
│   │   │   ├── ent_ids_1: mapping between entity names and ids for graph 1
│   │   │   ├── ent_ids_2: mapping between entity names and ids for graph 2
│   │   │   ├── rel_ids_1: mapping between relation names and ids for graph 1
│   │   │   ├── rel_ids_2: mapping between relation names and ids for graph 2
│   │   │   ├── ill_ent_ids: inter-lingual links (specified by ids)
│   │   │   ├── triples_1: a list of tuples in the form of (head, relation, tail) for graph 1 (specified by ids)
│   │   │   ├── triples_2: a list of tuples in the form of (head, relation, tail) for graph 2 (specified by ids)
│   │   │   ├── ...
│   │   ├── ...
│   ├── ...
├── dbp_wd_15k_V2/: dense split
│   ├── ...
data/pkls/
├── dbpedia_wikidata_15k_norm_GA_id_img_feature_dict.pkl: mapping between entity names to image features for DWY15k (normal)
│   ├── ...

Environment

The code is tested with python 3.7 and torch 1.7.0.

Use EVA

Run the full model on DBP15k:

./run_dbp15k.sh 0 2020 fr_en

where 0 specifies the GPU device, 2020 is a random seed and fr_en sets the language pair.

Similarly, you can run the full model on DWY15k:

./run_dwy15k.sh 0 2020 1

where the first two args are the same as before, the third specifies where using the normal (1) or dense (2) split.

To run without iterative learning:

./run_dbp15k_no_il.sh 0 2020 fr_en
./run_dwy15k_no_il.sh 0 2020 1

To run the unsupervised setting on DBP15k:

./run_dbp15k_unsup.sh 0 2020 fr_en

Acknowledgement

Our codes are modified from KECG. We appreciate the authors for making KECG open-sourced.

Citation

@inproceedings{liu2021visual,
  title={Visual Pivoting for (Unsupervised) Entity Alignment},
  author={Liu, Fangyu and Chen, Muhao and Roth, Dan and Collier, Nigel},
  booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
  volume={35},
  number={5},
  pages={4257--4266},
  year={2021}
}

License

EVA is MIT licensed. See the LICENSE file for details.

eva's People

Contributors

hardyqr avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

eva's Issues

关于迭代学习的问题

您好,我在自己的模型中引入了您论文模型中提到的迭代学习的方法,但是效果却不如“不加迭代学习的方法”好,理论上讲迭代学习是可以提升模型效果,但是实验结果却不如意,这个问题使我十分困惑,请问是否是我的模型前期过拟合导致的,除了这个原因,我暂时也想不到其他的原因,在此,我十分希望能得到您的指导意见!

数据集问题

你好,请问dbp15k 的数据集的映射为什么会和JAPE的论文有出入

关于实验结果和论文结果的问题

你好,我在复现你的代码时发现结果分为“l2r”和“r2l”两个方向,论文里只有一个方向,比如DBP15Kzh-en。请问论文里的结果是否是实验中“l2r”+“r2l”求平均得来的吗?

关于评价指标的问题

作者你好,拜读过你的论文后,我对你的研究十分感兴趣。但是我有个疑问,在你的实验设置中采用了Hit@k、MRR指标评估模型性能,这也是实体对齐任务中比较主流的评价指标。但是有部分实体对齐模型采用了Precision、Recall、F1指标来评估模型性能。请问这两种评估体系有什么区别,还有就是两种评价指标是否存在一定的转换关系,比如Hit@1和Precision是否有一定的转换关系。

数据集

您好,请问可以提供一下raw images数据集吗,您给的链接不能访问了,期待您的回复!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.