adamf-mat's Introduction

Unleashing the Power of Imbalanced Modality Information for Multi-modal Knowledge Graph Completion

Unleashing the Power of Imbalanced Modality Information for Multi-modal Knowledge Graph Completion

Multi-modal knowledge graph completion (MMKGC) aims to predict the missing triples in the multi-modal knowledge graphs by incorporating structural, visual, and textual information of entities into the discriminant models. The information from different modalities will work together to measure the triple plausibility. Existing MMKGC methods overlook the imbalance problem of modality information among entities, resulting in inadequate modal fusion and inefficient utilization of the raw modality information. To address the mentioned problems, we propose Adaptive Multi-modal Fusion and Modality Adversarial Training (AdaMF-MAT) to unleash the power of imbalanced modality information for MMKGC. AdaMF-MAT achieves multi-modal fusion with adaptive modality weights and further generates adversarial samples by modality-adversarial training to enhance the imbalanced modality information. Our approach is a co-design of the MMKGC model and training strategy which can outperform 19 recent MMKGC methods and achieve new state-of-the-art results on three public MMKGC benchmarks.

🌈 Model Architecture

🔔 News

2024-04 We preprint a new paper MyGO: Discrete Modality Information as Fine-Grained Tokens for Multi-modal Knowledge Graph Completion.
2024-03 We release the Repo for our paper: NativE: Multi-modal Knowledge Graph Completion in the Wild, SIGIR 2024.
2024-02 Our paper has been accepted by LREC-COLING 2024.

💻 Data preparation

We use the MMKG datasets proposed in MMRNS. You can refer to this repo to download the multi-modal embeddings of the MMKGs and put them in embeddings/. We prepare a processed version of the multi-modal embeddings and you can download from Google Drive

🚀 Training and Inference

You can use the shell scripts in the scripts/ to conduct the experiments. For example, the following scripts can run an experiments on DB15K

DATA=DB15K
EMB_DIM=250
NUM_BATCH=1024
MARGIN=12
LR=1e-4
LRG=1e-4
NEG_NUM=128
EPOCH=1000

CUDA_VISIBLE_DEVICES=0 nohup python run_adamf_mat.py -dataset=$DATA \
  -batch_size=$NUM_BATCH \
  -margin=$MARGIN \
  -epoch=$EPOCH \
  -dim=$EMB_DIM \
  -lrg=$LRG \
  -mu=0 \
  -save=./checkpoint/$DATA-$NUM_BATCH-$EMB_DIM-$NEG_NUM-$MARGIN-$LR-$EPOCH \
  -neg_num=$NEG_NUM \
  -learning_rate=$LR > $DATA-$EMB_DIM-$NUM_BATCH-$NEG_NUM-$MARGIN-$EPOCH.txt &

🤝 Cite:

Please consider citing this paper if you use the code from our work. Thanks a lot :)


@misc{zhang2024unleashing,
      title={Unleashing the Power of Imbalanced Modality Information for Multi-modal Knowledge Graph Completion}, 
      author={Yichi Zhang and Zhuo Chen and Lei Liang and Huajun Chen and Wen Zhang},
      year={2024},
      eprint={2402.15444},
      archivePrefix={arXiv},
      primaryClass={cs.AI}
}

adamf-mat's People

Contributors

Stargazers

Watchers

adamf-mat's Issues

你好，请教一些配置方面问题

在尝试运行代码时，发现电脑报错是不是有用的win32程序，报错的文件是base.so文件，我的电脑是win64。通过网上查询得知base.so文件是linux系统运行文件，尝试使用32位python去运行，结果出现Microsoft Visual C++ Redistributable is not installed报错。我想知道您运行代码是在linux系统运行吗，使用的是32位系统吗，base.so文件有没有win可替代的文件

[ASAP] About raw data used in paper code

Hello, I have some questions to ask about the raw data used in this paper:

Did you use the embeddings which is provided beforehand, or generate it from the raw data?
If you use raw data, where did you retrieve it, and did you check it before generating embeddings, because I checked and realized that the number of entities in the dataset you referenced and the statistics is not similar.
I tried to scrap data from the links you provided in /benchmarks but MKG-W data is broken, meanwhile MKG-Y doesn't consist links to scrap. Can you check it out?

Thanks! Hope to receive your answer!

Would you mind provide the -textual.pth and -visual.pth

I see MMRNS only provide the *.h5 embedding files, would you mind provide the *.pth you used in the paper. Thx.

error

When I was training, I made the following errors：
Namespace(adv_num=1, adv_temp=2.0, batch_size=1024, con_temp=0, dataset='FB15K', dim=250, epoch=1000, img_dim=4096, lamda=0, learning_rate=0.0001, lrg=0.0001, margin=12.0, missing_rate=0.8, mu=0.0, neg_num=128, postfix='', save='./checkpoint/DB15K-1024-250-128-12-1e-4-1000', seed=42, visual='random')
Input Files Path : ./benchmarks/FB15K/
The toolkit is importing datasets.
Segmentation fault (core dumped)

Recommend Projects

zjukg / adamf-mat Goto Github PK

adamf-mat's Introduction

Unleashing the Power of Imbalanced Modality Information for Multi-modal Knowledge Graph Completion

🌈 Model Architecture

🔔 News

💻 Data preparation

🚀 Training and Inference

🤝 Cite:

adamf-mat's People

Contributors

Stargazers

Watchers

Forkers

adamf-mat's Issues

你好，请教一些配置方面问题

[ASAP] About raw data used in paper code

Would you mind provide the -textual.pth and -visual.pth

error

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent