Giter Club home page Giter Club logo

rumordetection's Introduction

Paper of the source codes released:

Chunyuan Yuan, Qianwen Ma, Wei Zhou, Jizhong Han, Songlin Hu. Jointly embedding the local and global relations of heterogeneous graph for rumor detection. In 19th IEEE International Conference on Data Mining, IEEE ICDM 2019.

Dependencies:

Gensim==3.7.2

Jieba==0.39

Scikit-learn==0.21.2

Pytorch==1.4.0

Datasets

The main directory contains the directories of Weibo dataset and two Twitter datasets: twitter15 and twitter16. In each directory, there are:

  • twitter15.train, twitter15.dev, and twitter15.test file: This files provide traing, development and test samples in a format like: 'source tweet ID \t source tweet content \t label'

  • twitter15_graph.txt file: This file provides the source posts content of the trees in a format like: 'source tweet ID \t userID1:weight1 userID2:weight2 ...'

These dastasets are preprocessed according to our requirement and original datasets can be available at https://www.dropbox.com/s/7ewzdrbelpmrnxu/rumdetect2017.zip?dl=0 (Twitter) and http://alt.qcri.org/~wgao/data/rumdect.zip (Weibo).

If you want to preprocess the dataset by youself, you can use the word2vec used in our work. The pretrained word2vec can be available at https://drive.google.com/drive/folders/1IMOJCyolpYtoflEqQsj3jn5BYnaRhsiY?usp=sharing.

Reproduce the experimental results:

  1. create an empty directory: checkpoint/
  2. run script run.py

Citation

If you find this code useful in your research, please cite our paper:

@inproceedings{rumor_yuan_2019,
  title={Jointly embedding the local and global relations of heterogeneous graph for rumor detection},
  author={Yuan, Chunyuan and Ma, Qianwen and Zhou, Wei and Han, Jizhong and Hu, Songlin},
  booktitle={The 19th IEEE International Conference on Data Mining},
  year={2019},
  organization={IEEE}
}

rumordetection's People

Contributors

chunyuany avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

rumordetection's Issues

early detection的截止时间是怎么设置的?

新人小白,刚刚入门谣言检测,目前比较疑惑早期检测的截止时间是怎么设置的?目前最早能看到2015年的《Enquiring Minds: Early Detection of Rumors in Social Media from Enquiry Posts》文章里阐述了这个概念,但是后边看很多论文提到早期检测实验的时候都没有详细写怎么做的早期检测实验。
求教TAT

Only Text

您好,想问一下如何只用文本内容跑weibo数据集的内容呢,能分享一下不需要要graph的模型代码吗?

edge between users or edge between tweets

hi~ chunyuan, the paper says, "we connect the user nodes if they participate in common microblogs and link the nodes of source tweets by their common users", when i check weibo dataset, the file 'weibo_graph.txt' does not contain links between two uses or two source tweets. it only contains 4664 records between tweet and users. so does twitter15_graph.txt and twitter16_graph.txt
so, could GLAN still works without links between two uses or two source tweets ?

about reproducing the paper

Hello, I encountered a problem when reproducing your paper. You mentioned that the pytorch version used is 1.1.0, but the torch_geometric used in your code does not support torch 1.1.0, so where is the problem?

paper equations vs code

Hi,

I have read your paper it says it derives the conv features first from the w2v embedding and then applied multi-head attention over that to arrive at final message embeddings and then applied GRE to obtain another representation. but in your code directly conv features are fed for classification
Please clarify:

    conv_feature = torch.cat(conv_block, dim=1)
    features = self.dropout(conv_feature)

    a1 = self.relu(self.fc1(features))
    d1 = self.dropout(a1)

    output = self.fc2(d1)
    return output

where fc2 -----> self.fc2 = nn.Linear(300, config['num_classes'])

I am writing the survey paper to include your results please verify the code or give me the updated code as per paper.

weibo_graph.txt的创建

您好,在weibo_graph.txt文件中,用户id后面的权重是怎么计算得到的?还有,不是每一个用户id都被列入graph.txt文件,您是按照什么标准筛选的呢?

数据预处理问题

请问如何得到您数据集里的weibo_graph.txt呢?如何用自己的数据集进行训练呢?

关于运行代码的问题

is:issue is:open 新人小白,刚刚入门谣言检测,环境已经配好了,但运行run.py报[Errno 2] No such file or directory: 'checkpoint/weights.best.weibo.glan'这个错,请问这个错误遇到过吗?不应该是自己生成的吗?谢谢

请问您的数据中包含回复吗

看了一下您的论文,是有用到回复的,但是提供的数据和代码里似乎没有回复内容,,JingMa的推特回复好像只开源了tf-idf处理后的内容。
请问是我哪里理解错了吗?不知道问题是否描述清楚了,望解答,感谢

关于论文与code的问题

1、论文中有提到“build the connection inside between source microblog and retweet”但是code中并没有这一步
code:
X_text = self.mh_attention(X_text, X_text, X_text)
传进去的参数都是原始贴的文本
2、论文中有提到使用用户特征,但是数据里好像没有用户特征,请问是怎么利用用户特征的?

some problem in papers

  1. I want to know how the r value in target:r in *graph.txt in the data set is obtained. Is it the value normalized by time?
  2. In the paper, you said "To build the connection inside between source microblog and retweet, we first use multi-head attention to refine the representation of every retweet",but i did not get the construction of this relationship from the code, only directly input the source tweet into the multi head attention. So can i understand it this way, here is just the semantic relationship in the tweet.
    3.In part C of the paper, you introduced that user information u was introduced, but in the source code, I did not find anything related to this piece. Can you provide details?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.