About the Facebook dataset - what do the node features represent, and how were they generated? The paper mentions that the features are extracted from site descriptions. Does this mean they're text features, and if so which text representation or embedding did you use?
I download the datasets (github) from SNAP, but I'm now confused about the features in .json format.
Have they been preprocessed already so that they can be put into use without further processing?
Or do I need to understand what each dimension in the features mean?
In my humble opinion, the matrix corresponding to the undirected graph is symmetric. However, I find it is not the case for the GitHub Social Network(http://snap.stanford.edu/data/github-social.html).
I try to visualize it as follows.
I have one question about the file "DE_target.csv". There are several files like this one in the repository.
There are several columns in this file, including "id", "days", "mature", "view", "partner", and "new_id". I am curious about which column indicates the label of a node, that is, whether a streamer uses explicit language.
Thank you for your excellent work! And I would be very grateful if you could answer my question. That is, what's the meaning of the numbers in the node feature json file. For example, in the MUSAE/input/features/git.json. I guess that one vector in the json corresponds to a node, and you mentioned in the manuscript that ` Node features are location, starred repositories, employer and e-mail address'. How can I turn these infomation into the numbers in the json file?