Hi,
First of all, I would like to genuinely thank you for your incredibly clear and detailed guidance. To be honest, I am very new to the field of GNN and have just started delving into it a few weeks ago, so I still have a lot of questions about how they operate.
In the data description section, node features are described as being extracted based on the location, starred repositories, employer, and email address. Therefore, I think features should be text or something similar. However, in the musae_git_features.json
file, the features are numerical vectors. I also looked into various other datasets, and node features have a similar form. I genuinely do not understand how to process these features from raw data into numerical vectors that can serve as input for GNNs.
Thank you so much!