Giter Club home page Giter Club logo

datasets's Introduction

Benedek A. Rozemberczki/ Homepage / Twitter / GitHub / Google Scholar

Welcome stranger

  • ⏰ Currently working on machine learning for drug discovery.
  • 🤖 I would love to collaborate on the machine learning libraries ChemicalX and RexMex.

Great news

datasets's People

Contributors

benedekrozemberczki avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

datasets's Issues

deepwalk

你好,请问有标签的数据集可以作为deepwalk的输入么?怎样才能转化为.mat文件呢?

Features in Twitch dataset

Hi,
I had a question relating to the features.json file. It would be great if you could tell me what the features represent?

Thanks

[GitHub Web-ML] How node features are created

Hi,
First of all, I would like to genuinely thank you for your incredibly clear and detailed guidance. To be honest, I am very new to the field of GNN and have just started delving into it a few weeks ago, so I still have a lot of questions about how they operate.

In the data description section, node features are described as being extracted based on the location, starred repositories, employer, and email address. Therefore, I think features should be text or something similar. However, in the musae_git_features.json file, the features are numerical vectors. I also looked into various other datasets, and node features have a similar form. I genuinely do not understand how to process these features from raw data into numerical vectors that can serve as input for GNNs.

Thank you so much!

Question about the LastfmAsia and DeezerEurope datasets

I have a question about the LastfmAsia and DeezerEurope datasets of your CIKM 2020 paper, which I found on SNAP. These datasets are provided with node features which are “extracted based on the artists liked by the users”. Does this mean that each number in the vector associated to each node corresponds to the id of an artist that the user node liked? Or is the vector a more abstract embedding? I am referring to the file lastfm_asia_features.json and deezer_europe_features.json.

Thanks in advance!

Twitch Social Network dataset: Target

The target files in twitch social network contain the following columns id,days,mature,views,partner,new_id. Could you please provide some information regarding these values, and also, can you please point which column is used for the node classification task ?

MUSAE-Twitch dataset features

Hello,
Thank you for your work!

For the features.json file in the twitch dataset, is there a reference for what the feature indices in the values list specifically represent? For e.g: it's mentioned the features are extracted from games played. Do some of the values in the list represent ids of the games played by that user? Is there a way to get information on what each value corresponds to?

Features in Github Web-ML not of same length

Hi,

The features in git_web_ml/git_feature.json are not of same length. Should the shorter ones be padded with 0s to the end? Or is there a feature matrix of shape (node_num, feature_num) or a sparse mat with node_id, feature_id and value?

Thank you!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.