Giter Club home page Giter Club logo

comp90051-20-project-1's Introduction

COMP90051-20-Project-1

Team Info

Team Num: 65

Team Name: Time Is Money

Teammates:

  • Zhizhang LIN
  • Xiaowen JIN
  • Wei LI

Data Set

Number of UserID: 20,000
Number of Followee: 4,867,136
Number of Edge/Link:24,004,361

Kaggle Result

Individual Feature Type 1 Type 2 Type 3
Jaccard 0.73706 0.78049 0.87331
Cosine 0.79743 0.69522 0.90504
Common Neighbors 0.73067 0.68997 0.62689
Adar 0.80140 ------- 0.64677
KNN1 0.48147
KNN2 0.43365
KNN3 0.43693
KNN4 0.43676
Model AUC
RF 0.85922
LR 0.79229

Score and Comments

Kaggle competition: 15.66/16

Final report: 11.2/14

Total: 26.8/30

Comment in critical analysis (7.2/9):

Good work! The report covers the key aspects: sampling, feature generation, learning and model selection. While the features considered were relatively simple, they performed surprisingly well - especially "cosine similarity". A couple of classifiers were considered, including a non-linear one. Tuning was done via cross-validation to avoid overfitting, however it's unclear how the final model was selected (test AUCs are reported, so may be overfitting). The sampling was naive - missing edges were treated as fake and there was no attempt to match the test set. It was great that you mentioned future directions to explore - including node embeddings.

Comment in clarity and structure (4/5):

The report was a pleasure to read. It was well-structured and clear. I appreciated the use of tables to summarise the results and features. Some space could have been put to better use expanding on insights/motivations etc., rather than defining well-known concepts (e.g. random forest, logistic regression, ROC-AUC). Good referencing.

comp90051-20-project-1's People

Contributors

jxw1998 avatar lowspace avatar viiceslin avatar

Watchers

 avatar  avatar

comp90051-20-project-1's Issues

How to write a good conclusion part?

Example:

Despite the large number of algorithms and ways to permute such algorithms, link prediction in large graphs is still a very challenging problem. Success in the IJCNN Social Network Challenge required a balance between feasi- bility, simplicity, and accuracy. There were many promising methods which were not tested due to hardware or time constraints. Our link prediction framework achieved high sensitivity and specificity when separating real from fakeedges in a given test set, yet this is just one piece of the real-world link prediction problem.
It is common practice to abstract social networks as graphs and develop highly general methods for characterizing them. Unsurprisingly, few of these methods work best โ€œout of the box.โ€ Networks possess different underlying dynamics of growth and attachment, while their edges can symbolize many forms of connections. The strength of our approach came not just from the breadth and depth of the individual methods, but also from empirically testing variations and permutations of those methods on the Flickr graph. Meta methods provided some of the best performing features in our approach, illustrating the importance of capturing a taxonomy of connections in otherwise identical graph edges.

REF: https://ieeexplore.ieee.org/document/6033365

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.