Giter Club home page Giter Club logo

influencemap's Introduction

Influencemap Project @ ANU

Influence in the academic communities has been an area of interest for researchers. This can be seen in the popularity of applications like google scholar and the various metrics created for ranking papers, authors, conferences, etc.

We aim to provide a visualisation tool which allows users to easily search and visualise the flow of academic influence. Our visualisation maps influence in the form of an influence flower. We calculate influence as a function of the number of citations between two entities (look below for information on our definition of influence).

The node in the centre of the flower denotes the ego entity, the entitiy in which we are looking at influence with respect to. The leaf nodes are the most influential entities with respect to the ego. (We define the ego as a collection of papers. If it is an author, it is the collection of papers that the author has authored)

Each of the edges of the graph signifies the flow of influence to and from the ego node, the strength of this relation is reflected in the thickness of the edge. The red edges denote the influence the ego has towards the outer entities (an outer entity citing a paper by the ego). The blue edges denote the influence the outer entities have towards the ego (the ego cites a paper by one of the outer entities).

The colour of the outer nodes signifies the ratio of influence in and out. A blue node indicates that the associated entity has influenced the ego more than the ego has influenced itself. Likewise, a red node indicates the ego has influenced the node's entity more than it has influenced the ego.

We define two entities to be coauthors if the entities have contributed to the same paper. Coauthors of the ego are signified by nodes with greyed out names.

Citation Information

Minjeong Shin, Alexander Soen, Benjamin T. Readshaw, Stephen M. Blackburn, Mitchell Whitelaw, Lexing Xie. Influence Flowers of Academic Entities. IEEE Conference on Visual Analytics Science & Technology (VAST), 2019

Influence

To quantify academic influence, we define influence as a function of paper citations. Each citation which the ego is apart of contributes to the overall influence map of an ego. To prevent papers with a large number of entities contributing from creating an overwhelming amount of influence, we normalise the influence contribution by the number of entities in the cited paper.

For example, consider the following four paper database where we only consider entities which are authors.

Name Paper no. authors cites papers
John Smith Algorithms 2 [Linear Algebra]
John Smith Machine Learning 3 [Linear Algebra, Computation]
Maria Garcia Linear Algebra 2 None
Maria Garcia Computation 4 [Algorithms]

In this case John's influence on Maria is 0.5 (John's paper Algorithm's has a weight of 0.5 and was cited once by Maria).

On the other hand Maria's influence on John is 1.25 (Linear Algebra has a weight of 0.5 and it was cited twice by John, Computation has a weight of 0.25 and was cited once by John).

We aggregate the pairwise influence of entities associated with the papers of the ego to generate the nodes of a flower. Each flowers' outer nodes can be a collection of several types of entities. In our influence flower application, we present 4 different flower types:

  1. Author outer nodes
  2. Venue (conferences or journals) outer nodes
  3. Author Affiliation outer nodes
  4. Paper topic outer nodes

Filtering self-citations

We define a self-citation between papers and a cited paper as a relation dependent on the ego. A paper citation is a self-citation if both papers have the ego as an author (a venue, an institution, or a topic).

Filtering co-contributors

The Influence Flower is able to capture less obvious influence outside of one’s co-author networks with the filtering. We define two entities to be co-contributors if the entities have contributed to the same paper. For the venue type entity, co-contribution indicates if the ego has published a paper to the venue. For the topic type entity, it means that the ego has written a paper of the topic. Co-contributors of the ego are indicated by nodes with greyed out names.

Other candidate definitions of influence

We have described influence as the sum of citations from one person (or venue or affiliation) to another, weighted by the number of authors in the cited paper. Similar methods were considered early on in the project which included combinations of different weighting schemes. We looked at the eight combinations of three mutually exclusive weightings:

  1. Weighting by the number of authors on the citing paper;
  2. Weighting by the number of authors on the cited paper; and
  3. Weighting by the number of papers referenced by the citing paper.

Due to the lack of a ground truth value of influence to compare these definitions to, we evaluated the eight combinations of these weightings empirically by discussing with researchers which of the definitions produced flowers that most accurately reflected their opinions of who they have influenced and been influenced by.

Other definitions of influence which have not yet been explored with this data include existing measures for node centrality in graphs. By using citation data from MAG to define a directed graph where nodes represent authors, venues or affiliations, and edges are derived from citations between nodes, we could explore using metrics such as closeness, betweenness and eigenvector centrality. These metrics are more appropriate for defining the influence of an entity relative to the whole network.

Data

We use the microsoft academic graph (MAG) dataset for our visualisation. The dataset is a large curation of publication indexed by Bing. From MAG, we use the following fields of the paper entries in the dataset,

  • Citation links
  • Authors
  • Conferences
  • Journals
  • Author Affiliations

The current influencemap is based on a MAG graph snapshot from 2021-12-06. As MAG has been deprecated by Microsoft, we are working on replacing the data source from the combination of SemanticSchorlar and OpenAlex. The new update will be release soon and will include later publications.

influencemap's People

Contributors

alexandersoen avatar bennoxxx avatar csmetrics avatar davidzhang73 avatar lexingxie avatar nbgl avatar shinminjeong avatar steveblackburn avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

influencemap's Issues

Gather data for a person's publications

  1. document how to get the papers authored by one person in MAG
  2. gather a few examples and see how "clean" the data is (e.g. Brian Schmidt, Steve Blackburn, Brian Anderson?)
  3. figure out how to address these (merging? cross check with DBLP)

Flower graph for a person

Generating a flower graph for a person (e.g. Brian Schmidt)
this breaks down into several separate, smaller issues.

Display for comment system

Integration of comments with flower.
Have a:

  • Separate page?
  • Table below flower

Also have the ability to add and delete comments depending on user credentials.

Test webhook

This issue is for testing if the webhook works.

show reference over time

probably plot the publication year of the papers being published.
e.g. minjeong started being influenced by visualisation field (e.g. published in 1970s) from 2017

issues: cluttering the display
how much is the value of this info?

Internal server error

Hello, I can no longer search or create new flowers. I think there's a problem with the server as I also get an internal server error whenever I click on a profile.

Enhance search bar auto complete

-Allow the search bar to search for and display non authors as acronyms
-Increase the size of the option display box
-See if there are any possible efficiency improvements

Comment system

Need a system to generate comments.
Databased into ES (Django sqlite?).
Have ids based on:

  • Person commenting
  • Flower commenting on (including the current properties the flower is set on)

Fix progressbar

Fix progress bar to

  • correctly return the progress percentage
  • show between every step
  • disappear as soon as the result comes out

More user input to select flower nodes

  1. Intermediary step between scoring and drawing flower to display statistics and scores. The user can then select the nodes to populate the flower here.

  2. Allow node deletion in interactive flower. Flower re-populates automatically.

Node information

  • Change Information boxes to be accessed by hovering.
  • Add venue names
  • More paper information (Authors + Affiliations)

discussion: computing influence scores 2017-12-15

TODO

  • option to separate co-authors flower vs everyone else flower ✓
  • option to exclude self-citations ✓

REVISIT LATER

  • normalize citation credit made by one paper? 1/(# of references)

  • weight citation edge by the number of citations that the citing paper has?
    (this has a downside of discounting more recent papers)

    • is there any ideas for weighting papers by age?

Multiple user handling

Explore how to allow multiple users to access the website.
Currently uses Django sessions, but this has proven to be unreliable.

Integrate login service

Aim is to have a login service to moderate the addition and deletion of flower comments.
Need to explore the different options:

  • Django ~ Disqus
  • Login tokens
  • Own system

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.