medialab / bibliotools3.0 Goto Github PK
View Code? Open in Web Editor NEWmodification of bibliotools 2.2 from Sébastian Grauwin
License: Apache License 2.0
modification of bibliotools 2.2 from Sébastian Grauwin
License: Apache License 2.0
Develop a minimal user-interface for the script, notably to:
Currently the script works on ISI Web of Science data.
It would be interesting to develop a parser for Scopus (http://www.scopus.com) as well
In the config of the script it is possible to define two different threshold for each type of node:
The scripts does not distinguish between
In particular:
It would save a lot of time, if the default visualisation of the networks was automatised (scripted in Gephi toolkit, Sigma or other).
In particular, these are the operations to script:
SPATIALISATION
LinLog Mode (yes)
Scaling (0.02)
Prevent Overlap (no)
gravity (1)
COLORS (as named in Gephi)
reference: light grey
institutions: olivedrab
authors: gold
keywords: hot pink
subjects: dark orange
countries: powder blue
RANKING (according to the occurence count)
5-150
Blacklist
It often happens that users want to exclude some of the entities from all the extracted graph. A typical case, is the exclusion of the keyword that were present in the original query and are therefore, by construction, connected to almost all the items in the graph.
It would save users a lot of time to be able to define this entities once and for all in black-list to be applied to all the extracted graph.
Whitelist
Researchers or experts may want to include in the maps some items that would be excluded by the selected thresholds. It would be therefore useful to have the possibility to 'impose' some items in the networks.
Connect the Paul-optimised version of the script with the algorithm to detect&analyse the communities by Sebastien (see http://sebastian-grauwin.com/BIBLIOMAP/ and click on the nodes).
NB. this would be super helpful for the CO2 project
It would be nice to have the titles of the journal appearing in the corpus as a new type of node (apparently the parser is capable to is capable to parse them, but they are not exploited)
En les analysant avec les doctorants de Leuven, on a trouvé un truc bizarre sur les cartes que tu m'a envoyé jeudi dernier.
Les références spatialisent bien, mais il y a un problème sur le meta-données.
Sur pas mal des cartes (notamment celles des dernières tranches temporelles), quasiment tous les nœuds des meta-données se retrouvent au centre (car uniformément liés à toutes les parties du graphe de références).
Au début je croyais qu'il s'agissait d'un problème des seuils, mais puis on a lancé l'extraction des même cartes (avec les mêmes seuils) sur l'ordi de Kari et on ne retrouve pas le même problème (les meta-données au contraire se spatialisent bien où on s'attendrait de les trouver).
Vu que le problème semble concerner seulement les dernières tranches temporelles, je me demande si cela n'est pas du à la parallélisation.
Extracting comparable networks (in the sense of having roughly the same number of nodes) from time-spans containing a highly diverse number of bibliographical notices demands to set different filtering thresholds. Lower for time-spans containing fewer nodes; higher for time-spans containing more nodes.
A way of doing this in a more systematic way may be to use average and quartiles.
Instead of filtering all the nodes with an occurence count ("occ") lower than N or the edges with a weight lower than N ("weight"), we could filter all the nodes and edges with an occurence or weight lower than the average (or the 1st quartile or the 3rd quartile).
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.